|
1 | 1 | # silero-vad-hs
|
2 |
| -Voice activity detection powered by SileroVAD |
| 2 | + |
| 3 | +[](https://opensource.org/licenses/MIT) [](https://hackage.haskell.org/package/silero-vad) |
| 4 | + |
| 5 | +Voice activity detection powered by SileroVAD. |
| 6 | + |
| 7 | +## Supported architectures |
| 8 | + |
| 9 | +- [](https://github.com/qwbarch/silero-vad-hs/actions/workflows/linux-x64.yml) |
| 10 | +- [](https://github.com/qwbarch/silero-vad-hs/actions/workflows/mac-arm64.yml) |
| 11 | +- [](https://github.com/qwbarch/silero-vad-hs/actions/workflows/mac-x64.yml) |
| 12 | +- [](https://github.com/qwbarch/silero-vad-hs/actions/workflows/windows-x64.yml) |
| 13 | + |
| 14 | +## Quick start |
| 15 | + |
| 16 | +This is a literate haskell file. You can run this example via the following: |
| 17 | +```bash |
| 18 | +nix develop --command bash -c ' |
| 19 | + export LD_LIBRARY_PATH=lib:$(nix path-info .#stdenv.cc.cc.lib)/lib |
| 20 | + cabal run --flags="build-readme" |
| 21 | +' |
| 22 | +``` |
| 23 | + |
| 24 | +Necessary language extensions and imports for the example: |
| 25 | +```haskell |
| 26 | +import qualified Data.Vector.Storable as Vector |
| 27 | +import Data.Function ((&)) |
| 28 | +import Data.WAVE (sampleToDouble, WAVE (waveSamples), getWAVEFile) |
| 29 | +import Silero (withVad, withModel, detectSegments, detectSpeech, windowLength) |
| 30 | +``` |
| 31 | + |
| 32 | +For this example, the [WAVE](https://hackage.haskell.org/package/WAVE) library is used for simplicity. |
| 33 | +Unfortunately, its design is flawed and represents audio in a lazy linked list. |
| 34 | +Prefer using [wave](https://hackage.haskell.org/package/wave) for better performance. |
| 35 | + |
| 36 | +```haskell |
| 37 | +main :: IO () |
| 38 | +main = do |
| 39 | + wav <- getWAVEFile "lib/jfk.wav" |
| 40 | +``` |
| 41 | +The functions below expects a ``Vector Float``. This converts it to the expected format. |
| 42 | +```haskell |
| 43 | + let samples = |
| 44 | + concat (waveSamples wav) |
| 45 | + & Vector.fromList |
| 46 | + & Vector.map (realToFrac . sampleToDouble) |
| 47 | +``` |
| 48 | +Use ``detectSegments`` to detect the start/end times of voice activity segments. |
| 49 | +```haskell |
| 50 | + withVad $ \vad -> do |
| 51 | + segments <- detectSegments vad samples |
| 52 | + print segments |
| 53 | + pure () |
| 54 | +``` |
| 55 | +Alternatively, use ``detectSpeech`` if you want to detect if speech is found in a single window: |
| 56 | +```haskell |
| 57 | + withModel $ \model -> do |
| 58 | + probability <- detectSpeech model $ Vector.take windowLength samples |
| 59 | + putStrLn $ "Probability: " <> show probability |
| 60 | +``` |
0 commit comments