Commit Graph

17 Commits

Author SHA1 Message Date
Mahmoud Ashraf
dea24cbcc6 Upgrade to Silero-VAD V6 (#1373)
Co-authored-by: sssshhhhhh 193317444+sssshhhhhh@users.noreply.github.com
2025-10-14 15:29:56 +03:00
Mahmoud Ashraf
a0c3cb9802 Remove Silence in Batched transcription (#1297) 2025-08-06 03:30:59 +03:00
Mahmoud Ashraf
fbeb1ba731 get correct index for samples (#1336) 2025-08-06 03:17:45 +03:00
Purfview
11fd8ab301 Fix neg_threshold (#1191) 2024-12-29 14:38:58 +03:00
Purfview
1b24f284c9 Reduce VAD memory usage (#1198)
Co-authored-by: Mahmoud Ashraf <hassouna97.ma@gmail.com>
2024-12-12 15:23:30 +03:00
Purfview
8327d8cc64 Brings back original VAD parameters naming (#1181) 2024-12-01 20:41:53 +03:00
Mahmoud Ashraf
3e0ba86571 Remove torch dependency, Faster numpy Feature extraction (#1106) 2024-11-14 12:57:10 +03:00
Mahmoud Ashraf
203dddb047 replace NamedTuple with dataclass (#1105)
* replace `NamedTuple` with `dataclass`

* add deprecation warnings
2024-11-05 12:32:20 +03:00
Mahmoud Ashraf
2dbca5e559 Use Silero VAD in Batched Mode (#936)
Replace Pyannote VAD with Silero to reduce code duplication and requirements
2024-10-24 12:05:25 +03:00
Jilt Sebastian
eb8390233c New PR for Faster Whisper: Batching Support, Speed Boosts, and Quality Enhancements (#856)
Batching Support, Speed Boosts, and Quality Enhancements

---------

Co-authored-by: Hargun Mujral <83234565+hargunmujral@users.noreply.github.com>
Co-authored-by: MahmoudAshraf97 <hassouna97.ma@gmail.com>
2024-07-18 16:48:52 +07:00
Ki Hoon Kim
8d400e9870 Upgrade to Silero-Vad V5 (#884)
* Fix window_size_samples to 512

* Update SileroVADModel

* Replace ONNX file with V5 version
2024-07-01 15:40:37 +07:00
kh
ad58ba26ab Fix typo (#304)
https://github.com/snakers4/silero-vad/discussions/319#discussion-5081706
2023-06-16 07:37:45 +02:00
Guillaume Klein
4db549b800 Make get_speech_timestamps backward compatible with the previous usage (#259) 2023-05-24 15:49:36 +02:00
FlippFuzz
5d8f3e2d90 Implement VadOptions (#198)
* Implement VadOptions

* Fix line too long

./faster_whisper/transcribe.py:226:101: E501 line too long (111 > 100 characters)

* Reformatted files with black

* black .\faster_whisper\vad.py    
* black .\faster_whisper\transcribe.py

* Fix import order with isort

* isort .\faster_whisper\vad.py
* isort .\faster_whisper\transcribe.py

* Made recommended changes

Recommended in https://github.com/guillaumekln/faster-whisper/pull/198

* Fix typing of vad_options argument

---------

Co-authored-by: Guillaume Klein <guillaumekln@users.noreply.github.com>
2023-05-09 12:47:02 +02:00
Guillaume Klein
8cf5d5a4b3 Increase the default value of speech_pad_ms to 400 ms (#179) 2023-04-25 15:54:22 +02:00
Guillaume Klein
2f266eb844 Fix VAD index error when a predicted timestamps is too large (#107) 2023-04-03 19:34:54 +02:00
Guillaume Klein
19698c95f8 Support VAD filter (#95)
* Support VAD filter

* Generalize function collect_samples

* Define AudioSegment class

* Only pass prompt and prefix to the first chunk

* Add dict argument vad_parameters

* Fix isort format

* Rename method

* Update README

* Add shortcut when the chunk offset is 0

* Reword readme

* Fix end property

* Concatenate the speech chunks

* Cleanup diff

* Increase default speech pad

* Update README

* Increase default speech pad
2023-04-03 17:22:48 +02:00