Mahmoud Ashraf
93001a9438
bump version to 1.2.0
v1.2.0
2025-08-06 03:31:36 +03:00
Mahmoud Ashraf
a0c3cb9802
Remove Silence in Batched transcription ( #1297 )
2025-08-06 03:30:59 +03:00
Mahmoud Ashraf
fbeb1ba731
get correct index for samples ( #1336 )
2025-08-06 03:17:45 +03:00
Rishil
d3bfd0a305
feat: Allow loading of private HF models ( #1309 )
...
* feat: add HuggingFace auth token support to model download
* Format
2025-06-02 14:12:34 +03:00
Mahmoud Ashraf
43d4163fe0
Support distil-large-v3.5 ( #1311 )
2025-06-02 14:09:20 +03:00
Felix Mosheev
700584b2e6
feat: allow passing specific revision to download ( #1292 )
2025-04-30 00:55:48 +03:00
David Jiménez
1383fd4d37
Update README.md with speaches instead of faster-whisper-server ( #1267 )
...
Was previously named faster-whisper-server. They've decided to change the name from faster-whisper-server to speaches, as the project has evolved to support more than just ASR.
2025-03-20 17:20:26 +03:00
Mahmoud Ashraf
9e657b47cb
Bump version to 1.1.1
v1.1.1
2025-01-01 17:44:54 +03:00
Purfview
11fd8ab301
Fix neg_threshold ( #1191 )
2024-12-29 14:38:58 +03:00
Dragoș Bălan
95164297ff
Add duration of audio and VAD removed duration to BatchedInferencePipeline ( #1186 )
...
Co-authored-by: MahmoudAshraf97 <hassouna97.ma@gmail.com >
2024-12-23 17:23:40 +02:00
Purfview
1b24f284c9
Reduce VAD memory usage ( #1198 )
...
Co-authored-by: Mahmoud Ashraf <hassouna97.ma@gmail.com >
2024-12-12 15:23:30 +03:00
Jordi Mas
b568faec40
Add Open-dubbing into community projects ( #1034 )
...
* Add Open-dubbing into community projects
* Update URL
2024-12-12 13:36:04 +03:00
Purfview
f32c0e8af3
Make batched suppress_tokens behaviour same as in sequential ( #1194 )
2024-12-11 14:51:38 +03:00
Purfview
8327d8cc64
Brings back original VAD parameters naming ( #1181 )
2024-12-01 20:41:53 +03:00
Mahmoud Ashraf
22a5238b56
Upgrade CI to 3.9 and drop Python 3.8 support( #1184 )
2024-12-01 20:38:27 +03:00
Mahmoud Ashraf
97a4785fa1
Bump version to 1.1.0 and update benchmarks ( #1161 )
...
* update version
* Update CPU benchmarks
* Updated GPU benchmarks
* ..
* more gpu benchmarks
v1.1.0
2024-11-21 19:22:01 +03:00
Mahmoud Ashraf
08f6900217
remove log_prob_low_threshold ( #1160 )
2024-11-21 00:03:21 +03:00
Mahmoud Ashraf
9c8ef76c98
use jiwer instead of evaluate in benchmarks ( #1159 )
2024-11-20 23:51:55 +03:00
Mahmoud Ashraf
491852e1b9
Add new tests ( #1158 )
2024-11-20 14:50:57 +03:00
Mahmoud Ashraf
f830c6f241
Fix list index out of range in word timestamps ( #1157 )
2024-11-20 13:36:58 +03:00
Mahmoud Ashraf
bcd8ce0fc7
refactor multilingual option ( #1148 )
...
* Added test for `multilingual` option with english-german audio
* removed `output_language` argument as it is redundant, you can get the same functionality with `task="translate"`
* use the correct `encoder_output` for language detection in sequential transcription
* enabled `multilingual` functionality for batched inference
2024-11-20 00:14:59 +03:00
Mahmoud Ashraf
be9fb36ed3
Cleanup of BatchedInferencePipeline ( #1135 )
2024-11-17 16:45:32 +03:00
Mahmoud Ashraf
a6f8fbae00
Refactor of language detection functions ( #1146 )
...
* Supported new options for batched transcriptions:
* `language_detection_threshold`
* `language_detection_segments`
* Updated `WhisperModel.detect_language` function to include the improved language detection from #732 and added docstrings, it's now used inside `transcribe` function.
* Removed the following functions as they are no longer needed:
* `WhisperModel.detect_language_multi_segment` and its test
* `BatchedInferencePipeline.get_language_and_tokenizer`
* Added tests for empty audios
2024-11-16 13:53:07 +03:00
黑墨水鱼
53bbe54016
fix: Use correct seek value in output, fix word timestamps when the initial timestamp is not zero ( #1141 )
...
Co-authored-by: Mahmoud Ashraf <hassouna97.ma@gmail.com >
2024-11-15 14:57:38 +03:00
Mahmoud Ashraf
85e61ea111
Add progress bar to WhisperModel.transcribe ( #1138 )
2024-11-14 17:12:39 +03:00
Mahmoud Ashraf
3e0ba86571
Remove torch dependency, Faster numpy Feature extraction ( #1106 )
2024-11-14 12:57:10 +03:00
Mahmoud Ashraf
8f01aee36b
Update WhisperModel documentation to list all available models ( #1137 )
2024-11-13 19:26:01 +03:00
Mahmoud Ashraf
c2bf036234
change language_detection_threshold default value ( #1134 )
2024-11-13 17:07:46 +03:00
Mahmoud Ashraf
fb65cd387f
Update cuda instructions in readme ( #1125 )
...
* Update README.md
* Update README.md
* Update version.py
* Update README.md
* Update README.md
* Update README.md
2024-11-12 15:51:26 +03:00
Mahmoud Ashraf
203dddb047
replace NamedTuple with dataclass ( #1105 )
...
* replace `NamedTuple` with `dataclass`
* add deprecation warnings
2024-11-05 12:32:20 +03:00
Mahmoud Ashraf
814472fdbf
Revert CPU default threads to 0
...
https://github.com/SYSTRAN/faster-whisper/pull/965#issuecomment-2448208010
2024-10-30 23:00:36 +03:00
Ozan Caglayan
f978fa2979
Revert CPU default threads to 4 ( #965 )
...
Co-authored-by: Mahmoud Ashraf <hassouna97.ma@gmail.com >
2024-10-30 16:50:49 +03:00
Mahmoud Ashraf
2386843fd7
Use correct features padding for encoder input ( #1101 )
...
* pad to 3000 instead of `feature_extractor.nb_max_frames`
* correct trimming for batched features
2024-10-29 17:58:05 +03:00
黑墨水鱼
c2a1da1bd9
typo: trubo -> turbo ( #1092 )
2024-10-26 00:28:16 +03:00
Mahmoud Ashraf
b2da05582c
Add support for turbo model ( #1090 )
2024-10-25 15:50:23 +03:00
Mahmoud Ashraf
2dbca5e559
Use Silero VAD in Batched Mode ( #936 )
...
Replace Pyannote VAD with Silero to reduce code duplication and requirements
2024-10-24 12:05:25 +03:00
Mahmoud Ashraf
574e2563e7
Update Dockerfile to ensure compatibility with CT2==4.5.0
2024-10-23 18:28:27 +03:00
Mahmoud Ashraf
42b8681edb
revert back to using PyAV instead of torchaudio ( #961 )
...
* revert back to using PyAV instead of torch audio
* Update audio.py
2024-10-23 15:26:18 +03:00
Mahmoud Ashraf
d57c5b40b0
Remove the usage of transformers.pipeline from BatchedInferencePipeline and fix word timestamps for batched inference ( #921 )
...
* fix word timestamps for batched inference
* remove hf pipeline
2024-07-27 09:02:58 +07:00
zh-plus
83a368e98a
Make vad-related parameters configurable for batched inference. ( #923 )
2024-07-24 09:00:32 +07:00
Jilt Sebastian
eb8390233c
New PR for Faster Whisper: Batching Support, Speed Boosts, and Quality Enhancements ( #856 )
...
Batching Support, Speed Boosts, and Quality Enhancements
---------
Co-authored-by: Hargun Mujral <83234565+hargunmujral@users.noreply.github.com >
Co-authored-by: MahmoudAshraf97 <hassouna97.ma@gmail.com >
2024-07-18 16:48:52 +07:00
trungkienbkhn
fbcf58bf98
Fix language detection with non-speech audio ( #895 )
2024-07-05 14:43:45 +07:00
Jordi Mas
1195359984
Filter out non_speech_tokens in suppressed tokens ( #898 )
...
* Filter out non_speech_tokens in suppressed tokens
2024-07-05 14:43:11 +07:00
trungkienbkhn
c22db5125d
Bump version to 1.0.3 ( #887 )
v1.0.3
2024-07-01 16:36:12 +07:00
ABen
8862bee1f8
Improve language detection when using clip_timestamps ( #867 )
2024-07-01 16:12:45 +07:00
Ki Hoon Kim
8d400e9870
Upgrade to Silero-Vad V5 ( #884 )
...
* Fix window_size_samples to 512
* Update SileroVADModel
* Replace ONNX file with V5 version
2024-07-01 15:40:37 +07:00
Fedir Zadniprovskyi
bced5f04c0
docs: add 'faster-whisper-server' community integration ( #861 )
...
Co-authored-by: Fedir Zadniprovskyi <github.g1k56@simplelogin.com >
2024-06-05 22:27:41 +07:00
Fedir Zadniprovskyi
65551c081f
Docker file improvements ( #848 )
...
Docker file improvements
Co-authored-by: Fedir Zadniprovskyi <github.g1k56@simplelogin.com >
2024-05-20 09:13:19 +07:00
Napuh
f53be1e811
Add distil models to WhisperModel init and download_model docstrings ( #847 )
...
* chore: add distil models to WhisperModel init docstring and download_model docstring
2024-05-20 08:51:22 +07:00
Natanael Tan
4acdb5c619
Fix #839 incorrect clip_timestamps being used in model ( #842 )
...
* Fix #839
Changed the code from updating the TranscriptionOptions class instead of the options object which likely was the cause of unexpected behaviour
2024-05-17 16:35:07 +07:00