faster-whisper

mirror of https://github.com/SYSTRAN/faster-whisper.git synced 2026-01-09 13:38:01 -05:00

Author	SHA1	Message	Date
Mahmoud Ashraf	93001a9438	bump version to 1.2.0 v1.2.0	2025-08-06 03:31:36 +03:00
Mahmoud Ashraf	a0c3cb9802	Remove Silence in Batched transcription (#1297 )	2025-08-06 03:30:59 +03:00
Mahmoud Ashraf	fbeb1ba731	get correct index for samples (#1336 )	2025-08-06 03:17:45 +03:00
Rishil	d3bfd0a305	feat: Allow loading of private HF models (#1309 ) * feat: add HuggingFace auth token support to model download * Format	2025-06-02 14:12:34 +03:00
Mahmoud Ashraf	43d4163fe0	Support `distil-large-v3.5` (#1311 )	2025-06-02 14:09:20 +03:00
Felix Mosheev	700584b2e6	feat: allow passing specific revision to download (#1292 )	2025-04-30 00:55:48 +03:00
David Jiménez	1383fd4d37	Update README.md with speaches instead of faster-whisper-server (#1267 ) Was previously named faster-whisper-server. They've decided to change the name from faster-whisper-server to speaches, as the project has evolved to support more than just ASR.	2025-03-20 17:20:26 +03:00
Mahmoud Ashraf	9e657b47cb	Bump version to 1.1.1 v1.1.1	2025-01-01 17:44:54 +03:00
Purfview	11fd8ab301	Fix neg_threshold (#1191 )	2024-12-29 14:38:58 +03:00
Dragoș Bălan	95164297ff	Add duration of audio and VAD removed duration to BatchedInferencePipeline (#1186 ) Co-authored-by: MahmoudAshraf97 <hassouna97.ma@gmail.com>	2024-12-23 17:23:40 +02:00
Purfview	1b24f284c9	Reduce VAD memory usage (#1198 ) Co-authored-by: Mahmoud Ashraf <hassouna97.ma@gmail.com>	2024-12-12 15:23:30 +03:00
Jordi Mas	b568faec40	Add Open-dubbing into community projects (#1034 ) * Add Open-dubbing into community projects * Update URL	2024-12-12 13:36:04 +03:00
Purfview	f32c0e8af3	Make batched suppress_tokens behaviour same as in sequential (#1194 )	2024-12-11 14:51:38 +03:00
Purfview	8327d8cc64	Brings back original VAD parameters naming (#1181 )	2024-12-01 20:41:53 +03:00
Mahmoud Ashraf	22a5238b56	Upgrade CI to 3.9 and drop Python 3.8 support(#1184 )	2024-12-01 20:38:27 +03:00
Mahmoud Ashraf	97a4785fa1	Bump version to 1.1.0 and update benchmarks (#1161 ) * update version * Update CPU benchmarks * Updated GPU benchmarks * .. * more gpu benchmarks v1.1.0	2024-11-21 19:22:01 +03:00
Mahmoud Ashraf	08f6900217	remove `log_prob_low_threshold` (#1160 )	2024-11-21 00:03:21 +03:00
Mahmoud Ashraf	9c8ef76c98	use `jiwer` instead of `evaluate` in benchmarks (#1159 )	2024-11-20 23:51:55 +03:00
Mahmoud Ashraf	491852e1b9	Add new tests (#1158 )	2024-11-20 14:50:57 +03:00
Mahmoud Ashraf	f830c6f241	Fix `list index out of range` in word timestamps (#1157 )	2024-11-20 13:36:58 +03:00
Mahmoud Ashraf	bcd8ce0fc7	refactor `multilingual` option (#1148 ) * Added test for `multilingual` option with english-german audio * removed `output_language` argument as it is redundant, you can get the same functionality with `task="translate"` * use the correct `encoder_output` for language detection in sequential transcription * enabled `multilingual` functionality for batched inference	2024-11-20 00:14:59 +03:00
Mahmoud Ashraf	be9fb36ed3	Cleanup of `BatchedInferencePipeline` (#1135 )	2024-11-17 16:45:32 +03:00
Mahmoud Ashraf	a6f8fbae00	Refactor of language detection functions (#1146 ) * Supported new options for batched transcriptions: * `language_detection_threshold` * `language_detection_segments` * Updated `WhisperModel.detect_language` function to include the improved language detection from #732 and added docstrings, it's now used inside `transcribe` function. * Removed the following functions as they are no longer needed: * `WhisperModel.detect_language_multi_segment` and its test * `BatchedInferencePipeline.get_language_and_tokenizer` * Added tests for empty audios	2024-11-16 13:53:07 +03:00
黑墨水鱼	53bbe54016	fix: Use correct `seek` value in output, fix word timestamps when the initial timestamp is not zero (#1141 ) Co-authored-by: Mahmoud Ashraf <hassouna97.ma@gmail.com>	2024-11-15 14:57:38 +03:00
Mahmoud Ashraf	85e61ea111	Add progress bar to `WhisperModel.transcribe` (#1138 )	2024-11-14 17:12:39 +03:00
Mahmoud Ashraf	3e0ba86571	Remove `torch` dependency, Faster numpy Feature extraction (#1106 )	2024-11-14 12:57:10 +03:00
Mahmoud Ashraf	8f01aee36b	Update WhisperModel documentation to list all available models (#1137 )	2024-11-13 19:26:01 +03:00
Mahmoud Ashraf	c2bf036234	change `language_detection_threshold` default value (#1134 )	2024-11-13 17:07:46 +03:00
Mahmoud Ashraf	fb65cd387f	Update cuda instructions in readme (#1125 ) * Update README.md * Update README.md * Update version.py * Update README.md * Update README.md * Update README.md	2024-11-12 15:51:26 +03:00
Mahmoud Ashraf	203dddb047	replace `NamedTuple` with `dataclass` (#1105 ) * replace `NamedTuple` with `dataclass` * add deprecation warnings	2024-11-05 12:32:20 +03:00
Mahmoud Ashraf	814472fdbf	Revert CPU default threads to 0 https://github.com/SYSTRAN/faster-whisper/pull/965#issuecomment-2448208010	2024-10-30 23:00:36 +03:00
Ozan Caglayan	f978fa2979	Revert CPU default threads to 4 (#965 ) Co-authored-by: Mahmoud Ashraf <hassouna97.ma@gmail.com>	2024-10-30 16:50:49 +03:00
Mahmoud Ashraf	2386843fd7	Use correct features padding for encoder input (#1101 ) * pad to 3000 instead of `feature_extractor.nb_max_frames` * correct trimming for batched features	2024-10-29 17:58:05 +03:00
黑墨水鱼	c2a1da1bd9	typo: trubo -> turbo (#1092 )	2024-10-26 00:28:16 +03:00
Mahmoud Ashraf	b2da05582c	Add support for `turbo` model (#1090 )	2024-10-25 15:50:23 +03:00
Mahmoud Ashraf	2dbca5e559	Use Silero VAD in Batched Mode (#936 ) Replace Pyannote VAD with Silero to reduce code duplication and requirements	2024-10-24 12:05:25 +03:00
Mahmoud Ashraf	574e2563e7	Update Dockerfile to ensure compatibility with `CT2==4.5.0`	2024-10-23 18:28:27 +03:00
Mahmoud Ashraf	42b8681edb	revert back to using PyAV instead of `torchaudio` (#961 ) * revert back to using PyAV instead of torch audio * Update audio.py	2024-10-23 15:26:18 +03:00
Mahmoud Ashraf	d57c5b40b0	Remove the usage of `transformers.pipeline` from `BatchedInferencePipeline` and fix word timestamps for batched inference (#921 ) * fix word timestamps for batched inference * remove hf pipeline	2024-07-27 09:02:58 +07:00
zh-plus	83a368e98a	Make vad-related parameters configurable for batched inference. (#923 )	2024-07-24 09:00:32 +07:00
Jilt Sebastian	eb8390233c	New PR for Faster Whisper: Batching Support, Speed Boosts, and Quality Enhancements (#856 ) Batching Support, Speed Boosts, and Quality Enhancements --------- Co-authored-by: Hargun Mujral <83234565+hargunmujral@users.noreply.github.com> Co-authored-by: MahmoudAshraf97 <hassouna97.ma@gmail.com>	2024-07-18 16:48:52 +07:00
trungkienbkhn	fbcf58bf98	Fix language detection with non-speech audio (#895 )	2024-07-05 14:43:45 +07:00
Jordi Mas	1195359984	Filter out non_speech_tokens in suppressed tokens (#898 ) * Filter out non_speech_tokens in suppressed tokens	2024-07-05 14:43:11 +07:00
trungkienbkhn	c22db5125d	Bump version to 1.0.3 (#887 ) v1.0.3	2024-07-01 16:36:12 +07:00
ABen	8862bee1f8	Improve language detection when using clip_timestamps (#867 )	2024-07-01 16:12:45 +07:00
Ki Hoon Kim	8d400e9870	Upgrade to Silero-Vad V5 (#884 ) * Fix window_size_samples to 512 * Update SileroVADModel * Replace ONNX file with V5 version	2024-07-01 15:40:37 +07:00
Fedir Zadniprovskyi	bced5f04c0	docs: add 'faster-whisper-server' community integration (#861 ) Co-authored-by: Fedir Zadniprovskyi <github.g1k56@simplelogin.com>	2024-06-05 22:27:41 +07:00
Fedir Zadniprovskyi	65551c081f	Docker file improvements (#848 ) Docker file improvements Co-authored-by: Fedir Zadniprovskyi <github.g1k56@simplelogin.com>	2024-05-20 09:13:19 +07:00
Napuh	f53be1e811	Add distil models to WhisperModel init and download_model docstrings (#847 ) * chore: add distil models to WhisperModel init docstring and download_model docstring	2024-05-20 08:51:22 +07:00
Natanael Tan	4acdb5c619	Fix #839 incorrect clip_timestamps being used in model (#842 ) * Fix #839 Changed the code from updating the TranscriptionOptions class instead of the options object which likely was the cause of unexpected behaviour	2024-05-17 16:35:07 +07:00

1 2 3 4 5 ...

252 Commits