faster-whisper

mirror of https://github.com/SYSTRAN/faster-whisper.git synced 2026-01-09 21:48:08 -05:00

Author	SHA1	Message	Date
Mahmoud Ashraf	85e61ea111	Add progress bar to `WhisperModel.transcribe` (#1138 )	2024-11-14 17:12:39 +03:00
Mahmoud Ashraf	3e0ba86571	Remove `torch` dependency, Faster numpy Feature extraction (#1106 )	2024-11-14 12:57:10 +03:00
Mahmoud Ashraf	8f01aee36b	Update WhisperModel documentation to list all available models (#1137 )	2024-11-13 19:26:01 +03:00
Mahmoud Ashraf	c2bf036234	change `language_detection_threshold` default value (#1134 )	2024-11-13 17:07:46 +03:00
Mahmoud Ashraf	fb65cd387f	Update cuda instructions in readme (#1125 ) * Update README.md * Update README.md * Update version.py * Update README.md * Update README.md * Update README.md	2024-11-12 15:51:26 +03:00
Mahmoud Ashraf	203dddb047	replace `NamedTuple` with `dataclass` (#1105 ) * replace `NamedTuple` with `dataclass` * add deprecation warnings	2024-11-05 12:32:20 +03:00
Mahmoud Ashraf	814472fdbf	Revert CPU default threads to 0 https://github.com/SYSTRAN/faster-whisper/pull/965#issuecomment-2448208010	2024-10-30 23:00:36 +03:00
Ozan Caglayan	f978fa2979	Revert CPU default threads to 4 (#965 ) Co-authored-by: Mahmoud Ashraf <hassouna97.ma@gmail.com>	2024-10-30 16:50:49 +03:00
Mahmoud Ashraf	2386843fd7	Use correct features padding for encoder input (#1101 ) * pad to 3000 instead of `feature_extractor.nb_max_frames` * correct trimming for batched features	2024-10-29 17:58:05 +03:00
黑墨水鱼	c2a1da1bd9	typo: trubo -> turbo (#1092 )	2024-10-26 00:28:16 +03:00
Mahmoud Ashraf	b2da05582c	Add support for `turbo` model (#1090 )	2024-10-25 15:50:23 +03:00
Mahmoud Ashraf	2dbca5e559	Use Silero VAD in Batched Mode (#936 ) Replace Pyannote VAD with Silero to reduce code duplication and requirements	2024-10-24 12:05:25 +03:00
Mahmoud Ashraf	42b8681edb	revert back to using PyAV instead of `torchaudio` (#961 ) * revert back to using PyAV instead of torch audio * Update audio.py	2024-10-23 15:26:18 +03:00
Mahmoud Ashraf	d57c5b40b0	Remove the usage of `transformers.pipeline` from `BatchedInferencePipeline` and fix word timestamps for batched inference (#921 ) * fix word timestamps for batched inference * remove hf pipeline	2024-07-27 09:02:58 +07:00
zh-plus	83a368e98a	Make vad-related parameters configurable for batched inference. (#923 )	2024-07-24 09:00:32 +07:00
Jilt Sebastian	eb8390233c	New PR for Faster Whisper: Batching Support, Speed Boosts, and Quality Enhancements (#856 ) Batching Support, Speed Boosts, and Quality Enhancements --------- Co-authored-by: Hargun Mujral <83234565+hargunmujral@users.noreply.github.com> Co-authored-by: MahmoudAshraf97 <hassouna97.ma@gmail.com>	2024-07-18 16:48:52 +07:00
trungkienbkhn	fbcf58bf98	Fix language detection with non-speech audio (#895 )	2024-07-05 14:43:45 +07:00
Jordi Mas	1195359984	Filter out non_speech_tokens in suppressed tokens (#898 ) * Filter out non_speech_tokens in suppressed tokens	2024-07-05 14:43:11 +07:00
trungkienbkhn	c22db5125d	Bump version to 1.0.3 (#887 )	2024-07-01 16:36:12 +07:00
ABen	8862bee1f8	Improve language detection when using clip_timestamps (#867 )	2024-07-01 16:12:45 +07:00
Ki Hoon Kim	8d400e9870	Upgrade to Silero-Vad V5 (#884 ) * Fix window_size_samples to 512 * Update SileroVADModel * Replace ONNX file with V5 version	2024-07-01 15:40:37 +07:00
Napuh	f53be1e811	Add distil models to WhisperModel init and download_model docstrings (#847 ) * chore: add distil models to WhisperModel init docstring and download_model docstring	2024-05-20 08:51:22 +07:00
Natanael Tan	4acdb5c619	Fix #839 incorrect clip_timestamps being used in model (#842 ) * Fix #839 Changed the code from updating the TranscriptionOptions class instead of the options object which likely was the cause of unexpected behaviour	2024-05-17 16:35:07 +07:00
trungkienbkhn	2f6913efc8	Bump version to 1.0.2 (#816 )	2024-05-06 09:02:54 +07:00
Keating Reid	49a80eb8a8	Clarify documentation for hotwords (#817 ) * Clarify documentation for hotwords * Remove redundant type specifications	2024-05-06 08:52:59 +07:00
trungkienbkhn	8d5e6d56d9	Support initializing more whisper model args (#807 )	2024-05-04 15:12:59 +07:00
jax	847fec4492	Feature/add hotwords (#731 ) * add hotword params --------- Co-authored-by: jax <jax_builder@gamil.com>	2024-05-04 15:11:52 +07:00
otakutyrant	91c8307aa6	make faster_whisper.assets as a valid python package to distribute (#772 ) (#774 )	2024-04-02 18:22:22 +02:00
Purfview	b024972a56	Foolproof: Disable VAD if clip_timestamps is in use (#769 ) * Foolproof: Disable VAD if clip_timestamps is in use Prevent silly things to happen.	2024-04-02 18:20:34 +02:00
Purfview	8ae82c8372	Bugfix: code breaks if audio is empty (#768 ) * Bugfix: code breaks if audio is empty Regression since https://github.com/SYSTRAN/faster-whisper/pull/732 PR	2024-04-02 18:18:12 +02:00
trungkienbkhn	e0c3a9ed34	Update project github link to SYSTRAN (#746 )	2024-03-27 08:31:17 +01:00
Sanchit Gandhi	a67e0e47ae	Add support for distil-large-v3 (#755 ) * add distil-large-v3 * Update README.md * use fp16 weights from Systran	2024-03-26 14:58:39 +01:00
trungkienbkhn	1eb9a8004c	Improve language detection (#732 )	2024-03-12 15:44:49 +01:00
trungkienbkhn	a342b028b7	Bump version to 1.0.1 (#725 )	2024-03-01 11:32:12 +01:00
Purfview	5090cc9d0d	Fix window end heuristic for hallucination_silence_threshold (#706 ) Removes the wishful heuristic causing more issues than it's fixing. Same as https://github.com/openai/whisper/pull/2043 Example of the issue: https://github.com/openai/whisper/pull/1838#issuecomment-1960041500	2024-02-29 17:59:32 +01:00
trungkienbkhn	16141e65d9	Add pad_or_trim function to handle segment before encoding (#705 )	2024-02-29 17:08:28 +01:00
trungkienbkhn	06d32bf0c1	Bump version to 1.0.0 (#696 )	2024-02-22 09:49:01 +01:00
Purfview	30d6043e90	Prevent infinite loop for out-of-bound timestamps in clip_timestamps (#697 ) Same as https://github.com/openai/whisper/pull/2005	2024-02-22 09:48:35 +01:00
trungkienbkhn	092067208b	Add clip_timestamps and hallucination_silence_threshold options (#646 )	2024-02-20 17:34:54 +01:00
Purfview	3aec421849	Add: More clarity of what "max_new_tokens" does (#658 ) * Add: More clarity of what "max_new_tokens" does	2024-01-28 21:40:33 +01:00
Purfview	00efce1696	Bugfix: Illogical "Avoid computing higher temperatures on no_speech" (#652 )	2024-01-24 11:54:43 +01:00
metame	ad3c83045b	support distil-whisper (#557 )	2024-01-24 10:17:12 +01:00
Purfview	ebcfd6b964	Fix broken prompt_reset_on_temperature (#604 ) * Fix broken prompt_reset_on_temperature Fixing: https://github.com/SYSTRAN/faster-whisper/issues/603 Broken because `generate_with_fallback()` doesn't return final temperature. Regression since PR356 -> https://github.com/SYSTRAN/faster-whisper/pull/356	2023-12-13 13:14:39 +01:00
trungkienbkhn	19329a3611	Word timing tweaks (#616 )	2023-12-13 12:38:44 +01:00
Clayton Yochum	9641d5f56a	Force read-mode in `av.open` (#566 ) The `av.open` functions checks input metadata to determine the mode to open with ("r" or "w"). If an input to `decode_audio` is found to be in write-mode, without this change it can't be read. Forcing read mode solves this.	2023-11-27 10:43:35 +01:00
Dang Chuan Nguyen	e1a218fab1	Bump version to 0.10.0	2023-11-24 23:19:47 +01:00
Oscaarjs	3084409633	Add V3 Support (#578 ) * Add V3 Support * update conversion example --------- Co-authored-by: oscaarjs <oscar.johansson@conversy.se>	2023-11-24 23:16:12 +01:00
Guillaume Klein	5a0541ea7d	Bump version to 0.9.0	2023-09-18 16:21:37 +02:00
Guillaume Klein	e94711bb5c	Add property WhisperModel.supported_languages (#476 ) * Expose function supported_languages * Make it a method	2023-09-14 17:42:02 +02:00
Guillaume Klein	0048844f54	Expose function available_models (#475 ) * Expose function available_models * Add test case	2023-09-14 17:17:01 +02:00

1 2 3 4

158 Commits