whisperX

mirror of https://github.com/m-bain/whisperX.git synced 2026-01-10 04:58:07 -05:00

Author	SHA1	Message	Date
Barabazs	d32ec3e301	fix: add missing comma	2025-10-21 09:13:50 -06:00
pplkit	db317c358b	feat: add language-aware sentence tokenization (#1269 ) * feat: add language-aware sentence tokenization * feat: add missing punkt languages --------- Co-authored-by: pulkit <129310466+p1kit@users.noreply.github.com> Co-authored-by: Barabazs <31799121+Barabazs@users.noreply.github.com>	2025-10-21 15:57:26 +02:00
JulianFP	6e1d1caaf4	fix: incorrect type annotation in get_writer return value The audio_path attribute that the __call__ method of the ResultWriter class takes is a str, not TextIO	2025-10-17 09:43:43 -06:00
Barabazs	c8f7597345	feat: add hotwords argument to CLI for improved recognition of rare terms	2025-10-17 09:21:56 -06:00
Paffe	0fa81b31f1	feat: add Swedish alignment model (#1110 ) --------- Co-authored-by: Paffe <paffe@Sigge.home>	2025-10-15 08:08:18 +02:00
Barabazs	a51ae7a81a	feat: add centralized logging to replace ad-hoc print statements (#1254 ) * feat: add logging utility functions * feat: add logging setup and log level argument to CLI * feat: integrate logging across modules	2025-10-10 08:41:06 +02:00
Barabazs	3b1b9a8c4d	refactor: rename types.py to schema.py to avoid stdlib conflict	2025-10-09 14:25:58 +02:00
3manifold	64e307cc29	chore: remove redundant variable & improve load_model function documentation (#1197 ) * Remove redundant variable * Improve function documentation	2025-10-09 09:32:02 +02:00
Nguyen Binh	b1c8ac7de6	Change alignment model for Vietnamese language Since the current model is a wav2vec2 pre-trained model for Vietnamese audio, it won't work with alignment tasks. To make it work as expected, I recommend chaining to a fine-tuned ASR version.	2025-10-03 09:41:03 +02:00
Barabazs	bf150e442e	feat: update Punkt tokenizer to use pre-trained model and handle missing data	2025-10-03 09:11:24 +02:00
Alex Cannan	c7d31883bc	Add jr, sr, and ph.d to punkt abbreviations	2025-10-01 08:59:53 +02:00
Jean Du	2d9ce44329	fix(asr): load VAD model on correct CUDA device (#835 ) fix(asr): load VAD model on correct CUDA device Previously, the VAD sub‐model was always initialized on the default CUDA device (cuda:0), even when a higher device_index was specified. This change sets `device_vad` to `cuda:{device_index}` whenever `device == 'cuda'`, while falling back to the original `device` string for non‐CUDA cases. This ensures the VAD model is loaded on the intended GPU. Co-authored-by: dujing <dujing@xmov.ai> Co-authored-by: Barabazs <31799121+Barabazs@users.noreply.github.com>	2025-07-02 08:07:59 +02:00
3manifold	f4261f34e9	Remove unused code in Vad class	2025-07-01 09:06:04 +02:00
Howard	e0833da5dc	Fix: Ensure integer tensor indexing in get_wildcard_emission()	2025-06-27 09:17:44 +02:00
Barabazs	ffedc5cdf0	fix: speaker embedding bug (#1178 ) * fix: improve handling of speaker embeddings in transcribe_task * chore: bump version to 3.4.1	2025-06-25 13:55:20 +02:00
Barabazs	844736e4e4	style: minor code formatting	2025-06-24 15:01:09 +02:00
Radu-Sebastian Amarie	220fec9aea	refactor: update type hints in diarization module (PEP 585)	2025-06-24 15:01:09 +02:00
Radu-Sebastian Amarie	1631c3040f	feat: enhance diarization with optional output of speaker embeddings - Updated DiarizationPipeline to include a return_embeddings parameter for optional speaker embeddings. - Modified assign_word_speakers to accept and process speaker embeddings. - Updated CLI to support --speaker_embeddings flag for JSON output. - Ensured backward compatibility for existing functionality.	2025-06-24 15:01:09 +02:00
bog	b343241253	feat: add diarize_model arg to CLI (#1101 )	2025-05-31 13:32:31 +02:00
Barabazs	36d552cad3	fix: remove DiarizationPipeline from public API	2025-05-03 09:25:59 +02:00
Barabazs	7d36b832f9	refactor: update CLI entry point	2025-05-03 09:25:59 +02:00
Barabazs	d2a493e910	refactor: implement lazy loading for module imports in whisperx	2025-05-03 09:25:59 +02:00
Barabazs	ac0c8bd79a	feat: add version and Python version arguments to CLI	2025-05-01 11:08:54 +02:00
Yan Cheng Cheok	0aed874589	Remove duplicated item "lv": "latvian"	2025-04-12 11:08:15 +02:00
Barabazs	e7712f496e	refactor: update import statements to use explicit module paths across multiple files	2025-03-25 16:24:21 +01:00
jademlc	8e53866704	feat: pass hotwords argument to get_prompt (#1073 ) Co-authored-by: Jade Moillic <jade.moillic@radiofrance.com>	2025-03-24 10:47:47 +01:00
Barabazs	8c58c54635	Revert "feat: add Basque alignment model (#1074 )" (#1077 ) This reverts commit `0d9807adc5`.	2025-03-05 15:19:23 +01:00
Xabi	0d9807adc5	feat: add Basque alignment model (#1074 )	2025-03-04 14:55:30 +01:00
Amerogin Kamid	4db839018c	feat: add Tagalog (tl - Filipino) Phoneme-based ASR Model (#1067 )	2025-02-23 09:59:48 +01:00
Max Bain	44e8bf5bb6	Merge pull request #1024 from philmcmahon/local-files-only-param Add models_cache_only param	2025-01-27 14:26:19 +00:00
philmcmahon	7b3c9ce629	Add models_cache_only param	2025-01-27 12:16:37 +00:00
Reinis Ivanovs	36d2622e27	feat: add Latvian align model	2025-01-25 09:45:17 +01:00
tan90xx	acbeba6057	Update silero.py	2025-01-20 20:01:21 +08:00
tan90xx	fca563a782	Update silero.py	2025-01-20 19:52:37 +08:00
tan90xx	de0d8fe313	chore: handle empty segments_list case in silero prevent errors	2025-01-19 21:20:56 +08:00
Barabazs	86e2b3ee74	chore: remove deprecated VAD_SEGMENTATION_URL	2025-01-17 09:12:05 +01:00
liupeng	ffbc73664c	change the docstrings and comments to English	2025-01-13 22:56:48 +08:00
liupeng	289eadfc76	fix a merge error.	2025-01-13 20:26:27 +08:00
bfs18	22a93f2932	Merge branch 'main' into main	2025-01-13 19:34:21 +08:00
Max Bain	5e54b872a9	Merge branch 'main' into main	2025-01-13 10:09:20 +00:00
Max Bain	6be02cccfa	Update asr.py	2025-01-13 10:08:09 +00:00
Barabazs	2f93e029c7	feat: add SegmentData type for temporary processing during alignment	2025-01-13 10:45:50 +01:00
Barabazs	024bc8481b	refactor: consolidate segment data handling in alignment function	2025-01-13 10:45:50 +01:00
Barabazs	f286e7f3de	refactor: improve type hints and clean up imports	2025-01-13 10:45:50 +01:00
Barabazs	73e644559d	refactor: remove namespace for consistency	2025-01-13 10:45:50 +01:00
winking324	1ec527375a	fix vad_method is none	2025-01-13 13:53:35 +08:00
Max Bain	6695426a85	fix new vad paths	2025-01-12 12:50:15 +00:00
Max Bain	aaddb83aa5	switch from case to ifelse	2025-01-11 17:11:21 +00:00
Max Bain	c288f4812a	Merge branch 'main' into silero-vad	2025-01-11 17:05:53 +00:00
liupeng	4ebfb078c5	make no beam consistent with backtrack.	2025-01-09 23:13:11 +08:00

1 2 3 4 5 ...

278 Commits