whisperX

mirror of https://github.com/m-bain/whisperX.git synced 2026-01-08 20:18:00 -05:00

Author	SHA1	Message	Date
Barabazs	d32ec3e301	fix: add missing comma	2025-10-21 09:13:50 -06:00
pplkit	db317c358b	feat: add language-aware sentence tokenization (#1269 ) * feat: add language-aware sentence tokenization * feat: add missing punkt languages --------- Co-authored-by: pulkit <129310466+p1kit@users.noreply.github.com> Co-authored-by: Barabazs <31799121+Barabazs@users.noreply.github.com>	2025-10-21 15:57:26 +02:00
JulianFP	6e1d1caaf4	fix: incorrect type annotation in get_writer return value The audio_path attribute that the __call__ method of the ResultWriter class takes is a str, not TextIO	2025-10-17 09:43:43 -06:00
Barabazs	c8f7597345	feat: add hotwords argument to CLI for improved recognition of rare terms	2025-10-17 09:21:56 -06:00
Barabazs	5925e5f8c7	docs: add cuDNN troubleshooting for common issues (#1266 ) * docs: add troubleshooting guide for cuDNN loading errors * docs: add cuDNN version incompatibility troubleshooting	2025-10-16 10:56:51 +02:00
Barabazs	617835dc27	chore: upgrade torch and torchaudio dependencies to 2.8.0 v3.7.4	2025-10-16 07:41:45 +00:00
Barabazs	92227e7412	fix: lock down torch and torchaudio versions (#1265 ) * fix: update torch and torchaudio dependencies to use compatible version specifiers * chore: update version to 3.7.3 v3.7.3	2025-10-16 08:42:10 +02:00
Paffe	0fa81b31f1	feat: add Swedish alignment model (#1110 ) --------- Co-authored-by: Paffe <paffe@Sigge.home>	2025-10-15 08:08:18 +02:00
Barabazs	505bd9c0b5	chore: refine triton dependency to restrict installation to x86_64 Linux (#1259 ) * chore: refine triton dependency to restrict installation to x86_64 Linux * bump: update version to 3.7.2 v3.7.2	2025-10-12 10:38:29 +02:00
Barabazs	895e5a8493	chore: update numpy dependency constraints for Python 3.13 compatibility (#1258 ) * chore: update numpy dependency constraints for Python 3.13 compatibility * bump: update version to 3.7.1 v3.7.1	2025-10-12 10:31:44 +02:00
Barabazs	a58ff9cb20	bump: update version to 3.7.0 v3.7.0	2025-10-10 07:37:17 +00:00
Barabazs	d13171cdde	feat: add support for python 3.13 (#1256 ) * feat: update Python version requirement to be compatible with 3.13 * feat: add Python 3.13 to compatibility matrix * feat: update onnxruntime dependency for Python version compatibility * fix: drop onnxruntime restriction for python >= 3.10	2025-10-10 09:36:24 +02:00
Barabazs	c1c08c472f	bump: update version to 3.6.0 v3.6.0	2025-10-10 06:45:00 +00:00
Barabazs	a51ae7a81a	feat: add centralized logging to replace ad-hoc print statements (#1254 ) * feat: add logging utility functions * feat: add logging setup and log level argument to CLI * feat: integrate logging across modules	2025-10-10 08:41:06 +02:00
Barabazs	3b1b9a8c4d	refactor: rename types.py to schema.py to avoid stdlib conflict	2025-10-09 14:25:58 +02:00
Tomáš Hnyk	027ec57aee	doc: update cpu only example (#1164 )	2025-10-09 09:34:54 +02:00
3manifold	64e307cc29	chore: remove redundant variable & improve load_model function documentation (#1197 ) * Remove redundant variable * Improve function documentation	2025-10-09 09:32:02 +02:00
Adrian Wan	2663f2edb5	doc: fix diarize import in example script (#1192 )	2025-10-09 09:27:07 +02:00
Barabazs	c266ac5459	chore: update version to 3.5.0 v3.5.0	2025-10-08 09:24:25 +00:00
Jim Chen	95fecb91c8	build: upgrade PyTorch to 2.7.1 with CUDA 12.8 and multi-platform support - feat: upgrade PyTorch to 2.7.1 and CUDA 12.8 * Update README setup to require CUDA toolkit 12.8 instead of 12.4 (Linux and Windows) * Bump torch dependency from 2.6.0 to 2.7.1 * Switch the PyTorch CUDA wheel index from cu124 to cu128 - Revert "docs: add troubleshooting section for libcudnn dependencies in README" * The issue of relying on two different versions of CUDNN in this project has been resolved. - build(pyproject): relax python version and constrain package deps * Only download torch from PyTorch; obtain all other packages from PyPI. * Restrict numpy, onnxruntime, pandas to be compatible with Python 3.9 - build(pyproject): require triton 3.3.0+ for arm64 support * Add triton version 3.3.0 or newer to the dependencies to support arm64 architecture. - build: skip Triton on Windows since it isn't supported * Add a platform marker to the triton dependency to skip it on Windows, as triton does not support Windows. - build: configure PyTorch sources for cross-platform compatibility * macOS uses CPU-only PyTorch from pytorch-cpu index * Linux and Windows use CUDA 12.8 PyTorch from pytorch index * triton only installs on Linux with CUDA 12.8 support * Update lockfile to support multi-platform builds - fix: restrict av to <16.0.0 for Python 3.9 compatibility * Add av<16.0.0 to dependencies to maintain Python 3.9 support * Update comment to include av in the restriction list * Update uv.lock accordingly PyAV dropped Python 3.9 support in v16.0.0: `106089447c` - fix: resolve PyTorch ARM64 platform compatibility issue * Update uv.lock to properly handle aarch64 platforms for PyTorch dependencies * Add resolution markers for ARM64 Linux systems to use CPU-only PyTorch builds * Ensure CUDA builds are only used on x86_64 platforms where supported Resolves ARM64 Docker build failures by preventing uv from attempting to install CUDA PyTorch on unsupported platforms - chore: change .python-version to 3.10 --- Signed-off-by: CHEN, CHUN <jim60105@gmail.com> Signed-off-by: Jim Chen <Jim@ChenJ.im> Co-authored-by: GitHub Copilot <bot@ChenJ.im>	2025-10-08 11:21:28 +02:00
Nguyen Binh	b1c8ac7de6	Change alignment model for Vietnamese language Since the current model is a wav2vec2 pre-trained model for Vietnamese audio, it won't work with alignment tasks. To make it work as expected, I recommend chaining to a fine-tuned ASR version.	2025-10-03 09:41:03 +02:00
Barabazs	bf150e442e	feat: update Punkt tokenizer to use pre-trained model and handle missing data	2025-10-03 09:11:24 +02:00
Max Bain	ed13dc8c6c	recall.ai sponsor Added information about Recall.ai's Meeting Transcription API.	2025-10-03 00:12:53 +01:00
Alex Cannan	c7d31883bc	Add jr, sr, and ph.d to punkt abbreviations	2025-10-01 08:59:53 +02:00
Barabazs	83afb81ac7	fix: restrict pyannote-audio version to avoid compatibility issues (#1242 ) * fix: restrict pyannote-audio version to avoid compatibility issues * chore: bump whisperX version to 3.4.3 v3.4.3	2025-10-01 08:37:00 +02:00
Jean Du	2d9ce44329	fix(asr): load VAD model on correct CUDA device (#835 ) fix(asr): load VAD model on correct CUDA device Previously, the VAD sub‐model was always initialized on the default CUDA device (cuda:0), even when a higher device_index was specified. This change sets `device_vad` to `cuda:{device_index}` whenever `device == 'cuda'`, while falling back to the original `device` string for non‐CUDA cases. This ensures the VAD model is loaded on the intended GPU. Co-authored-by: dujing <dujing@xmov.ai> Co-authored-by: Barabazs <31799121+Barabazs@users.noreply.github.com>	2025-07-02 08:07:59 +02:00
3manifold	f4261f34e9	Remove unused code in Vad class	2025-07-01 09:06:04 +02:00
Barabazs	429658d4cc	chore: bump version to 3.4.2 v3.4.2	2025-06-27 07:18:39 +00:00
Howard	e0833da5dc	Fix: Ensure integer tensor indexing in get_wildcard_emission()	2025-06-27 09:17:44 +02:00
Barabazs	ffedc5cdf0	fix: speaker embedding bug (#1178 ) * fix: improve handling of speaker embeddings in transcribe_task * chore: bump version to 3.4.1 v3.4.1	2025-06-25 13:55:20 +02:00
Barabazs	b93e9b6f57	chore: bump version to 3.4.0 v3.4.0	2025-06-24 16:21:23 +02:00
Barabazs	844736e4e4	style: minor code formatting	2025-06-24 15:01:09 +02:00
Radu-Sebastian Amarie	220fec9aea	refactor: update type hints in diarization module (PEP 585)	2025-06-24 15:01:09 +02:00
Radu-Sebastian Amarie	1631c3040f	feat: enhance diarization with optional output of speaker embeddings - Updated DiarizationPipeline to include a return_embeddings parameter for optional speaker embeddings. - Modified assign_word_speakers to accept and process speaker embeddings. - Updated CLI to support --speaker_embeddings flag for JSON output. - Ensured backward compatibility for existing functionality.	2025-06-24 15:01:09 +02:00
Kirill	d700b56c9c	docs: add missing torch import to Python usage example in README	2025-06-08 03:34:49 -06:00
bog	b343241253	feat: add diarize_model arg to CLI (#1101 )	2025-05-31 13:32:31 +02:00
Barabazs	6fe0a8784a	docs: add troubleshooting section for libcudnn dependencies in README	2025-05-31 05:20:06 -06:00
Barabazs	5012650d0f	chore: update lockfile	2025-05-03 16:25:43 +02:00
Barabazs	108bd0c400	chore: add lockfile check step to CI workflows	2025-05-03 16:25:43 +02:00
Barabazs	b2d50a027b	chore: bump version v3.3.4	2025-05-03 11:38:54 +02:00
Barabazs	36d552cad3	fix: remove DiarizationPipeline from public API	2025-05-03 09:25:59 +02:00
Barabazs	7d36b832f9	refactor: update CLI entry point	2025-05-03 09:25:59 +02:00
Barabazs	d2a493e910	refactor: implement lazy loading for module imports in whisperx	2025-05-03 09:25:59 +02:00
Barabazs	f5b40b5366	chore: update version to 3.3.3 in pyproject.toml and uv.lock v3.3.3	2025-05-01 11:08:54 +02:00
Barabazs	ac0c8bd79a	feat: add version and Python version arguments to CLI	2025-05-01 11:08:54 +02:00
Barabazs	cd59f21d1a	fix: downgrade ctranslate2 dependency version	2025-05-01 11:08:54 +02:00
Yan Cheng Cheok	0aed874589	Remove duplicated item "lv": "latvian"	2025-04-12 11:08:15 +02:00
Barabazs	f10dbf6ab1	fix: update setuptools configuration to include package discovery for whisperx	2025-03-25 18:49:44 +01:00
Barabazs	a7564c2ad6	docs: update installation instructions	2025-03-25 17:02:41 +01:00
Barabazs	e7712f496e	refactor: update import statements to use explicit module paths across multiple files	2025-03-25 16:24:21 +01:00

1 2 3 4 5 ...

506 Commits