build: upgrade PyTorch to 2.7.1 with CUDA 12.8 and multi-platform support

- feat: upgrade PyTorch to 2.7.1 and CUDA 12.8
    * Update README setup to require CUDA toolkit 12.8 instead of 12.4 (Linux and Windows)
    * Bump torch dependency from 2.6.0 to 2.7.1
    * Switch the PyTorch CUDA wheel index from cu124 to cu128


- Revert "docs: add troubleshooting section for libcudnn dependencies in README"
    * The issue of relying on two different versions of CUDNN in this project has been resolved.


- build(pyproject): relax python version and constrain package deps
    * Only download torch from PyTorch; obtain all other packages from PyPI.
    * Restrict numpy, onnxruntime, pandas to be compatible with Python 3.9


- build(pyproject): require triton 3.3.0+ for arm64 support
    * Add triton version 3.3.0 or newer to the dependencies to support arm64 architecture.

- build: skip Triton on Windows since it isn't supported
    * Add a platform marker to the triton dependency to skip it on Windows, as triton does not support Windows.

- build: configure PyTorch sources for cross-platform compatibility
    * macOS uses CPU-only PyTorch from pytorch-cpu index
    * Linux and Windows use CUDA 12.8 PyTorch from pytorch index
    * triton only installs on Linux with CUDA 12.8 support
    * Update lockfile to support multi-platform builds

- fix: restrict av to <16.0.0 for Python 3.9 compatibility
    * Add av<16.0.0 to dependencies to maintain Python 3.9 support
    * Update comment to include av in the restriction list
    * Update uv.lock accordingly

        PyAV dropped Python 3.9 support in v16.0.0:
        106089447c


- fix: resolve PyTorch ARM64 platform compatibility issue

    * Update uv.lock to properly handle aarch64 platforms for PyTorch dependencies
    * Add resolution markers for ARM64 Linux systems to use CPU-only PyTorch builds
    * Ensure CUDA builds are only used on x86_64 platforms where supported

    Resolves ARM64 Docker build failures by preventing uv from attempting to install CUDA PyTorch on unsupported platforms

- chore: change .python-version to 3.10

---

Signed-off-by: CHEN, CHUN <jim60105@gmail.com>
Signed-off-by: Jim Chen <Jim@ChenJ.im>
Co-authored-by: GitHub Copilot <bot@ChenJ.im>
This commit is contained in:
Jim Chen
2025-10-08 17:21:28 +08:00
committed by GitHub
parent b1c8ac7de6
commit 95fecb91c8
4 changed files with 1989 additions and 1465 deletions

1
.python-version Normal file
View File

@@ -0,0 +1 @@
3.10

View File

@@ -62,6 +62,15 @@ This repository provides fast automatic speech recognition (70x realtime with la
<h2 align="left" id="setup">Setup ⚙️</h2>
### 0. CUDA Installation
To use WhisperX with GPU acceleration, install the CUDA toolkit 12.8 before WhisperX. Skip this step if using only the CPU.
- For **Linux** users, install the CUDA toolkit 12.8 following this guide:
[CUDA Installation Guide for Linux](https://docs.nvidia.com/cuda/cuda-installation-guide-linux/).
- For **Windows** users, download and install the CUDA toolkit 12.8:
[CUDA Downloads](https://developer.nvidia.com/cuda-12-8-1-download-archive).
### 1. Simple Installation (Recommended)
The easiest way to install WhisperX is through PyPi:
@@ -102,25 +111,6 @@ uv sync --all-extras --dev
You may also need to install ffmpeg, rust etc. Follow openAI instructions here https://github.com/openai/whisper#setup.
### Common Issues & Troubleshooting 🔧
#### libcudnn Dependencies (GPU Users)
If you're using WhisperX with GPU support and encounter errors like:
- `Could not load library libcudnn_ops_infer.so.8`
- `Unable to load any of {libcudnn_cnn.so.9.1.0, libcudnn_cnn.so.9.1, libcudnn_cnn.so.9, libcudnn_cnn.so}`
- `libcudnn_ops_infer.so.8: cannot open shared object file: No such file or directory`
This means your system is missing the CUDA Deep Neural Network library (cuDNN). This library is needed for GPU acceleration but isn't always installed by default.
**Install cuDNN (example for apt based systems):**
```bash
sudo apt update
sudo apt install libcudnn8 libcudnn8-dev -y
```
### Speaker Diarization
To **enable Speaker Diarization**, include your Hugging Face access token (read) that you can generate from [Here](https://huggingface.co/settings/tokens) after the `--hf_token` argument and accept the user agreement for the following models: [Segmentation](https://huggingface.co/pyannote/segmentation-3.0) and [Speaker-Diarization-3.1](https://huggingface.co/pyannote/speaker-diarization-3.1) (if you choose to use Speaker-Diarization 2.x, follow requirements [here](https://huggingface.co/pyannote/speaker-diarization) instead.)

View File

@@ -9,16 +9,19 @@ requires-python = ">=3.9, <3.13"
license = { text = "BSD-2-Clause" }
dependencies = [
"ctranslate2<4.5.0",
"ctranslate2>=4.5.0",
"faster-whisper>=1.1.1",
"nltk>=3.9.1",
"numpy>=2.0.2",
"onnxruntime>=1.19",
"pandas>=2.2.3",
# Restrict numpy, onnxruntime, pandas, av to be compatible with Python 3.9
"numpy>=2.0.2,<2.1.0",
"onnxruntime>=1.19,<1.20.0",
"pandas>=2.2.3,<2.3.0",
"av<16.0.0",
"pyannote-audio>=3.3.2,<4.0.0",
"torch>=2.5.1",
"torchaudio>=2.5.1",
"torch>=2.7.1",
"torchaudio",
"transformers>=4.48.0",
"triton>=3.3.0; sys_platform == 'linux'" # only install triton on Linux
]
@@ -34,3 +37,28 @@ include-package-data = true
[tool.setuptools.packages.find]
where = ["."]
include = ["whisperx*"]
[tool.uv.sources]
torch = [
{ index = "pytorch-cpu", marker = "sys_platform == 'darwin'" },
{ index = "pytorch-cpu", marker = "platform_machine != 'x86_64' and sys_platform != 'darwin'" },
{ index = "pytorch", marker = "platform_machine == 'x86_64' and sys_platform != 'darwin'" },
]
torchaudio = [
{ index = "pytorch-cpu", marker = "sys_platform == 'darwin'" },
{ index = "pytorch-cpu", marker = "platform_machine != 'x86_64' and sys_platform != 'darwin'" },
{ index = "pytorch", marker = "platform_machine == 'x86_64' and sys_platform != 'darwin'" },
]
triton = [
{ index = "pytorch", marker = "sys_platform == 'linux'" },
]
[[tool.uv.index]]
name = "pytorch"
url = "https://download.pytorch.org/whl/cu128"
explicit = true
[[tool.uv.index]]
name = "pytorch-cpu"
url = "https://download.pytorch.org/whl/cpu"
explicit = true

3385
uv.lock generated

File diff suppressed because it is too large Load Diff