Update cuda instructions in readme (#1125)

* Update README.md

* Update README.md

* Update version.py

* Update README.md

* Update README.md

* Update README.md
This commit is contained in:
Mahmoud Ashraf
2024-11-12 14:51:26 +02:00
committed by GitHub
parent 203dddb047
commit fb65cd387f
2 changed files with 13 additions and 13 deletions

View File

@@ -75,9 +75,9 @@ segments, info = model.transcribe("audio.mp3", beam_size=5, language="en")
GPU execution requires the following NVIDIA libraries to be installed: GPU execution requires the following NVIDIA libraries to be installed:
* [cuBLAS for CUDA 12](https://developer.nvidia.com/cublas) * [cuBLAS for CUDA 12](https://developer.nvidia.com/cublas)
* [cuDNN 8 for CUDA 12](https://developer.nvidia.com/cudnn) * [cuDNN 9 for CUDA 12](https://developer.nvidia.com/cudnn)
**Note**: Latest versions of `ctranslate2` support CUDA 12 only. For CUDA 11, the current workaround is downgrading to the `3.24.0` version of `ctranslate2` (This can be done with `pip install --force-reinstall ctranslate2==3.24.0` or specifying the version in a `requirements.txt`). **Note**: The latest versions of `ctranslate2` only support CUDA 12 and cuDNN 9. For CUDA 11 and cuDNN 8, the current workaround is downgrading to the `3.24.0` version of `ctranslate2`, for CUDA 12 and cuDNN 8, downgrade to the `4.4.0` version of `ctranslate2`, (This can be done with `pip install --force-reinstall ctranslate2==4.4.0` or specifying the version in a `requirements.txt`).
There are multiple ways to install the NVIDIA libraries mentioned above. The recommended way is described in the official NVIDIA documentation, but we also suggest other installation methods below. There are multiple ways to install the NVIDIA libraries mentioned above. The recommended way is described in the official NVIDIA documentation, but we also suggest other installation methods below.
@@ -89,20 +89,18 @@ There are multiple ways to install the NVIDIA libraries mentioned above. The rec
#### Use Docker #### Use Docker
The libraries (cuBLAS, cuDNN) are installed in these official NVIDIA CUDA Docker images: `nvidia/cuda:12.0.0-runtime-ubuntu20.04` or `nvidia/cuda:12.0.0-runtime-ubuntu22.04`. The libraries (cuBLAS, cuDNN) are installed in this official NVIDIA CUDA Docker images: `nvidia/cuda:12.3.2-cudnn9-runtime-ubuntu22.04`.
#### Install with `pip` (Linux only) #### Install with `pip` (Linux only)
On Linux these libraries can be installed with `pip`. Note that `LD_LIBRARY_PATH` must be set before launching Python. On Linux these libraries can be installed with `pip`. Note that `LD_LIBRARY_PATH` must be set before launching Python.
```bash ```bash
pip install nvidia-cublas-cu12 nvidia-cudnn-cu12 pip install nvidia-cublas-cu12 nvidia-cudnn-cu12==9.*
export LD_LIBRARY_PATH=`python3 -c 'import os; import nvidia.cublas.lib; import nvidia.cudnn.lib; print(os.path.dirname(nvidia.cublas.lib.__file__) + ":" + os.path.dirname(nvidia.cudnn.lib.__file__))'` export LD_LIBRARY_PATH=`python3 -c 'import os; import nvidia.cublas.lib; import nvidia.cudnn.lib; print(os.path.dirname(nvidia.cublas.lib.__file__) + ":" + os.path.dirname(nvidia.cudnn.lib.__file__))'`
``` ```
**Note**: Version 9+ of `nvidia-cudnn-cu12` appears to cause issues due its reliance on cuDNN 9 (Faster-Whisper does not currently support cuDNN 9). Ensure your version of the Python package is for cuDNN 8.
#### Download the libraries from Purfview's repository (Windows & Linux) #### Download the libraries from Purfview's repository (Windows & Linux)
Purfview's [whisper-standalone-win](https://github.com/Purfview/whisper-standalone-win) provides the required NVIDIA libraries for Windows & Linux in a [single archive](https://github.com/Purfview/whisper-standalone-win/releases/tag/libs). Decompress the archive and place the libraries in a directory included in the `PATH`. Purfview's [whisper-standalone-win](https://github.com/Purfview/whisper-standalone-win) provides the required NVIDIA libraries for Windows & Linux in a [single archive](https://github.com/Purfview/whisper-standalone-win/releases/tag/libs). Decompress the archive and place the libraries in a directory included in the `PATH`.
@@ -166,24 +164,24 @@ segments, _ = model.transcribe("audio.mp3")
segments = list(segments) # The transcription will actually run here. segments = list(segments) # The transcription will actually run here.
``` ```
### multi-segment language detection ### Multi-Segment Language Detection
To directly use the model for improved language detection, the following code snippet can be used: To directly use the model for improved language detection, the following code snippet can be used:
```python ```python
from faster_whisper import WhisperModel from faster_whisper import WhisperModel
model = WhisperModel("medium", device="cuda", compute_type="float16")
model = WhisperModel("turbo", device="cuda", compute_type="float16")
language_info = model.detect_language_multi_segment("audio.mp3") language_info = model.detect_language_multi_segment("audio.mp3")
``` ```
### Batched faster-whisper ### Batched Transcription
The following code snippet illustrates how to run batched transcription on an example audio file. `BatchedInferencePipeline.transcribe` is a drop-in replacement for `WhisperModel.transcribe`
The following code snippet illustrates how to run inference with batched version on an example audio file. Please also refer to the test scripts of batched faster whisper.
```python ```python
from faster_whisper import WhisperModel, BatchedInferencePipeline from faster_whisper import WhisperModel, BatchedInferencePipeline
model = WhisperModel("medium", device="cuda", compute_type="float16") model = WhisperModel("turbo", device="cuda", compute_type="float16")
batched_model = BatchedInferencePipeline(model=model) batched_model = BatchedInferencePipeline(model=model)
segments, info = batched_model.transcribe("audio.mp3", batch_size=16) segments, info = batched_model.transcribe("audio.mp3", batch_size=16)
@@ -238,6 +236,7 @@ segments, _ = model.transcribe(
vad_parameters=dict(min_silence_duration_ms=500), vad_parameters=dict(min_silence_duration_ms=500),
) )
``` ```
Vad filter is enabled by default for batched transcription.
### Logging ### Logging
@@ -309,6 +308,7 @@ model = faster_whisper.WhisperModel("username/whisper-large-v3-ct2")
If you are comparing the performance against other Whisper implementations, you should make sure to run the comparison with similar settings. In particular: If you are comparing the performance against other Whisper implementations, you should make sure to run the comparison with similar settings. In particular:
* Verify that the same transcription options are used, especially the same beam size. For example in openai/whisper, `model.transcribe` uses a default beam size of 1 but here we use a default beam size of 5. * Verify that the same transcription options are used, especially the same beam size. For example in openai/whisper, `model.transcribe` uses a default beam size of 1 but here we use a default beam size of 5.
* Transcription speed is closely affected by the number of words in the transcript, so ensure that other implementations have a similar WER (Word Error Rate) to this one.
* When running on CPU, make sure to set the same number of threads. Many frameworks will read the environment variable `OMP_NUM_THREADS`, which can be set when running your script: * When running on CPU, make sure to set the same number of threads. Many frameworks will read the environment variable `OMP_NUM_THREADS`, which can be set when running your script:
```bash ```bash

View File

@@ -1,3 +1,3 @@
"""Version information.""" """Version information."""
__version__ = "1.0.3" __version__ = "1.1.0rc0"