Merge pull request #1419 from DrewThomasson/main

sync from main
2026-01-10 06:18:02 -05:00 · 2025-12-26 06:18:49 -08:00
parent 5c2427d2e2 4b781b45c8
commit 618ec6a180
1 changed files with 14 additions and 19 deletions
--- a/README.md
+++ b/README.md
@@ -102,10 +102,11 @@ https://github.com/user-attachments/assets/81c4baad-117e-4db5-ac86-efc2b7fea921

 ## Features
 - 📚 Splits eBook into chapters for organized audio.
- 🎙️ High-quality text-to-speech with [Coqui XTTSv2](https://huggingface.co/coqui/XTTS-v2) and [Fairseq](https://github.com/facebookresearch/fairseq/tree/main/examples/mms) (and more).
+- 🎙️ High-quality text-to-speech with [Xttsv2](https://huggingface.co/coqui/XTTS-v2), [Fairseq](https://github.com/facebookresearch/fairseq/tree/main/examples/mms) and much more.
 - 🗣️ Optional voice cloning with your own voice file.
- 🌍 Supports +1110 languages (English by default). [List of Supported languages](https://dl.fbaipublicfiles.com/mms/tts/all-tts-languages.html)
- 🖥️ Designed to run on 4GB RAM.
+- 🗣️ Optional custom model with your own training model.
+- 🌍 Supports 1158 languages. [List of Supported languages](https://dl.fbaipublicfiles.com/mms/tts/all-tts-languages.html)
+- 🖥️ Designed to run on 2GB RAM 1GB VRAM Min.


 ## Supported Languages
@@ -121,11 +122,12 @@ https://github.com/user-attachments/assets/81c4baad-117e-4db5-ac86-efc2b7fea921


 ##  Hardware Requirements
- 2gb RAM minimum, 8GB recommended
- Virtualization enabled if running on windows (Docker only)
- CPU (intel, AMD, ARM)*
- GPU (CUDA, ROCm, XPU)
- MPS (Apple Silicon CPU)
+- 2gb RAM min, 8GB recommended.
+- 1GB VRAM min, 4GB recommended.
+- Virtualization enabled if running on windows (Docker only).
+- CPU (intel, AMD, ARM)*.
+- GPU (CUDA, ROCm, XPU).
+- MPS (Apple Silicon CPU).

 *<i> Modern TTS are very slow on CPU</i>

@@ -179,13 +181,11 @@ to let the web page reconnect to the new connection socket.**
 ### Basic  Usage
   - **Linux/MacOS**:
     ```bash
-     ./ebook2audiobook.sh --headless --ebook <path_to_ebook_file> \
-         --voice [path_to_voice_file] --language [language_code]
+     ./ebook2audiobook.sh --headless --ebook <path_to_ebook_file> --voice [path_to_voice_file] --language [language_code]
     ```
   - **Windows**
     ```bash
-     ebook2audiobook.cmd --headless --ebook <path_to_ebook_file>
-         --voice [path_to_voice_file] --language [language_code]
+     ebook2audiobook.cmd --headless --ebook <path_to_ebook_file> --voice [path_to_voice_file] --language [language_code]
     ```
     
  - **[--ebook]**: Path to your eBook file
@@ -199,13 +199,11 @@ to let the web page reconnect to the new connection socket.**
  (must be a .zip file containing the mandatory model files. Example for XTTSv2: config.json, model.pth, vocab.json and ref.wav)
   - **Linux/MacOS**
     ```bash
-     ./ebook2audiobook.sh --headless --ebook <ebook_file_path> \
-         --language <language> --custom_model <custom_model_path>
+     ./ebook2audiobook.sh --headless --ebook <ebook_file_path> --language <language> --custom_model <custom_model_path>
     ```
   - **Windows**
     ```bash
-     ebook2audiobook.cmd --headless --ebook <ebook_file_path> \
-         --language <language> --custom_model <custom_model_path>
+     ebook2audiobook.cmd --headless --ebook <ebook_file_path> --language <language> --custom_model <custom_model_path>
     ```
     <i>Note: the ref.wav of your custom model is always the voice selected for the conversion</i>
     
@@ -213,7 +211,6 @@ to let the web page reconnect to the new connection socket.**
      which must contain (according to the tts engine) all the mandatory files<br>
      (see ./lib/models.py).

-
 ### For Detailed Guide with list of all Parameters to use
   - **Linux/MacOS**
     ```bash
@@ -374,8 +371,6 @@ one [pause] is a random between 0.8 to 1.6 seconds


 ### Docker
-
-#### Steps to Run
 1. **Clone the Repository**:
 ```bash
   git clone https://github.com/DrewThomasson/ebook2audiobook.git