Speaches | Notion

TTS

docker run \\
  --rm \\
  --detach \\
  --publish 8000:8000 \\
  --name speaches \\
  --volume hf-hub-cache:/home/ubuntu/.cache/huggingface/hub \\
  ghcr.io/speaches-ai/speaches:latest-cpu

--rm: Automatically removes the container once it stops, ensuring no residual data occupies disk space. Think of this command as a way to "try out" the service rather than permanently installing it. If you want to make it more permanent, you would need to skip the --rm part and perhaps set it up to start automatically.
-detach: Runs the container in the background, allowing the terminal to remain free for other tasks.
-publish 8000:8000: Maps port 8000 of the container to port 8000 on the host machine, facilitating external access to the application's services. The app (Speaches) starts working immediately, and you can access it on your browser or system via http://localhost:8000
--volume hf-hub-cache:/home/ubuntu/.cache/huggingface/hub: Mounts a Docker volume named hf-hub-cache to the specified directory within the container. This setup caches Hugging Face models, reducing redundant downloads and enhancing performance.
Check docs for GPU option (cuda)

# Verify that the server is running
curl <http://localhost:8000>

Download the Kokoro Model and Voices:

--volume hf-hub-cache:/home/ubuntu/.cache/huggingface/hub creates a bind mount between your local disk and the container. So, although the download process happens within the container, the models are stored on your local disk (thanks to the --volume option) and remain permanent unless you delete the hf-hub-cache directory manually. Even if the container is removed (because of the -rm flag), the models remain on your local disk in the hf-hub-cache directory. When you start a new container with the same -volume configuration, it can access the models without downloading them again.

export KOKORO_REVISION=c97b7bbc3e60f447383c79b2f94fee861ff156ac

# Download the ONNX model (~346 MBs)
docker exec -it speaches huggingface-cli download hexgrad/Kokoro-82M --include 'kokoro-v0_19.onnx' --revision $KOKORO_REVISION

# Download the voices.json (~54 MBs) file

# from docs (but will fail because curl is not installed in the container)
# docker exec -it speaches curl --location --output /home/ubuntu/.cache/huggingface/hub/models--hexgrad--Kokoro-82M/snapshots/$KOKORO_REVISION/voices.json <https://github.com/thewh1teagle/kokoro-onnx/releases/download/model-files/voices.json>

# 1. **Download voices.json Locally**:
curl --location --output voices.json \\
  <https://github.com/thewh1teagle/kokoro-onnx/releases/download/model-files/voices.json>

# 2. **Create the Directory Structure**
mkdir -p hf-hub-cache/models--hexgrad--Kokoro-82M/snapshots/$KOKORO_REVISION

# 2. **Move It to the Mounted Volume**:
mv voices.json hf-hub-cache/models--hexgrad--Kokoro-82M/snapshots/$KOKORO_REVISION/

# 3. **Verify Inside the Container**:
docker exec -it speaches ls /home/ubuntu/.cache/huggingface/hub/models--hexgrad--Kokoro-82M/snapshots/$KOKORO_REVISION/

Inside the container, in the correct dir, there must be: model file and voices file. Check with:

docker exec -it speaches ls /home/ubuntu/.cache/huggingface/hub/models--hexgrad--Kokoro-82M/snapshots/c97b7bbc3e60f447383c79b2f94fee861ff156ac/

> kokoro-v0_19.onnx  voices.json

Restart the Container (Optional): If the container is running, it should automatically detect the new file in the mounted volume. If not, you can restart it