# Docker Container Dependencies This document lists all Python packages explicitly installed in each Docker container, matching the exact versions from the working virtual environments. ## Soprano Container (CUDA/Python 3.11.14) ### Base Image - `nvidia/cuda:11.8.0-runtime-ubuntu22.04` - Python 3.11.14 (explicitly installed via deadsnakes PPA) - pip 25.3 ### PyTorch Stack (CUDA 11.8) Installed from PyTorch index (https://download.pytorch.org/whl/cu118): - `torch==2.7.1+cu118` - `torchaudio==2.7.1+cu118` - `torchvision==0.22.1+cu118` ### Core Dependencies (from requirements-soprano.txt) ``` fastapi==0.128.0 uvicorn==0.40.0 pyzmq==27.1.0 numpy==2.4.1 pydantic==2.12.5 python-multipart==0.0.21 sounddevice==0.5.3 pydub==0.25.1 ``` ### LMDeploy and ML Dependencies ``` lmdeploy==0.11.1 transformers==4.57.5 tokenizers==0.22.2 huggingface-hub==0.36.0 safetensors==0.7.0 accelerate==1.12.0 sentencepiece==0.2.1 einops==0.8.1 peft==0.14.0 scipy==1.17.0 ``` ### Soprano TTS - Installed from source: `pip install -e '.[lmdeploy]'` from `/app/soprano/` - Version: `soprano-tts==0.1.0` (from local editable install) - Git commit: `f2e7c19b7de51e18d6d977dd0d1027c4b9967f50` - Features: Hallucination detection and automatic regeneration ### Supporting Libraries ``` protobuf==6.33.4 tiktoken==0.12.0 requests==2.32.5 tqdm==4.67.1 PyYAML==6.0.3 Jinja2==3.1.6 click==8.3.1 psutil==7.2.1 packaging==25.0 filelock==3.20.3 fsspec==2026.1.0 regex==2026.1.14 certifi==2026.1.4 charset-normalizer==3.4.4 urllib3==2.6.3 idna==3.11 ``` --- ## RVC Container (ROCm/Python 3.10.19) ### Base Image - `rocm/pytorch:rocm6.2_ubuntu22.04_py3.10_pytorch_release_2.3.0` - Python 3.10.x (from base image, targeting 3.10.19) - pip 24.0 - PyTorch 2.5.1+rocm6.2 (pre-installed in base) - TorchAudio 2.5.1+rocm6.2 (pre-installed in base) - TorchVision 0.20.1+rocm6.2 (pre-installed in base) ### Core Dependencies (from requirements-rvc.txt) ``` fastapi==0.128.0 uvicorn==0.40.0 pyzmq==27.1.0 numpy==1.23.5 pydantic==2.12.5 python-multipart==0.0.21 ``` ### Audio Processing Stack ``` librosa==0.10.2 soundfile==0.13.1 sounddevice==0.5.3 pydub==0.25.1 audioread==3.1.0 resampy==0.4.3 soxr==1.0.0 pyworld==0.3.2 praat-parselmouth==0.4.7 torchcrepe==0.0.23 torchfcpe==0.0.4 ``` ### RVC Core Dependencies ``` fairseq==0.12.2 faiss-cpu==1.7.3 gradio==3.48.0 gradio_client==0.6.1 numba==0.56.4 llvmlite==0.39.0 local-attention==1.11.2 ``` ### LMDeploy and ML Dependencies ``` lmdeploy==0.11.1 transformers==4.57.3 tokenizers==0.22.2 huggingface-hub==0.36.0 safetensors==0.7.0 accelerate==1.12.0 sentencepiece==0.2.1 einops==0.8.1 peft==0.14.0 ``` ### Scientific Computing ``` scipy==1.15.3 scikit-learn==1.7.2 matplotlib==3.10.8 pandas==2.3.3 ``` ### Additional Dependencies ``` av==16.1.0 pillow==10.4.0 omegaconf==2.0.6 hydra-core==1.0.7 python-dotenv==1.2.1 ray==2.53.0 ``` ### Supporting Libraries ``` protobuf==6.33.4 tiktoken==0.12.0 requests==2.32.5 tqdm==4.67.1 PyYAML==6.0.3 Jinja2==3.1.6 click==8.3.1 psutil==7.2.1 packaging==25.0 filelock==3.20.0 fsspec==2025.10.0 regex==2025.11.3 certifi==2026.1.4 charset-normalizer==3.4.4 urllib3==2.6.3 idna==3.11 ``` --- ## Key Differences Between Containers ### Python Versions - **Soprano**: Python 3.11.14 (installed via deadsnakes PPA) - **RVC**: Python 3.10.x (from ROCm base image, targeting 3.10.19) ### pip Versions - **Soprano**: pip 25.3 - **RVC**: pip 24.0 ### PyTorch Versions - **Soprano**: PyTorch 2.7.1 with CUDA 11.8 - **RVC**: PyTorch 2.5.1 with ROCm 6.2 ### NumPy Versions - **Soprano**: NumPy 2.4.1 (latest compatible with PyTorch 2.7.1) - **RVC**: NumPy 1.23.5 (required for fairseq/RVC compatibility) ### FastAPI Versions - **Both**: FastAPI 0.128.0 (updated from 0.88.0 in original requirements.txt) - **Both**: Starlette 0.50.0 (dependency of FastAPI 0.128.0) ### Transformers Versions - **Soprano**: transformers==4.57.5 - **RVC**: transformers==4.57.3 - Slight version difference but compatible ### Unique to Soprano - `soprano-tts` (editable install from source) - CUDA-specific NVIDIA packages (cublas, cudnn, etc.) ### Unique to RVC - `fairseq` (sequence-to-sequence modeling) - `faiss-cpu` (similarity search) - `gradio` (web UI framework) - Audio processing: `librosa`, `soundfile`, `pyworld`, `praat-parselmouth` - Voice pitch: `torchcrepe`, `torchfcpe` - Performance: `numba`, `llvmlite` - Attention: `local-attention` - Configuration: `omegaconf`, `hydra-core` --- ## Installation Order ### Soprano Container 1. System packages (Python 3.11.14 via deadsnakes PPA, git, build tools) 2. Upgrade pip to 25.3 3. PyTorch with CUDA 11.8 4. Dependencies from `requirements-soprano.txt` 5. Soprano from source with `pip install -e '.[lmdeploy]'` ### RVC Container 1. System packages (ffmpeg, libsndfile1, curl) 2. Set pip to 24.0 (downgrade if needed to match venv) 3. PyTorch with ROCm 6.2 (pre-installed in base image) 4. Dependencies from `requirements-rvc.txt` 5. Models copied to `/app/models/` --- ## Verification After building containers, verify installations: ### Soprano Container ```bash docker exec miku-soprano-tts python3 -c " import torch import soprano import lmdeploy print(f'PyTorch: {torch.__version__}') print(f'CUDA available: {torch.cuda.is_available()}') print(f'Soprano: {soprano.__version__}') print(f'LMDeploy: {lmdeploy.__version__}') " ``` ### RVC Container ```bash docker exec miku-rvc-api python3 -c " import torch import fairseq import librosa import pyworld print(f'PyTorch: {torch.__version__}') print(f'ROCm available: {torch.cuda.is_available()}') print(f'Fairseq: {fairseq.__version__}') print(f'Librosa: {librosa.__version__}') " ``` --- ## Maintenance When updating packages: 1. **Update venv first**: Test in local virtual environment 2. **Export packages**: `pip list --format=freeze > packages.txt` 3. **Update requirements**: Update `requirements-soprano.txt` or `requirements-rvc.txt` 4. **Rebuild container**: `docker-compose build --no-cache ` 5. **Test**: Verify functionality matches bare metal --- ## Notes - All version numbers are explicitly pinned for reproducibility - PyTorch versions are installed from specific indexes (CUDA/ROCm) - Soprano is installed as editable package to match bare metal setup - ROCm base image comes with PyTorch pre-installed (not in requirements file) - CUDA packages are handled by PyTorch wheel (nvidia-cublas, nvidia-cudnn, etc.)