Voice conversion pipeline (Soprano TTS → RVC) with Docker support. Previously tracked as bare gitlink; removed .git/ directories and absorbed into main repo for unified tracking. Includes: Soprano TTS, RVC WebUI integration, Docker configs, WebSocket API, and benchmark scripts. Updated .gitignore to exclude large model weights (*.pth, *.pt, *.onnx, *.index). 287 files (3.1GB of ML weights properly excluded via gitignore).
6.3 KiB
6.3 KiB
Docker Container Dependencies
This document lists all Python packages explicitly installed in each Docker container, matching the exact versions from the working virtual environments.
Soprano Container (CUDA/Python 3.11.14)
Base Image
nvidia/cuda:11.8.0-runtime-ubuntu22.04- Python 3.11.14 (explicitly installed via deadsnakes PPA)
- pip 25.3
PyTorch Stack (CUDA 11.8)
Installed from PyTorch index (https://download.pytorch.org/whl/cu118):
torch==2.7.1+cu118torchaudio==2.7.1+cu118torchvision==0.22.1+cu118
Core Dependencies (from requirements-soprano.txt)
fastapi==0.128.0
uvicorn==0.40.0
pyzmq==27.1.0
numpy==2.4.1
pydantic==2.12.5
python-multipart==0.0.21
sounddevice==0.5.3
pydub==0.25.1
LMDeploy and ML Dependencies
lmdeploy==0.11.1
transformers==4.57.5
tokenizers==0.22.2
huggingface-hub==0.36.0
safetensors==0.7.0
accelerate==1.12.0
sentencepiece==0.2.1
einops==0.8.1
peft==0.14.0
scipy==1.17.0
Soprano TTS
- Installed from source:
pip install -e '.[lmdeploy]'from/app/soprano/ - Version:
soprano-tts==0.1.0(from local editable install) - Git commit:
f2e7c19b7de51e18d6d977dd0d1027c4b9967f50 - Features: Hallucination detection and automatic regeneration
Supporting Libraries
protobuf==6.33.4
tiktoken==0.12.0
requests==2.32.5
tqdm==4.67.1
PyYAML==6.0.3
Jinja2==3.1.6
click==8.3.1
psutil==7.2.1
packaging==25.0
filelock==3.20.3
fsspec==2026.1.0
regex==2026.1.14
certifi==2026.1.4
charset-normalizer==3.4.4
urllib3==2.6.3
idna==3.11
RVC Container (ROCm/Python 3.10.19)
Base Image
rocm/pytorch:rocm6.2_ubuntu22.04_py3.10_pytorch_release_2.3.0- Python 3.10.x (from base image, targeting 3.10.19)
- pip 24.0
- PyTorch 2.5.1+rocm6.2 (pre-installed in base)
- TorchAudio 2.5.1+rocm6.2 (pre-installed in base)
- TorchVision 0.20.1+rocm6.2 (pre-installed in base)
Core Dependencies (from requirements-rvc.txt)
fastapi==0.128.0
uvicorn==0.40.0
pyzmq==27.1.0
numpy==1.23.5
pydantic==2.12.5
python-multipart==0.0.21
Audio Processing Stack
librosa==0.10.2
soundfile==0.13.1
sounddevice==0.5.3
pydub==0.25.1
audioread==3.1.0
resampy==0.4.3
soxr==1.0.0
pyworld==0.3.2
praat-parselmouth==0.4.7
torchcrepe==0.0.23
torchfcpe==0.0.4
RVC Core Dependencies
fairseq==0.12.2
faiss-cpu==1.7.3
gradio==3.48.0
gradio_client==0.6.1
numba==0.56.4
llvmlite==0.39.0
local-attention==1.11.2
LMDeploy and ML Dependencies
lmdeploy==0.11.1
transformers==4.57.3
tokenizers==0.22.2
huggingface-hub==0.36.0
safetensors==0.7.0
accelerate==1.12.0
sentencepiece==0.2.1
einops==0.8.1
peft==0.14.0
Scientific Computing
scipy==1.15.3
scikit-learn==1.7.2
matplotlib==3.10.8
pandas==2.3.3
Additional Dependencies
av==16.1.0
pillow==10.4.0
omegaconf==2.0.6
hydra-core==1.0.7
python-dotenv==1.2.1
ray==2.53.0
Supporting Libraries
protobuf==6.33.4
tiktoken==0.12.0
requests==2.32.5
tqdm==4.67.1
PyYAML==6.0.3
Jinja2==3.1.6
click==8.3.1
psutil==7.2.1
packaging==25.0
filelock==3.20.0
fsspec==2025.10.0
regex==2025.11.3
certifi==2026.1.4
charset-normalizer==3.4.4
urllib3==2.6.3
idna==3.11
Key Differences Between Containers
Python Versions
- Soprano: Python 3.11.14 (installed via deadsnakes PPA)
- RVC: Python 3.10.x (from ROCm base image, targeting 3.10.19)
pip Versions
- Soprano: pip 25.3
- RVC: pip 24.0
PyTorch Versions
- Soprano: PyTorch 2.7.1 with CUDA 11.8
- RVC: PyTorch 2.5.1 with ROCm 6.2
NumPy Versions
- Soprano: NumPy 2.4.1 (latest compatible with PyTorch 2.7.1)
- RVC: NumPy 1.23.5 (required for fairseq/RVC compatibility)
FastAPI Versions
- Both: FastAPI 0.128.0 (updated from 0.88.0 in original requirements.txt)
- Both: Starlette 0.50.0 (dependency of FastAPI 0.128.0)
Transformers Versions
- Soprano: transformers==4.57.5
- RVC: transformers==4.57.3
- Slight version difference but compatible
Unique to Soprano
soprano-tts(editable install from source)- CUDA-specific NVIDIA packages (cublas, cudnn, etc.)
Unique to RVC
fairseq(sequence-to-sequence modeling)faiss-cpu(similarity search)gradio(web UI framework)- Audio processing:
librosa,soundfile,pyworld,praat-parselmouth - Voice pitch:
torchcrepe,torchfcpe - Performance:
numba,llvmlite - Attention:
local-attention - Configuration:
omegaconf,hydra-core
Installation Order
Soprano Container
- System packages (Python 3.11.14 via deadsnakes PPA, git, build tools)
- Upgrade pip to 25.3
- PyTorch with CUDA 11.8
- Dependencies from
requirements-soprano.txt - Soprano from source with
pip install -e '.[lmdeploy]'
RVC Container
- System packages (ffmpeg, libsndfile1, curl)
- Set pip to 24.0 (downgrade if needed to match venv)
- PyTorch with ROCm 6.2 (pre-installed in base image)
- Dependencies from
requirements-rvc.txt - Models copied to
/app/models/
Verification
After building containers, verify installations:
Soprano Container
docker exec miku-soprano-tts python3 -c "
import torch
import soprano
import lmdeploy
print(f'PyTorch: {torch.__version__}')
print(f'CUDA available: {torch.cuda.is_available()}')
print(f'Soprano: {soprano.__version__}')
print(f'LMDeploy: {lmdeploy.__version__}')
"
RVC Container
docker exec miku-rvc-api python3 -c "
import torch
import fairseq
import librosa
import pyworld
print(f'PyTorch: {torch.__version__}')
print(f'ROCm available: {torch.cuda.is_available()}')
print(f'Fairseq: {fairseq.__version__}')
print(f'Librosa: {librosa.__version__}')
"
Maintenance
When updating packages:
- Update venv first: Test in local virtual environment
- Export packages:
pip list --format=freeze > packages.txt - Update requirements: Update
requirements-soprano.txtorrequirements-rvc.txt - Rebuild container:
docker-compose build --no-cache <service> - Test: Verify functionality matches bare metal
Notes
- All version numbers are explicitly pinned for reproducibility
- PyTorch versions are installed from specific indexes (CUDA/ROCm)
- Soprano is installed as editable package to match bare metal setup
- ROCm base image comes with PyTorch pre-installed (not in requirements file)
- CUDA packages are handled by PyTorch wheel (nvidia-cublas, nvidia-cudnn, etc.)