add: absorb soprano_to_rvc as regular subdirectory
Voice conversion pipeline (Soprano TTS → RVC) with Docker support. Previously tracked as bare gitlink; removed .git/ directories and absorbed into main repo for unified tracking. Includes: Soprano TTS, RVC WebUI integration, Docker configs, WebSocket API, and benchmark scripts. Updated .gitignore to exclude large model weights (*.pth, *.pt, *.onnx, *.index). 287 files (3.1GB of ML weights properly excluded via gitignore).
This commit is contained in:
283
soprano_to_rvc/DOCKER_DEPENDENCIES.md
Normal file
283
soprano_to_rvc/DOCKER_DEPENDENCIES.md
Normal file
@@ -0,0 +1,283 @@
|
||||
# Docker Container Dependencies
|
||||
|
||||
This document lists all Python packages explicitly installed in each Docker container, matching the exact versions from the working virtual environments.
|
||||
|
||||
## Soprano Container (CUDA/Python 3.11.14)
|
||||
|
||||
### Base Image
|
||||
- `nvidia/cuda:11.8.0-runtime-ubuntu22.04`
|
||||
- Python 3.11.14 (explicitly installed via deadsnakes PPA)
|
||||
- pip 25.3
|
||||
|
||||
### PyTorch Stack (CUDA 11.8)
|
||||
Installed from PyTorch index (https://download.pytorch.org/whl/cu118):
|
||||
- `torch==2.7.1+cu118`
|
||||
- `torchaudio==2.7.1+cu118`
|
||||
- `torchvision==0.22.1+cu118`
|
||||
|
||||
### Core Dependencies (from requirements-soprano.txt)
|
||||
```
|
||||
fastapi==0.128.0
|
||||
uvicorn==0.40.0
|
||||
pyzmq==27.1.0
|
||||
numpy==2.4.1
|
||||
pydantic==2.12.5
|
||||
python-multipart==0.0.21
|
||||
sounddevice==0.5.3
|
||||
pydub==0.25.1
|
||||
```
|
||||
|
||||
### LMDeploy and ML Dependencies
|
||||
```
|
||||
lmdeploy==0.11.1
|
||||
transformers==4.57.5
|
||||
tokenizers==0.22.2
|
||||
huggingface-hub==0.36.0
|
||||
safetensors==0.7.0
|
||||
accelerate==1.12.0
|
||||
sentencepiece==0.2.1
|
||||
einops==0.8.1
|
||||
peft==0.14.0
|
||||
scipy==1.17.0
|
||||
```
|
||||
|
||||
### Soprano TTS
|
||||
- Installed from source: `pip install -e '.[lmdeploy]'` from `/app/soprano/`
|
||||
- Version: `soprano-tts==0.1.0` (from local editable install)
|
||||
- Git commit: `f2e7c19b7de51e18d6d977dd0d1027c4b9967f50`
|
||||
- Features: Hallucination detection and automatic regeneration
|
||||
|
||||
### Supporting Libraries
|
||||
```
|
||||
protobuf==6.33.4
|
||||
tiktoken==0.12.0
|
||||
requests==2.32.5
|
||||
tqdm==4.67.1
|
||||
PyYAML==6.0.3
|
||||
Jinja2==3.1.6
|
||||
click==8.3.1
|
||||
psutil==7.2.1
|
||||
packaging==25.0
|
||||
filelock==3.20.3
|
||||
fsspec==2026.1.0
|
||||
regex==2026.1.14
|
||||
certifi==2026.1.4
|
||||
charset-normalizer==3.4.4
|
||||
urllib3==2.6.3
|
||||
idna==3.11
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## RVC Container (ROCm/Python 3.10.19)
|
||||
|
||||
### Base Image
|
||||
- `rocm/pytorch:rocm6.2_ubuntu22.04_py3.10_pytorch_release_2.3.0`
|
||||
- Python 3.10.x (from base image, targeting 3.10.19)
|
||||
- pip 24.0
|
||||
- PyTorch 2.5.1+rocm6.2 (pre-installed in base)
|
||||
- TorchAudio 2.5.1+rocm6.2 (pre-installed in base)
|
||||
- TorchVision 0.20.1+rocm6.2 (pre-installed in base)
|
||||
|
||||
### Core Dependencies (from requirements-rvc.txt)
|
||||
```
|
||||
fastapi==0.128.0
|
||||
uvicorn==0.40.0
|
||||
pyzmq==27.1.0
|
||||
numpy==1.23.5
|
||||
pydantic==2.12.5
|
||||
python-multipart==0.0.21
|
||||
```
|
||||
|
||||
### Audio Processing Stack
|
||||
```
|
||||
librosa==0.10.2
|
||||
soundfile==0.13.1
|
||||
sounddevice==0.5.3
|
||||
pydub==0.25.1
|
||||
audioread==3.1.0
|
||||
resampy==0.4.3
|
||||
soxr==1.0.0
|
||||
pyworld==0.3.2
|
||||
praat-parselmouth==0.4.7
|
||||
torchcrepe==0.0.23
|
||||
torchfcpe==0.0.4
|
||||
```
|
||||
|
||||
### RVC Core Dependencies
|
||||
```
|
||||
fairseq==0.12.2
|
||||
faiss-cpu==1.7.3
|
||||
gradio==3.48.0
|
||||
gradio_client==0.6.1
|
||||
numba==0.56.4
|
||||
llvmlite==0.39.0
|
||||
local-attention==1.11.2
|
||||
```
|
||||
|
||||
### LMDeploy and ML Dependencies
|
||||
```
|
||||
lmdeploy==0.11.1
|
||||
transformers==4.57.3
|
||||
tokenizers==0.22.2
|
||||
huggingface-hub==0.36.0
|
||||
safetensors==0.7.0
|
||||
accelerate==1.12.0
|
||||
sentencepiece==0.2.1
|
||||
einops==0.8.1
|
||||
peft==0.14.0
|
||||
```
|
||||
|
||||
### Scientific Computing
|
||||
```
|
||||
scipy==1.15.3
|
||||
scikit-learn==1.7.2
|
||||
matplotlib==3.10.8
|
||||
pandas==2.3.3
|
||||
```
|
||||
|
||||
### Additional Dependencies
|
||||
```
|
||||
av==16.1.0
|
||||
pillow==10.4.0
|
||||
omegaconf==2.0.6
|
||||
hydra-core==1.0.7
|
||||
python-dotenv==1.2.1
|
||||
ray==2.53.0
|
||||
```
|
||||
|
||||
### Supporting Libraries
|
||||
```
|
||||
protobuf==6.33.4
|
||||
tiktoken==0.12.0
|
||||
requests==2.32.5
|
||||
tqdm==4.67.1
|
||||
PyYAML==6.0.3
|
||||
Jinja2==3.1.6
|
||||
click==8.3.1
|
||||
psutil==7.2.1
|
||||
packaging==25.0
|
||||
filelock==3.20.0
|
||||
fsspec==2025.10.0
|
||||
regex==2025.11.3
|
||||
certifi==2026.1.4
|
||||
charset-normalizer==3.4.4
|
||||
urllib3==2.6.3
|
||||
idna==3.11
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## Key Differences Between Containers
|
||||
|
||||
### Python Versions
|
||||
- **Soprano**: Python 3.11.14 (installed via deadsnakes PPA)
|
||||
- **RVC**: Python 3.10.x (from ROCm base image, targeting 3.10.19)
|
||||
|
||||
### pip Versions
|
||||
- **Soprano**: pip 25.3
|
||||
- **RVC**: pip 24.0
|
||||
|
||||
### PyTorch Versions
|
||||
- **Soprano**: PyTorch 2.7.1 with CUDA 11.8
|
||||
- **RVC**: PyTorch 2.5.1 with ROCm 6.2
|
||||
|
||||
### NumPy Versions
|
||||
- **Soprano**: NumPy 2.4.1 (latest compatible with PyTorch 2.7.1)
|
||||
- **RVC**: NumPy 1.23.5 (required for fairseq/RVC compatibility)
|
||||
|
||||
### FastAPI Versions
|
||||
- **Both**: FastAPI 0.128.0 (updated from 0.88.0 in original requirements.txt)
|
||||
- **Both**: Starlette 0.50.0 (dependency of FastAPI 0.128.0)
|
||||
|
||||
### Transformers Versions
|
||||
- **Soprano**: transformers==4.57.5
|
||||
- **RVC**: transformers==4.57.3
|
||||
- Slight version difference but compatible
|
||||
|
||||
### Unique to Soprano
|
||||
- `soprano-tts` (editable install from source)
|
||||
- CUDA-specific NVIDIA packages (cublas, cudnn, etc.)
|
||||
|
||||
### Unique to RVC
|
||||
- `fairseq` (sequence-to-sequence modeling)
|
||||
- `faiss-cpu` (similarity search)
|
||||
- `gradio` (web UI framework)
|
||||
- Audio processing: `librosa`, `soundfile`, `pyworld`, `praat-parselmouth`
|
||||
- Voice pitch: `torchcrepe`, `torchfcpe`
|
||||
- Performance: `numba`, `llvmlite`
|
||||
- Attention: `local-attention`
|
||||
- Configuration: `omegaconf`, `hydra-core`
|
||||
|
||||
---
|
||||
|
||||
## Installation Order
|
||||
|
||||
### Soprano Container
|
||||
1. System packages (Python 3.11.14 via deadsnakes PPA, git, build tools)
|
||||
2. Upgrade pip to 25.3
|
||||
3. PyTorch with CUDA 11.8
|
||||
4. Dependencies from `requirements-soprano.txt`
|
||||
5. Soprano from source with `pip install -e '.[lmdeploy]'`
|
||||
|
||||
### RVC Container
|
||||
1. System packages (ffmpeg, libsndfile1, curl)
|
||||
2. Set pip to 24.0 (downgrade if needed to match venv)
|
||||
3. PyTorch with ROCm 6.2 (pre-installed in base image)
|
||||
4. Dependencies from `requirements-rvc.txt`
|
||||
5. Models copied to `/app/models/`
|
||||
|
||||
---
|
||||
|
||||
## Verification
|
||||
|
||||
After building containers, verify installations:
|
||||
|
||||
### Soprano Container
|
||||
```bash
|
||||
docker exec miku-soprano-tts python3 -c "
|
||||
import torch
|
||||
import soprano
|
||||
import lmdeploy
|
||||
print(f'PyTorch: {torch.__version__}')
|
||||
print(f'CUDA available: {torch.cuda.is_available()}')
|
||||
print(f'Soprano: {soprano.__version__}')
|
||||
print(f'LMDeploy: {lmdeploy.__version__}')
|
||||
"
|
||||
```
|
||||
|
||||
### RVC Container
|
||||
```bash
|
||||
docker exec miku-rvc-api python3 -c "
|
||||
import torch
|
||||
import fairseq
|
||||
import librosa
|
||||
import pyworld
|
||||
print(f'PyTorch: {torch.__version__}')
|
||||
print(f'ROCm available: {torch.cuda.is_available()}')
|
||||
print(f'Fairseq: {fairseq.__version__}')
|
||||
print(f'Librosa: {librosa.__version__}')
|
||||
"
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## Maintenance
|
||||
|
||||
When updating packages:
|
||||
|
||||
1. **Update venv first**: Test in local virtual environment
|
||||
2. **Export packages**: `pip list --format=freeze > packages.txt`
|
||||
3. **Update requirements**: Update `requirements-soprano.txt` or `requirements-rvc.txt`
|
||||
4. **Rebuild container**: `docker-compose build --no-cache <service>`
|
||||
5. **Test**: Verify functionality matches bare metal
|
||||
|
||||
---
|
||||
|
||||
## Notes
|
||||
|
||||
- All version numbers are explicitly pinned for reproducibility
|
||||
- PyTorch versions are installed from specific indexes (CUDA/ROCm)
|
||||
- Soprano is installed as editable package to match bare metal setup
|
||||
- ROCm base image comes with PyTorch pre-installed (not in requirements file)
|
||||
- CUDA packages are handled by PyTorch wheel (nvidia-cublas, nvidia-cudnn, etc.)
|
||||
Reference in New Issue
Block a user