- miku-stt: switch PyTorch CUDA -> CPU-only (~2.5 GB savings) - Silero VAD already runs on CPU via ONNX (onnx=True), CUDA PyTorch was waste - faster-whisper/CTranslate2 uses CUDA directly, no PyTorch GPU needed - torch+torchaudio layer: 3.3 GB -> 796 MB; total image 9+ GB -> 6.83 GB - Tested: Silero VAD loads (ONNX), Whisper loads on cuda, server ready - llama-swap-rocm: add root .dockerignore to fix 31 GB build context - Dockerfile clones all sources from git, never COPYs from context - 19 GB of GGUF model files were being transferred on every build - Now excludes everything (*), near-zero context transfer - anime-face-detector: add .dockerignore to exclude accumulated outputs - api/outputs/ (56 accumulated detection files) no longer baked into image - api/__pycache__/ and images/ also excluded - .gitignore: remove .dockerignore exclusion so these files are tracked
17 lines
335 B
Plaintext
17 lines
335 B
Plaintext
# Low-latency STT dependencies
|
|
websockets>=12.0
|
|
numpy>=1.24.0
|
|
|
|
# Faster-whisper backend (GPU accelerated)
|
|
faster-whisper>=1.0.0
|
|
ctranslate2>=4.4.0
|
|
|
|
# Audio processing
|
|
soundfile>=0.12.0
|
|
|
|
# VAD - Silero (loaded via torch.hub, runs on CPU via ONNX)
|
|
# Requires torch (CPU-only) - see requirements-gpu-torch.txt
|
|
|
|
# Utilities
|
|
aiohttp>=3.9.0
|