miku-discord/stt-realtime/requirements.txt at 8b96f4dc8af0b621c1076a0c170c6c4ecbbf9dcb - miku-discord - Koko210Tea

Koko210/miku-discord

Files

koko210Serve 9e5511da21 perf: reduce container sizes and build times

- miku-stt: switch PyTorch CUDA -> CPU-only (~2.5 GB savings)
  - Silero VAD already runs on CPU via ONNX (onnx=True), CUDA PyTorch was waste
  - faster-whisper/CTranslate2 uses CUDA directly, no PyTorch GPU needed
  - torch+torchaudio layer: 3.3 GB -> 796 MB; total image 9+ GB -> 6.83 GB
  - Tested: Silero VAD loads (ONNX), Whisper loads on cuda, server ready

- llama-swap-rocm: add root .dockerignore to fix 31 GB build context
  - Dockerfile clones all sources from git, never COPYs from context
  - 19 GB of GGUF model files were being transferred on every build
  - Now excludes everything (*), near-zero context transfer

- anime-face-detector: add .dockerignore to exclude accumulated outputs
  - api/outputs/ (56 accumulated detection files) no longer baked into image
  - api/__pycache__/ and images/ also excluded

- .gitignore: remove .dockerignore exclusion so these files are tracked

2026-02-25 14:41:04 +02:00

17 lines

335 B

Plaintext

Raw Blame History

 # Low-latency STT dependencies
 websockets>=12.0
 numpy>=1.24.0
 # Faster-whisper backend (GPU accelerated)
 faster-whisper>=1.0.0
 ctranslate2>=4.4.0
 # Audio processing
 soundfile>=0.12.0
 # VAD - Silero (loaded via torch.hub, runs on CPU via ONNX)
 # Requires torch (CPU-only) - see requirements-gpu-torch.txt
 # Utilities
 aiohttp>=3.9.0