- miku-stt: switch PyTorch CUDA -> CPU-only (~2.5 GB savings)
- Silero VAD already runs on CPU via ONNX (onnx=True), CUDA PyTorch was waste
- faster-whisper/CTranslate2 uses CUDA directly, no PyTorch GPU needed
- torch+torchaudio layer: 3.3 GB -> 796 MB; total image 9+ GB -> 6.83 GB
- Tested: Silero VAD loads (ONNX), Whisper loads on cuda, server ready
- llama-swap-rocm: add root .dockerignore to fix 31 GB build context
- Dockerfile clones all sources from git, never COPYs from context
- 19 GB of GGUF model files were being transferred on every build
- Now excludes everything (*), near-zero context transfer
- anime-face-detector: add .dockerignore to exclude accumulated outputs
- api/outputs/ (56 accumulated detection files) no longer baked into image
- api/__pycache__/ and images/ also excluded
- .gitignore: remove .dockerignore exclusion so these files are tracked