perf: reduce container sizes and build times

- miku-stt: switch PyTorch CUDA -> CPU-only (~2.5 GB savings) - Silero VAD already runs on CPU via ONNX (onnx=True), CUDA PyTorch was waste - faster-whisper/CTranslate2 uses CUDA directly, no PyTorch GPU needed - torch+torchaudio layer: 3.3 GB -> 796 MB; total image 9+ GB -> 6.83 GB - Tested: Silero VAD loads (ONNX), Whisper loads on cuda, server ready - llama-swap-rocm: add root .dockerignore to fix 31 GB build context - Dockerfile clones all sources from git, never COPYs from context - 19 GB of GGUF model files were being transferred on every build - Now excludes everything (*), near-zero context transfer - anime-face-detector: add .dockerignore to exclude accumulated outputs - api/outputs/ (56 accumulated detection files) no longer baked into image - api/__pycache__/ and images/ also excluded - .gitignore: remove .dockerignore exclusion so these files are tracked
2026-02-25 14:41:04 +02:00
parent 0edf1ef1c0
commit 9e5511da21
6 changed files with 29 additions and 14 deletions
--- a/.dockerignore
+++ b/.dockerignore
@@ -0,0 +1,10 @@
+# .dockerignore for llama-swap-rocm (build context is project root)
+# The Dockerfile.llamaswap-rocm doesn't COPY anything from the build context —
+# everything is git-cloned in multi-stage builds. Exclude everything to avoid
+# sending ~31 GB of unnecessary build context (models, backups, etc.)
+
+# Exclude everything by default
+*
+
+# Only include what the Dockerfile actually needs (nothing from context currently)
+# If the Dockerfile changes to COPY files, add exceptions here with !filename