perf: reduce container sizes and build times
- miku-stt: switch PyTorch CUDA -> CPU-only (~2.5 GB savings) - Silero VAD already runs on CPU via ONNX (onnx=True), CUDA PyTorch was waste - faster-whisper/CTranslate2 uses CUDA directly, no PyTorch GPU needed - torch+torchaudio layer: 3.3 GB -> 796 MB; total image 9+ GB -> 6.83 GB - Tested: Silero VAD loads (ONNX), Whisper loads on cuda, server ready - llama-swap-rocm: add root .dockerignore to fix 31 GB build context - Dockerfile clones all sources from git, never COPYs from context - 19 GB of GGUF model files were being transferred on every build - Now excludes everything (*), near-zero context transfer - anime-face-detector: add .dockerignore to exclude accumulated outputs - api/outputs/ (56 accumulated detection files) no longer baked into image - api/__pycache__/ and images/ also excluded - .gitignore: remove .dockerignore exclusion so these files are tracked
This commit is contained in:
@@ -1,5 +1,7 @@
|
||||
# PyTorch with CUDA 12.8 support
|
||||
# Updated per RealtimeSTT PR #295 for better performance
|
||||
torch==2.7.1+cu128
|
||||
torchaudio==2.7.1+cu128
|
||||
--index-url https://download.pytorch.org/whl/cu128
|
||||
# PyTorch CPU-only (used solely for Silero VAD which runs on CPU)
|
||||
# Silero VAD's OnnxWrapper uses torch tensors internally but does not need GPU.
|
||||
# Faster-Whisper/CTranslate2 handles GPU transcription via CUDA directly.
|
||||
# torchaudio is required by silero-vad's utils_vad.py top-level import.
|
||||
torch==2.7.1+cpu
|
||||
torchaudio==2.7.1+cpu
|
||||
--index-url https://download.pytorch.org/whl/cpu
|
||||
|
||||
Reference in New Issue
Block a user