- miku-bot: Re-add scikit-learn to requirements.txt (needed for vision color extraction)
- miku-stt: Upgrade from CUDA 12.6.2 to 12.8.1, PyTorch 2.5.1 to 2.7.1 per RealtimeSTT PR #295
- miku-stt: Use Ubuntu 24.04 with Python 3.12 (single installation, no dual Python)
- miku-stt: Add requirements-gpu-torch.txt for separate PyTorch installation
- miku-stt: Use --break-system-packages flag for Ubuntu 24.04 pip compatibility
Major changes:
- Remove unused ML libraries: torch, scikit-learn, langchain-core, langchain-text-splitters, langchain-community, faiss-cpu
- Comment out unused langchain imports in utils/core.py (only used in commented-out code)
- Keep transformers (used in persona_dialogue.py for sentiment analysis)
Results:
- Container size reduced from 14.5GB to 2.6GB
- 82% reduction (11.9GB saved)
- Bot runs correctly without errors
- All functionality preserved
Removed packages:
- torch: ~1.0-1.5GB (not used, only in soprano_to_rvc/)
- scikit-learn: ~200-300MB (not used in bot/)
- langchain-core: ~50-100MB (not used, only in commented code)
- langchain-text-splitters: ~30-50MB (not used, only in commented code)
- langchain-community: ~50-80MB (not used, only in commented code)
- faiss-cpu: ~100-200MB (not used in bot/)
This is Phase 1 of container optimization (Quick Wins).
Further optimizations possible:
- OpenCV headless (150-200MB)
- Evaluate Playwright usage (500MB-1GB)
- Alpine base image (1-1.5GB)
- Multi-stage builds (200-400MB)
Major changes:
- Add Pydantic-based configuration system (bot/config.py, bot/config_manager.py)
- Add config.yaml with all service URLs, models, and feature flags
- Fix config.yaml path resolution in Docker (check /app/config.yaml first)
- Remove Fish Audio API integration (tested feature that didn't work)
- Remove hardcoded ERROR_WEBHOOK_URL, import from config instead
- Add missing Pydantic models (LogConfigUpdateRequest, LogFilterUpdateRequest)
- Enable Cheshire Cat memory system by default (USE_CHESHIRE_CAT=true)
- Add .env.example template with all required environment variables
- Add setup.sh script for user-friendly initialization
- Update docker-compose.yml with proper env file mounting
- Update .gitignore for config files and temporary files
Config system features:
- Static configuration from config.yaml
- Runtime overrides from config_runtime.yaml
- Environment variables for secrets (.env)
- Web UI integration via config_manager
- Graceful fallback to defaults
Secrets handling:
- Move ERROR_WEBHOOK_URL from hardcoded to .env
- Add .env.example with all placeholder values
- Document all required secrets
- Fish API key and voice ID removed from .env
Documentation:
- CONFIG_README.md - Configuration system guide
- CONFIG_SYSTEM_COMPLETE.md - Implementation summary
- FISH_API_REMOVAL_COMPLETE.md - Removal record
- SECRETS_CONFIGURED.md - Secrets setup record
- BOT_STARTUP_FIX.md - Pydantic model fixes
- MIGRATION_CHECKLIST.md - Setup checklist
- WEB_UI_INTEGRATION_COMPLETE.md - Web UI config guide
- Updated readmes/README.md with new features
Features:
- Built custom ROCm container for AMD RX 6800 GPU
- Added GPU selection toggle in web UI (NVIDIA/AMD)
- Unified model names across both GPUs for seamless switching
- Vision model always uses NVIDIA GPU (optimal performance)
- Text models (llama3.1, darkidol) can use either GPU
- Added /gpu-status and /gpu-select API endpoints
- Implemented GPU state persistence in memory/gpu_state.json
Technical details:
- Multi-stage Dockerfile.llamaswap-rocm with ROCm 6.2.4
- llama.cpp compiled with GGML_HIP=ON for gfx1030 (RX 6800)
- Proper GPU permissions without root (groups 187/989)
- AMD container on port 8091, NVIDIA on port 8090
- Updated bot/utils/llm.py with get_current_gpu_url() and get_vision_gpu_url()
- Modified bot/utils/image_handling.py to always use NVIDIA for vision
- Enhanced web UI with GPU selector button (blue=NVIDIA, red=AMD)
Files modified:
- docker-compose.yml (added llama-swap-amd service)
- bot/globals.py (added LLAMA_AMD_URL)
- bot/api.py (added GPU selection endpoints and helper function)
- bot/utils/llm.py (GPU routing for text models)
- bot/utils/image_handling.py (GPU routing for vision models)
- bot/static/index.html (GPU selector UI)
- llama-swap-rocm-config.yaml (unified model names)
New files:
- Dockerfile.llamaswap-rocm
- bot/memory/gpu_state.json
- bot/utils/gpu_router.py (load balancing utility)
- setup-dual-gpu.sh (setup verification script)
- DUAL_GPU_*.md (documentation files)