Commit Graph

6 Commits

Author SHA1 Message Date
a226bc41df Rewrite is_miku_addressed() to only trigger when addressed, not mentioned
- Pre-compile 393 name variants into 4 regex patterns at module load
  (was 7,300+ raw re.search() calls per message)
- Strict addressing detection using punctuation context:
  START:  name at beginning + punctuation (Miku, ... / みく!...)
  END:    comma + name at end (..., Miku / ...、ミク)
  MIDDLE: commas on both sides - vocative (..., Miku, ...)
  ALONE:  name is the entire message (Miku! / ミクちゃん)
- Rejects mere mentions: 'I like Miku' / 'Miku is cool' no longer trigger
- Script-family-aware pattern generation (Latin, Cyrillic, Japanese)
  eliminates nonsensical cross-script combos (e.g. o-みく)
- Word boundary enforcement prevents substring matches (mikumiku)
- Fixes regex 'unbalanced parenthesis' errors from old implementation
- Add comprehensive test suite (94 cases, all passing)
2026-03-03 12:42:33 +02:00
33e5095607 Optimize miku-bot container size by removing unused dependencies
Major changes:
- Remove unused ML libraries: torch, scikit-learn, langchain-core, langchain-text-splitters, langchain-community, faiss-cpu
- Comment out unused langchain imports in utils/core.py (only used in commented-out code)
- Keep transformers (used in persona_dialogue.py for sentiment analysis)

Results:
- Container size reduced from 14.5GB to 2.6GB
- 82% reduction (11.9GB saved)
- Bot runs correctly without errors
- All functionality preserved

Removed packages:
- torch: ~1.0-1.5GB (not used, only in soprano_to_rvc/)
- scikit-learn: ~200-300MB (not used in bot/)
- langchain-core: ~50-100MB (not used, only in commented code)
- langchain-text-splitters: ~30-50MB (not used, only in commented code)
- langchain-community: ~50-80MB (not used, only in commented code)
- faiss-cpu: ~100-200MB (not used in bot/)

This is Phase 1 of container optimization (Quick Wins).
Further optimizations possible:
- OpenCV headless (150-200MB)
- Evaluate Playwright usage (500MB-1GB)
- Alpine base image (1-1.5GB)
- Multi-stage builds (200-400MB)
2026-02-15 20:56:25 +02:00
7368ef0cd5 Added Japanese and Bulgarian addressing 2026-01-30 21:34:24 +02:00
ecd14cf704 Able to now address Miku in Cyrillic, Kanji and both Kanas, incl. Japanese honorifics 2026-01-27 19:53:18 +02:00
32c2a7b930 feat: Implement comprehensive non-hierarchical logging system
- Created new logging infrastructure with per-component filtering
- Added 6 log levels: DEBUG, INFO, API, WARNING, ERROR, CRITICAL
- Implemented non-hierarchical level control (any combination can be enabled)
- Migrated 917 print() statements across 31 files to structured logging
- Created web UI (system.html) for runtime configuration with dark theme
- Added global level controls to enable/disable levels across all components
- Added timestamp format control (off/time/date/datetime options)
- Implemented log rotation (10MB per file, 5 backups)
- Added API endpoints for dynamic log configuration
- Configured HTTP request logging with filtering via api.requests component
- Intercepted APScheduler logs with proper formatting
- Fixed persistence paths to use /app/memory for Docker volume compatibility
- Fixed checkbox display bug in web UI (enabled_levels now properly shown)
- Changed System Settings button to open in same tab instead of new window

Components: bot, api, api.requests, autonomous, persona, vision, llm,
conversation, mood, dm, scheduled, gpu, media, server, commands,
sentiment, core, apscheduler

All settings persist across container restarts via JSON config.
2026-01-10 20:46:19 +02:00
8c74ad5260 Initial commit: Miku Discord Bot 2025-12-07 17:15:09 +02:00