Commit Graph

89 Commits

Author SHA1 Message Date
8ca94fbafc fix: persist active tab via localStorage + fix implicit event bug in switchTab
- Add data-tab attributes to tab buttons for reliable identification
- Replace implicit window.event usage with querySelector by data-tab
- Save active tab to localStorage on switch, restore on page load
2026-02-28 22:59:12 +02:00
66881f4c88 refactor: deduplicate prompts, reorganize persona files, update paths
Prompt deduplication (~20% reduction, 4,743 chars saved):
- evil_miku_lore.txt: remove intra-file duplication (height rule 2x,
  cruelty-has-substance 2x, music secret 2x, adoration secret 2x),
  trim verbose restatements, cut speech examples from 10 to 6
- evil_miku_prompt.txt: remove entire PERSONALITY section (in lore),
  remove entire RESPONSE STYLE section (now only in preamble),
  soften height from prohibition to knowledge
- miku_lore.txt: remove RELATIONSHIPS section (duplicates FRIENDS)
- miku_prompt.txt: remove duplicate intro, 4 personality traits
  already in lore, FAMOUS SONGS section (in lore), fix response
  length inconsistency (1-2 vs 2-3 -> consistent 2-3)

Preamble updates (evil_mode.py, evil_miku_personality.py, llm.py,
miku_personality.py):
- Response rules now exist in ONE place only (preamble)
- Height rule softened: model knows 15.8m, can say it if asked,
  but won't default to quoting it when taunting
- Response length: 2-4 sentences (was 1-3), removed action template
  list that model was copying literally (*scoffs*, *rolls eyes*)
- Added: always include actual words, never action-only responses
- Normal Miku: trim CHARACTER CONTEXT, fix 1-3 -> 2-3 sentences

Directory reorganization:
- Move 6 persona files to bot/persona/{evil,miku}/ subdirectories
- Update all open() paths in evil_mode.py, context_manager.py,
  voice_manager.py, both Cat plugins
- Dockerfile: 6 COPY lines -> 1 (COPY persona /app/persona)
- docker-compose: 6 file mounts -> 2 directory mounts
  (bot/persona/evil -> cat/data/evil, bot/persona/miku -> cat/data/miku)

Evil Miku system (previously unstaged):
- Full evil mood management: 2h rotation timer, mood persistence,
  10 mood-specific autonomous template pools, mood-aware DMs
- Evil mode toggle with role color/nickname/pfp management
- get_evil_system_prompt() with mood integration

Add test_evil_moods.py: 10-mood x 3-message comprehensive test
2026-02-27 13:14:03 +02:00
9038f442a3 feat(evil-miku): add 10-mood system and Evil Miku Cat plugin
- Add 6 new evil mood files: bored, contemptuous, jealous, manic,
  melancholic, playful_cruel
- Rewrite 4 existing mood files: aggressive, cunning, evil_neutral,
  sarcastic (shorter, more focused descriptions)
- Add evil_miku_personality Cat plugin (parallel to miku_personality)
  with mood-aware system prompt, softened height rule, and balanced
  response length rules (2-4 sentences)
2026-02-27 13:11:37 +02:00
7aafd06da1 added new evil mood emoji map to web UI and minor fixes 2026-02-26 12:08:41 +02:00
9e5511da21 perf: reduce container sizes and build times
- miku-stt: switch PyTorch CUDA -> CPU-only (~2.5 GB savings)
  - Silero VAD already runs on CPU via ONNX (onnx=True), CUDA PyTorch was waste
  - faster-whisper/CTranslate2 uses CUDA directly, no PyTorch GPU needed
  - torch+torchaudio layer: 3.3 GB -> 796 MB; total image 9+ GB -> 6.83 GB
  - Tested: Silero VAD loads (ONNX), Whisper loads on cuda, server ready

- llama-swap-rocm: add root .dockerignore to fix 31 GB build context
  - Dockerfile clones all sources from git, never COPYs from context
  - 19 GB of GGUF model files were being transferred on every build
  - Now excludes everything (*), near-zero context transfer

- anime-face-detector: add .dockerignore to exclude accumulated outputs
  - api/outputs/ (56 accumulated detection files) no longer baked into image
  - api/__pycache__/ and images/ also excluded

- .gitignore: remove .dockerignore exclusion so these files are tracked
2026-02-25 14:41:04 +02:00
0edf1ef1c0 Fix webhook avatar mismatch: pass avatar_url at send time
- Fixed missing client parameter in animated GIF webhook update path
- Added get_persona_avatar_urls() helper that returns bot's current Discord
  avatar URL for Miku persona (always fresh, no cache lag)
- Pass avatar_url on every webhook.send() call in bipolar_mode.py,
  persona_dialogue.py, and api.py so avatars always match current pfp
  regardless of webhook cache state
2026-02-25 13:20:18 +02:00
9b74acd03b Fix missing sklearn module in miku-bot; upgrade miku-stt to CUDA 12.8.1 + PyTorch 2.7.1
- miku-bot: Re-add scikit-learn to requirements.txt (needed for vision color extraction)
- miku-stt: Upgrade from CUDA 12.6.2 to 12.8.1, PyTorch 2.5.1 to 2.7.1 per RealtimeSTT PR #295
- miku-stt: Use Ubuntu 24.04 with Python 3.12 (single installation, no dual Python)
- miku-stt: Add requirements-gpu-torch.txt for separate PyTorch installation
- miku-stt: Use --break-system-packages flag for Ubuntu 24.04 pip compatibility
2026-02-23 14:31:48 +02:00
615dd4a5ef fix(P3): 3 priority-3 fixes — timezone, decay rounding, rate limiter
#16  Timezone consistency — added TZ=Europe/Sofia to docker-compose.yml
     so datetime.now() returns local time inside the container. Removed
     the +3 hour hack from get_time_of_day(). All three time-of-day
     consumers (autonomous_v1_legacy, moods, autonomous_engine) now
     use the same correct local hour automatically.

#17  Decay truncation — replaced int() with round() in decay_events()
     so a counter of 1 survives one more 15-minute cycle instead of
     being immediately zeroed (round(0.841)=1 vs int(0.841)=0).

#20  Unpersisted rate limiter — _last_action_execution dict in
     autonomous.py is now seeded from the engine's persisted
     server_last_action on import, so restarts don't bypass the
     30-second cooldown.

Note: #18 (dead config fields) was a false positive — autonomous_interval_minutes
IS used by the scheduler. #19 deferred to bipolar mode rework.
2026-02-23 13:53:22 +02:00
2b743ed65e fix(P2): 5 priority-2 bug fixes — emoji consolidation, DM safety, pause gap
#10  Redundant coin flip in join_conversation — removed the 50% random
     gate that doubled the V2 engine's own decision to act.

#11  Message-triggered actions skip _autonomous_paused — _check_and_act
     and _check_and_react now bail out immediately when the autonomous
     system is paused (voice session), matching the scheduled-tick path.

#12  Duplicate emoji dictionaries — removed MOOD_EMOJIS and
     EVIL_MOOD_EMOJIS from globals.py (had different emojis from moods.py).
     bipolar_mode.py and evil_mode.py now import the canonical dicts
     from utils/moods.py so all code sees the same emojis.

#13  DM mood can spontaneously become 'asleep' — rotate_dm_mood() now
     filters 'asleep' out of the candidate list since DMs have no
     sleepy-to-asleep transition guard and no wakeup timer.

#15  Engage-user fallback misreports action type — log level raised to
     WARNING with an explicit [engage_user->general] prefix so the
     cooldown-triggered fallback is visible in logs.
2026-02-23 13:43:15 +02:00
0e4aebf353 fix(P1): 6 priority-1 bug fixes for autonomous engine and mood system
#4  Sleep/mood desync — set_server_mood() now clears is_sleeping when
    mood changes away from 'asleep', preventing ghost-sleep state.

#5  Race condition in _check_and_act — added per-guild asyncio.Lock so
    overlapping ticks + message-triggered calls cannot fire concurrently.

#6  Class-level attrs on ServerConfig — sleepy_responses_left,
    angry_wakeup_timer, and forced_angry_until are now proper dataclass
    fields with defaults, so asdict()/from_dict() round-trip correctly.
    Also strips unknown keys in from_dict() to survive schema changes.

#7  Persistence decay_factor crash — initialise decay_factor = 1.0
    before the loop so empty-server or zero-downtime paths don't
    raise NameError.

#8  Double record_action — removed the redundant call in
    autonomous_tick_v2(); only _check_and_act records the action now.

#9  Engine mood desync — on_mood_change() is now called inside
    set_server_mood() (single source of truth) and removed from 4
    call-sites in api.py, moods.py, and server_manager wakeup task.
2026-02-23 13:31:15 +02:00
422366df4c fix: 3 critical autonomous engine & mood system bugs
1. Momentum cliff at 10 messages (P0): The conversation momentum formula
   had a discontinuity where the 10th message caused momentum to DROP from
   0.9 to 0.5. Replaced with a smooth log1p curve that monotonically
   increases (0→0→0.20→0.32→...→0.70→0.89→1.0 at 30 msgs).

2. Neutral keywords overriding all moods (P0): detect_mood_shift() checked
   neutral early with generic keywords (okay, sure, hmm) that matched
   almost any response, constantly resetting mood to neutral. Now: all
   specific moods are scored by match count first (best-match wins),
   neutral is only checked as fallback and requires 2+ keyword matches.

3. Uncancellable delayed_wakeup tasks (P0): Fire-and-forget sleep tasks
   could stack and overwrite mood state after manual wake-up. Added a
   centralized wakeup task registry in ServerManager with automatic
   cancellation on manual wake or new sleep cycle.
2026-02-20 15:37:57 +02:00
2f0d430c35 feat: Add manual trigger bypass for web UI autonomous engagement
- Added manual_trigger parameter to /autonomous/engage endpoint to bypass 12h cooldown
- Updated miku_engage_random_user_for_server() and miku_engage_random_user() to accept manual_trigger flag
- Modified Web UI to always send manual_trigger=true when engaging users from the UI
- Users can now manually engage the same user multiple times from web UI without cooldown restriction
- Regular autonomous schedules still respect the 12h cooldown between engagements to the same user

Changes:
- bot/api.py: Added manual_trigger parameter with string-to-boolean conversion
- bot/static/index.html: Added manual_trigger=true to engage user request
- bot/utils/autonomous_v1_legacy.py: Added manual_trigger parameter and cooldown bypass logic
2026-02-20 00:53:42 +02:00
9972edb06d fix(docker): add config_manager.py to Dockerfile and logger components
- Add COPY config_manager.py to Dockerfile so it's included in the image
- Add 'config_manager' to logger COMPONENTS list to enable logging

Fixes the ModuleNotFoundError and ValueError when importing config_manager
2026-02-19 11:02:58 +02:00
305605fde5 docs: add comprehensive COMMANDS.md reference
Document all bot commands, features and API endpoints:
- 7 voice commands, 4 UNO commands, 2 inline commands
- Conversational features (name detection, DMs, media analysis, image gen)
- Mood system (14 regular + 4 evil moods)
- Personality modes (evil, bipolar, persona dialogue)
- Voice chat architecture (dual GPU, STT, TTS, resource locking)
- Autonomous behavior system (6 action types)
- Memory system (Cheshire Cat declarative + episodic)
- Profile picture system
- ~126 API endpoints organized into 20 categories
- Discord event handlers and environment variables

Resolves #18
2026-02-18 12:37:25 +02:00
d44f08af18 fix(config): persist runtime settings across bot restarts
Add restore_runtime_settings() to ConfigManager that reads config_runtime.yaml
on startup and restores persisted values into globals:
- LANGUAGE_MODE, AUTONOMOUS_DEBUG, VOICE_DEBUG_MODE
- USE_CHESHIRE_CAT, PREFER_AMD_GPU, DM_MOOD

Add missing persistence calls to API endpoints:
- POST /language/set now persists to config_runtime.yaml
- POST /voice/debug-mode now persists to config_runtime.yaml
- POST /memory/toggle now persists to config_runtime.yaml

Call restore_runtime_settings() in on_ready() after evil/bipolar restore.

Resolves #22
2026-02-18 12:18:12 +02:00
8d5137046c fix(shutdown): implement graceful async shutdown handler
Replace the minimal sync-only shutdown (which only saved autonomous state)
with a comprehensive async graceful_shutdown() coroutine that:

1. Ends active voice sessions (disconnect, release GPU locks, cleanup audio)
2. Saves autonomous engine state
3. Stops the APScheduler
4. Cancels all tracked background tasks (from task_tracker)
5. Closes the Discord gateway connection

Signal handlers (SIGTERM/SIGINT) now schedule the async shutdown on the
running event loop. The atexit handler is kept as a last-resort sync fallback.

Resolves #5, also addresses #4 (voice cleanup at shutdown)
2026-02-18 12:08:32 +02:00
7b7abcfc68 fix(tasks): replace fire-and-forget asyncio.create_task with create_tracked_task
Add utils/task_tracker.py with create_tracked_task() that wraps background
tasks with error logging, cancellation handling, and reference tracking.

Replace all 17 fire-and-forget asyncio.create_task() calls across 7 files:
- bot/bot.py (5 interjection checks)
- bot/utils/autonomous.py (2 check-and-act/react tasks)
- bot/utils/bipolar_mode.py (3 argument tasks)
- bot/commands/uno.py (1 game loop task)
- bot/utils/voice_receiver.py (3 STT/interruption callbacks)
- bot/utils/persona_dialogue.py (4 dialogue turn/interjection tasks)

Previously-tracked tasks (voice_audio.py, voice_manager.py) were left as-is
since they already store task references for cancellation.

Closes #1
2026-02-18 12:01:08 +02:00
cf55b15745 Optimize miku-bot container: remove unused packages and caches
Optimizations applied:
- Add pip cache purge after pip install (~7.5MB saved)
- Remove /usr/share/doc documentation (~7.5MB saved)
- Remove pocketsphinx speech recognition packages (~37MB saved)
- Remove libflite1 TTS library (~28MB saved)

Packages removed:
- pocketsphinx-en-us (US English speech model)
- pocketsphinx (speech recognition library)
- libflite1 (text-to-speech engine)
- libpocketsphinx3 (speech recognition frontend)

Reason: These packages are not used in Python code:
- Speech recognition: Handled by external stt-realtime container
- Text-to-speech: Handled by external RVC container

Note: Could not remove Vulkan/Mesa drivers (~130MB) because:
- Playwright installs them via --with-deps flag
- Removing them also removes libgl1 (required by OpenCV)
- libgl1 pulls back Mesa graphics drivers

Total savings: ~80MB (from previous 2.41GB baseline)
Container size remains 2.41GB due to essential package dependencies
2026-02-15 22:21:30 +02:00
33e5095607 Optimize miku-bot container size by removing unused dependencies
Major changes:
- Remove unused ML libraries: torch, scikit-learn, langchain-core, langchain-text-splitters, langchain-community, faiss-cpu
- Comment out unused langchain imports in utils/core.py (only used in commented-out code)
- Keep transformers (used in persona_dialogue.py for sentiment analysis)

Results:
- Container size reduced from 14.5GB to 2.6GB
- 82% reduction (11.9GB saved)
- Bot runs correctly without errors
- All functionality preserved

Removed packages:
- torch: ~1.0-1.5GB (not used, only in soprano_to_rvc/)
- scikit-learn: ~200-300MB (not used in bot/)
- langchain-core: ~50-100MB (not used, only in commented code)
- langchain-text-splitters: ~30-50MB (not used, only in commented code)
- langchain-community: ~50-80MB (not used, only in commented code)
- faiss-cpu: ~100-200MB (not used in bot/)

This is Phase 1 of container optimization (Quick Wins).
Further optimizations possible:
- OpenCV headless (150-200MB)
- Evaluate Playwright usage (500MB-1GB)
- Alpine base image (1-1.5GB)
- Multi-stage builds (200-400MB)
2026-02-15 20:56:25 +02:00
8d09a8a52f Implement comprehensive config system and clean up codebase
Major changes:
- Add Pydantic-based configuration system (bot/config.py, bot/config_manager.py)
- Add config.yaml with all service URLs, models, and feature flags
- Fix config.yaml path resolution in Docker (check /app/config.yaml first)
- Remove Fish Audio API integration (tested feature that didn't work)
- Remove hardcoded ERROR_WEBHOOK_URL, import from config instead
- Add missing Pydantic models (LogConfigUpdateRequest, LogFilterUpdateRequest)
- Enable Cheshire Cat memory system by default (USE_CHESHIRE_CAT=true)
- Add .env.example template with all required environment variables
- Add setup.sh script for user-friendly initialization
- Update docker-compose.yml with proper env file mounting
- Update .gitignore for config files and temporary files

Config system features:
- Static configuration from config.yaml
- Runtime overrides from config_runtime.yaml
- Environment variables for secrets (.env)
- Web UI integration via config_manager
- Graceful fallback to defaults

Secrets handling:
- Move ERROR_WEBHOOK_URL from hardcoded to .env
- Add .env.example with all placeholder values
- Document all required secrets
- Fish API key and voice ID removed from .env

Documentation:
- CONFIG_README.md - Configuration system guide
- CONFIG_SYSTEM_COMPLETE.md - Implementation summary
- FISH_API_REMOVAL_COMPLETE.md - Removal record
- SECRETS_CONFIGURED.md - Secrets setup record
- BOT_STARTUP_FIX.md - Pydantic model fixes
- MIGRATION_CHECKLIST.md - Setup checklist
- WEB_UI_INTEGRATION_COMPLETE.md - Web UI config guide
- Updated readmes/README.md with new features
2026-02-15 19:51:00 +02:00
bb5067a89e fix: Add settings.json and enable profile_picture_context plugin
- Added empty settings.json required by Cat plugin system
- Plugin now appears in ACTIVE PLUGINS list
- Enabled via /plugins/toggle API endpoint
- Ready to inject PFP descriptions when user asks about it
2026-02-11 00:09:58 +02:00
eb557f655c feat: Add profile picture context plugin with regex-based injection
- Create profile_picture_context plugin to detect PFP queries via regex
- Inject current_description.txt only when user asks about profile picture
- Mount bot/memory directory in Cat container for PFP access
- Avoids context bloat by only adding PFP description when relevant
- Patterns match: 'what does your pfp look like', 'describe your avatar', etc.
- Works seamlessly with existing profile picture update system
- No manual sync needed - description auto-updates with PFP changes
2026-02-10 23:41:14 +02:00
985ac60191 Webhook pfp updates properly now 2026-02-10 22:57:55 +02:00
34167eddae feat: Restore mood system and implement comprehensive memory editor UI
MOOD SYSTEM FIX:
- Mount bot/moods directory in docker-compose.yml for Cat container access
- Update miku_personality plugin to load mood descriptions from .txt files
- Add Cat logger for debugging mood loading (replaces print statements)
- Moods now dynamically loaded from working_memory instead of hardcoded neutral
2026-02-10 22:03:54 +02:00
6ba8e19d99 Ability to edit and add memories from the web UI with fixed escapeHtml 2026-02-10 21:41:28 +02:00
fbd940e711 fix: Restore declarative memory recall by preserving suffix template
Root cause: The miku_personality plugin's agent_prompt_suffix hook was returning
an empty string, which wiped out the {declarative_memory} and {episodic_memory}
placeholders from the prompt template. This caused the LLM to never receive any
stored facts about users, resulting in hallucinated responses.

Changes:
- miku_personality: Changed agent_prompt_suffix to return the memory context
  section with {episodic_memory}, {declarative_memory}, and {tools_output}
  placeholders instead of empty string

- discord_bridge: Added before_cat_recalls_declarative_memories hook to increase
  k-value from 3 to 10 and lower threshold from 0.7 to 0.5 for better fact
  retrieval. Added agent_prompt_prefix to emphasize factual accuracy. Added
  debug logging via before_agent_starts hook.

Result: Miku now correctly recalls user facts (favorite songs, games, etc.)
from declarative memory with 100% accuracy.

Tested with:
- 'What is my favorite song?' → Correctly answers 'Monitoring (Best Friend Remix) by DECO*27'
- 'Do you remember my favorite song?' → Correctly recalls the song
- 'What is my favorite video game?' → Correctly answers 'Sonic Adventure'
2026-02-09 12:33:31 +02:00
beb1a89000 Fix: Optimize Twitter fetching to avoid Playwright hangs
- Replaced Playwright browser scraping with direct API media extraction
- Both fetch_miku_tweets() and fetch_figurine_tweets_latest() now use twscrape's built-in media info
- Reduced tweet fetching from 10-15 minutes to ~5 seconds
- Eliminated browser timeout/hanging issues
- Relaxed autonomous tweet sharing conditions:
  * Increased message threshold from 10 to 20 per hour
  * Reduced cooldown from 3600s to 2400s (40 minutes)
  * Increased energy threshold from 50% to 70%
  * Added 'silly' and 'flirty' moods to allowed sharing moods

This makes both figurine notifications and tweet sharing much more reliable and responsive.
2026-02-08 14:55:01 +02:00
b9d1f67d70 llama-swap-rocm now uses official image and adjusted accordingly 2026-02-07 23:43:01 +02:00
11b90ebb46 fix: Phase 3 bug fixes - memory APIs, username visibility, web UI layout, Docker
**Critical Bug Fixes:**

1. Per-user memory isolation bug
   - Changed CatAdapter from HTTP POST to WebSocket /ws/{user_id}
   - User_id now comes from URL path parameter (true per-user isolation)
   - Verified: Different users can't see each other's memories

2. Memory API 405 errors
   - Replaced non-existent Cat endpoint calls with Qdrant direct queries
   - get_memory_points(): Now uses POST /collections/{collection}/points/scroll
   - delete_memory_point(): Now uses POST /collections/{collection}/points/delete

3. Memory stats showing null counts
   - Reimplemented get_memory_stats() to query Qdrant directly
   - Now returns accurate counts: episodic: 20, declarative: 6, procedural: 4

4. Miku couldn't see usernames
   - Modified discord_bridge before_cat_reads_message hook
   - Prepends [Username says:] to every message text
   - LLM now knows who is texting: [Alice says:] Hello Miku!

5. Web UI Memory tab layout
   - Tab9 was positioned outside .tab-container div (showed to the right)
   - Moved tab9 HTML inside container, before closing divs
   - Memory tab now displays below tab buttons like other tabs

**Code Changes:**

bot/utils/cat_client.py:
- Line 25: Logger name changed to 'llm' (available component)
- get_memory_stats() (lines 256-285): Query Qdrant directly via HTTP GET
- get_memory_points() (lines 275-310): Use Qdrant POST /points/scroll
- delete_memory_point() (lines 350-370): Use Qdrant POST /points/delete

cat-plugins/discord_bridge/discord_bridge.py:
- Fixed .pop() → .get() (UserMessage is Pydantic BaseModelDict)
- Added before_cat_reads_message logic to prepend [Username says:]
- Message format: [Alice says:] message content

Dockerfile.llamaswap-rocm:
- Lines 37-44: Added conditional check for UI directory
- if [ -d ui ] before npm install && npm run build
- Fixes build failure when llama-swap UI dir doesn't exist

bot/static/index.html:
- Moved tab9 from lines 1554-1688 (outside container)
- To position before container closing divs (now inside)
- Memory tab button at line 673: 🧠 Memories

**Testing & Verification:**
 Per-user isolation verified (Docker exec test)
 Memory stats showing real counts (curl test)
 Memory API working (facts/episodic loading)
 Web UI layout fixed (tab displays correctly)
 All 5 services running (llama-swap, llama-swap-amd, qdrant, cat, bot)
 Username prepending working (message context for LLM)

**Result:** All Phase 3 critical bugs fixed and verified working.
2026-02-07 23:27:15 +02:00
5fe420b7bc Web UI tabs made into two rows 2026-02-07 22:16:01 +02:00
14e1a8df51 Phase 3: Unified Cheshire Cat integration with WebSocket-based per-user isolation
Key changes:
- CatAdapter (bot/utils/cat_client.py): WebSocket /ws/{user_id} for chat
  queries instead of HTTP POST (fixes per-user memory isolation when no
  API keys are configured — HTTP defaults all users to user_id='user')
- Memory management API: 8 endpoints for status, stats, facts, episodic
  memories, consolidation trigger, multi-step delete with confirmation
- Web UI: Memory tab (tab9) with collection stats, fact/episodic browser,
  manual consolidation trigger, and 3-step delete flow requiring exact
  confirmation string
- Bot integration: Cat-first response path with query_llama fallback for
  both text and embed responses, server mood detection
- Discord bridge plugin: fixed .pop() to .get() (UserMessage is a Pydantic
  BaseModelDict, not a raw dict), metadata extraction via extra attributes
- Unified docker-compose: Cat + Qdrant services merged into main compose,
  bot depends_on Cat healthcheck
- All plugins (discord_bridge, memory_consolidation, miku_personality)
  consolidated into cat-plugins/ for volume mount
- query_llama deprecated but functional for compatibility
2026-02-07 20:22:03 +02:00
edb88e9ede fix: Phase 2 integrity review - v2.0.0 rewrite & bugfixes
Memory Consolidation Plugin (828 -> 465 lines):
- Replace SentenceTransformer with cat.embedder.embed_query() for vector consistency
- Fix per-user fact isolation: source=user_id instead of global
- Add duplicate fact detection (_is_duplicate_fact, score_threshold=0.85)
- Remove ~350 lines of dead async run_consolidation() code
- Remove duplicate declarative search in before_cat_sends_message
- Unify trivial patterns into TRIVIAL_PATTERNS frozenset
- Remove all sys.stderr.write debug logging
- Remove sentence-transformers from requirements.txt (no external deps)

Loguru Fix (cheshire-cat/cat/log.py):
- Patch Cat v1.6.2 loguru format to provide default extra fields
- Fixes KeyError: 'original_name' from third-party libs (fastembed)
- Mounted via docker-compose volume

Discord Bridge:
- Copy discord_bridge.py to cat-plugins/ (was empty directory)

Test Results (6/7 pass, 100% fact recall):
- 11 facts extracted, per-user isolation working
- Duplicate detection effective (+2 on 2nd run)
- 5/5 natural language recall queries correct
2026-02-07 19:24:46 +02:00
83c103324c feat: Phase 2 Memory Consolidation - Production Ready
Implements intelligent memory consolidation system with LLM-based fact extraction:

Features:
- Bidirectional memory: stores both user and Miku messages
- LLM-based fact extraction (replaces regex for intelligent pattern detection)
- Filters Miku's responses during fact extraction (only user messages analyzed)
- Trivial message filtering (removes lol, k, ok, etc.)
- Manual consolidation trigger via 'consolidate now' command
- Declarative fact recall with semantic search
- User separation via metadata (user_id, guild_id)
- Tested: 60% fact recall accuracy, 39 episodic memories, 11 facts extracted

Phase 2 Requirements Complete:
 Minimal real-time filtering
 Nightly consolidation task (manual trigger works)
 Context-aware LLM analysis
 Extract declarative facts
 Metadata enrichment

Test Results:
- Episodic memories: 39 stored (user + Miku)
- Declarative facts: 11 extracted from user messages only
- Fact recall accuracy: 3/5 queries (60%)
- Pipeline test: PASS

Ready for production deployment with scheduled consolidation.
2026-02-03 23:17:27 +02:00
323ca753d1 feat: Phase 1 - Discord bridge with unified user identity
Implements unified cross-server memory system for Miku bot:

**Core Changes:**
- discord_bridge plugin with 3 hooks for metadata enrichment
- Unified user identity: discord_user_{id} across servers and DMs
- Minimal filtering: skip only trivial messages (lol, k, 1-2 chars)
- Marks all memories as consolidated=False for Phase 2 processing

**Testing:**
- test_phase1.py validates cross-server memory recall
- PHASE1_TEST_RESULTS.md documents successful validation
- Cross-server test: User says 'blue' in Server A, Miku remembers in Server B 

**Documentation:**
- IMPLEMENTATION_PLAN.md - Complete architecture and roadmap
- Phase 2 (sleep consolidation) ready for implementation

This lays the foundation for human-like memory consolidation.
2026-01-31 18:54:00 +02:00
0a9145728e Ability to play Uno implemented in early stages! 2026-01-30 21:43:20 +02:00
5b1163c7af Removed KV Cache offloading to increase performance 2026-01-30 21:35:07 +02:00
7368ef0cd5 Added Japanese and Bulgarian addressing 2026-01-30 21:34:24 +02:00
38a986658d moved AI generated readmes to readme folder (may delete) 2026-01-27 19:58:26 +02:00
c58b941587 moved AI generated readmes to readme folder (may delete) 2026-01-27 19:57:48 +02:00
0f1c30f757 Added verbose logging to llama-swap-rocm. Not sure if does anything... 2026-01-27 19:57:04 +02:00
55fd3e0953 Cleanup. Moved prototype and testing STT/TTS to 1TB HDD 2026-01-27 19:55:13 +02:00
ecd14cf704 Able to now address Miku in Cyrillic, Kanji and both Kanas, incl. Japanese honorifics 2026-01-27 19:53:18 +02:00
641a5b83e8 Improved Evil Mode toggle to handle edge cases of the pfp and role color change. Japanese swallow model compatible (should be). 2026-01-27 19:52:39 +02:00
c0aaab0c3a Disabled KV cache offloading on llama-server and enabled Flash Attention. Performance gains in the tens. 2026-01-27 19:11:49 +02:00
dca58328e4 Tuned the Japanese mode system prompt and model better 2026-01-23 17:01:47 +02:00
fe0962118b Implemented new Japanese only text mode with WebUI toggle, utilizing a llama3.1 swallow dataset model. Next up is Japanese TTS. 2026-01-23 15:02:36 +02:00
eb03dfce4d refactor: Implement low-latency STT pipeline with speculative transcription
Major architectural overhaul of the speech-to-text pipeline for real-time voice chat:

STT Server Rewrite:
- Replaced RealtimeSTT dependency with direct Silero VAD + Faster-Whisper integration
- Achieved sub-second latency by eliminating unnecessary abstractions
- Uses small.en Whisper model for fast transcription (~850ms)

Speculative Transcription (NEW):
- Start transcribing at 150ms silence (speculative) while still listening
- If speech continues, discard speculative result and keep buffering
- If 400ms silence confirmed, use pre-computed speculative result immediately
- Reduces latency by ~250-850ms for typical utterances with clear pauses

VAD Implementation:
- Silero VAD with ONNX (CPU-efficient) for 32ms chunk processing
- Direct speech boundary detection without RealtimeSTT overhead
- Configurable thresholds for silence detection (400ms final, 150ms speculative)

Architecture:
- Single Whisper model loaded once, shared across sessions
- VAD runs on every 512-sample chunk for immediate speech detection
- Background transcription worker thread for non-blocking processing
- Greedy decoding (beam_size=1) for maximum speed

Performance:
- Previous: 400ms silence wait + ~850ms transcription = ~1.25s total latency
- Current: 400ms silence wait + 0ms (speculative ready) = ~400ms (best case)
- Single model reduces VRAM usage, prevents OOM on GTX 1660

Container Manager Updates:
- Updated health check logic to work with new response format
- Changed from checking 'warmed_up' flag to just 'status: ready'
- Improved terminology from 'warmup' to 'models loading'

Files Changed:
- stt-realtime/stt_server.py: Complete rewrite with Silero VAD + speculative transcription
- stt-realtime/requirements.txt: Removed RealtimeSTT, using torch.hub for Silero VAD
- bot/utils/container_manager.py: Updated health check for new STT response format
- bot/api.py: Updated docstring to reflect new architecture
- backups/: Archived old RealtimeSTT-based implementation

This addresses low latency requirements while maintaining accuracy with configurable
speech detection thresholds.
2026-01-22 22:08:07 +02:00
2934efba22 Implemented experimental real production ready voice chat, relegated old flow to voice debug mode. New Web UI panel for Voice Chat. 2026-01-20 23:06:17 +02:00
362108f4b0 Decided on Parakeet ONNX Runtime. Works pretty great. Realtime voice chat possible now. UX lacking. 2026-01-19 00:29:44 +02:00
0a8910fff8 Changed stt to parakeet — still experiemntal, though performance seems to be better 2026-01-18 03:35:50 +02:00