Commit Graph

4 Commits

Author SHA1 Message Date
fbd940e711 fix: Restore declarative memory recall by preserving suffix template
Root cause: The miku_personality plugin's agent_prompt_suffix hook was returning
an empty string, which wiped out the {declarative_memory} and {episodic_memory}
placeholders from the prompt template. This caused the LLM to never receive any
stored facts about users, resulting in hallucinated responses.

Changes:
- miku_personality: Changed agent_prompt_suffix to return the memory context
  section with {episodic_memory}, {declarative_memory}, and {tools_output}
  placeholders instead of empty string

- discord_bridge: Added before_cat_recalls_declarative_memories hook to increase
  k-value from 3 to 10 and lower threshold from 0.7 to 0.5 for better fact
  retrieval. Added agent_prompt_prefix to emphasize factual accuracy. Added
  debug logging via before_agent_starts hook.

Result: Miku now correctly recalls user facts (favorite songs, games, etc.)
from declarative memory with 100% accuracy.

Tested with:
- 'What is my favorite song?' → Correctly answers 'Monitoring (Best Friend Remix) by DECO*27'
- 'Do you remember my favorite song?' → Correctly recalls the song
- 'What is my favorite video game?' → Correctly answers 'Sonic Adventure'
2026-02-09 12:33:31 +02:00
11b90ebb46 fix: Phase 3 bug fixes - memory APIs, username visibility, web UI layout, Docker
**Critical Bug Fixes:**

1. Per-user memory isolation bug
   - Changed CatAdapter from HTTP POST to WebSocket /ws/{user_id}
   - User_id now comes from URL path parameter (true per-user isolation)
   - Verified: Different users can't see each other's memories

2. Memory API 405 errors
   - Replaced non-existent Cat endpoint calls with Qdrant direct queries
   - get_memory_points(): Now uses POST /collections/{collection}/points/scroll
   - delete_memory_point(): Now uses POST /collections/{collection}/points/delete

3. Memory stats showing null counts
   - Reimplemented get_memory_stats() to query Qdrant directly
   - Now returns accurate counts: episodic: 20, declarative: 6, procedural: 4

4. Miku couldn't see usernames
   - Modified discord_bridge before_cat_reads_message hook
   - Prepends [Username says:] to every message text
   - LLM now knows who is texting: [Alice says:] Hello Miku!

5. Web UI Memory tab layout
   - Tab9 was positioned outside .tab-container div (showed to the right)
   - Moved tab9 HTML inside container, before closing divs
   - Memory tab now displays below tab buttons like other tabs

**Code Changes:**

bot/utils/cat_client.py:
- Line 25: Logger name changed to 'llm' (available component)
- get_memory_stats() (lines 256-285): Query Qdrant directly via HTTP GET
- get_memory_points() (lines 275-310): Use Qdrant POST /points/scroll
- delete_memory_point() (lines 350-370): Use Qdrant POST /points/delete

cat-plugins/discord_bridge/discord_bridge.py:
- Fixed .pop() → .get() (UserMessage is Pydantic BaseModelDict)
- Added before_cat_reads_message logic to prepend [Username says:]
- Message format: [Alice says:] message content

Dockerfile.llamaswap-rocm:
- Lines 37-44: Added conditional check for UI directory
- if [ -d ui ] before npm install && npm run build
- Fixes build failure when llama-swap UI dir doesn't exist

bot/static/index.html:
- Moved tab9 from lines 1554-1688 (outside container)
- To position before container closing divs (now inside)
- Memory tab button at line 673: 🧠 Memories

**Testing & Verification:**
 Per-user isolation verified (Docker exec test)
 Memory stats showing real counts (curl test)
 Memory API working (facts/episodic loading)
 Web UI layout fixed (tab displays correctly)
 All 5 services running (llama-swap, llama-swap-amd, qdrant, cat, bot)
 Username prepending working (message context for LLM)

**Result:** All Phase 3 critical bugs fixed and verified working.
2026-02-07 23:27:15 +02:00
14e1a8df51 Phase 3: Unified Cheshire Cat integration with WebSocket-based per-user isolation
Key changes:
- CatAdapter (bot/utils/cat_client.py): WebSocket /ws/{user_id} for chat
  queries instead of HTTP POST (fixes per-user memory isolation when no
  API keys are configured — HTTP defaults all users to user_id='user')
- Memory management API: 8 endpoints for status, stats, facts, episodic
  memories, consolidation trigger, multi-step delete with confirmation
- Web UI: Memory tab (tab9) with collection stats, fact/episodic browser,
  manual consolidation trigger, and 3-step delete flow requiring exact
  confirmation string
- Bot integration: Cat-first response path with query_llama fallback for
  both text and embed responses, server mood detection
- Discord bridge plugin: fixed .pop() to .get() (UserMessage is a Pydantic
  BaseModelDict, not a raw dict), metadata extraction via extra attributes
- Unified docker-compose: Cat + Qdrant services merged into main compose,
  bot depends_on Cat healthcheck
- All plugins (discord_bridge, memory_consolidation, miku_personality)
  consolidated into cat-plugins/ for volume mount
- query_llama deprecated but functional for compatibility
2026-02-07 20:22:03 +02:00
edb88e9ede fix: Phase 2 integrity review - v2.0.0 rewrite & bugfixes
Memory Consolidation Plugin (828 -> 465 lines):
- Replace SentenceTransformer with cat.embedder.embed_query() for vector consistency
- Fix per-user fact isolation: source=user_id instead of global
- Add duplicate fact detection (_is_duplicate_fact, score_threshold=0.85)
- Remove ~350 lines of dead async run_consolidation() code
- Remove duplicate declarative search in before_cat_sends_message
- Unify trivial patterns into TRIVIAL_PATTERNS frozenset
- Remove all sys.stderr.write debug logging
- Remove sentence-transformers from requirements.txt (no external deps)

Loguru Fix (cheshire-cat/cat/log.py):
- Patch Cat v1.6.2 loguru format to provide default extra fields
- Fixes KeyError: 'original_name' from third-party libs (fastembed)
- Mounted via docker-compose volume

Discord Bridge:
- Copy discord_bridge.py to cat-plugins/ (was empty directory)

Test Results (6/7 pass, 100% fact recall):
- 11 facts extracted, per-user isolation working
- Duplicate detection effective (+2 on 2nd run)
- 5/5 natural language recall queries correct
2026-02-07 19:24:46 +02:00