Phase 3: Unified Cheshire Cat integration with WebSocket-based per-user isolation

Key changes:
- CatAdapter (bot/utils/cat_client.py): WebSocket /ws/{user_id} for chat
  queries instead of HTTP POST (fixes per-user memory isolation when no
  API keys are configured — HTTP defaults all users to user_id='user')
- Memory management API: 8 endpoints for status, stats, facts, episodic
  memories, consolidation trigger, multi-step delete with confirmation
- Web UI: Memory tab (tab9) with collection stats, fact/episodic browser,
  manual consolidation trigger, and 3-step delete flow requiring exact
  confirmation string
- Bot integration: Cat-first response path with query_llama fallback for
  both text and embed responses, server mood detection
- Discord bridge plugin: fixed .pop() to .get() (UserMessage is a Pydantic
  BaseModelDict, not a raw dict), metadata extraction via extra attributes
- Unified docker-compose: Cat + Qdrant services merged into main compose,
  bot depends_on Cat healthcheck
- All plugins (discord_bridge, memory_consolidation, miku_personality)
  consolidated into cat-plugins/ for volume mount
- query_llama deprecated but functional for compatibility
This commit is contained in:
2026-02-07 20:22:03 +02:00
parent edb88e9ede
commit 14e1a8df51
14 changed files with 1382 additions and 70 deletions

View File

@@ -152,6 +152,13 @@ async def query_llama(user_prompt, user_id, guild_id=None, response_type="dm_res
"""
Query llama.cpp server via llama-swap with OpenAI-compatible API.
.. deprecated:: Phase 3
For main conversation flow, prefer routing through the Cheshire Cat pipeline
(via cat_client.CatAdapter.query) which provides memory-augmented responses.
This function remains available for specialized use cases (vision, bipolar mode,
image generation, autonomous, sentiment analysis) and as a fallback when Cat
is unavailable.
Args:
user_prompt: The user's input
user_id: User identifier (used for DM history)