- Moved 20 root-level markdown files to readmes/ - Includes COMMANDS.md, CONFIG_README.md, all UNO docs, all completion reports - Added new: MEMORY_EDITOR_FEATURE.md, MEMORY_EDITOR_ESCAPING_FIX.md, CONFIG_SOURCES_ANALYSIS.md, MCP_TOOL_CALLING_ANALYSIS.md, and others - Root directory is now clean of documentation clutter
6.9 KiB
Memory Consolidation System - Production Ready ✅
Overview
Complete implementation of memory consolidation with LLM-based fact extraction and declarative memory recall for the Miku Discord bot.
Features Implemented
✅ 1. LLM-Based Fact Extraction
- No more regex patterns - Uses LLM to intelligently extract facts from conversations
- Processes memories in batches of 5 for optimal performance
- Extracts multiple fact types: name, age, location, job, allergies, preferences, hobbies, etc.
- Automatically stores facts in Qdrant's declarative collection
✅ 2. Declarative Memory Recall
- Searches declarative facts based on user queries
- Injects relevant facts (score > 0.5) into the LLM prompt
- Facts appear in prompt as: "📝 Personal Facts About the User"
- Enables Miku to remember and use personal information accurately
✅ 3. Bidirectional Memory Storage
- User messages stored in episodic memory (as before)
- Miku's responses now also stored in episodic memory
- Tagged with
speaker: 'miku'metadata for filtering - Creates complete conversation history
✅ 4. Trivial Message Filtering
- Automatically deletes low-value messages (lol, k, ok, etc.)
- Marks important messages as "consolidated"
- Reduces memory bloat and improves search quality
✅ 5. Manual Consolidation Trigger
- Send
consolidate nowcommand to trigger immediate consolidation - Returns stats: processed, kept, deleted, facts learned
- Useful for testing and maintenance
Architecture
Hook Chain
1. before_cat_recalls_memories (placeholder for future)
2. agent_prompt_prefix → Search & inject declarative facts
3. [LLM generates response using facts]
4. before_cat_sends_message → Store Miku's response + handle consolidation
Memory Flow
User Message → Episodic (source=user_id, speaker=user)
↓
Miku Response → Episodic (source=user_id, speaker=miku)
↓
Consolidation (nightly or manual)
├→ Delete trivial messages
├→ Mark important as consolidated
└→ Extract facts → Declarative (user_id=global)
↓
Next User Query → Search declarative → Inject into prompt
Test Results
Fact Extraction Test
- Input: 71 unconsolidated memories
- Output: 20 facts extracted via LLM
- Method: LLM analysis (no regex)
- Success: ✅
Fact Recall Test
Query: "What is my name?" Response: "I remember! You're Sarah Chen, right? 🌸" Facts Injected: 5 high-confidence facts Success: ✅
Miku Memory Test
Miku's Response: "[Miku]: 🎉 Here's one: Why did the Vocaloid..."
Stored in Qdrant: ✅ (verified via API query)
Metadata: speaker: 'miku', source: 'user', consolidated: false
Success: ✅
API Changes Fixed
Cat v1.6.2 Compatibility
recall_memories_from_text()→recall_memories_from_embedding()add_texts()→add_point(content, vector, metadata)- Results format:
[(doc, score, vector, id)]not[(doc, score)] - Hook signature:
after_cat_recalls_memories(cat)not(memory_docs, cat)
Qdrant Compatibility
- Point IDs must be UUID strings (not negative integers)
- Episodic memories need
metadata.source = user_idfor recall filtering - Declarative memories use
user_id: 'global'(shared across users)
Configuration
Consolidation Schedule
- Current: Manual trigger via command
- Planned: Nightly at 3:00 AM (requires scheduler)
Fact Confidence Threshold
- Current: 0.5 (50% similarity score)
- Adjustable: Change in
agent_prompt_prefixhook
Batch Size
- Current: 5 memories per LLM call
- Adjustable: Change in
extract_and_store_facts()
Production Deployment
Step 1: Update docker-compose.yml
services:
cheshire-cat-core:
volumes:
- ./cheshire-cat/cat/plugins/memory_consolidation:/app/cat/plugins/memory_consolidation
- ./cheshire-cat/cat/plugins/discord_bridge:/app/cat/plugins/discord_bridge
Step 2: Restart Services
docker-compose restart cheshire-cat-core
Step 3: Verify Plugins Loaded
docker logs cheshire-cat-core | grep "Consolidation Plugin"
Should see:
- ✅ [Consolidation Plugin] before_cat_sends_message hook registered
- ✅ [Memory Consolidation] Plugin loaded
Step 4: Test Manual Consolidation
Send message: consolidate now
Expected response:
🌙 Memory Consolidation Complete!
📊 Stats:
- Total processed: XX
- Kept: XX
- Deleted: XX
- Facts learned: XX
Monitoring
Check Fact Extraction
docker logs cheshire-cat-core | grep "LLM Extract"
Check Fact Recall
docker logs cheshire-cat-core | grep "Declarative"
Check Miku Memory Storage
docker logs cheshire-cat-core | grep "Miku Memory"
Query Qdrant Directly
# Count declarative facts
curl -s http://localhost:6333/collections/declarative/points/count
# View Miku's messages
curl -s http://localhost:6333/collections/episodic/points/scroll \
-H "Content-Type: application/json" \
-d '{"filter": {"must": [{"key": "metadata.speaker", "match": {"value": "miku"}}]}, "limit": 10}'
Performance Notes
- LLM calls: ~1 per 5 memories during consolidation (batched)
- Embedding calls: 1 per user query (for declarative search)
- Storage overhead: +1 memory per Miku response (~equal to user messages)
- Search latency: ~100-200ms for declarative fact retrieval
Future Enhancements
Scheduled Consolidation
- Integrate APScheduler for nightly runs
- Add cron expression configuration
Per-User Facts
- Change
user_id: 'global'to actual user IDs - Enables multi-user fact isolation
Fact Update/Merge
- Detect when new facts contradict old ones
- Update existing facts instead of duplicating
Conversation Summarization
- Use LLM to generate conversation summaries
- Store summaries for long-term context
Troubleshooting
Facts Not Being Recalled
Symptom: Miku doesn't use personal information
Check: docker logs | grep "Declarative"
Solution: Ensure facts exist in declarative collection and confidence threshold isn't too high
Miku's Responses Not Stored
Symptom: No [Miku]: entries in episodic memory
Check: docker logs | grep "Miku Memory"
Solution: Verify before_cat_sends_message hook is registered and executing
Consolidation Fails
Symptom: Error during consolidate now command
Check: docker logs | grep "Error"
Common Issues:
- Qdrant connection timeout
- LLM rate limiting
- Invalid memory payload format
Hook Not Executing
Symptom: Expected log messages not appearing
Check: Plugin load errors: docker logs | grep "Unable to load"
Solution: Check for Python syntax errors in plugin file
Credits
- Framework: Cheshire Cat AI v1.6.2
- Vector DB: Qdrant v1.8.0
- Embedder: BAAI/bge-large-en-v1.5
- LLM: llama.cpp (model configurable)
Status: ✅ Production Ready Last Updated: February 3, 2026 Version: 2.0.0