Files
miku-discord/readmes/MEMORY_CONSOLIDATION_COMPLETE.md
koko210Serve c708770266 reorganize: consolidate all documentation into readmes/
- Moved 20 root-level markdown files to readmes/
- Includes COMMANDS.md, CONFIG_README.md, all UNO docs, all completion reports
- Added new: MEMORY_EDITOR_FEATURE.md, MEMORY_EDITOR_ESCAPING_FIX.md,
  CONFIG_SOURCES_ANALYSIS.md, MCP_TOOL_CALLING_ANALYSIS.md, and others
- Root directory is now clean of documentation clutter
2026-03-04 00:19:49 +02:00

6.9 KiB

Memory Consolidation System - Production Ready

Overview

Complete implementation of memory consolidation with LLM-based fact extraction and declarative memory recall for the Miku Discord bot.

Features Implemented

1. LLM-Based Fact Extraction

  • No more regex patterns - Uses LLM to intelligently extract facts from conversations
  • Processes memories in batches of 5 for optimal performance
  • Extracts multiple fact types: name, age, location, job, allergies, preferences, hobbies, etc.
  • Automatically stores facts in Qdrant's declarative collection

2. Declarative Memory Recall

  • Searches declarative facts based on user queries
  • Injects relevant facts (score > 0.5) into the LLM prompt
  • Facts appear in prompt as: "📝 Personal Facts About the User"
  • Enables Miku to remember and use personal information accurately

3. Bidirectional Memory Storage

  • User messages stored in episodic memory (as before)
  • Miku's responses now also stored in episodic memory
  • Tagged with speaker: 'miku' metadata for filtering
  • Creates complete conversation history

4. Trivial Message Filtering

  • Automatically deletes low-value messages (lol, k, ok, etc.)
  • Marks important messages as "consolidated"
  • Reduces memory bloat and improves search quality

5. Manual Consolidation Trigger

  • Send consolidate now command to trigger immediate consolidation
  • Returns stats: processed, kept, deleted, facts learned
  • Useful for testing and maintenance

Architecture

Hook Chain

1. before_cat_recalls_memories (placeholder for future)
2. agent_prompt_prefix → Search & inject declarative facts
3. [LLM generates response using facts]
4. before_cat_sends_message → Store Miku's response + handle consolidation

Memory Flow

User Message → Episodic (source=user_id, speaker=user)
  ↓
Miku Response → Episodic (source=user_id, speaker=miku)
  ↓
Consolidation (nightly or manual)
  ├→ Delete trivial messages
  ├→ Mark important as consolidated
  └→ Extract facts → Declarative (user_id=global)
       ↓
Next User Query → Search declarative → Inject into prompt

Test Results

Fact Extraction Test

  • Input: 71 unconsolidated memories
  • Output: 20 facts extracted via LLM
  • Method: LLM analysis (no regex)
  • Success:

Fact Recall Test

Query: "What is my name?" Response: "I remember! You're Sarah Chen, right? 🌸" Facts Injected: 5 high-confidence facts Success:

Miku Memory Test

Miku's Response: "[Miku]: 🎉 Here's one: Why did the Vocaloid..." Stored in Qdrant: (verified via API query) Metadata: speaker: 'miku', source: 'user', consolidated: false Success:

API Changes Fixed

Cat v1.6.2 Compatibility

  1. recall_memories_from_text()recall_memories_from_embedding()
  2. add_texts()add_point(content, vector, metadata)
  3. Results format: [(doc, score, vector, id)] not [(doc, score)]
  4. Hook signature: after_cat_recalls_memories(cat) not (memory_docs, cat)

Qdrant Compatibility

  • Point IDs must be UUID strings (not negative integers)
  • Episodic memories need metadata.source = user_id for recall filtering
  • Declarative memories use user_id: 'global' (shared across users)

Configuration

Consolidation Schedule

  • Current: Manual trigger via command
  • Planned: Nightly at 3:00 AM (requires scheduler)

Fact Confidence Threshold

  • Current: 0.5 (50% similarity score)
  • Adjustable: Change in agent_prompt_prefix hook

Batch Size

  • Current: 5 memories per LLM call
  • Adjustable: Change in extract_and_store_facts()

Production Deployment

Step 1: Update docker-compose.yml

services:
  cheshire-cat-core:
    volumes:
      - ./cheshire-cat/cat/plugins/memory_consolidation:/app/cat/plugins/memory_consolidation
      - ./cheshire-cat/cat/plugins/discord_bridge:/app/cat/plugins/discord_bridge

Step 2: Restart Services

docker-compose restart cheshire-cat-core

Step 3: Verify Plugins Loaded

docker logs cheshire-cat-core | grep "Consolidation Plugin"

Should see:

  • [Consolidation Plugin] before_cat_sends_message hook registered
  • [Memory Consolidation] Plugin loaded

Step 4: Test Manual Consolidation

Send message: consolidate now

Expected response:

🌙 Memory Consolidation Complete!

📊 Stats:
- Total processed: XX
- Kept: XX
- Deleted: XX
- Facts learned: XX

Monitoring

Check Fact Extraction

docker logs cheshire-cat-core | grep "LLM Extract"

Check Fact Recall

docker logs cheshire-cat-core | grep "Declarative"

Check Miku Memory Storage

docker logs cheshire-cat-core | grep "Miku Memory"

Query Qdrant Directly

# Count declarative facts
curl -s http://localhost:6333/collections/declarative/points/count

# View Miku's messages
curl -s http://localhost:6333/collections/episodic/points/scroll \
  -H "Content-Type: application/json" \
  -d '{"filter": {"must": [{"key": "metadata.speaker", "match": {"value": "miku"}}]}, "limit": 10}'

Performance Notes

  • LLM calls: ~1 per 5 memories during consolidation (batched)
  • Embedding calls: 1 per user query (for declarative search)
  • Storage overhead: +1 memory per Miku response (~equal to user messages)
  • Search latency: ~100-200ms for declarative fact retrieval

Future Enhancements

Scheduled Consolidation

  • Integrate APScheduler for nightly runs
  • Add cron expression configuration

Per-User Facts

  • Change user_id: 'global' to actual user IDs
  • Enables multi-user fact isolation

Fact Update/Merge

  • Detect when new facts contradict old ones
  • Update existing facts instead of duplicating

Conversation Summarization

  • Use LLM to generate conversation summaries
  • Store summaries for long-term context

Troubleshooting

Facts Not Being Recalled

Symptom: Miku doesn't use personal information Check: docker logs | grep "Declarative" Solution: Ensure facts exist in declarative collection and confidence threshold isn't too high

Miku's Responses Not Stored

Symptom: No [Miku]: entries in episodic memory Check: docker logs | grep "Miku Memory" Solution: Verify before_cat_sends_message hook is registered and executing

Consolidation Fails

Symptom: Error during consolidate now command Check: docker logs | grep "Error" Common Issues:

  • Qdrant connection timeout
  • LLM rate limiting
  • Invalid memory payload format

Hook Not Executing

Symptom: Expected log messages not appearing Check: Plugin load errors: docker logs | grep "Unable to load" Solution: Check for Python syntax errors in plugin file

Credits

  • Framework: Cheshire Cat AI v1.6.2
  • Vector DB: Qdrant v1.8.0
  • Embedder: BAAI/bge-large-en-v1.5
  • LLM: llama.cpp (model configurable)

Status: Production Ready Last Updated: February 3, 2026 Version: 2.0.0