- Fix silent None return in analyze_image_with_vision exception handler
- Add None/empty guards after vision analysis in bot.py (image, video, GIF, Tenor)
- Route all image/video/GIF responses through Cheshire Cat pipeline (was
calling query_llama directly), enabling episodic memory storage for media
interactions and correct Last Prompt display in Web UI
- Add media_type parameter to cat_adapter.query() and forward as
discord_media_type in WebSocket payload
- Update discord_bridge plugin to read media_type from payload and inject
MEDIA NOTE into system prefix in before_agent_starts hook
- Add _extract_vision_question() helper to strip Discord mentions and bot-name
triggers from user message; pass cleaned question to vision model so specific
questions (e.g. 'what is the person wearing?') go directly to the vision model
instead of the generic 'Describe this image in detail.' fallback
- Pass user_prompt to all analyze_image_with_qwen / analyze_video_with_vision
call sites in bot.py (image, video, GIF, Tenor, embed paths)
- Fix autonomous reaction loops skipping messages that @mention the bot or have
media attachments in DMs, preventing duplicate vision model calls for images
already being processed by the main message handler
- Increase vision max_tokens: images 300->800, video/GIF 400->1000 (no VRAM
impact; KV cache is pre-allocated at model load time)
- Change bot/memory/*.json to bot/memory/** to properly ignore all
subdirectories (dms/, dm_reports/, profile_pictures/)
- Untrack bot/memory/ files from index (DMs, profile pics, dm reports)
- Untrack cheshire-cat discord_bridge __pycache__/*.pyc from index
- These files are runtime/user data that should never be in version control
Voice conversion pipeline (Soprano TTS → RVC) with Docker support.
Previously tracked as bare gitlink; removed .git/ directories and
absorbed into main repo for unified tracking.
Includes: Soprano TTS, RVC WebUI integration, Docker configs,
WebSocket API, and benchmark scripts.
Updated .gitignore to exclude large model weights (*.pth, *.pt, *.onnx, *.index).
287 files (3.1GB of ML weights properly excluded via gitignore).
UNO card game web app (Node.js/React) with Miku bot integration.
Previously an independent git repo (fork of mizanxali/uno-online).
Removed .git/ and absorbed into main repo for unified tracking.
Includes bot integration code: botActionExecutor, cardParser,
gameStateBuilder, and server-side bot action support.
37 files, node_modules excluded via local .gitignore.
- Moved 20 root-level markdown files to readmes/
- Includes COMMANDS.md, CONFIG_README.md, all UNO docs, all completion reports
- Added new: MEMORY_EDITOR_FEATURE.md, MEMORY_EDITOR_ESCAPING_FIX.md,
CONFIG_SOURCES_ANALYSIS.md, MCP_TOOL_CALLING_ANALYSIS.md, and others
- Root directory is now clean of documentation clutter
- Moved 8 root-level test scripts + 2 from bot/ to tests/
- Moved run_rocinante_test.sh runner script to tests/
- Added tests/README.md documenting each test's purpose, type, and requirements
- Added test_pfp_context.py and test_rocinante_comparison.py (previously untracked)
- Pre-compile 393 name variants into 4 regex patterns at module load
(was 7,300+ raw re.search() calls per message)
- Strict addressing detection using punctuation context:
START: name at beginning + punctuation (Miku, ... / みく!...)
END: comma + name at end (..., Miku / ...、ミク)
MIDDLE: commas on both sides - vocative (..., Miku, ...)
ALONE: name is the entire message (Miku! / ミクちゃん)
- Rejects mere mentions: 'I like Miku' / 'Miku is cool' no longer trigger
- Script-family-aware pattern generation (Latin, Cyrillic, Japanese)
eliminates nonsensical cross-script combos (e.g. o-みく)
- Word boundary enforcement prevents substring matches (mikumiku)
- Fixes regex 'unbalanced parenthesis' errors from old implementation
- Add comprehensive test suite (94 cases, all passing)
- discord_bridge before_agent_starts now checks evil_mode from
working_memory to load the correct personality files:
Normal: miku_lore/prompt/lyrics + /app/moods/{mood}.txt
Evil: evil_miku_lore/prompt/lyrics + /app/moods/evil/{mood}.txt
- Reads files directly instead of relying on cross-plugin working_memory
- cat_client.query() returns (response, full_prompt) tuple
- Full prompt includes system prefix + recalled memories + conversation
- API /prompt/cat returns full_prompt field
Bot was calling restore_evil_cat_state() in on_ready() before Cheshire
Cat finished booting (~25s), causing all plugin toggle API calls to fail
silently. Evil Miku plugin was left disabled and the bot used Cat's
default personality instead.
Changes:
- cat_client.py: add wait_for_ready() that polls Cat health endpoint
every 5s for up to 120s before attempting any admin API calls
- evil_mode.py: rewrite restore_evil_cat_state() with:
- wait_for_ready() gate before any plugin/model switching
- 3-second extra delay after Cat is up (plugin registry fully loaded)
- up to 3 retries on failure
- post-switch verification that the correct plugins are actually active
Also fixes helcyon model references that leaked into the container image
(cat_client.py was switching Cat's LLM to 'helcyon' which has no
llama-swap handler; reverted to correct 'darkidol' / 'llama3.1').
Show a CSS spinner overlay when switching to Autonomous Stats (tab6),
Memories (tab9), and DM Management (tab10). Spinner only shows on
first visit when content is empty, removed after data loads.
Replace hardcoded <option> lists in #mood (tab1 DM mood) and
#chat-mood-select (tab7 chat mood) with empty selects populated
by populateMoodDropdowns(). Respects evil mode emoji mapping.
Called on DOMContentLoaded and after server cards render.
Convert 47 raw fetch+response.json+error-handling patterns to use the
centralized apiCall() utility. The 11 remaining raw fetch() calls are
FormData uploads or SSE streaming that require direct fetch access.
- Extract initTabState, initTabWheelScroll, initVisibilityPolling,
initChatImagePreview, initModalAccessibility as named functions
- Move polling interval vars to outer scope for accessibility
- Single DOMContentLoaded calls all init functions in logical order
- Replace scattered listeners with comment markers at original locations
- Escape sender name via escapeHtml in innerHTML template
- Set message content via textContent instead of innerHTML injection
- Prevents HTML/script injection from user input or LLM responses
- Escape key closes any open memory modal
- Clicking the dark backdrop behind a modal closes it
- Add role=dialog, aria-modal, aria-label for accessibility
First block of conversation-view, conversations-list, conversation-message,
message-header, sender, timestamp, message-content, message-attachments was
silently overridden by identical selectors defined later. Kept the unique
reaction/delete-button styles.
- Cancel previous timer before starting new one (prevents early dismissal)
- Add green background for type='success' notifications
- Bump z-index from 1000 to 3000 so notifications show above modals
- Add fade-out transition for smoother dismissal
- Replace raw setInterval with startPolling/stopPolling functions
- Add visibilitychange listener to pause when tab is hidden
- Immediately refresh data when tab becomes visible again
- Saves bandwidth and CPU when the dashboard is in background
- Add data-tab attributes to tab buttons for reliable identification
- Replace implicit window.event usage with querySelector by data-tab
- Save active tab to localStorage on switch, restore on page load
- miku-stt: switch PyTorch CUDA -> CPU-only (~2.5 GB savings)
- Silero VAD already runs on CPU via ONNX (onnx=True), CUDA PyTorch was waste
- faster-whisper/CTranslate2 uses CUDA directly, no PyTorch GPU needed
- torch+torchaudio layer: 3.3 GB -> 796 MB; total image 9+ GB -> 6.83 GB
- Tested: Silero VAD loads (ONNX), Whisper loads on cuda, server ready
- llama-swap-rocm: add root .dockerignore to fix 31 GB build context
- Dockerfile clones all sources from git, never COPYs from context
- 19 GB of GGUF model files were being transferred on every build
- Now excludes everything (*), near-zero context transfer
- anime-face-detector: add .dockerignore to exclude accumulated outputs
- api/outputs/ (56 accumulated detection files) no longer baked into image
- api/__pycache__/ and images/ also excluded
- .gitignore: remove .dockerignore exclusion so these files are tracked
- Fixed missing client parameter in animated GIF webhook update path
- Added get_persona_avatar_urls() helper that returns bot's current Discord
avatar URL for Miku persona (always fresh, no cache lag)
- Pass avatar_url on every webhook.send() call in bipolar_mode.py,
persona_dialogue.py, and api.py so avatars always match current pfp
regardless of webhook cache state
- miku-bot: Re-add scikit-learn to requirements.txt (needed for vision color extraction)
- miku-stt: Upgrade from CUDA 12.6.2 to 12.8.1, PyTorch 2.5.1 to 2.7.1 per RealtimeSTT PR #295
- miku-stt: Use Ubuntu 24.04 with Python 3.12 (single installation, no dual Python)
- miku-stt: Add requirements-gpu-torch.txt for separate PyTorch installation
- miku-stt: Use --break-system-packages flag for Ubuntu 24.04 pip compatibility
#16 Timezone consistency — added TZ=Europe/Sofia to docker-compose.yml
so datetime.now() returns local time inside the container. Removed
the +3 hour hack from get_time_of_day(). All three time-of-day
consumers (autonomous_v1_legacy, moods, autonomous_engine) now
use the same correct local hour automatically.
#17 Decay truncation — replaced int() with round() in decay_events()
so a counter of 1 survives one more 15-minute cycle instead of
being immediately zeroed (round(0.841)=1 vs int(0.841)=0).
#20 Unpersisted rate limiter — _last_action_execution dict in
autonomous.py is now seeded from the engine's persisted
server_last_action on import, so restarts don't bypass the
30-second cooldown.
Note: #18 (dead config fields) was a false positive — autonomous_interval_minutes
IS used by the scheduler. #19 deferred to bipolar mode rework.
#10 Redundant coin flip in join_conversation — removed the 50% random
gate that doubled the V2 engine's own decision to act.
#11 Message-triggered actions skip _autonomous_paused — _check_and_act
and _check_and_react now bail out immediately when the autonomous
system is paused (voice session), matching the scheduled-tick path.
#12 Duplicate emoji dictionaries — removed MOOD_EMOJIS and
EVIL_MOOD_EMOJIS from globals.py (had different emojis from moods.py).
bipolar_mode.py and evil_mode.py now import the canonical dicts
from utils/moods.py so all code sees the same emojis.
#13 DM mood can spontaneously become 'asleep' — rotate_dm_mood() now
filters 'asleep' out of the candidate list since DMs have no
sleepy-to-asleep transition guard and no wakeup timer.
#15 Engage-user fallback misreports action type — log level raised to
WARNING with an explicit [engage_user->general] prefix so the
cooldown-triggered fallback is visible in logs.
#4 Sleep/mood desync — set_server_mood() now clears is_sleeping when
mood changes away from 'asleep', preventing ghost-sleep state.
#5 Race condition in _check_and_act — added per-guild asyncio.Lock so
overlapping ticks + message-triggered calls cannot fire concurrently.
#6 Class-level attrs on ServerConfig — sleepy_responses_left,
angry_wakeup_timer, and forced_angry_until are now proper dataclass
fields with defaults, so asdict()/from_dict() round-trip correctly.
Also strips unknown keys in from_dict() to survive schema changes.
#7 Persistence decay_factor crash — initialise decay_factor = 1.0
before the loop so empty-server or zero-downtime paths don't
raise NameError.
#8 Double record_action — removed the redundant call in
autonomous_tick_v2(); only _check_and_act records the action now.
#9 Engine mood desync — on_mood_change() is now called inside
set_server_mood() (single source of truth) and removed from 4
call-sites in api.py, moods.py, and server_manager wakeup task.
1. Momentum cliff at 10 messages (P0): The conversation momentum formula
had a discontinuity where the 10th message caused momentum to DROP from
0.9 to 0.5. Replaced with a smooth log1p curve that monotonically
increases (0→0→0.20→0.32→...→0.70→0.89→1.0 at 30 msgs).
2. Neutral keywords overriding all moods (P0): detect_mood_shift() checked
neutral early with generic keywords (okay, sure, hmm) that matched
almost any response, constantly resetting mood to neutral. Now: all
specific moods are scored by match count first (best-match wins),
neutral is only checked as fallback and requires 2+ keyword matches.
3. Uncancellable delayed_wakeup tasks (P0): Fire-and-forget sleep tasks
could stack and overwrite mood state after manual wake-up. Added a
centralized wakeup task registry in ServerManager with automatic
cancellation on manual wake or new sleep cycle.
- Added manual_trigger parameter to /autonomous/engage endpoint to bypass 12h cooldown
- Updated miku_engage_random_user_for_server() and miku_engage_random_user() to accept manual_trigger flag
- Modified Web UI to always send manual_trigger=true when engaging users from the UI
- Users can now manually engage the same user multiple times from web UI without cooldown restriction
- Regular autonomous schedules still respect the 12h cooldown between engagements to the same user
Changes:
- bot/api.py: Added manual_trigger parameter with string-to-boolean conversion
- bot/static/index.html: Added manual_trigger=true to engage user request
- bot/utils/autonomous_v1_legacy.py: Added manual_trigger parameter and cooldown bypass logic
- Add COPY config_manager.py to Dockerfile so it's included in the image
- Add 'config_manager' to logger COMPONENTS list to enable logging
Fixes the ModuleNotFoundError and ValueError when importing config_manager
Add restore_runtime_settings() to ConfigManager that reads config_runtime.yaml
on startup and restores persisted values into globals:
- LANGUAGE_MODE, AUTONOMOUS_DEBUG, VOICE_DEBUG_MODE
- USE_CHESHIRE_CAT, PREFER_AMD_GPU, DM_MOOD
Add missing persistence calls to API endpoints:
- POST /language/set now persists to config_runtime.yaml
- POST /voice/debug-mode now persists to config_runtime.yaml
- POST /memory/toggle now persists to config_runtime.yaml
Call restore_runtime_settings() in on_ready() after evil/bipolar restore.
Resolves#22
Replace the minimal sync-only shutdown (which only saved autonomous state)
with a comprehensive async graceful_shutdown() coroutine that:
1. Ends active voice sessions (disconnect, release GPU locks, cleanup audio)
2. Saves autonomous engine state
3. Stops the APScheduler
4. Cancels all tracked background tasks (from task_tracker)
5. Closes the Discord gateway connection
Signal handlers (SIGTERM/SIGINT) now schedule the async shutdown on the
running event loop. The atexit handler is kept as a last-resort sync fallback.
Resolves#5, also addresses #4 (voice cleanup at shutdown)