HIGH: Add Circuit Breakers for Critical Services #28
Reference in New Issue
Block a user
Delete Branch "%!s()"
Deleting a branch is permanent. Although the deleted branch may continue to exist for a short time before it actually gets removed, it CANNOT be undone in most cases. Continue?
Critical external services (Cheshire Cat, LLM, STT) lack circuit breakers, causing cascading failures when services are down.
Where It Occurs
Why This Is a Problem
What Can Go Wrong
Scenario 1: Cheshire Cat Downtime
Scenario 2: LLM Service Overload
Proposed Fix
Implement circuit breaker pattern with fallback:
Severity
HIGH - Lack of circuit breakers causes cascading failures and complete bot unresponsiveness.
Files Affected
cat-plugins/cat_client.py, bot/utils/llm.py, bot/stt_client.py, bot/utils/voice_audio.py, new file: bot/utils/circuit_breaker.py
Closing as Already Implemented - A circuit breaker already exists for the Cheshire Cat service in bot/utils/cat_client.py lines 45-100. The CatAdapter class has full circuit breaker functionality: consecutive failure tracking (max 3 failures), 60-second cooldown period, automatic state transitions, and graceful fallback to direct LLM queries when the circuit is open. The implementation matches the pattern proposed in this issue.