Add interactive Chat with LLM interface to Web UI
Features: - Real-time streaming chat interface (ChatGPT-like experience) - Model selection: Text model (fast) or Vision model (image analysis) - System prompt toggle: Chat with Miku's personality or raw LLM - Mood selector: Choose from 14 different emotional states - Full context integration: Uses complete miku_lore.txt, miku_prompt.txt, and miku_lyrics.txt - Conversation memory: Maintains chat history throughout session - Image upload support for vision model - Horizontal scrolling tabs for responsive design - Clear chat history functionality - SSE (Server-Sent Events) for streaming responses - Keyboard shortcuts (Ctrl+Enter to send) Technical changes: - Added POST /chat/stream endpoint in api.py with streaming support - Updated ChatMessage model with mood, conversation_history, and image_data - Integrated context_manager for proper Miku personality context - Added Chat with LLM tab to index.html - Implemented JavaScript streaming client with EventSource-like handling - Added CSS for chat messages, typing indicators, and animations - Made tab navigation horizontally scrollable for narrow viewports
This commit is contained in:
296
CHAT_INTERFACE_FEATURE.md
Normal file
296
CHAT_INTERFACE_FEATURE.md
Normal file
@@ -0,0 +1,296 @@
|
||||
# Chat Interface Feature Documentation
|
||||
|
||||
## Overview
|
||||
A new **"Chat with LLM"** tab has been added to the Miku bot Web UI, allowing you to chat directly with the language models with full streaming support (similar to ChatGPT).
|
||||
|
||||
## Features
|
||||
|
||||
### 1. Model Selection
|
||||
- **💬 Text Model (Fast)**: Chat with the text-based LLM for quick conversations
|
||||
- **👁️ Vision Model (Images)**: Use the vision model to analyze and discuss images
|
||||
|
||||
### 2. System Prompt Options
|
||||
- **✅ Use Miku Personality**: Attach the standard Miku personality system prompt
|
||||
- Text model: Gets the full Miku character prompt (same as `query_llama`)
|
||||
- Vision model: Gets a simplified Miku-themed image analysis prompt
|
||||
- **❌ Raw LLM (No Prompt)**: Chat directly with the base LLM without any personality
|
||||
- Great for testing raw model responses
|
||||
- No character constraints
|
||||
|
||||
### 3. Real-time Streaming
|
||||
- Messages stream in character-by-character like ChatGPT
|
||||
- Shows typing indicator while waiting for response
|
||||
- Smooth, responsive interface
|
||||
|
||||
### 4. Vision Model Support
|
||||
- Upload images when using the vision model
|
||||
- Image preview before sending
|
||||
- Analyze images with Miku's personality or raw vision capabilities
|
||||
|
||||
### 5. Chat Management
|
||||
- Clear chat history button
|
||||
- Timestamps on all messages
|
||||
- Color-coded messages (user vs assistant)
|
||||
- Auto-scroll to latest message
|
||||
- Keyboard shortcut: **Ctrl+Enter** to send messages
|
||||
|
||||
## Technical Implementation
|
||||
|
||||
### Backend (api.py)
|
||||
|
||||
#### New Endpoint: `POST /chat/stream`
|
||||
```python
|
||||
# Accepts:
|
||||
{
|
||||
"message": "Your chat message",
|
||||
"model_type": "text" | "vision",
|
||||
"use_system_prompt": true | false,
|
||||
"image_data": "base64_encoded_image" (optional, for vision model)
|
||||
}
|
||||
|
||||
# Returns: Server-Sent Events (SSE) stream
|
||||
data: {"content": "streamed text chunk"}
|
||||
data: {"done": true}
|
||||
data: {"error": "error message"}
|
||||
```
|
||||
|
||||
**Key Features:**
|
||||
- Uses Server-Sent Events (SSE) for streaming
|
||||
- Supports both `TEXT_MODEL` and `VISION_MODEL` from globals
|
||||
- Dynamically switches system prompts based on configuration
|
||||
- Integrates with llama.cpp's streaming API
|
||||
|
||||
### Frontend (index.html)
|
||||
|
||||
#### New Tab: "💬 Chat with LLM"
|
||||
Located in the main navigation tabs (tab6)
|
||||
|
||||
**Components:**
|
||||
1. **Configuration Panel**
|
||||
- Radio buttons for model selection
|
||||
- Radio buttons for system prompt toggle
|
||||
- Image upload section (shows/hides based on model)
|
||||
- Clear chat history button
|
||||
|
||||
2. **Chat Messages Container**
|
||||
- Scrollable message history
|
||||
- Animated message appearance
|
||||
- Typing indicator during streaming
|
||||
- Color-coded messages with timestamps
|
||||
|
||||
3. **Input Area**
|
||||
- Multi-line text input
|
||||
- Send button with loading state
|
||||
- Keyboard shortcuts
|
||||
|
||||
**JavaScript Functions:**
|
||||
- `sendChatMessage()`: Handles message sending and streaming reception
|
||||
- `toggleChatImageUpload()`: Shows/hides image upload for vision model
|
||||
- `addChatMessage()`: Adds messages to chat display
|
||||
- `showTypingIndicator()` / `hideTypingIndicator()`: Typing animation
|
||||
- `clearChatHistory()`: Clears all messages
|
||||
- `handleChatKeyPress()`: Keyboard shortcuts
|
||||
|
||||
## Usage Guide
|
||||
|
||||
### Basic Text Chat with Miku
|
||||
1. Go to "💬 Chat with LLM" tab
|
||||
2. Ensure "💬 Text Model" is selected
|
||||
3. Ensure "✅ Use Miku Personality" is selected
|
||||
4. Type your message and click "📤 Send" (or press Ctrl+Enter)
|
||||
5. Watch as Miku's response streams in real-time!
|
||||
|
||||
### Raw LLM Testing
|
||||
1. Select "💬 Text Model"
|
||||
2. Select "❌ Raw LLM (No Prompt)"
|
||||
3. Chat directly with the base language model without personality constraints
|
||||
|
||||
### Vision Model Chat
|
||||
1. Select "👁️ Vision Model"
|
||||
2. Click "Upload Image" and select an image
|
||||
3. Type a message about the image (e.g., "What do you see in this image?")
|
||||
4. Click "📤 Send"
|
||||
5. The vision model will analyze the image and respond
|
||||
|
||||
### Vision Model with Miku Personality
|
||||
1. Select "👁️ Vision Model"
|
||||
2. Keep "✅ Use Miku Personality" selected
|
||||
3. Upload an image
|
||||
4. Miku will analyze and comment on the image with her cheerful personality!
|
||||
|
||||
## System Prompts
|
||||
|
||||
### Text Model (with Miku personality)
|
||||
Uses the same comprehensive system prompt as `query_llama()`:
|
||||
- Full Miku character context
|
||||
- Current mood integration
|
||||
- Character consistency rules
|
||||
- Natural conversation guidelines
|
||||
|
||||
### Vision Model (with Miku personality)
|
||||
Simplified prompt optimized for image analysis:
|
||||
```
|
||||
You are Hatsune Miku analyzing an image. Describe what you see naturally
|
||||
and enthusiastically as Miku would. Be detailed but conversational.
|
||||
React to what you see with Miku's cheerful, playful personality.
|
||||
```
|
||||
|
||||
### No System Prompt
|
||||
Both models respond without personality constraints when this option is selected.
|
||||
|
||||
## Streaming Technology
|
||||
|
||||
The interface uses **Server-Sent Events (SSE)** for real-time streaming:
|
||||
- Backend sends chunked responses from llama.cpp
|
||||
- Frontend receives and displays chunks as they arrive
|
||||
- Smooth, ChatGPT-like experience
|
||||
- Works with both text and vision models
|
||||
|
||||
## UI/UX Features
|
||||
|
||||
### Message Styling
|
||||
- **User messages**: Green accent, right-aligned feel
|
||||
- **Assistant messages**: Blue accent, left-aligned feel
|
||||
- **Error messages**: Red accent with error icon
|
||||
- **Fade-in animation**: Smooth appearance for new messages
|
||||
|
||||
### Responsive Design
|
||||
- Chat container scrolls automatically
|
||||
- Image preview for vision model
|
||||
- Loading states on buttons
|
||||
- Typing indicators
|
||||
- Custom scrollbar styling
|
||||
|
||||
### Keyboard Shortcuts
|
||||
- **Ctrl+Enter**: Send message quickly
|
||||
- **Tab**: Navigate between input fields
|
||||
|
||||
## Configuration Options
|
||||
|
||||
All settings are preserved during the chat session:
|
||||
- Model type (text/vision)
|
||||
- System prompt toggle (Miku/Raw)
|
||||
- Uploaded image (for vision model)
|
||||
|
||||
Settings do NOT persist after page refresh (fresh session each time).
|
||||
|
||||
## Error Handling
|
||||
|
||||
The interface handles various errors gracefully:
|
||||
- Connection failures
|
||||
- Model errors
|
||||
- Invalid image files
|
||||
- Empty messages
|
||||
- Timeout issues
|
||||
|
||||
All errors are displayed in the chat with clear error messages.
|
||||
|
||||
## Performance Considerations
|
||||
|
||||
### Text Model
|
||||
- Fast responses (typically 1-3 seconds)
|
||||
- Streaming starts almost immediately
|
||||
- Low latency
|
||||
|
||||
### Vision Model
|
||||
- Slower due to image processing
|
||||
- First token may take 3-10 seconds
|
||||
- Streaming continues once started
|
||||
- Image is sent as base64 (efficient)
|
||||
|
||||
## Development Notes
|
||||
|
||||
### File Changes
|
||||
1. **`bot/api.py`**
|
||||
- Added `from fastapi.responses import StreamingResponse`
|
||||
- Added `ChatMessage` Pydantic model
|
||||
- Added `POST /chat/stream` endpoint with SSE support
|
||||
|
||||
2. **`bot/static/index.html`**
|
||||
- Added tab6 button in navigation
|
||||
- Added complete chat interface HTML
|
||||
- Added CSS styles for chat messages and animations
|
||||
- Added JavaScript functions for chat functionality
|
||||
|
||||
### Dependencies
|
||||
- Uses existing `aiohttp` for HTTP streaming
|
||||
- Uses existing `globals.TEXT_MODEL` and `globals.VISION_MODEL`
|
||||
- Uses existing `globals.LLAMA_URL` for llama.cpp connection
|
||||
- No new dependencies required!
|
||||
|
||||
## Future Enhancements (Ideas)
|
||||
|
||||
Potential improvements for future versions:
|
||||
- [ ] Save/load chat sessions
|
||||
- [ ] Export chat history to file
|
||||
- [ ] Multi-user chat history (separate sessions per user)
|
||||
- [ ] Temperature and max_tokens controls
|
||||
- [ ] Model selection dropdown (if multiple models available)
|
||||
- [ ] Token count display
|
||||
- [ ] Voice input support
|
||||
- [ ] Markdown rendering in responses
|
||||
- [ ] Code syntax highlighting
|
||||
- [ ] Copy message button
|
||||
- [ ] Regenerate response button
|
||||
|
||||
## Troubleshooting
|
||||
|
||||
### "No response received from LLM"
|
||||
- Check if llama.cpp server is running
|
||||
- Verify `LLAMA_URL` in globals is correct
|
||||
- Check bot logs for connection errors
|
||||
|
||||
### "Failed to read image file"
|
||||
- Ensure image is valid format (JPEG, PNG, GIF)
|
||||
- Check file size (large images may cause issues)
|
||||
- Try a different image
|
||||
|
||||
### Streaming not working
|
||||
- Check browser console for JavaScript errors
|
||||
- Verify SSE is not blocked by proxy/firewall
|
||||
- Try refreshing the page
|
||||
|
||||
### Model not responding
|
||||
- Check if correct model is loaded in llama.cpp
|
||||
- Verify model type matches what's configured
|
||||
- Check llama.cpp logs for errors
|
||||
|
||||
## API Reference
|
||||
|
||||
### POST /chat/stream
|
||||
|
||||
**Request Body:**
|
||||
```json
|
||||
{
|
||||
"message": "string", // Required: User's message
|
||||
"model_type": "text|vision", // Required: Which model to use
|
||||
"use_system_prompt": boolean, // Required: Whether to add system prompt
|
||||
"image_data": "string|null" // Optional: Base64 image for vision model
|
||||
}
|
||||
```
|
||||
|
||||
**Response:**
|
||||
```
|
||||
Content-Type: text/event-stream
|
||||
|
||||
data: {"content": "Hello"}
|
||||
data: {"content": " there"}
|
||||
data: {"content": "!"}
|
||||
data: {"done": true}
|
||||
```
|
||||
|
||||
**Error Response:**
|
||||
```
|
||||
data: {"error": "Error message here"}
|
||||
```
|
||||
|
||||
## Conclusion
|
||||
|
||||
The Chat Interface provides a powerful, user-friendly way to:
|
||||
- Test LLM responses interactively
|
||||
- Experiment with different prompting strategies
|
||||
- Analyze images with vision models
|
||||
- Chat with Miku's personality in real-time
|
||||
- Debug and understand model behavior
|
||||
|
||||
All with a smooth, modern streaming interface that feels like ChatGPT! 🎉
|
||||
Reference in New Issue
Block a user