# Remote Microphone Streaming Setup This guide shows how to use the ASR system with a client on one machine streaming audio to a server on another machine. ## Architecture ``` ┌─────────────────┐ ┌─────────────────┐ │ Client Machine │ │ Server Machine │ │ │ │ │ │ 🎤 Microphone │ ───WebSocket───▶ │ 🖥️ Display │ │ │ (Audio) │ │ │ client/ │ │ server/ │ │ mic_stream.py │ │ display_server │ └─────────────────┘ └─────────────────┘ ``` ## Server Setup (Machine with GPU) ### 1. Start the server with live display ```bash cd /home/koko210Serve/parakeet-test source venv/bin/activate PYTHONPATH=/home/koko210Serve/parakeet-test python server/display_server.py ``` **Options:** ```bash python server/display_server.py --host 0.0.0.0 --port 8766 ``` The server will: - ✅ Bind to all network interfaces (0.0.0.0) - ✅ Display transcriptions in real-time with color coding - ✅ Show progressive updates as audio streams in - ✅ Highlight final transcriptions when complete ### 2. Configure firewall (if needed) Allow incoming connections on port 8766: ```bash # Ubuntu/Debian sudo ufw allow 8766/tcp # CentOS/RHEL sudo firewall-cmd --permanent --add-port=8766/tcp sudo firewall-cmd --reload ``` ### 3. Get the server's IP address ```bash # Find your server's IP address ip addr show | grep "inet " | grep -v 127.0.0.1 ``` Example output: `192.168.1.100` ## Client Setup (Remote Machine) ### 1. Install dependencies on client machine Create a minimal Python environment: ```bash # Create virtual environment python3 -m venv asr-client source asr-client/bin/activate # Install only client dependencies pip install websockets sounddevice numpy ``` ### 2. Copy the client script Copy `client/mic_stream.py` to your client machine: ```bash # On server machine scp client/mic_stream.py user@client-machine:~/ # Or download it via your preferred method ``` ### 3. List available microphones ```bash python mic_stream.py --list-devices ``` Example output: ``` Available audio input devices: -------------------------------------------------------------------------------- [0] Built-in Microphone Channels: 2 Sample rate: 44100.0 Hz [1] USB Microphone Channels: 1 Sample rate: 48000.0 Hz -------------------------------------------------------------------------------- ``` ### 4. Start streaming ```bash python mic_stream.py --url ws://SERVER_IP:8766 ``` Replace `SERVER_IP` with your server's IP address (e.g., `ws://192.168.1.100:8766`) **Options:** ```bash # Use specific microphone device python mic_stream.py --url ws://192.168.1.100:8766 --device 1 # Change sample rate (if needed) python mic_stream.py --url ws://192.168.1.100:8766 --sample-rate 16000 # Adjust chunk size for network latency python mic_stream.py --url ws://192.168.1.100:8766 --chunk-duration 0.2 ``` ## Usage Flow ### 1. Start Server On the server machine: ```bash cd /home/koko210Serve/parakeet-test source venv/bin/activate PYTHONPATH=/home/koko210Serve/parakeet-test python server/display_server.py ``` You'll see: ``` ================================================================================ ASR Server - Live Transcription Display ================================================================================ Server: ws://0.0.0.0:8766 Sample Rate: 16000 Hz Model: Parakeet TDT 0.6B V3 ================================================================================ Server is running and ready for connections! Waiting for clients... ``` ### 2. Connect Client On the client machine: ```bash python mic_stream.py --url ws://192.168.1.100:8766 ``` You'll see: ``` Connected to server: ws://192.168.1.100:8766 Recording started. Press Ctrl+C to stop. ``` ### 3. Speak into Microphone - Speak naturally into your microphone - Watch the **server terminal** for real-time transcriptions - Progressive updates appear in yellow as you speak - Final transcriptions appear in green when you pause ### 4. Stop Streaming Press `Ctrl+C` on the client to stop recording and disconnect. ## Display Color Coding On the server display: - **🟢 GREEN** = Final transcription (complete, accurate) - **🟡 YELLOW** = Progressive update (in progress) - **🔵 BLUE** = Connection events - **⚪ WHITE** = Server status messages ## Example Session ### Server Display: ``` ================================================================================ ✓ Client connected: 192.168.1.50:45232 ================================================================================ [14:23:15] 192.168.1.50:45232 → Hello this is [14:23:17] 192.168.1.50:45232 → Hello this is a test of the remote [14:23:19] 192.168.1.50:45232 ✓ FINAL: Hello this is a test of the remote microphone streaming system. [14:23:25] 192.168.1.50:45232 → Can you hear me [14:23:27] 192.168.1.50:45232 ✓ FINAL: Can you hear me clearly? ================================================================================ ✗ Client disconnected: 192.168.1.50:45232 ================================================================================ ``` ### Client Display: ``` Connected to server: ws://192.168.1.100:8766 Recording started. Press Ctrl+C to stop. Server: Connected to ASR server with live display [PARTIAL] Hello this is [PARTIAL] Hello this is a test of the remote [FINAL] Hello this is a test of the remote microphone streaming system. [PARTIAL] Can you hear me [FINAL] Can you hear me clearly? ^C Stopped by user Disconnected from server Client stopped by user ``` ## Network Considerations ### Bandwidth Usage - Sample rate: 16000 Hz - Bit depth: 16-bit (int16) - Bandwidth: ~32 KB/s per client - Very low bandwidth - works well over WiFi or LAN ### Latency - Progressive updates: Every ~2 seconds - Final transcription: When audio stops or on demand - Total latency: ~2-3 seconds (network + processing) ### Multiple Clients The server supports multiple simultaneous clients: - Each client gets its own session - Transcriptions are tagged with client IP:port - No interference between clients ## Troubleshooting ### Client Can't Connect ``` Error: [Errno 111] Connection refused ``` **Solution:** 1. Check server is running 2. Verify firewall allows port 8766 3. Confirm server IP address is correct 4. Test connectivity: `ping SERVER_IP` ### No Audio Being Captured ``` Recording started but no transcriptions appear ``` **Solution:** 1. Check microphone permissions 2. List devices: `python mic_stream.py --list-devices` 3. Try different device: `--device N` 4. Test microphone in other apps first ### Poor Transcription Quality **Solution:** 1. Move closer to microphone 2. Reduce background noise 3. Speak clearly and at normal pace 4. Check microphone quality/settings ### High Latency **Solution:** 1. Use wired connection instead of WiFi 2. Reduce chunk duration: `--chunk-duration 0.05` 3. Check network latency: `ping SERVER_IP` ## Security Notes ⚠️ **Important:** This setup uses WebSocket without encryption (ws://) For production use: - Use WSS (WebSocket Secure) with TLS certificates - Add authentication (API keys, tokens) - Restrict firewall rules to specific IP ranges - Consider using VPN for remote access ## Advanced: Auto-start Server Create a systemd service (Linux): ```bash sudo nano /etc/systemd/system/asr-server.service ``` ```ini [Unit] Description=ASR WebSocket Server After=network.target [Service] Type=simple User=YOUR_USERNAME WorkingDirectory=/home/koko210Serve/parakeet-test Environment="PYTHONPATH=/home/koko210Serve/parakeet-test" ExecStart=/home/koko210Serve/parakeet-test/venv/bin/python server/display_server.py Restart=always [Install] WantedBy=multi-user.target ``` Enable and start: ```bash sudo systemctl enable asr-server sudo systemctl start asr-server sudo systemctl status asr-server ``` ## Performance Tips 1. **Server:** Use GPU for best performance (~100ms latency) 2. **Client:** Use low chunk duration for responsiveness (0.1s default) 3. **Network:** Wired connection preferred, WiFi works fine 4. **Audio Quality:** 16kHz sample rate is optimal for speech ## Summary ✅ **Server displays transcriptions in real-time** ✅ **Client sends audio from remote microphone** ✅ **Progressive updates show live transcription** ✅ **Final results when speech pauses** ✅ **Multiple clients supported** ✅ **Low bandwidth, low latency** Enjoy your remote ASR streaming system! 🎤 → 🌐 → 🖥️