82 lines
2.8 KiB
Markdown
82 lines
2.8 KiB
Markdown
|
|
# ROCm Docker Image Update - February 2026
|
||
|
|
|
||
|
|
## Issue Summary
|
||
|
|
The custom-built ROCm container for llama-swap failed to build due to changes in the llama-swap repository structure:
|
||
|
|
- UI directory changed from `ui` to `ui-svelte`
|
||
|
|
- Build output changed from `ui/dist` to `proxy/ui_dist`
|
||
|
|
- Go build was failing with "pattern ui_dist: no matching files found"
|
||
|
|
|
||
|
|
## Resolution
|
||
|
|
Switched to using **official llama.cpp ROCm Docker images** which are now available since PR #18439 was merged on December 29, 2025.
|
||
|
|
|
||
|
|
### What Changed
|
||
|
|
|
||
|
|
1. **Base Image**: Now using `ghcr.io/ggml-org/llama.cpp:server-rocm` instead of building llama.cpp from source
|
||
|
|
2. **UI Build Path**: Updated to use `ui-svelte` directory (was `ui`)
|
||
|
|
3. **Simpler Build**: Removed custom llama.cpp compilation stage - official image already includes optimized ROCm build
|
||
|
|
|
||
|
|
### Benefits of Official Image
|
||
|
|
|
||
|
|
- **Maintained by upstream**: Always up-to-date with latest llama.cpp ROCm optimizations
|
||
|
|
- **Pre-compiled**: Faster build times, no need to compile llama.cpp from source
|
||
|
|
- **Tested**: Official builds are tested by llama.cpp CI/CD
|
||
|
|
- **Smaller Dockerfile**: Reduced from 4 stages to 3 stages
|
||
|
|
|
||
|
|
## Updated Dockerfile Structure
|
||
|
|
|
||
|
|
### Stage 1: UI Builder (Node.js)
|
||
|
|
- Clones llama-swap repository
|
||
|
|
- Builds UI from `ui-svelte` directory
|
||
|
|
- Outputs to `proxy/ui_dist`
|
||
|
|
|
||
|
|
### Stage 2: Binary Builder (Go)
|
||
|
|
- Copies llama-swap source with built UI
|
||
|
|
- Compiles Go binary with `GOTOOLCHAIN=auto`
|
||
|
|
|
||
|
|
### Stage 3: Runtime (Official ROCm Image)
|
||
|
|
- Based on `ghcr.io/ggml-org/llama.cpp:server-rocm`
|
||
|
|
- Includes llama-server with ROCm support
|
||
|
|
- Adds llama-swap binary
|
||
|
|
- Creates non-root user with GPU access groups
|
||
|
|
- Sets environment variables for AMD RX 6800 (gfx1030)
|
||
|
|
|
||
|
|
## Environment Variables
|
||
|
|
|
||
|
|
```bash
|
||
|
|
HSA_OVERRIDE_GFX_VERSION=10.3.0 # RX 6800 compatibility
|
||
|
|
ROCM_PATH=/opt/rocm
|
||
|
|
HIP_VISIBLE_DEVICES=0
|
||
|
|
```
|
||
|
|
|
||
|
|
## GPU Access Groups
|
||
|
|
|
||
|
|
- GID 187: `hostrender` group (render access)
|
||
|
|
- GID 989: `hostvideo` group (kfd/video access)
|
||
|
|
|
||
|
|
## References
|
||
|
|
|
||
|
|
- **Upstream Issue**: https://github.com/ggml-org/llama.cpp/issues/11913
|
||
|
|
- **Fix PR**: https://github.com/ggml-org/llama.cpp/pull/18439
|
||
|
|
- **llama-swap Repo**: https://github.com/mostlygeek/llama-swap
|
||
|
|
- **Official ROCm Images**: `ghcr.io/ggml-org/llama.cpp:server-rocm`
|
||
|
|
|
||
|
|
## Testing
|
||
|
|
|
||
|
|
After building, verify GPU detection:
|
||
|
|
```bash
|
||
|
|
docker compose up llama-swap-amd -d
|
||
|
|
docker compose exec llama-swap-amd rocminfo
|
||
|
|
```
|
||
|
|
|
||
|
|
Expected output should show AMD RX 6800 (gfx1030) device.
|
||
|
|
|
||
|
|
## Migration Notes
|
||
|
|
|
||
|
|
If you had the old custom-built container running:
|
||
|
|
1. Stop the old container: `docker compose down llama-swap-amd`
|
||
|
|
2. Remove old image: `docker rmi miku-discord-llama-swap-amd`
|
||
|
|
3. Rebuild with new Dockerfile: `docker compose build llama-swap-amd`
|
||
|
|
4. Start new container: `docker compose up llama-swap-amd -d`
|
||
|
|
|
||
|
|
Configuration files (`llama-swap-rocm-config.yaml`) remain unchanged.
|