Lemonade Server

By Ken VanDine

View on Snapcraft.io

Versionv10.2.0

Revision145

Size102.3 MB

LicenseMIT

Confinementstrict

Basecore24

CategoriesProductivity

Local AI server with OpenAI-compatible API

Website Source Code Report Bug Contact

Lemonade Server is a lightweight, high-performance local AI inference server
that provides an OpenAI-compatible API for running large language models on
your own hardware.

Features:
- OpenAI-compatible REST API (chat completions, embeddings, etc.)
- Multiple backend support: Vulkan, ROCm (AMD GPUs), and CPU
- Automatic model management and caching
- Support for GGUF models from Hugging Face
- Low latency local inference
- Runs as a background service

Supported Hardware:
- AMD GPUs: RDNA3 (RX 7000), RDNA4 (RX 9000), Strix Point/Halo APUs
- Any Vulkan-capable GPU
- CPU fallback for systems without GPU acceleration

Quick Start:
The server starts automatically after installation. Access the API at: http://localhost:8000/api/v1

ROCm Support (AMD GPUs):
For ROCm GPU acceleration, connect the process-control interface:
sudo snap connect lemonade-server:process-control

Documentation: https://lemonade-server.ai/

Update History

v10.0.1 (134) → v10.2.0 (145)

11 Apr 2026, 14:29 UTC

v10.0.1 123 → 134

7 Apr 2026, 06:12 UTC

v10.0.1 123 → 134

6 Apr 2026, 18:45 UTC

Published14 Jan 2026, 20:51 UTC

Last updated11 Apr 2026, 04:06 UTC

First seen15 Jan 2026, 04:37 UTC