Lemonade Server

By Ken VanDine Star developer

View on Snapcraft.io
Versionv10.6.0
Revision193
Size143.9 MB
LicenseMIT
Confinementstrict
Basecore24
CategoriesProductivity

Local AI server with OpenAI-compatible API


Lemonade Server is a lightweight, high-performance local AI inference server
that provides an OpenAI-compatible API for running large language models on
your own hardware.

Features:
- OpenAI-compatible REST API (chat completions, embeddings, etc.)
- Multiple backend support: Vulkan, ROCm (AMD GPUs), and CPU
- Automatic model management and caching
- Support for GGUF models from Hugging Face
- Low latency local inference
- Runs as a background service

Supported Hardware:
- AMD GPUs: RDNA3 (RX 7000), RDNA4 (RX 9000), Strix Point/Halo APUs
- Any Vulkan-capable GPU
- CPU fallback for systems without GPU acceleration

Quick Start:
The server starts automatically after installation. Access the API at: http://localhost:13305/api/v1

Documentation: https://lemonade-server.ai/

Update History

v10.5.1 (190)v10.6.0 (193)
3 Jun 2026, 13:45 UTC
v10.5.0 (185)v10.5.1 (190)
20 May 2026, 15:30 UTC
v10.4.0 (183)v10.5.0 (185)
18 May 2026, 03:00 UTC
v10.4.0 177 → 183
17 May 2026, 01:30 UTC
v10.3.0 (165)v10.4.0 (177)
14 May 2026, 15:30 UTC
v10.2.0 (145)v10.3.0 (165)
29 Apr 2026, 01:45 UTC
v10.0.1 (134)v10.2.0 (145)
11 Apr 2026, 14:29 UTC
v10.0.1 123 → 134
7 Apr 2026, 06:12 UTC
v10.0.1 123 → 134
6 Apr 2026, 18:45 UTC
v10.0.0 (116)v10.0.1 (123)
25 Mar 2026, 01:21 UTC

Published14 Jan 2026, 20:51 UTC

Last updated21 May 2026, 21:29 UTC

First seen15 Jan 2026, 04:37 UTC