epa-orchestrator

By Canonical

View on Snapcraft.io

Version2024.1

Revision36

LicenseApache-2.0

Confinementstrict

Basecore24

NUMA-aware CPU and hugepage resource orchestrator for snaps

EPA Orchestrator is designed to provide secure, policy-driven resource orchestration for snaps and workloads on Linux systems. Its vision is to enable fine-grained, dynamic allocation and management of system resources—starting with CPU pinning and memory management, with plans to expand to other resource types and orchestration policies. The orchestrator exposes a secure Unix socket API for resource allocation and introspection, making it easy for other snaps (such as openstack-hypervisor) and workloads to request and manage dedicated or shared resources in a controlled manner.

Features:

- CPU Pinning and Allocation: Allocate isolated and shared CPU sets to snaps and workloads, supporting both dedicated and shared CPU usage models with basic system-size heuristics.
- Memory Management and Hugepage Tracking: Introspect NUMA hugepages and track hugepage allocations across NUMA nodes with per-service allocation tracking.
- NUMA-Aware Core Allocation: Request a specific number of cores from a particular NUMA node with override/append semantics and exact-count guarantees.
- Resource Introspection: Query current allocations and available resources via a secure API.
- Secure Unix Socket API: All orchestration actions are performed via a secure, local Unix socket with JSON-based requests and responses.
- Basic Allocation Heuristics: Automatic allocation based on system size (small vs large systems) when no specific core count is requested.

CPU Allocation Policy: Small vs. Large Systems:

When a client requests core allocation with numofcores: 0, EPA Orchestrator applies a policy based on the total number of CPUs detected:

- Small systems (≤100 CPUs):
- By default, 80% of the available CPUs are allocated to the requesting snap or workload.
- The remaining 20% are left unallocated (shared).
- Large systems (>100 CPUs):
- By default, 16 CPUs are always reserved (left unallocated/shared).
- All other CPUs are allocated to the requesting snap or workload.

This policy ensures that on large servers, a fixed number of CPUs are always available for system or shared use, while on smaller systems, a proportional allocation is used.

NUMA-Aware Core Allocation Policy:

The NUMA-aware allocation action allows services to request a specific number of cores from a particular NUMA node:

- NUMA Locality: Cores are allocated from the specified NUMA node to ensure optimal memory access patterns.
- Force Reallocation: NUMA allocation will override any existing non-explicit allocations to other services, even if they span multiple NUMA nodes.
- Atomic Exact-Count: If fewer than the requested number of cores are available in the NUMA node, the request fails with an error; no partial allocation occurs.
- Priority System: NUMA allocations take precedence over automatic allocations and cannot be overridden by other services.
- Per-NUMA override/append semantics: If the same service requests the same NUMA node again, it overrides previous cores from that node. If it requests a different NUMA node, the new cores are appended so the service may hold allocations across multiple NUMA nodes.
- Per-NUMA deallocation: Sending numofcores = -1 for a node deallocates any existing cores for that service in that node. numofcores = 0 is invalid for NUMA.

Update History

2024.1 34 → 36

1 Jun 2026, 01:30 UTC

2024.1 (34)

1 Apr 2026, 21:28 UTC

Published30 Jun 2025, 07:33 UTC

Last updated25 Mar 2026, 20:03 UTC

First seen1 Apr 2026, 21:28 UTC