Skip to main content
Version: 1.14.0

Epinio Performance Settings

Rate Limiting​

By default, Epinio controls client-side Kubernetes API rate limiting with two environment variables. See the client-go rate limiting constants for background on how these work.

VariableDefaultDescription
KUBE_API_QPS5Sustained queries per second to the Kubernetes API
KUBE_API_BURST10Maximum burst above the QPS limit

The defaults are conservative. If you are running many concurrent users or high-frequency deployments, increasing these values will reduce latency under load. Changes take effect after restarting the Epinio server.

Note: The Kubernetes API server has its own rate limits. Setting client-side limits too high can exhaust server-side capacity. If you see 429 Too Many Requests errors, either lower KUBE_API_QPS/KUBE_API_BURST or increase the API server limits.

Kubernetes 1.29+ uses API Priority and Fairness by default, configured via FlowSchema and PriorityLevelConfiguration objects. Older clusters use the --max-requests-inflight and --max-mutating-requests-inflight kube-apiserver flags.