r/softwarearchitecture • u/doublecore20 • 7d ago
Discussion/Advice Warm Pool vs KubeAPI
We have a debate at our workplace;
We're in the process of a big refactor of a monolithic project into micro services which will be deployed with k8s on EKS (and k8s on prem). We use Traefik as our gateway (important for option #2)
Our use-case is very specific and requires us to route a user to a specific pod which does a very user-specific isolated workload. The pod serves only 1 user at a time. When the workload ends - the worker must discarded (security requirement).
We have two options: 1. Use KubebAPI directly and spin up pods on demand. Assigning a label and routing by label with custom proxy. Allowing "native" scale per user request and delete when needed with manual monitoring also via KubeAPI.
- Having a warm pool of "workers" with HPA for elasticity with custom metric for min available workers.. Managing worker's (workload pods) state in redis (ZSET for heartbeat and O(1) allocation). Each worker has a random unique ID assigned on startup. Traefik (our Gateway) can use Redis as external provider and can create HTTP routes dynamically based on worker state (worker allocated = heartbeat creates kv in redis and this triggers an HTTP route creation). This allows us to route the user to a pod by the unique ID (Traefik route to pod IP by worker ID). Monitoring is done by querying Redis.
Option #1 is simple, easy to implement and mostly to maintain (code wise) - but couples us with k8s (cannot be deployment agnostic), sounds like a total abuse of KubeAPI specifically at larger scale.
Option #2 is more complex theoretically, but it avoids using KubeAPI for application specific needs. Decouples infrastructure from application without some high privileged RBAC policies. Allowing the infrastructure to support the application based on custom metrics and load.
The question - is option #2 really over-engineering and using KubeAPI is not as bad as is sounds? (Controllers and Operators exist for a reason, but I am not sure they are used like that)
1
u/SufficientFrame 7d ago
I'd be careful framing this as "KubeAPI abuse" versus "clean decoupling." In your case the lifecycle of a worker is part of the product behavior, so talking to Kubernetes is not automatically the wrong boundary. What usually becomes painful is putting too much scheduling logic in the app layer and then re-implementing reliability, leases, retries, and cleanup yourself in Redis plus dynamic Traefik config. That second option can work, but operationally it's now two control planes: Kubernetes for pods and Redis/Traefik for allocation and routing state, and keeping those perfectly in sync during crashes or network partitions is where teams bleed time.
A middle path might fit better: keep Kubernetes responsible for creating and destroying short-lived isolated workers, but introduce a small allocator service that owns the user-to-worker assignment and exposes a simple app-level contract. The allocator can request a pod/job, wait for readiness, issue a short-lived session token or worker ID, and garbage collect aggressively after completion. If startup latency is the main reason for the warm pool, I'd test that explicitly before committing to the extra moving parts: image pull times, init cost, CNI attach, and readiness delays often decide this more than architecture diagrams do. Also worth asking whether these are really long-lived "pods per user" or closer to jobs with a routing phase, because that pushes the design in different directions.