Skip to main content

Command Palette

Search for a command to run...

AWS EKS: Kubernetes on AWS — power, complexity, and when it's worth it

Updated
12 min read
AWS EKS: Kubernetes on AWS — power, complexity, and when it's worth it

AWS EKS: Kubernetes on AWS — power, complexity, and when it's worth it

Series: AWS Field Guide · Part 5 of 6 — Compute This series: 01 — Overview · 02 — EC2 · 03 — Lambda · 04 — ECS & Fargate · 05 — EKS · 06 — Wrap-up


The scenario

Your engineering team has grown from five to fifty engineers. You're running twelve microservices on ECS. Deployments are getting complicated — different services have different scaling policies, some need custom scheduling rules, and your platform team keeps writing one-off scripts to fill gaps in what ECS can do natively. A senior engineer makes the case for Kubernetes. The conversation has been going for three months.

This is the moment EKS either makes sense or doesn't. Getting it wrong in either direction is expensive — adopting it too early burns engineering time on infrastructure; delaying it too long means migrating under pressure.

TL;DR: EKS is AWS's managed Kubernetes service — it runs the control plane for you, leaving you responsible for the worker nodes, networking, add-ons, and everything that runs on top. It gives you the full Kubernetes API, which means access to a vast ecosystem of tooling, portability across cloud providers, and the ability to handle workloads of almost arbitrary complexity. The honest cost is real operational weight. Don't adopt EKS because Kubernetes is impressive. Adopt it when ECS genuinely isn't enough.


The problem it solves

Running Kubernetes yourself means managing the control plane — the API server, etcd, the scheduler, the controller manager. This is notoriously hard to do reliably at scale. Control plane outages are high-severity incidents; etcd corruption can be catastrophic.

EKS removes that burden. AWS runs a highly available control plane across multiple availability zones, handles version upgrades for the control plane, and integrates it with IAM, VPC, and the rest of the AWS ecosystem. What remains — worker nodes, networking, storage, observability — is still your responsibility, but the hardest part is managed.


Core concepts

The control plane

The Kubernetes control plane is the brain of the cluster — it schedules workloads, maintains desired state, and exposes the Kubernetes API. In EKS, AWS runs this for you across three availability zones. You interact with it via kubectl or any Kubernetes-native tooling, but you never SSH into it.

You pay \(0.10/hour (~\)73/month) per EKS cluster for the managed control plane — regardless of how many nodes you run.

Node groups

Worker nodes are the machines your workloads actually run on. EKS supports three node types:

Managed node groups — AWS provisions and manages EC2 instances for you. You define the instance type, scaling limits, and AMI; EKS handles node provisioning, updates, and graceful draining during upgrades. The default choice for most teams.

Self-managed nodes — you manage the EC2 instances yourself, joining them to the cluster manually. Maximum control, maximum operational burden. Only worth it for very specific requirements.

Fargate profiles — run pods serverlessly on Fargate, just like ECS Fargate. No node management at all. Works well for batch jobs and isolated workloads; has limitations around DaemonSets, privileged containers, and stateful workloads.

Pods, deployments, and services

The Kubernetes primitives that matter most day-to-day:

Pod — the smallest deployable unit. One or more containers that share a network namespace and storage. Pods are ephemeral — they're created and destroyed constantly.

Deployment — declares the desired state for a set of pods ("run 3 replicas of this container image"). The controller continuously reconciles actual state with desired state, replacing failed pods and managing rolling updates.

Service — a stable network endpoint for a set of pods. Pods come and go; a Service gives them a consistent DNS name and load-balanced IP. The Kubernetes equivalent of an ECS service + load balancer target group.

Namespace — a logical partition within a cluster. Teams typically use namespaces to isolate environments (staging, production) or separate workloads from different teams.

Add-ons and the operational surface

This is where EKS diverges sharply from ECS in complexity. A production-ready EKS cluster needs several add-ons that ECS handles natively or doesn't require:

Add-on Purpose
VPC CNI Pod networking — assigns VPC IPs to pods
CoreDNS Cluster-internal DNS resolution
kube-proxy Network rules for Service routing
AWS Load Balancer Controller Provisions ALBs/NLBs from Kubernetes Ingress/Service
Cluster Autoscaler or Karpenter Scales node groups based on pod demand
EBS/EFS CSI Driver Persistent storage for stateful workloads
metrics-server Enables horizontal pod autoscaling

Each add-on has its own version lifecycle, upgrade cadence, and failure modes. Managing these well is a non-trivial ongoing commitment.

IAM and RBAC

EKS uses two overlapping access control systems, which is a persistent source of confusion:

IAM controls who can call the EKS API (create clusters, describe node groups, update add-ons). Standard AWS IAM policies apply here.

Kubernetes RBAC controls who can call the Kubernetes API (deploy pods, read secrets, list services). This uses Kubernetes-native roles and role bindings.

EKS bridges these via IAM Roles for Service Accounts (IRSA) — a mechanism that lets pods assume IAM roles, so your application code can call AWS APIs without storing credentials. IRSA is the right way to give pods AWS permissions; anything else (instance profile credentials, hardcoded keys) is an antipattern.


Minimal working example

Create an EKS cluster with a managed node group and deploy a simple application:

# Install eksctl (the easiest way to manage EKS clusters)
brew install eksctl  # or see eksctl.io for other platforms

# Create a cluster with a managed node group
eksctl create cluster \
  --name my-cluster \
  --region us-east-1 \
  --nodegroup-name standard-workers \
  --node-type t3.medium \
  --nodes 2 \
  --nodes-min 1 \
  --nodes-max 4 \
  --managed

# Configure kubectl
aws eks update-kubeconfig --region us-east-1 --name my-cluster

# Deploy a simple application
kubectl create deployment my-app \
  --image=123456789.dkr.ecr.us-east-1.amazonaws.com/my-app:latest \
  --replicas=2

# Expose it via a LoadBalancer service
kubectl expose deployment my-app \
  --type=LoadBalancer \
  --port=80 \
  --target-port=8080

# Check the external endpoint
kubectl get service my-app

Pricing model

EKS pricing has two components — and the control plane cost is the one that surprises people first.

Control plane: \(0.10/hour per cluster = ~\)73/month, regardless of workload size. This is the EKS tax. Running three clusters (production, staging, dev) costs $219/month before a single workload runs.

Worker nodes: standard EC2 pricing for your managed node group instances. The cluster autoscaler or Karpenter scales these based on demand.

Example — small production cluster:

Control plane:        1 cluster × \(0.10/hr × 730 hrs     =  \)73/month
Worker nodes:         3 × t3.medium On-Demand             =  $91/month
Load balancer (ALB):  1 × ALB                             =  $16/month
Total baseline:                                            ~$180/month

Compare this to two ECS Fargate tasks doing the same job at ~$18/month — EKS carries a significant baseline cost that only becomes justified at scale.

The control plane cost is fixed, so the economics improve as you consolidate more workloads onto a single cluster. Large organisations running hundreds of services on one or two clusters amortise the control plane cost effectively. Small teams running one or two services rarely do.


When to use EKS (and when not to)

Use EKS when:

  • ✅ Your team is already running Kubernetes and the tooling investment is made

  • ✅ You need Kubernetes-native features: custom operators, CRDs, Helm-based deployment workflows

  • ✅ Workload portability across cloud providers is a genuine requirement

  • ✅ You're running complex multi-tenant workloads that benefit from namespace isolation and fine-grained RBAC

  • ✅ You need advanced scheduling (node affinity, pod topology spread, priority classes)

  • ✅ You're large enough that the operational investment is justified by the control you gain

Don't use EKS when:

  • ❌ You're a small team — the operational overhead will consume engineering time that should go to product

  • ❌ ECS covers your needs — if you don't have a specific Kubernetes requirement, you don't need EKS

  • ❌ Your team doesn't know Kubernetes — adopting EKS and learning Kubernetes simultaneously in production is a painful experience

  • ❌ Cost efficiency at small scale matters — the $73/month control plane cost is hard to justify for one or two services

  • ❌ You want fast iteration — Kubernetes adds abstraction layers that slow down simple changes


Common gotchas

1. The Kubernetes version upgrade treadmill. AWS supports each Kubernetes minor version for approximately 14 months before it reaches end of support. Miss an upgrade window and you're on an unsupported version — which means no security patches and no AWS support. In practice, EKS clusters require a minor version upgrade roughly every 12 months. Each upgrade needs to be tested, add-ons need to be updated in the right order, and node groups need to be rotated. This is a real ongoing maintenance commitment that teams underestimate when they first adopt EKS.

2. VPC CNI and IP exhaustion. EKS's default networking uses the VPC CNI plugin, which assigns a real VPC IP address to every pod. In a large cluster, this consumes IP addresses at a surprising rate — a node running 30 pods consumes 31 VPC IPs (one for the node, one per pod). Small VPC subnets fill up fast. Plan your VPC CIDR ranges before deploying EKS to production, or use prefix delegation to increase pod density per node.

3. The add-on upgrade order matters. When upgrading an EKS cluster, add-ons need to be updated in a specific order and to versions compatible with the new Kubernetes version. Upgrading the control plane without updating the VPC CNI or kube-proxy can cause networking failures. Always check the EKS version compatibility matrix before upgrading and follow the documented order: control plane first, then add-ons, then node groups.

4. IRSA is not automatic. Giving pods AWS permissions via IRSA requires creating an OIDC provider for your cluster, writing an IAM role with a trust policy referencing the pod's service account, and annotating the service account. Teams who skip this and fall back to node-level IAM instance profiles end up with every pod on a node sharing the same broad permissions — a significant security risk.

5. kubectl access and the aws-auth ConfigMap. By default, only the IAM principal that created the EKS cluster can access it via kubectl. Adding other users or roles requires editing the aws-auth ConfigMap — a YAML file in the cluster that maps IAM identities to Kubernetes RBAC roles. Getting this wrong either locks people out or grants unintended access. AWS Access Entries (a newer mechanism) is gradually replacing aws-auth and is worth using for new clusters.


Compared to the alternatives

EKS vs ECS + Fargate

ECS is simpler, cheaper at small scale, and has less operational overhead. EKS wins when you need the Kubernetes API specifically — custom operators, Helm-based workflows, multi-cloud portability, or advanced scheduling. The right question isn't "which is better" but "do I have a requirement that ECS can't meet?" If not, ECS is the correct choice.

EKS vs self-managed Kubernetes (kubeadm, kops)

Self-managed Kubernetes gives you maximum control over the control plane and is cheaper if you're already running on EC2. The operational burden is significantly higher — you're responsible for etcd backups, control plane HA, and upgrades. For most teams on AWS, EKS's $73/month is well worth the managed control plane. Self-managed Kubernetes on AWS is rare outside very specific compliance or cost scenarios at large scale.

EKS vs GKE (Google Kubernetes Engine)

GKE is widely considered the most polished managed Kubernetes experience — Google invented Kubernetes, and it shows. Autopilot mode (GKE's equivalent of Fargate for pods) is more mature than EKS Fargate profiles. If you're not AWS-native, GKE is worth serious consideration. For teams already invested in the AWS ecosystem, EKS's IAM and VPC integration tips the balance.


Key takeaways

  • EKS manages the Kubernetes control plane; you're still responsible for nodes, networking, add-ons, and upgrades.

  • The \(0.10/hour control plane cost (~\)73/month per cluster) makes EKS expensive at small scale. The economics only improve as you consolidate more workloads onto fewer clusters.

  • Don't adopt EKS without a specific Kubernetes requirement. ECS + Fargate covers the vast majority of containerised workloads with far less operational weight.

  • The version upgrade treadmill and VPC IP exhaustion are the two production problems most teams don't anticipate when they first adopt EKS.

  • IRSA is the correct way to give pods AWS permissions. Node-level instance profiles are a security antipattern in multi-workload clusters.


Up next

Part 6 → AWS Compute in review: what we learned, what surprised us, what to pick

The wrap-up — a consolidated decision framework, a one-page cheat sheet across all four services, and the honest verdict on each. Plus a preview of the next series: AWS Storage.


Previously

Part 4 → AWS ECS & Fargate: containers without the cluster headache

Covers ECS task definitions, services, the Fargate vs EC2 launch type decision, and the IAM execution role vs task role confusion that trips up almost every team on their first ECS deployment.


Part of the AWS Field Guide series. Tags: #aws #eks #kubernetes #containers #aws-field-guide