AWS Lambda: the serverless sweet spot (and where it falls apart)

Series: AWS Field Guide · Part 3 of 6 — Compute This series: 01 — Overview · 02 — EC2 · 03 — Lambda · 04 — ECS & Fargate · 05 — EKS · 06 — Wrap-up

The scenario

Your S3 bucket receives a new image upload every few seconds — sometimes. At 2am on a Tuesday it might be one upload per hour. At noon on a product launch day it might be a thousand per minute. You need to resize each image as it arrives.

Provisioning an EC2 instance for this means paying for a server that sits idle most of the time. Autoscaling helps, but there's still a floor — a minimum number of instances running even at zero load. And you'd be maintaining a server just to run a function that takes 300 milliseconds.

That's exactly the problem Lambda was built for.

TL;DR: Lambda lets you run code without provisioning or managing servers. You write a function, attach a trigger, and AWS handles the rest — scaling from zero to thousands of concurrent executions automatically. You pay only for the milliseconds your code actually runs. The tradeoffs are real though: 15-minute execution limit, cold starts, and concurrency constraints that can surprise you in production.

The problem it solves

Traditional server-based compute has a fundamental mismatch problem: you provision for peak load, but pay for it at all times. For workloads that are spiky, infrequent, or event-driven, you're constantly paying for idle capacity.

Lambda inverts this. There's no server to provision. No fleet to maintain. No paying for idle. The unit of work is the function invocation — and you're billed only when work is actually happening. For the right workloads, this isn't just cheaper — it removes an entire category of operational burden.

Core concepts

The execution model

When a Lambda function is invoked, AWS:

Spins up a sandboxed execution environment
Downloads your deployment package (code + dependencies)
Initialises your runtime
Runs your handler function
Returns the response and freezes (or reuses) the environment

That execution environment — the container AWS manages for you — is what makes Lambda both powerful and occasionally surprising. Understanding its lifecycle is the key to using Lambda well.

Triggers

Lambda functions don't run on their own. They respond to events. Common triggers include:

Trigger	Use case
API Gateway / Function URL	HTTP APIs and webhooks
S3 events	Process files on upload
DynamoDB Streams	React to database changes
SQS / SNS	Process queue messages
EventBridge	Scheduled jobs, event routing
Cognito	Custom auth flows
CloudFront	Edge logic (Lambda@Edge)

The trigger defines the event shape — what your handler function receives as its event argument.

Handler function

Every Lambda function has a handler — the entry point AWS calls on each invocation. In Python:

def handler(event, context):
    # event: the trigger payload (dict)
    # context: runtime info (timeout remaining, request ID, etc.)
    print(f"Processing: {event}")
    return {"statusCode": 200, "body": "Done"}

The handler is stateless by design. Any state that needs to persist across invocations must live outside the function — in S3, DynamoDB, ElastiCache, or another external store.

Invocation types

Lambda supports three invocation types:

Synchronous — the caller waits for the function to complete and return a response. Used by API Gateway, ALB, and direct SDK calls. If the function errors, the caller gets the error immediately.

Asynchronous — the caller hands off the event and moves on. Lambda retries on failure (up to twice by default). Used by S3 events, SNS, EventBridge. Configure a Dead Letter Queue (DLQ) or destination to capture failed events.

Stream/poll-based — Lambda polls a source (SQS, DynamoDB Streams, Kinesis) and invokes the function with batches of records. Failure handling and retry behaviour varies by source.

Getting the invocation type right matters — especially for error handling. Synchronous errors surface immediately; asynchronous failures can disappear silently without a DLQ.

Concurrency

Lambda scales by running multiple instances of your function in parallel — one instance per concurrent request. This is both the magic and the gotcha.

Concurrency limit: By default, your AWS account has a regional concurrency limit of 1,000 concurrent Lambda executions across all functions. Hit that limit and new invocations are throttled.

Reserved concurrency: You can allocate a fixed concurrency pool to a specific function — guaranteeing it always has capacity, but also capping it so it can't consume the entire account limit.

Provisioned concurrency: Pre-warms a set number of execution environments, eliminating cold starts for those instances. Costs more, but essential for latency-sensitive APIs.

Minimal working example

A Lambda function triggered by an S3 upload that logs the filename:

import json
import urllib.parse

def handler(event, context):
    bucket = event['Records'][0]['s3']['bucket']['name']
    key = urllib.parse.unquote_plus(
        event['Records'][0]['s3']['object']['key']
    )
    print(f"New file uploaded: s3://{bucket}/{key}")
    return {"processed": key}

Deploy it with the AWS CLI:

# Package the function
zip function.zip lambda_function.py

# Create the function
aws lambda create-function \
  --function-name process-s3-upload \
  --runtime python3.12 \
  --role arn:aws:iam::123456789:role/lambda-s3-role \
  --handler lambda_function.handler \
  --zip-file fileb://function.zip

# Invoke it manually to test
aws lambda invoke \
  --function-name process-s3-upload \
  --payload '{"Records":[{"s3":{"bucket":{"name":"my-bucket"},"object":{"key":"test.jpg"}}}]}' \
  response.json

Pricing model

Lambda pricing has two components:

Requests: $0.20 per 1 million invocations. The first 1 million per month are free.

Duration: Billed in 1ms increments at $0.0000166667 per GB-second (the amount of memory allocated × execution time in seconds).

Example — image resize function:

Memory: 512MB (0.5GB)
Avg execution time: 800ms
Volume: 5 million invocations/month

Requests cost:  5M invocations × \(0.20/1M          = \)1.00
Duration cost:  5M × 0.8s × 0.5GB × \(0.0000166667 = \)33.33
Total:                                               ~$34/month

For comparison, a t3.small EC2 instance running 24/7 for the same month costs ~$15 — but it can only handle one resize at a time and sits idle between uploads. Lambda scales to handle thousands simultaneously without any additional configuration.

The crossover point where EC2 becomes cheaper than Lambda is typically around sustained, high-concurrency workloads with long execution times. For spiky or infrequent workloads, Lambda almost always wins on cost.

When to use Lambda (and when not to)

Use Lambda when:

✅ The workload is event-driven — something happens, you react to it
✅ Traffic is spiky or unpredictable — Lambda scales to zero and to thousands without intervention
✅ Tasks are short-lived — well within the 15-minute limit
✅ You want zero server management — no patching, no AMIs, no autoscaling groups
✅ You're building glue logic — connecting AWS services together (S3 → process → DynamoDB)
✅ Cost at low volume matters — Lambda is genuinely free at small scale

Don't use Lambda when:

❌ Tasks run longer than 15 minutes — use ECS, Batch, or EC2
❌ You need persistent in-memory state between invocations — Lambda is stateless by design
❌ Cold start latency is unacceptable and provisioned concurrency is too expensive
❌ The workload is constantly running at high concurrency — EC2 or ECS becomes cheaper
❌ You need long-lived TCP connections (WebSockets, persistent database connections at scale)
❌ Your team needs to debug with standard server tooling — Lambda's local development experience is genuinely worse

Common gotchas

1. Cold starts in production latency budgets. When a Lambda function hasn't been invoked recently, the execution environment needs to be initialised — downloading the package, starting the runtime, running your initialisation code. This adds anywhere from 100ms to several seconds depending on your runtime (Java and .NET are the worst offenders; Python and Node are much faster). For user-facing APIs, cold starts surface as occasional slow responses that are hard to reproduce. Provisioned concurrency solves this but adds cost.

2. Connection pool exhaustion with databases. Lambda's scaling model creates a database connection problem. Each concurrent function instance opens its own connection. 500 concurrent Lambda executions means 500 database connections — easily overwhelming a standard RDS instance. The solution is RDS Proxy, which pools connections between Lambda and your database. Missing this is a common production incident waiting to happen.

3. Silent failures on async invocations. Asynchronous Lambda invocations retry twice on failure and then drop the event — silently, by default. Without a Dead Letter Queue or an EventBridge Pipes destination configured, failed events vanish without a trace. Always configure failure destinations for async workloads.

4. The 15-minute timeout is a hard wall. Lambda will terminate your function at 15 minutes with no warning and no retry. If your function is doing something that occasionally tips over that limit — a slow external API call, a large file processing job — you'll get partial failures that are difficult to reason about. Design around the limit or use a different compute option.

5. Package size affects cold start time. Every dependency you include in your deployment package adds to initialisation time. Lambda layers help share common dependencies across functions, but the real fix is keeping functions lean. A function that imports half of numpy for one utility method will always cold-start slowly.

Compared to the alternatives

Lambda vs EC2

EC2 is the right choice when you need a long-running process, OS-level control, or consistent high-throughput compute. Lambda wins on operational simplicity and cost for event-driven, short-lived tasks. They complement each other — most real architectures use both.

Lambda vs ECS + Fargate

For containerised workloads that need more than 15 minutes or persistent connections, Fargate is the better fit. Lambda is simpler to deploy and cheaper at low volumes; Fargate is more predictable at sustained load and removes the cold start problem entirely.

Lambda vs Cloudflare Workers

Workers run at the edge (closer to users), have a near-zero cold start, and are genuinely cheaper for simple request/response workloads. The tradeoff: much tighter runtime constraints, no native AWS service integrations, and a different programming model. For globally distributed low-latency APIs, Workers are worth evaluating. For anything tightly integrated with AWS services, Lambda wins.

Key takeaways

Lambda runs your code in response to events — no servers to manage, no idle capacity to pay for.
The execution model is stateless and ephemeral. Anything that needs to persist across invocations lives outside the function.
Pricing is genuinely cheap for spiky or low-volume workloads. At sustained high concurrency, EC2 or Fargate often becomes more cost-effective.
Cold starts, concurrency limits, the 15-minute timeout, and database connection exhaustion are the four production problems worth understanding before you go live.
Async invocations need explicit failure handling. Silent drops are a real risk without DLQs or destinations configured.

Up next

Part 4 → AWS ECS & Fargate: containers without the cluster headache

We go deep on ECS — task definitions, services, the Fargate vs EC2 launch type decision, and the IAM task role confusion that trips up almost every team on their first ECS deployment.

Previously

Part 2 → AWS EC2: the workhorse you should understand before anything else

Covers EC2 instance types, AMIs, security groups, and the On-Demand vs Reserved vs Spot pricing decision that has the biggest impact on your AWS bill.

Part of the AWS Field Guide series. Tags: #aws #lambda #serverless #compute #aws-field-guide

AWS Lambda: the serverless sweet spot (and where it falls apart)

AWS Lambda: the serverless sweet spot (and where it falls apart)

The scenario

The problem it solves

Core concepts

The execution model

Triggers

Handler function

Invocation types

Concurrency

Minimal working example

Pricing model

When to use Lambda (and when not to)

Use Lambda when:

Don't use Lambda when:

Common gotchas

Compared to the alternatives

Lambda vs EC2

Lambda vs ECS + Fargate

Lambda vs Cloudflare Workers

Key takeaways

Up next

Previously

Comments

AWS Field Guide

AWS ECS & Fargate: containers without the cluster headache

More from this blog

System Design Foundations: The Core Concepts Every Engineer Needs

AWS Compute in review: what we learned, what surprised us, what to pick

AWS EKS: Kubernetes on AWS — power, complexity, and when it's worth it

AWS ECS & Fargate: containers without the cluster headache

Command Palette

AWS Lambda: the serverless sweet spot (and where it falls apart)

The scenario

The problem it solves

Core concepts

The execution model

Triggers

Handler function

Invocation types

Concurrency

Minimal working example

Pricing model

When to use Lambda (and when not to)

Use Lambda when:

Don't use Lambda when:

Common gotchas

Compared to the alternatives

Lambda vs EC2

Lambda vs ECS + Fargate

Lambda vs Cloudflare Workers

Key takeaways

Up next

Previously

Comments

AWS Field Guide

AWS ECS & Fargate: containers without the cluster headache

More from this blog