Day 6: Microservices & API Design
What You'll Learn Today
- Monolith vs microservices tradeoffs
- API design: REST vs gRPC vs GraphQL
- API Gateway pattern
- Service discovery mechanisms
- Rate limiting and throttling strategies
- Authentication with OAuth 2.0 and JWT
- Idempotency in API design
Monolith vs Microservices
flowchart LR
subgraph Monolith["Monolith Architecture"]
direction TB
UI1["UI Layer"]
BL1["Business Logic"]
DB1[("Single Database")]
UI1 --> BL1 --> DB1
end
subgraph Micro["Microservices Architecture"]
direction TB
GW["API Gateway"]
S1["User Service"]
S2["Order Service"]
S3["Payment Service"]
DB2[("User DB")]
DB3[("Order DB")]
DB4[("Payment DB")]
GW --> S1 & S2 & S3
S1 --> DB2
S2 --> DB3
S3 --> DB4
end
style Monolith fill:#f59e0b,color:#fff
style Micro fill:#3b82f6,color:#fff
| Aspect | Monolith | Microservices |
|---|---|---|
| Deployment | Single unit, simple | Independent, complex |
| Scaling | Scale everything | Scale individual services |
| Development | Easy to start | Better for large teams |
| Data consistency | ACID transactions | Eventual consistency |
| Latency | In-process calls | Network calls (higher) |
| Debugging | Simpler stack traces | Distributed tracing needed |
| Technology | Single stack | Polyglot possible |
| Failure | Entire app fails | Partial failures |
When to Choose Each
- Start with a monolith when you have a small team, unclear domain boundaries, or are building an MVP.
- Move to microservices when you need independent scaling, have distinct team boundaries, or need different tech stacks per service.
API Design: REST vs gRPC vs GraphQL
REST (Representational State Transfer)
GET /api/v1/users/123 β Get user
POST /api/v1/users β Create user
PUT /api/v1/users/123 β Update user
DELETE /api/v1/users/123 β Delete user
PATCH /api/v1/users/123 β Partial update
Key principles:
- Resource-based URLs
- HTTP methods for actions
- Stateless
- JSON payloads (typically)
- HTTP status codes for responses
gRPC (Google Remote Procedure Call)
service RideService {
rpc RequestRide(RideRequest) returns (RideResponse);
rpc StreamLocation(stream LocationUpdate) returns (stream DriverLocation);
}
message RideRequest {
string user_id = 1;
Location pickup = 2;
Location dropoff = 3;
}
Key features:
- Protocol Buffers (binary serialization)
- HTTP/2 with multiplexing
- Bidirectional streaming
- Code generation for multiple languages
GraphQL
query {
user(id: "123") {
name
email
rides(last: 5) {
id
status
driver {
name
rating
}
}
}
}
Key features:
- Client specifies exact data needed
- Single endpoint
- No over-fetching or under-fetching
- Strong type system with schema
Comparison
| Feature | REST | gRPC | GraphQL |
|---|---|---|---|
| Protocol | HTTP/1.1+ | HTTP/2 | HTTP |
| Data format | JSON | Protobuf (binary) | JSON |
| Performance | Good | Excellent | Good |
| Streaming | Limited | Bidirectional | Subscriptions |
| Browser support | Native | Requires proxy | Native |
| Best for | Public APIs | Service-to-service | Mobile/frontend |
| Learning curve | Low | Medium | Medium |
| Caching | HTTP caching | Custom | Complex |
API Gateway Pattern
flowchart TB
C1["Mobile App"] & C2["Web App"] & C3["Third Party"]
GW["API Gateway"]
C1 & C2 & C3 --> GW
subgraph Services["Backend Services"]
S1["User Service"]
S2["Ride Service"]
S3["Payment Service"]
S4["Notification Service"]
end
GW --> S1 & S2 & S3 & S4
subgraph GWFeatures["Gateway Responsibilities"]
F1["Authentication"]
F2["Rate Limiting"]
F3["Load Balancing"]
F4["Request Routing"]
F5["Response Aggregation"]
F6["SSL Termination"]
end
style GW fill:#8b5cf6,color:#fff
style Services fill:#3b82f6,color:#fff
style GWFeatures fill:#22c55e,color:#fff
The API Gateway acts as a single entry point for all clients. It handles cross-cutting concerns so individual services don't have to.
Popular implementations: Kong, AWS API Gateway, Netflix Zuul, Envoy
Service Discovery
In a microservices architecture, services need to find each other. Services scale up and down, and IP addresses change.
flowchart TB
subgraph Client["Client-Side Discovery"]
C1["Service A"] -->|"1. Query"| R1["Service Registry"]
R1 -->|"2. Return addresses"| C1
C1 -->|"3. Direct call"| S1["Service B (instance 1)"]
end
subgraph Server["Server-Side Discovery"]
C2["Service A"] -->|"1. Request"| LB["Load Balancer"]
LB -->|"2. Query"| R2["Service Registry"]
LB -->|"3. Forward"| S2["Service B (instance 2)"]
end
style Client fill:#3b82f6,color:#fff
style Server fill:#8b5cf6,color:#fff
| Approach | How It Works | Example |
|---|---|---|
| Client-side discovery | Client queries registry, picks instance | Netflix Eureka |
| Server-side discovery | Load balancer queries registry | AWS ELB, Kubernetes |
| DNS-based | Services register DNS entries | Consul, CoreDNS |
| Service mesh | Sidecar proxy handles routing | Istio, Linkerd |
Rate Limiting & Throttling
Rate limiting protects services from being overwhelmed. It's critical for public APIs and shared resources.
Common Algorithms
flowchart LR
subgraph TB["Token Bucket"]
direction TB
T1["Tokens added at fixed rate"]
T2["Request consumes a token"]
T3["No token β rejected"]
T1 --> T2 --> T3
end
subgraph SW["Sliding Window"]
direction TB
W1["Track requests in time window"]
W2["Count requests"]
W3["Over limit β rejected"]
W1 --> W2 --> W3
end
style TB fill:#3b82f6,color:#fff
style SW fill:#22c55e,color:#fff
| Algorithm | Pros | Cons |
|---|---|---|
| Token Bucket | Allows bursts, smooth | Memory for tokens |
| Leaky Bucket | Smooth output rate | No burst handling |
| Fixed Window | Simple | Burst at window edges |
| Sliding Window Log | Precise | High memory usage |
| Sliding Window Counter | Good balance | Approximate |
Rate Limit Headers
HTTP/1.1 429 Too Many Requests
X-RateLimit-Limit: 100
X-RateLimit-Remaining: 0
X-RateLimit-Reset: 1625097600
Retry-After: 60
Authentication: OAuth 2.0 & JWT
OAuth 2.0 Flow
sequenceDiagram
participant U as User
participant A as App (Client)
participant AS as Auth Server
participant RS as Resource Server
U->>A: 1. Click "Login"
A->>AS: 2. Redirect to auth page
U->>AS: 3. Enter credentials
AS->>A: 4. Authorization code
A->>AS: 5. Exchange code for tokens
AS->>A: 6. Access token + Refresh token
A->>RS: 7. API call with access token
RS->>A: 8. Protected resource
JWT (JSON Web Token)
A JWT has three parts: Header.Payload.Signature
eyJhbGciOiJIUzI1NiJ9. β Header (algorithm)
eyJ1c2VyX2lkIjoiMTIzIn0. β Payload (claims)
SflKxwRJSMeKKF2QT4fwpM... β Signature (verification)
| Aspect | Session-based | JWT |
|---|---|---|
| Storage | Server-side | Client-side |
| Scalability | Requires shared store | Stateless, scales easily |
| Revocation | Easy (delete session) | Hard (need blocklist) |
| Size | Small session ID | Larger token |
| Best for | Traditional web apps | Microservices, APIs |
Idempotency
An idempotent operation produces the same result regardless of how many times it's called. This is critical in distributed systems where retries are common.
| HTTP Method | Idempotent? | Example |
|---|---|---|
| GET | Yes | Fetch user profile |
| PUT | Yes | Update entire resource |
| DELETE | Yes | Remove resource |
| POST | No | Create new resource |
| PATCH | It depends | Partial update |
Idempotency Key Pattern
POST /api/v1/payments
Idempotency-Key: "abc-123-unique-key"
{
"amount": 50.00,
"currency": "USD"
}
The server stores the result keyed by the idempotency key. If the same key is sent again, the server returns the stored result instead of processing again. This prevents duplicate payments, duplicate orders, etc.
Practice Problem: Design APIs for a Ride-Sharing Service
Core Entities
- User (riders and drivers)
- Ride (a trip from pickup to dropoff)
- Payment (transaction for a ride)
- Location (real-time GPS coordinates)
API Design
# User Service
POST /api/v1/users β Register
POST /api/v1/auth/login β Login (returns JWT)
GET /api/v1/users/{id}/profile β Get profile
# Ride Service
POST /api/v1/rides β Request a ride
GET /api/v1/rides/{id} β Get ride details
PUT /api/v1/rides/{id}/accept β Driver accepts
PUT /api/v1/rides/{id}/start β Start ride
PUT /api/v1/rides/{id}/complete β Complete ride
PUT /api/v1/rides/{id}/cancel β Cancel ride
GET /api/v1/rides/{id}/eta β Get ETA
# Location Service (gRPC for real-time)
rpc UpdateDriverLocation(stream LocationUpdate) returns (Ack)
rpc SubscribeRiderLocation(RideId) returns (stream DriverLocation)
# Payment Service
POST /api/v1/payments β Process payment
GET /api/v1/payments/{id} β Get payment status
POST /api/v1/payments/{id}/refund β Refund
Summary
| Concept | Description |
|---|---|
| Monolith vs Microservices | Start simple, split when needed |
| REST | Resource-based, widely adopted |
| gRPC | High-performance service-to-service |
| GraphQL | Client-driven queries, reduces over-fetching |
| API Gateway | Single entry point, cross-cutting concerns |
| Service Discovery | Dynamically locate service instances |
| Rate Limiting | Protect services from overload |
| OAuth 2.0 / JWT | Secure authentication for distributed systems |
| Idempotency | Safe retries in unreliable networks |
Key Takeaways
- Choose your API style based on your use case: REST for public APIs, gRPC for internal services, GraphQL for flexible frontends
- An API Gateway simplifies client interactions and centralizes cross-cutting concerns
- Rate limiting is essential for any production API
- Design every write API to be idempotent to handle retries safely
Practice Problems
Problem 1: Basic
Design a REST API for a simple blog platform with users, posts, and comments. Define the endpoints, HTTP methods, and response codes.
Problem 2: Intermediate
You're migrating a monolithic e-commerce app to microservices. Identify the service boundaries, define the APIs between services, and explain how you'd handle a transaction that spans multiple services (e.g., placing an order).
Challenge
Design a rate limiting system for a public API that supports: per-user limits, per-endpoint limits, and global limits. The system must work across multiple API server instances. Describe the algorithm, data store, and how you handle edge cases like clock skew.
References
- Microsoft - Microservices Architecture
- gRPC Documentation
- GraphQL Specification
- OAuth 2.0 RFC 6749
- Stripe - Idempotent Requests
Next up: In Day 7, we'll walk through a complete system design interview β designing a URL Shortener from scratch.