Learn Networking in 10 DaysDay 7: HTTP & How the Web Works

Day 7: HTTP & How the Web Works

What You'll Learn Today

  • The complete flow from typing a URL to seeing a web page
  • How HTTP/1.1, HTTP/2, and HTTP/3 (QUIC) differ
  • HTTP methods, status codes, and headers
  • Cookies, sessions, and stateful communication
  • REST API design principles and WebSocket for real-time communication

From URL to Web Page

When you type https://www.example.com/page into your browser and press Enter, a remarkable chain of events unfolds in milliseconds.

sequenceDiagram
    participant User
    participant Browser
    participant DNS as DNS Resolver
    participant Server as Web Server

    User->>Browser: Type URL and press Enter
    Browser->>Browser: Parse URL (scheme, host, path)
    Browser->>DNS: Resolve www.example.com
    DNS->>Browser: 93.184.216.34
    Browser->>Server: TCP 3-way handshake
    Browser->>Server: TLS handshake (for HTTPS)
    Browser->>Server: HTTP GET /page
    Server->>Browser: HTTP 200 OK + HTML
    Browser->>Browser: Parse HTML, request CSS/JS/images
    Browser->>User: Render the page

URL Anatomy

A URL (Uniform Resource Locator) has a precise structure.

https://www.example.com:443/path/page?query=value#section
|___|   |_____________|___|__________|___________|_______|
scheme      host       port   path      query     fragment
Component Purpose Example
Scheme Protocol to use https, http, ftp
Host Server to contact www.example.com
Port Network port (default: 80 for HTTP, 443 for HTTPS) :443
Path Resource location on the server /path/page
Query Key-value parameters ?query=value&sort=asc
Fragment Client-side anchor (not sent to server) #section

HTTP Protocol Basics

HTTP (HyperText Transfer Protocol) is a request-response protocol. The client sends a request, and the server returns a response. HTTP is stateless by design; each request is independent.

HTTP Request Structure

GET /index.html HTTP/1.1
Host: www.example.com
User-Agent: Mozilla/5.0
Accept: text/html
Accept-Language: en-US
Connection: keep-alive

HTTP Response Structure

HTTP/1.1 200 OK
Content-Type: text/html; charset=UTF-8
Content-Length: 1234
Cache-Control: max-age=3600

<!DOCTYPE html>
<html>...
flowchart LR
    subgraph Request["HTTP Request"]
        RL["Request Line\nGET /path HTTP/1.1"]
        RH["Headers\nHost, User-Agent, Accept"]
        RB["Body\n(optional, for POST/PUT)"]
    end
    subgraph Response["HTTP Response"]
        SL["Status Line\nHTTP/1.1 200 OK"]
        SRH["Headers\nContent-Type, Cache-Control"]
        SB["Body\nHTML, JSON, image data"]
    end
    RL --> RH --> RB
    SL --> SRH --> SB
    style Request fill:#3b82f6,color:#fff
    style Response fill:#22c55e,color:#fff

HTTP Methods

HTTP defines several methods (also called verbs) that indicate the desired action on a resource.

Method Purpose Request Body Idempotent Safe
GET Retrieve a resource No Yes Yes
POST Create a new resource Yes No No
PUT Replace a resource entirely Yes Yes No
PATCH Partially update a resource Yes No No
DELETE Remove a resource Optional Yes No
HEAD Same as GET but no body No Yes Yes
OPTIONS Describe communication options No Yes Yes

Idempotent means calling the same request multiple times produces the same result. Safe means the request does not modify server state.

flowchart TB
    subgraph Methods["HTTP Methods"]
        GET["GET\nRead"]
        POST["POST\nCreate"]
        PUT["PUT\nReplace"]
        PATCH["PATCH\nUpdate"]
        DELETE["DELETE\nRemove"]
    end
    style GET fill:#22c55e,color:#fff
    style POST fill:#3b82f6,color:#fff
    style PUT fill:#f59e0b,color:#fff
    style PATCH fill:#8b5cf6,color:#fff
    style DELETE fill:#ef4444,color:#fff

HTTP Status Codes

Status codes are grouped into five categories.

Range Category Meaning
1xx Informational Request received, continuing
2xx Success Request successfully processed
3xx Redirection Further action needed
4xx Client Error Problem with the request
5xx Server Error Server failed to fulfill a valid request

Common Status Codes

Code Name Description
200 OK Request succeeded
201 Created Resource was created (typically after POST)
204 No Content Success, but no response body
301 Moved Permanently Resource has a new permanent URL
302 Found Temporary redirect
304 Not Modified Cached version is still valid
400 Bad Request Malformed request syntax
401 Unauthorized Authentication required
403 Forbidden Authenticated but not authorized
404 Not Found Resource does not exist
405 Method Not Allowed HTTP method not supported for this resource
429 Too Many Requests Rate limit exceeded
500 Internal Server Error Generic server failure
502 Bad Gateway Upstream server returned invalid response
503 Service Unavailable Server temporarily overloaded or in maintenance

HTTP Headers

Headers carry metadata about the request or response. They fall into several categories.

Request Headers

Header Purpose Example
Host Target server (required in HTTP/1.1) Host: www.example.com
User-Agent Client software identification User-Agent: Mozilla/5.0...
Accept Preferred response content types Accept: text/html, application/json
Authorization Authentication credentials Authorization: Bearer eyJhbG...
Cookie Send stored cookies to server Cookie: session_id=abc123
Cache-Control Caching directives Cache-Control: no-cache

Response Headers

Header Purpose Example
Content-Type MIME type of the response body Content-Type: application/json
Content-Length Size of the response body in bytes Content-Length: 1234
Set-Cookie Store a cookie on the client Set-Cookie: session_id=abc123; HttpOnly
Cache-Control Caching instructions for the client Cache-Control: max-age=3600
Location URL for redirects (3xx) Location: https://new-url.com
Access-Control-Allow-Origin CORS policy Access-Control-Allow-Origin: *

HTTP Version Evolution

HTTP/1.1 (1997)

HTTP/1.1 introduced persistent connections (keep-alive) so multiple requests could reuse a single TCP connection. However, it suffers from head-of-line blocking: each request must wait for the previous response before the next can be sent on the same connection.

sequenceDiagram
    participant Client
    participant Server

    Client->>Server: GET /style.css
    Server->>Client: 200 OK (style.css)
    Client->>Server: GET /script.js
    Server->>Client: 200 OK (script.js)
    Client->>Server: GET /image.png
    Server->>Client: 200 OK (image.png)
    Note over Client,Server: Sequential requests on one connection

To work around this, browsers open 6 parallel TCP connections per domain.

HTTP/2 (2015)

HTTP/2 introduced multiplexing: multiple requests and responses can be interleaved on a single TCP connection using binary framing.

flowchart TB
    subgraph HTTP1["HTTP/1.1"]
        direction LR
        C1["Connection 1"] --> R1["Request 1 β†’ Response 1"]
        C2["Connection 2"] --> R2["Request 2 β†’ Response 2"]
        C3["Connection 3"] --> R3["Request 3 β†’ Response 3"]
    end
    subgraph HTTP2["HTTP/2"]
        direction LR
        SC["Single Connection"] --> MUX["Multiplexed Streams\nAll requests/responses interleaved"]
    end
    style HTTP1 fill:#ef4444,color:#fff
    style HTTP2 fill:#22c55e,color:#fff
Feature HTTP/1.1 HTTP/2
Format Text-based Binary framing
Multiplexing No (1 request at a time per connection) Yes (many streams per connection)
Header compression No HPACK compression
Server push No Yes (server can proactively send resources)
Connections needed Multiple (typically 6) Single connection

HTTP/3 (2022)

HTTP/3 replaces TCP with QUIC, a UDP-based transport protocol developed by Google. QUIC solves TCP-level head-of-line blocking: if one stream's packet is lost, other streams continue unaffected.

flowchart TB
    subgraph Stack1["HTTP/1.1 & HTTP/2"]
        H12["HTTP"]
        TLS12["TLS"]
        TCP["TCP"]
        IP1["IP"]
    end
    subgraph Stack2["HTTP/3"]
        H3["HTTP/3"]
        QUIC["QUIC (includes TLS 1.3)"]
        UDP["UDP"]
        IP2["IP"]
    end
    H12 --> TLS12 --> TCP --> IP1
    H3 --> QUIC --> UDP --> IP2
    style Stack1 fill:#f59e0b,color:#fff
    style Stack2 fill:#22c55e,color:#fff
Feature HTTP/2 (over TCP) HTTP/3 (over QUIC)
Transport TCP UDP (QUIC)
Head-of-line blocking At TCP level, yes No (independent streams)
Connection setup TCP + TLS (2-3 RTT) 1 RTT (0-RTT for resumption)
Connection migration No (new IP = new connection) Yes (connection ID survives IP changes)

Cookies and Sessions

HTTP is stateless, but web applications need to remember users. Cookies bridge this gap.

How Cookies Work

sequenceDiagram
    participant Browser
    participant Server

    Browser->>Server: POST /login (username, password)
    Server->>Browser: 200 OK + Set-Cookie: session_id=abc123
    Browser->>Server: GET /dashboard + Cookie: session_id=abc123
    Server->>Browser: 200 OK (personalized dashboard)

Cookie Attributes

Attribute Purpose Example
HttpOnly Prevents JavaScript access (XSS protection) Set-Cookie: id=abc; HttpOnly
Secure Only sent over HTTPS Set-Cookie: id=abc; Secure
SameSite Controls cross-site sending (CSRF protection) SameSite=Strict, Lax, None
Max-Age Cookie lifetime in seconds Max-Age=3600
Domain Which domains receive the cookie Domain=.example.com
Path URL path scope Path=/api

Sessions vs Tokens

Approach Storage Scalability
Server-side sessions Session data on server; only ID in cookie Requires shared session store for multiple servers
JWT tokens All data in the token (signed, not encrypted by default) Stateless; any server can verify the token

REST API Design

REST (Representational State Transfer) is an architectural style for designing web APIs. It uses standard HTTP methods and URLs to represent resources.

Operation HTTP Method URL Example
List all users GET /api/users Get all users
Get one user GET /api/users/42 Get user with ID 42
Create a user POST /api/users Create new user
Update a user PUT /api/users/42 Replace user 42 entirely
Partial update PATCH /api/users/42 Update specific fields
Delete a user DELETE /api/users/42 Remove user 42

REST Principles

  1. Resource-based URLs: Nouns, not verbs (/users, not /getUsers)
  2. Stateless: Each request contains all information needed
  3. Uniform interface: Consistent use of HTTP methods
  4. Proper status codes: 201 for creation, 404 for not found, etc.

WebSocket

HTTP is request-response: the client always initiates. WebSocket provides full-duplex communication over a single TCP connection, ideal for real-time applications.

sequenceDiagram
    participant Client
    participant Server

    Client->>Server: HTTP GET /chat (Upgrade: websocket)
    Server->>Client: 101 Switching Protocols
    Note over Client,Server: WebSocket connection established
    Client->>Server: Message: "Hello"
    Server->>Client: Message: "Hi there"
    Server->>Client: Message: "New notification"
    Client->>Server: Message: "Thanks"
    Note over Client,Server: Either side can send at any time
Feature HTTP WebSocket
Communication Request-response (client initiates) Full-duplex (both sides)
Connection New request per interaction (or keep-alive) Single persistent connection
Overhead Headers on every request Minimal framing after handshake
Use case Web pages, REST APIs Chat, live feeds, gaming, collaborative editing
Protocol http:// / https:// ws:// / wss://

Summary

Concept Description
URL Scheme + host + port + path + query + fragment
HTTP Request Method + path + headers + optional body
HTTP Response Status code + headers + body
HTTP Methods GET (read), POST (create), PUT (replace), PATCH (update), DELETE (remove)
Status Codes 1xx informational, 2xx success, 3xx redirect, 4xx client error, 5xx server error
HTTP/1.1 Text-based, persistent connections, head-of-line blocking
HTTP/2 Binary framing, multiplexing, header compression, server push
HTTP/3 QUIC (UDP-based), no TCP head-of-line blocking, connection migration
Cookies Server-set key-value pairs stored in the browser for stateful sessions
REST Resource-oriented API design using standard HTTP methods
WebSocket Full-duplex persistent connection for real-time communication

Key Takeaways

  1. The journey from URL to rendered page involves DNS, TCP, TLS, and HTTP working together
  2. HTTP/2 multiplexing and HTTP/3 QUIC dramatically improve performance over HTTP/1.1
  3. Status codes are your primary debugging tool; learn the common ones by heart
  4. Cookies add state to stateless HTTP, but must be secured with HttpOnly, Secure, and SameSite
  5. WebSocket enables real-time communication that HTTP's request-response model cannot provide

Practice Problems

Beginner

Use curl -v https://www.example.com to inspect the full HTTP request and response. Identify the HTTP version, status code, and at least 5 response headers. Explain what each header does.

Intermediate

Design a REST API for a simple blog application. Define the endpoints (URL, method, request body, response) for these operations: list posts, get a single post, create a post, update a post, delete a post, and list comments for a post. Include appropriate status codes for success and error cases.

Advanced

You notice that a web application loads slowly. Using browser developer tools (Network tab), analyze the requests and propose optimizations. Consider: How many connections are being opened? Are resources being loaded in parallel? Could HTTP/2 help? Are there opportunities for caching (check Cache-Control headers)? Would any resources benefit from preloading or server push?


References


Next up: In Day 8, we'll dive into "TLS/SSL & Network Security." You'll learn how encryption protects data in transit, how the TLS handshake establishes a secure connection, and how to defend against common network attacks!