Skip to content

Networking Primer

From AWS Re/Start (June 2025) — the foundations that everything else in networking builds on.


Ultra-Short Summary

Before VPCs make sense, you need the underlying networking model: how data moves from one machine to another across the internet. The answer is layers of abstraction, each with its own addressing and rules. The OSI model is the framework; TCP/IP is the actual implementation; DNS, HTTP, and TLS are the applications on top.


The OSI Model

Seven layers — each layer handles a specific job and only talks to the layer above/below:

7. Application   -> HTTP, HTTPS, DNS, SMTP, FTP -- what the app uses
6. Presentation  -> Encryption, encoding, compression (SSL/TLS lives here conceptually)
5. Session       -> Manages connections (open/close/maintain)
4. Transport     -> TCP / UDP -- end-to-end delivery, ports
3. Network       -> IP -- routing packets between networks
2. Data Link     -> MAC addresses -- delivery on the same network (Ethernet, Wi-Fi)
1. Physical      -> Actual signals -- cables, radio waves, electrical pulses

How to remember: "Please Do Not Throw Sausage Pizza Away" (bottom up: Physical, Data Link, Network, Transport, Session, Presentation, Application).

In practice, most engineers work with layers 3-7. Layers 1-2 are the network hardware team's problem.


TCP vs UDP

Both are transport layer (Layer 4) protocols:

Feature TCP UDP
Connection Connection-oriented (handshake first) Connectionless
Reliability Guaranteed delivery, retransmits lost packets Best-effort, no retransmit
Order In-order delivery guaranteed Packets may arrive out of order
Speed Slower (overhead of guarantees) Faster (no overhead)
Use cases HTTP/S, SSH, databases, email DNS, video streaming, gaming, VoIP

TCP Three-Way Handshake

Client --> SYN       --> Server   "I want to connect"
Client <-- SYN-ACK   <-- Server   "OK, acknowledged"
Client --> ACK       --> Server   "Connection established"

Then data flows bidirectionally.

Closing:
Client --> FIN --> Server
Client <-- FIN-ACK <-- Server

IP Addresses and Ports

IP address: identifies the machine (which house on the street)
Port:       identifies the service running on the machine (which room in the house)

Full connection identifier: source IP:port -> destination IP:port

Common ports:
  22    SSH
  80    HTTP
  443   HTTPS
  3306  MySQL
  5432  PostgreSQL
  6379  Redis
  27017 MongoDB

Ephemeral ports: When a client connects to a server, the server uses a well-known port (443). The client uses a random high port (1024-65535) for the return traffic. This is why NACLs need outbound ephemeral port rules.


DNS — Domain Name System

Translates human-readable names to IP addresses:

You type: google.com
Browser asks: "What IP is google.com?"
DNS resolver answers: "142.250.184.46"
Browser connects to: 142.250.184.46:443

DNS hierarchy:
  Root (.)
    -> TLD (.com, .org, .io)
      -> Authoritative nameserver (google.com NS)
        -> Returns A record (IPv4) or AAAA (IPv6)

DNS record types:
  A      -> domain -> IPv4 address
  AAAA   -> domain -> IPv6 address
  CNAME  -> domain -> another domain (alias)
  MX     -> mail server for the domain
  TXT    -> arbitrary text (often used for verification, SPF)
  NS     -> which nameservers are authoritative for this domain
  SOA    -> start of authority (zone metadata)

TTL (Time to Live): DNS records have a TTL — how long resolvers cache the result. Low TTL = changes propagate fast but more DNS queries. High TTL = less traffic but slow to update.


HTTP and HTTPS

HTTP (HyperText Transfer Protocol) is the language browsers and servers use:

Request:
  GET /path HTTP/1.1
  Host: example.com
  Accept: application/json

Response:
  HTTP/1.1 200 OK
  Content-Type: application/json
  {"key": "value"}

HTTP Methods

Method Purpose
GET Read a resource
POST Create a resource
PUT Replace a resource
PATCH Partially update a resource
DELETE Delete a resource

HTTP Status Codes

2xx Success:    200 OK, 201 Created, 204 No Content
3xx Redirect:   301 Moved Permanently, 302 Found, 304 Not Modified
4xx Client error: 400 Bad Request, 401 Unauthorized, 403 Forbidden, 404 Not Found
5xx Server error: 500 Internal Server Error, 502 Bad Gateway, 503 Service Unavailable

HTTPS = HTTP + TLS. The connection is encrypted. The server proves its identity with a certificate.


TLS Handshake (Simplified)

1. Client Hello      -> "I support TLS 1.3, here are my cipher suites"
2. Server Hello      -> "I pick AES-256-GCM. Here's my certificate."
3. Client verifies the certificate against a trusted CA
4. Key exchange      -> Both sides derive a shared symmetric key
5. Encrypted traffic flows using the symmetric key

TLS = asymmetric crypto to exchange a key, symmetric crypto for the actual data
(Asymmetric is too slow for bulk data; symmetric is fast once you have a shared secret)

Load Balancers

Distribute traffic across multiple servers:

Client -> Load Balancer -> Server 1
                       -> Server 2
                       -> Server 3

If Server 2 fails: LB stops sending traffic there (health check)

Layer 4 (NLB — Network Load Balancer): Routes by IP and port — extremely fast, no content inspection.

Layer 7 (ALB — Application Load Balancer): Routes by HTTP content — host headers, URL paths, cookies.

ALB routing rules:
  /api/*      -> API server target group
  /static/*   -> S3 origin / static server
  *.mobile.*  -> Mobile-optimised target group

NAT — Network Address Translation

How private IP addresses reach the internet:

Your laptop: 192.168.1.10 (private, not routable on internet)
Router/NAT:  203.0.113.5  (public IP)

Outbound:
  192.168.1.10:52000 -> 8.8.8.8:53
  NAT rewrites: 203.0.113.5:52000 -> 8.8.8.8:53

Inbound response:
  8.8.8.8:53 -> 203.0.113.5:52000
  NAT rewrites back: -> 192.168.1.10:52000

External world only sees 203.0.113.5 -- private IPs hidden

AWS NAT Gateway does the same thing for private subnet EC2 instances.


Mental Model

Networking = a series of envelopes inside envelopes

Your HTTP request gets wrapped in:
  HTTP data
  -> TCP segment (adds port numbers)
  -> IP packet (adds source/destination IP)
  -> Ethernet frame (adds MAC addresses)
  -> Physical signal

Each layer adds a header at the front. The receiving end strips them off in reverse.

DNS = the address book
IP  = the postal address system
TCP = the registered mail service (guaranteed delivery)
UDP = dropping a letter in the box (best effort, fast)
TLS = a locked box inside the envelope

Self-Quiz

  1. What are the 7 OSI layers? What does each one do?
  2. What's the difference between TCP and UDP? Give an example use case for each.
  3. Walk through what happens when you type https://google.com in a browser.
  4. What port does HTTPS use? SSH? MySQL?
  5. What's the difference between an ALB and NLB?
  6. What's NAT and why does your home network need it?
  7. What DNS record type maps a domain to an IP address?
  8. What does TTL mean in the context of DNS?