Networking for Senior Engineers: Everything You Actually Use

· networkingsystem-designbackendmiddleware

The Big Picture

When a user clicks a link in your app, here’s what actually happens:

  1. Browser resolves the domain name to an IP address (DNS)
  2. Establishes a TCP connection with a 3-way handshake
  3. Encrypts the connection with TLS
  4. Sends the HTTP request through your load balancer
  5. Your middleware pipeline runs (auth, rate limiting, logging)
  6. Your controller handles the request, reads from cache or database
  7. Response flows back through the same path

Most developers only think about step 6. Seniors understand all 7 — because when step 2 is slow, or step 4 is misconfigured, or step 5 has a bug, the app breaks and you need to know where to look.

The OSI Model

The OSI model has 7 layers. Each layer adds headers to your data and handles one responsibility. You rarely think about it directly, but every networking problem maps to a specific layer.

  • Layer 7 (Application) — This is where your code lives. HTTP, DNS, SSH.
  • Layer 4 (Transport) — TCP and UDP. Ports live here.
  • Layer 3 (Network) — IP addresses and routing. Routers work here.
  • Layers 2 and 1 — Ethernet, MAC addresses, cables. You almost never touch these.

In practice, everyone uses the simplified TCP/IP model (4 layers instead of 7). Same idea, fewer boxes.

OSI Model Explorer

Click any layer to see how your request travels through it. Toggle between the 7-layer OSI model and the simplified TCP/IP model used in practice.

L7ApplicationHTTP, DNS, SSH, FTP
L6PresentationSSL/TLS, JPEG, ASCII
L5SessionNetBIOS, RPC, PPTP
L4TransportTCP, UDP
L3NetworkIP, ICMP, OSPF
L2Data LinkEthernet, MAC, ARP
L1PhysicalCables, Wi-Fi, Fiber

Real work example: Your app can’t connect to the database. You ping the IP and it responds. You telnet to port 5432 and it connects. The problem is at Layer 7 — your ORM is sending wrong auth parameters. If you only knew about “networking is broken,” you’d waste hours. Layer thinking narrows it down in minutes.

TCP vs UDP

Two transport protocols. Pick one per use case.

TCP guarantees delivery, ordering, and error correction. The cost: overhead (handshake, acknowledgments, retransmission). Used for everything where correctness matters.

UDP sends packets and doesn’t care if they arrive. No handshake, no ordering, no retries. Used when speed matters more than reliability.

Use CaseProtocolWhy
REST APIsTCPYou need every response intact
Video streamingUDPA dropped frame is fine, lag is not
Database connectionsTCPA missing row is not fine
DNS queriesUDPFast lookup, retry if no response
Online gamingUDPOld position updates are useless
File uploadsTCPEvery byte must arrive correctly
TCP vs UDP

TCP guarantees delivery. UDP guarantees speed. Pick a scenario and send a packet to see how each protocol handles it.

TCPACTIVE
SEND
GET /api/users HTTP/1.1
RECV
Waiting...
UDP---
SEND
N/A
RECV
---
Feature Comparison
Feature
TCP
UDP
Connection
3-way handshake required
No connection needed
Ordering
Guaranteed order
No ordering
Retransmit
Lost packets resent
Lost = gone
Flow Control
Sliding window
None
Speed
Slower (overhead)
Faster (minimal)
Use Case
APIs, files, payments
Streaming, gaming, DNS

Real work example: You’re building a real-time feature. WebSockets use TCP, so if the connection drops, you get automatic reconnection and ordering. Server-Sent Events also use TCP. If you’re building a game server, use UDP with your own reliability layer on top.

The TCP Handshake

Every TCP connection starts with a 3-way handshake: SYN (client says “I want to connect”), SYN-ACK (server says “OK, I’m ready”), ACK (client says “Let’s go”). After that, data flows. When done, a 4-way teardown closes it.

This happens for every HTTP request before HTTP/2 (which multiplexes on one connection). For your API, each new connection costs ~1 round trip (usually 20-100ms depending on distance).

TCP Handshake & Teardown

Watch the 3-way handshake open a connection and the 4-way teardown close it. Every TCP connection you use starts and ends like this.

Client
192.168.1.42
Server
104.21.76.8
CLOSEDStep 1/7
No connection exists

Real work example: Your API is slow from another country. The TCP handshake alone takes 150ms. Solution: enable HTTP/2 keep-alive (one connection, multiple requests), or use a CDN with edge servers closer to the user.

DNS: How Domains Become IPs

DNS is the phonebook of the internet. api.example.com becomes 104.21.76.8. The resolution chain goes: browser cache -> OS cache -> recursive resolver (like 1.1.1.1 or 8.8.8.8) -> root nameserver -> TLD nameserver -> authoritative nameserver.

A DNS lookup typically takes 20-120ms. If it misses all caches, it can take 200ms+.

How DNS Resolution Works

DNS translates human-readable domain names into IP addresses. Click "Resolve" to step through the lookup process for dotsdecoded.com.

1
Browser
User types dotsdecoded.com in the address bar
2
Local Cache
Browser checks local DNS cache
3
Resolver
Query sent to recursive resolver
4
Root NS
Root nameserver queried for dotsdecoded.com
5
TLD NS
.com TLD server queried
6
Auth NS
Authoritative nameserver responds
7
Cached
Resolver caches and returns result to browser
8
Connected
Browser establishes TCP/TLS connection

Real work example: Your staging API resolves in 5ms (cached), but your production API takes 180ms. You dig into it and find DNS is hitting the authoritative nameserver every time because the TTL is set to 0. Fix: increase TTL to 300 (5 minutes) and add a local DNS resolver.

HTTP Deep Dive

HTTP is stateless request/response. Every API call you write is one of these methods:

  • GET — Read. Safe and idempotent. Same request always returns same result.
  • POST — Create. Not idempotent. Two identical POSTs create two resources.
  • PUT — Replace. Idempotent. Send it 10 times, result is the same as sending once.
  • PATCH — Partial update. Not idempotent (depends on implementation).
  • DELETE — Remove. Idempotent. Deleting something twice is the same as deleting once.

Status codes you’ll see daily: 200 (OK), 201 (Created), 204 (No Content), 301 (Permanent redirect), 400 (Bad request), 401 (Unauthenticated), 403 (Unauthorized), 404 (Not found), 429 (Rate limited), 500 (Server error), 502 (Bad gateway — your reverse proxy can’t reach the app), 503 (Service unavailable), 504 (Gateway timeout — the app took too long).

HTTP Methods & Status Codes

Every API call is an HTTP method. Every response has a status code. Understanding these is non-negotiable for backend work.

GET/api/usersHTTP/1.1
Read a resource
SAFEIDEMPOTENT
Status Code Families
1xxInformational100 Continue | 101 Switching Protocols
2xxSuccess200 OK | 201 Created | 204 No Content
3xxRedirection301 Moved Permanently | 302 Found | 304 Not Modified
4xxClient Error400 Bad Request | 401 Unauthorized | 403 Forbidden | 404 Not Found | 429 Too Many Requests
5xxServer Error500 Internal Server Error | 502 Bad Gateway | 503 Service Unavailable | 504 Gateway Timeout

Real work example: Your mobile app gets 400 errors on a specific endpoint. You check the request body and it looks correct. The issue is the Content-Type header — the app sends text/plain but your API expects application/json. A 5-minute fix, but only if you know HTTP headers.

SSL/TLS: How HTTPS Actually Works

TLS encrypts everything between client and server. TLS 1.3 (the current standard) completes the handshake in 1 round trip.

The key insight: the server’s private key never leaves the server. The client and server independently derive a shared secret using Diffie-Hellman key exchange. Even if someone captures all traffic, they can’t decrypt it without that secret.

What TLS prevents:

  • Man-in-the-middle — attackers can’t impersonate your server (certificate verification)
  • Eavesdropping — captured packets are unreadable (AES-256 encryption)
  • Tampering — modified packets are detected and rejected (authentication tags)
SSL/TLS Handshake

TLS 1.3 encrypts your data in 1 round trip. Click through to see how the handshake works and what attacks it prevents.

Client
Browser
Server
api.example.com

Real work example: Your staging environment uses a self-signed certificate and your API client refuses to connect. You don’t disable TLS (that’s what juniors do). You add the CA certificate to the client’s trust store. In production, you use Let’s Encrypt with auto-renewal via Certbot.

NAT: Why Your Local Dev Doesn’t Match Production

Your laptop has IP 192.168.1.42 (private). The internet can’t route to private IPs. Your router translates it to a public IP (like 203.0.113.5) using NAT (Network Address Translation).

This is why localhost:3000 works on your machine but not on your phone — your phone is on a different network, and there’s no NAT forwarding set up.

Your Network (Simulated)

When your device sends data to the internet, it passes through your router which performs NAT, replacing your private IP with a public one so the response can find its way back.

D
Your Device
192.168.1.42
R
Router
192.168.1.1
I
ISP
203.0.113.1
*
Internet

Real work example: You’re running a WebSocket server locally and it works fine. In production, connections drop after 30 seconds. The AWS load balancer has a 60-second idle timeout, and the NAT gateway has a 30-second TCP timeout. Fix: enable TCP keep-alive packets every 15 seconds.

Ports: Which Application Gets the Data

An IP identifies a machine. A port identifies an application on that machine. When data arrives at 104.21.76.8:443, port 443 tells the OS to hand it to the HTTPS server.

Common ports you’ll encounter: 22 (SSH), 80 (HTTP), 443 (HTTPS), 3000 (dev server), 3306 (MySQL), 5432 (PostgreSQL), 6379 (Redis), 8080 (alternative HTTP).

Client ports (ephemeral ports) are random (49152-65535). That’s why you see 52341 -> 443 in logs — your browser opened a random port to talk to the server on port 443.

Port Explorer

Ports identify which application should receive data on a device. An IP address is the building, a port is the apartment number. Select a connection type below to see which ports are used.

Your Device
IP: 192.168.1.42
Port: 52341
Ephemeral (random)
TCP
Remote Server
IP: 93.184.216.34
Port: 443
HTTPS server
Connection Summary
192.168.1.42:52341->93.184.216.34:443viaTCP
Well-known
0-1023
System services (HTTP, SSH, DNS)
Registered
1024-49151
Application services (MySQL, Redis)
Ephemeral
49152-65535
Temporary client ports

Real work example: Your app can’t connect to PostgreSQL. You check the host, username, password — all correct. The issue: your Docker container maps port 5432 internally, but you’re trying to connect to 5432 on the host which has nothing running. Fix: map the container port with docker run -p 5432:5432.

Load Balancing

One server can’t handle all traffic. A load balancer distributes requests across multiple servers. You use one even with 2 servers.

Algorithms:

  • Round Robin — cycles through servers equally. Simple, works when servers are identical.
  • Least Connections — sends to the server with fewest active requests. Better when requests take different times.
  • IP Hash — same client always hits the same server. Useful for session stickiness (but use shared sessions instead).
  • Weighted — stronger servers get more traffic. Used when servers have different specs.
Load Balancer

A load balancer distributes traffic across servers. Different algorithms make different trade-offs. Watch how 12 requests get distributed.

Cycles through servers in order
Servers
Server A
0
Server B
0
Server C
0
Server D
0
Request Log
Press Play to send requests

Real work example: Your API has 3 servers behind Nginx. Users report intermittent errors. You check the logs and find Server C is returning 500s because it ran out of memory. With least connections balancing, traffic still routed to it. Fix: add health checks so the load balancer removes unhealthy servers from the pool. In Nginx: max_fails=3 fail_timeout=30s.

Caching

Caching is the most impactful performance optimization. Every layer can cache:

  1. Browser cache — controlled by Cache-Control and ETag headers. Free, zero server cost.
  2. CDN cache — Cloudflare, Fastly. Caches at edge locations worldwide. 5-20ms latency.
  3. Application cache — Redis, Memcached. In-memory, sub-millisecond reads. You control the TTL.
  4. Database query cache — some databases cache query results. Usually not enough — use Redis instead.

The cache hierarchy: browser -> CDN -> Redis -> database. Each miss falls through to the next layer.

Caching Strategies

Caching is the #1 way to make your API fast. Each cache layer has different latency and scope. Send requests to see where they land.

User list (changes rarely)TTL: 300s
Cache key: users:all
Browser Cache
0ms
0
hits
CDN (Cloudflare)
5-20ms
0
hits
Redis Cache
<1ms
0
hits
Database
5-50ms
0
hits
Where Each Cache Lives
Browser Cache
Cached by the browser using Cache-Control headers
CDN (Cloudflare)
Edge servers worldwide cache static content
Redis Cache
In-memory cache on your server for API responses
Database
Source of truth -- slowest but always fresh

Real work example: Your /api/products endpoint hits the database every time and takes 200ms. You add Redis caching with a 5-minute TTL. Latency drops to 2ms. Cache invalidation: when a product is updated, you delete the cache key (DEL products:all) and the next request rebuilds it. Simple pattern: cache-aside.

CDN: Edge Caching

A CDN (Content Delivery Network) puts your content on servers around the world. A user in Tokyo hits a CDN node in Tokyo, not your origin server in Virginia. Latency drops from 200ms to 20ms.

CDNs cache static assets (JS, CSS, images) and can also cache API responses if you set the right headers. Cloudflare, Fastly, and AWS CloudFront are the big ones.

Real work example: Your product images load from images.example.com hosted in Virginia. Users in Europe see 800ms load times. You put Cloudflare in front and set Cache-Control: public, max-age=86400. Images load from the nearest edge. Load time drops to 40ms.

Firewalls

A firewall filters traffic based on rules: allow or deny, by port, protocol, and source IP. Every cloud provider has one (AWS Security Groups, GCP Firewall Rules).

Default posture: deny everything, allow only what you need.

RulePortWhy
Allow TCP 443443HTTPS traffic (your app)
Allow TCP 8080HTTP (redirect to HTTPS)
Deny TCP 22 from internet22SSH only from VPN/bastion
Allow TCP 5432 from internal5432DB only reachable from app servers
Deny TCP 3306 from internet3306MySQL never exposed publicly
Firewall Rules

A firewall controls which traffic can enter and leave your network. Toggle rules and test packets to see what gets through.

Rules (evaluated top to bottom)
ALLOWinbound443TCPHTTPS web traffic
ALLOWinbound80TCPHTTP web traffic
DENYinbound22TCPSSH (block external access)
ALLOWinbound5432TCPPostgreSQL (internal only)
ALLOWoutbound443TCPAPI calls to external services
ALLOWoutbound6379TCPRedis cache (internal only)
Test a Packet

Real work example: Your production database has port 5432 open to 0.0.0.0/0 (the entire internet). Someone brute-forces the password and dumps your users table. Fix: restrict the security group to only allow connections from your app server’s IP. Never expose database ports to the internet.

Middleware: Where Networking Meets Your Code

Middleware is code that runs before (and after) your request handler. It’s the bridge between raw networking and your application logic.

Every web framework has middleware. The order matters — CORS runs before auth (because the preflight request has no auth headers), auth runs before your controller (because you need to know who the user is).

Common middleware you’ll write or configure:

MiddlewarePurposeReal Work Example
CORSAllow cross-origin requestsYour React app on app.com calls API on api.com
Rate LimiterPrevent abuseLimit to 100 req/min per IP, return 429 when exceeded
Auth (JWT)Identify the userExtract token from header, set current_user
Request LoggerObservabilityLog method, path, status, response time for monitoring
Strong ParamsInput validationOnly allow name and email on user creation
CSRFPrevent cross-site forgeryVerify token on POST/PUT/DELETE requests
Error HandlerCatch exceptionsReturn 500 with error ID instead of stack traces
Middleware Pipeline

Middleware wraps your request handler like layers of an onion. Each one can read, modify, or reject the request before it reaches your code.

app.use(cors())
app.use(rateLimit({ max: 100 }))
app.use(verifyJWT)
app.use(requestLogger)
app.use(express.json())

app.post('/users', (req, res) => {
  const user = User.create(req.body)
  res.json(user, 201)
})
Toggle middleware and watch the request flow
Request arrivesPOST /api/users
CORS
Rate Limiter
Auth (JWT)
Request Logger
Strong Params
CSRF
Your ControllerUser.create(params)

Real work example: Your API returns 500 errors but you can’t reproduce them locally. You add request logging middleware that logs the full request (headers, body, params) with a unique request ID. You correlate the request ID with your error tracker. Turns out a mobile client is sending Content-Type: text/xml and your JSON parser crashes. Add a middleware that validates Content-Type before it reaches the controller.

Debugging Slow APIs

When your API is slow, check each layer in order:

LayerCheckTool
DNSIs resolution slow?dig api.example.com, check TTL
TCPIs the handshake slow?curl -w "%{time_connect}" -o /dev/null -s URL
TLSIs the handshake slow?curl -w "%{time_appconnect}" -o /dev/null -s URL
Load BalancerIs it healthy?Check health endpoints, connection counts
MiddlewareIs something blocking?Add timing logs around each middleware
CacheAre you hitting the DB?Check Redis hit rate, cache TTLs
DatabaseIs the query slow?EXPLAIN ANALYZE, check indexes
NetworkIs bandwidth the issue?Check response size, enable gzip

The senior move: Add a timing header to every response (X-Response-Time: 142ms) and log the breakdown (DNS: 5ms, TCP: 28ms, middleware: 12ms, DB: 89ms, render: 8ms). When someone says “the API is slow,” you immediately know where.

Quick Reference

Latency numbers every senior should know:

OperationTypical Latency
DNS cache hit&lt;1ms
DNS cache miss20-120ms
TCP handshake (same region)10-30ms
TCP handshake (cross-continent)100-200ms
TLS handshake30-50ms
Redis read&lt;1ms
PostgreSQL query (indexed)1-10ms
PostgreSQL query (no index)100-1000ms+
CDN cache hit5-20ms
API response (cached)2-5ms
API response (DB hit)50-500ms

Idempotency cheat sheet: GET and PUT are idempotent. POST is not. DELETE is idempotent. PATCH depends on implementation. If your payment API uses POST, add an idempotency key header so retrying a failed payment doesn’t double-charge.

Timeout defaults you should set: DNS: 5s, TCP connect: 10s, TLS: 10s, HTTP request: 30s (API), 300s (file upload). Always set timeouts. A connection without a timeout is a connection that hangs forever.