WebSockets Deep Dive: Protocol, Framing, and Real-Time Bidirectional Communication

Imagine you are in a library. You want to ask the librarian a series of questions. In the old web model, you would walk up to the desk, ask a question, wait for the answer, walk back to your seat, process the answer, then walk back to ask the next question. Every question requires a new trip. That is HTTP polling.

Now imagine a different library. You walk up once, hand the librarian a note that says “I will be asking follow-up questions.” The librarian nods. You ask your first question, get an answer immediately, ask a follow-up, get another answer. The conversation flows naturally, both directions, without you leaving the desk. That is WebSocket.

What Problem Does WebSocket Solve?

The web was built on a request-response cycle. The client sends a request, the server sends one response, and the connection closes. This is fine for documents. But what about chat messages, live game state, streaming stock prices, or collaborative editing?

Engineers tried workarounds:

HTTP polling: The client sends a request every N seconds asking “any updates?” The server responds with the data or an empty body. Simple to implement, but wasteful. Most requests return nothing, and there is always latency equal to the polling interval.
Long-polling: The client sends a request. The server holds it open until new data is available, then responds. The client immediately sends a new request. This reduces empty responses but still creates a new HTTP connection for every message. Headers are sent each time, adding overhead.
HTTP streaming: The server sends partial chunks of a response without closing the connection (Transfer-Encoding: chunked). The client reads chunks as they arrive. This avoids reconnection overhead but is still unidirectional (server to client only) and the client cannot send data through the same stream.

Each approach has tradeoffs. None of them give true bidirectional, low-latency communication over a single connection.

Enter WebSocket

WebSocket (RFC 6455) solves this by upgrading an HTTP connection into a persistent, full-duplex communication channel over a single TCP socket.

The key properties:

Single TCP connection — no repeated handshakes after the initial upgrade
Full-duplex — both sides send data simultaneously
Low framing overhead — 2-14 bytes per frame vs. hundreds of bytes for HTTP headers
Binary and text — native support for both UTF-8 text and raw binary
Sub-protocol negotiation — client and server agree on a higher-level protocol

The connection starts as HTTP, then upgrades. The server responds with a 101 Switching Protocols status, and from that point forward, both sides speak the WebSocket protocol over the same TCP socket.

WebSocket vs Polling vs SSE

| Feature | HTTP Polling | Long-Polling | SSE | WebSocket | |---------|-------------|--------------|-----|-----------| | Direction | Client to Server | Client to Server | Server to Client only | Bidirectional | | Overhead | High (headers each time) | High (headers each time) | Low (one connection) | Very low (2-byte min frame) | | Latency | Poll interval | One HTTP round trip | Immediate | Immediate | | Binary | Yes (HTTP body) | Yes (HTTP body) | Base64 needed | Native binary | | Auto-reconnect | Implicit (next poll) | Implicit (next poll) | Built-in | Manual implementation | | Proxy friendly | Yes | Yes | Yes | May be blocked | | Complexity | Trivial | Moderate | Simple | Moderate |

WebSocket wins on latency and overhead. SSE wins on simplicity and auto-reconnection. Polling wins on compatibility. Choose based on your use case.

The Opening Handshake

Every WebSocket connection begins as an HTTP request. The client sends a standard GET request with special headers:

Upgrade: websocket — signals the intent to switch protocols
Connection: Upgrade — tells intermediaries not to treat this as a regular HTTP request
Sec-WebSocket-Key — a 16-byte random value, base64-encoded. Used to prove the server understands the WebSocket protocol
Sec-WebSocket-Version: 13 — the protocol version (currently the only standardized version)

The server must not simply accept any upgrade request. It needs to prove it understands the protocol. The server computes a response token by:

Concatenating the Sec-WebSocket-Key with the magic GUID 258EAFA5-E914-47DA-95CA-C5AB0DC85B11
Taking the SHA-1 hash of this concatenation
Base64-encoding the resulting 20 bytes

The result is sent as Sec-WebSocket-Accept. This proves the server read and understood the WebSocket specification, because only someone who knows the magic GUID can produce the correct accept value.

WebSocket Opening Handshake

The client sends an HTTP Upgrade request. The server computes the accept value by concatenating the key with a magic GUID, taking SHA-1, and base64-encoding the result.

Client Request

GET /ws HTTP/1.1
Host: example.com
Upgrade: websocket
Connection: Upgrade
Sec-WebSocket-Key: s5CEclUR4qUOqcpaSo5kTw==
Sec-WebSocket-Version: 13

Server Response

HTTP/1.1 101 Switching Protocols
Upgrade: websocket
Connection: Upgrade
Sec-WebSocket-Accept: 

Accept Computation

Step 1: Concatenate key with magic GUID
s5CEclUR4qUOqcpaSo5kTw==
+
258EAFA5-E914-47DA-95CA-C5AB0DC85B11
Step 2: Compute SHA-1 hash
Click "Show Hex" to reveal
Step 3: Base64 encode the hash

The handshake is deliberately simple. It reuses HTTP infrastructure (port 80/443, proxies, authentication) while establishing a protocol switch. After the handshake, the HTTP connection ceases to exist — both sides now speak WebSocket frames over the raw TCP socket.

WebSocket Frame Format

After the upgrade, all data is sent in frames. The frame format is binary and compact:

 0                   1                   2                   3
 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1
+-+-+-+-+-------+-+-------------+-------------------------------+
|F|R|R|R| opcode|M| Payload len |    Extended payload length    |
|I|S|S|S|  (4)  |A|     (7)     |             (16/64)           |
|N|V|V|V|       |S|             |   (if payload len==126/127)   |
| |1|2|3|       |K|             |                               |
+-+-+-+-+-------+-+-------------+ - - - - - - - - - - - - - - - +
|     Extended payload length continued (if payload len==127)   |
+ - - - - - - - - - - - - - - - +-------------------------------+
|                               |Masking-key (if MASK set)       |
+-------------------------------+-------------------------------+
| Masking-key (continued)       |          Payload Data         |
+-------------------------------- - - - - - - - - - - - - - - - +
:                     Payload Data continued ...                :
+ - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - +
|                     Payload Data (continued)                  |
+---------------------------------------------------------------+

FIN (1 bit): Marks the final frame of a message. 0 means more fragments follow.
RSV1-3 (3 bits): Reserved for extensions. Must be 0 unless an extension is negotiated.
Opcode (4 bits): The frame type — text (1), binary (2), close (8), ping (9), pong (10).
MASK (1 bit): Whether the payload is masked. Must be 1 for client-to-server frames.
Payload length (7 bits, or 7+16, or 7+64): The length of the payload. Values 0-125 are direct. 126 means the next 2 bytes are the length (big-endian). 127 means the next 8 bytes are the length.
Masking key (0 or 4 bytes): Present only if MASK=1. Used to XOR-mask the payload.
Payload data: The actual message content, possibly masked.

WebSocket Frame Structure

Payload

Opcode

MaskedFIN

Hex Bytes (11 total)

818514116b1e5c7407727b

FIN1 (Final fragment)RSV000 (must be 000)Opcode0001 = 1 (Text)MASK1 (Masked)Length5 (7-bit)Mask Keya6 39 e8 42Payload5 bytes: "Hello"

Legend

Control Byte

Length Info

Masking Key

Payload

The frame format is minimal by design. A simple text frame with a short payload can be as small as 2 bytes of overhead (FIN + opcode + length). Compare that to the hundreds of bytes of HTTP headers in a polling request, and the efficiency gain is clear.

Opcodes and Control Frames

WebSocket defines five opcodes, divided into data frames and control frames:

Data frames:

0x1 (Text): UTF-8 encoded text data. Most applications use this for JSON messages.
0x2 (Binary): Raw binary data. No encoding. Use for blobs, protobuf, images, etc.
0x0 (Continuation): A continuation of a previous frame’s message. Used for fragmentation.

Control frames:

0x8 (Close): Initiates the close handshake. Contains an optional 2-byte status code and reason string.
0x9 (Ping): Keepalive probe. The receiver must respond with a Pong frame.
0xA (Pong): Response to a Ping frame.

Control frames must not be fragmented. They can appear between fragments of a data message. Control frames have a maximum payload length of 125 bytes.

The opcode in the first fragment of a message tells you the type of the entire message (text or binary). Continuation frames always have opcode 0. When FIN=1 on a continuation frame, the message is complete.

Client-to-Server Masking

One quirk of the WebSocket protocol: client-to-server frames must have MASK=1, while server-to-client frames must have MASK=0.

Why? The WebSocket working group identified a security issue called “cache poisoning” or “cross-protocol attack.” An attacker could craft a WebSocket client that sends data that looks like a valid HTTP request to an intermediary (proxy, cache). If the intermediary misinterpreted the WebSocket data as HTTP, it could poison its cache.

Masking prevents this by XORing the payload with a random 4-byte key. The intermediary sees random bytes that do not match any known protocol. Once the connection is established, the intermediary treats it as opaque TCP data.

The masking key is chosen randomly per frame. Each byte of the payload is XORed with maskingKey[i % 4]. The receiver XORs with the same key to recover the original payload.

This is not encryption. Masking is a defense against broken intermediaries, not a confidentiality mechanism. For actual security, use WSS (WebSocket over TLS).

Building a WebSocket Server from Scratch

Let us build a minimal WebSocket server in Node.js to see how the protocol works in practice. We will use only the built-in http and crypto modules — no third-party libraries.

import { createServer } from 'http'
import { createHash } from 'crypto'

const MAGIC_GUID = '258EAFA5-E914-47DA-95CA-C5AB0DC85B11'

const server = createServer((req, res) => {
  const key = req.headers['sec-websocket-key']
  const upgrade = req.headers['upgrade']

  if (req.url === '/ws' && upgrade?.toLowerCase() === 'websocket' && key) {
    const accept = createHash('sha1')
      .update(key + MAGIC_GUID)
      .digest('base64')

    res.writeHead(101, {
      'Upgrade': 'websocket',
      'Connection': 'Upgrade',
      'Sec-WebSocket-Accept': accept,
    })
    res.socket.setNoDelay(true)

    const socket = res.socket
    // Now we can read/write WebSocket frames using the raw socket
    // (see next section for frame parsing)
  } else {
    res.writeHead(404)
    res.end()
  }
})

server.listen(8080)

The createHash('sha1').update(key + MAGIC_GUID).digest('base64') computes the accept token we explored earlier.

Once the 101 response is sent, res.socket gives us raw access to the TCP socket. We no longer use the HTTP response object — we read and write WebSocket frames directly.

A simple frame parser in Node.js:

function parseFrame(buffer) {
  const firstByte = buffer[0]
  const secondByte = buffer[1]
  const fin = (firstByte & 0x80) !== 0
  const opcode = firstByte & 0x0f
  const masked = (secondByte & 0x80) !== 0
  let payloadLen = secondByte & 0x7f
  let offset = 2

  if (payloadLen === 126) {
    payloadLen = buffer.readUInt16BE(2)
    offset = 4
  } else if (payloadLen === 127) {
    payloadLen = Number(buffer.readBigUInt64BE(2))
    offset = 10
  }

  let maskKey = null
  if (masked) {
    maskKey = buffer.slice(offset, offset + 4)
    offset += 4
  }

  let payload = buffer.slice(offset, offset + payloadLen)
  if (masked) {
    payload = Buffer.from(
      payload.map((byte, i) => byte ^ maskKey[i % 4])
    )
  }

  return { fin, opcode, masked, payloadLen, payload: payload.toString() }
}

And a frame builder:

function buildFrame(payload, opcode = 0x1) {
  const payloadBuf = Buffer.from(payload, 'utf-8')
  const len = payloadBuf.length
  const header = []

  header.push(0x80 | opcode)

  if (len < 126) {
    header.push(len)
  } else if (len < 65536) {
    header.push(126, (len >> 8) & 0xff, len & 0xff)
  } else {
    header.push(127)
    const bigLen = BigInt(len)
    for (let i = 7; i >= 0; i--) {
      header.push(Number((bigLen >> BigInt(i * 8)) & 0xffn))
    }
  }

  return Buffer.concat([Buffer.from(header), payloadBuf])
}

The same concepts apply in any language. Here is a Python server using the asyncio and hashlib standard libraries:

import asyncio
import hashlib
import base64

MAGIC_GUID = '258EAFA5-E914-47DA-95CA-C5AB0DC85B11'

def compute_accept(key):
    sha1 = hashlib.sha1()
    sha1.update((key + MAGIC_GUID).encode())
    return base64.b64encode(sha1.digest()).decode()

async def handle_client(reader, writer):
    data = await reader.read(4096)
    request = data.decode()
    key = None
    for line in request.split('\r\n'):
        if line.lower().startswith('sec-websocket-key'):
            key = line.split(':')[1].strip()
            break

    accept = compute_accept(key)
    response = (
        'HTTP/1.1 101 Switching Protocols\r\n'
        'Upgrade: websocket\r\n'
        'Connection: Upgrade\r\n'
        f'Sec-WebSocket-Accept: {accept}\r\n'
        '\r\n'
    )
    writer.write(response.encode())
    await writer.drain()

    # Parse and echo frames
    while True:
        frame = await reader.read(4096)
        if not frame or len(frame) < 2:
            break

        first_byte = frame[0]
        second_byte = frame[1]
        opcode = first_byte & 0x0f

        if opcode == 0x8:  # Close
            break
        elif opcode == 0x9:  # Ping
          writer.write(bytes([0x8a, 0x00]))
          await writer.drain()
          continue

        # Parse payload length
        payload_len = second_byte & 0x7f
        offset = 2
        if payload_len == 126:
            payload_len = int.from_bytes(frame[2:4], 'big')
            offset = 4
        elif payload_len == 127:
            payload_len = int.from_bytes(frame[2:10], 'big')
            offset = 10

        # Unmask
        mask_key = frame[offset:offset+4]
        offset += 4
        payload = bytearray(frame[offset:offset+payload_len])
        for i in range(len(payload)):
            payload[i] ^= mask_key[i % 4]

        print(f"Received: {payload.decode()}")

        # Echo back (unmasked)
        echo = build_frame(payload.decode(), 0x1)
        writer.write(echo)
        await writer.drain()

    writer.close()

async def main():
    server = await asyncio.start_server(handle_client, '0.0.0.0', 8080)
    async with server:
        await server.serve_forever()

asyncio.run(main())

You can test the handshake with curl:

# WebSocket handshake using curl
curl -i -N \
  -H "Connection: Upgrade" \
  -H "Upgrade: websocket" \
  -H "Sec-WebSocket-Version: 13" \
  -H "Sec-WebSocket-Key: dGhlIHNhbXBsZSBub25jZQ==" \
  http://localhost:8080/ws

The server responds with 101 Switching Protocols and the Sec-WebSocket-Accept header. After this, curl drops into raw TCP mode, and you would need to manually type WebSocket frames (not practical — use a tool like websocat instead):

# Interactive WebSocket test with websocat
websocat ws://localhost:8080/ws

Message Fragmentation

WebSocket allows messages to be split across multiple frames. This is called fragmentation. It is useful when:

The sender generates data progressively (streaming a large response)
The total message size is not known in advance
A control frame (ping/pong) needs to be interleaved with a long message

The rules are straightforward:

The first fragment has the message’s opcode (text or binary) and FIN=0
Continuation fragments have opcode=0 and FIN=0
The final fragment has opcode=0 and FIN=1
Control frames may be interleaved between fragments

Message Fragmentation

Large messages are split into frames. The first frame has the opcode (e.g., 1 for text), continuation frames have opcode 0, and the final frame has FIN=1.

First Frame (FIN=0, opcode=1)WAITING

FIN: 0Opcode: 1 (text)Payload: "Hello "Bytes: 6

Continuation (FIN=0, opcode=0)WAITING

FIN: 0Opcode: 0 (continuation)Payload: "Wor"Bytes: 3

Final Frame (FIN=1, opcode=0)WAITING

FIN: 1Opcode: 0 (continuation)Payload: "ld!"Bytes: 3

Reassembled Message

Waiting for all frames...

The receiver reassembles the message by concatenating payloads from all fragments in order. The opcode from the first fragment determines the message type (text or binary). If a non-zero opcode appears on a non-initial fragment, it is a protocol error.

Fragmentation is transparent to the application layer. Most WebSocket libraries reassemble frames before delivering the message to your code. But understanding fragmentation matters for:

Memory management: A fragmented message can be arbitrarily large. You may need to set maximum message size limits.
Streaming: If you want to send progressive data, you control fragmentation at the frame level.
Debugging: Wire-level debugging tools show individual frames, not reassembled messages.

Ping/Pong Keepalive

WebSocket connections over TCP can remain open indefinitely. But network equipment (NATs, firewalls, proxies, load balancers) has idle timeouts. If no data passes through for a configurable period, the intermediary may close the connection.

Ping/Pong frames keep the connection alive. The client sends a Ping frame (opcode 9), and the server must respond with a Pong frame (opcode 10) as soon as possible.

Ping / Pong Keepalive

WebSocket control frames keep idle connections alive. The client sends a Ping (opcode 9), the server responds with a Pong (opcode 10). If no Pong arrives, the connection is considered dead.

Heartbeat Interval (seconds)3s

Round-Trip Time

---

Connection Status

Connected

PING

Client

Idle

PONG

Server

Event Log

No events yet. Click Start to begin.

The JavaScript WebSocket API does not expose ping/pong directly (the browser handles them automatically). When building a custom WS server, you must implement this:

// Server-side heartbeat
const INTERVAL = 30000 // 30 seconds

const heartbeat = setInterval(() => {
  if (socket.readyState === WebSocket.OPEN) {
    socket.ping()
    socket._pingTimeout = setTimeout(() => {
      socket.terminate() // No pong received
    }, 10000) // 10 second timeout
  }
}, INTERVAL)

socket.on('pong', () => {
  clearTimeout(socket._pingTimeout)
})

socket.on('close', () => {
  clearInterval(heartbeat)
  clearTimeout(socket._pingTimeout)
})

A WebSocket library like ws in Node.js handles ping/pong and connection health tracking for you:

import { WebSocketServer } from 'ws'

const wss = new WebSocketServer({ port: 8080 })

wss.on('connection', (ws) => {
  ws.isAlive = true
  ws.on('pong', () => { ws.isAlive = true })
})

// Heartbeat check every 30 seconds
const interval = setInterval(() => {
  wss.clients.forEach((ws) => {
    if (ws.isAlive === false) return ws.terminate()
    ws.isAlive = false
    ws.ping()
  })
}, 30000)

wss.on('close', () => clearInterval(interval))

The heartbeat interval should be shorter than the network path’s idle timeout. A common choice is 30-45 seconds, which works behind most NATs and cloud load balancers.

Close Handshake

Closing a WebSocket connection is a handshake, not an abrupt teardown. Either side can initiate a close by sending a Close frame (opcode 8). The receiving side must respond with its own Close frame.

Close Handshake

A close frame (opcode 8) contains a 2-byte status code and an optional reason string. The server echoes the close frame to confirm. If no close is received, the connection is abnormally closed.

Status Code

The connection successfully completed its purpose

Reason (optional)

0/123 bytes

Close Frame Hex (4 bytes)

880203e8

Client

Waiting to send close...

Server

Waiting...

The Close frame has an optional body:

Status code (2 bytes, big-endian): A numeric code indicating why the connection closed
Reason (up to 123 bytes): A UTF-8 string explaining the reason

Common status codes:

| Code | Name | Meaning | |------|------|---------| | 1000 | Normal Closure | The purpose of the connection was fulfilled | | 1001 | Going Away | Server is shutting down, or client navigated away | | 1002 | Protocol Error | Received an invalid frame | | 1003 | Unsupported Data | Received a data type that cannot be accepted | | 1007 | Invalid Payload Data | Received data that does not match the type (e.g., invalid UTF-8) | | 1008 | Policy Violation | Received a message that violates server policy | | 1009 | Message Too Big | The message exceeds the maximum allowed size | | 1011 | Internal Error | Server encountered an unexpected condition |

If a Close frame is not received (e.g., the TCP connection drops), the closure is considered abnormal. The side that detects the TCP close should assume the connection is dead and clean up local resources.

import WebSocket from 'ws'

function gracefulClose(ws, code = 1000, reason = '') {
  ws.close(code, reason)
  // ws 'close' event fires when the server echoes the close frame
  ws.on('close', () => {
    console.log(`Closed: ${code} ${reason}`)
  })
}

// Timeout for abnormal close
const closeTimeout = setTimeout(() => {
  if (ws.readyState !== WebSocket.CLOSED) {
    console.warn('Abnormal close - terminating')
    ws.terminate() // Force TCP close
  }
}, 5000)

Scaling WebSockets

WebSocket servers are stateful. Each connection maintains server-side state (session, authentication, subscription channels). This creates scaling challenges that stateless HTTP does not have.

Sticky Sessions

When a client connects through a load balancer, the initial HTTP upgrade request goes to one server. All subsequent WebSocket frames must go to the same server, because that server holds the connection state.

Load balancers solve this with sticky sessions (also called session affinity):

Cookie-based (AWS ALB, HAProxy): The load balancer sets a cookie directing subsequent requests to the same backend
Source IP hash: The load balancer hashes the client IP to select a backend
Proxy Protocol: Some LBs pass the original client IP so the server can maintain its own affinity table

# Nginx WebSocket proxy with sticky sessions
upstream ws_backend {
    ip_hash;
    server 10.0.1.1:8080;
    server 10.0.2.1:8080;
    server 10.0.3.1:8080;
}

server {
    listen 80;
    location /ws {
        proxy_pass http://ws_backend;
        proxy_http_version 1.1;
        proxy_set_header Upgrade $http_upgrade;
        proxy_set_header Connection "upgrade";
        proxy_set_header Host $host;
        proxy_read_timeout 86400s;
    }
}

Pub/Sub Backend

High-traffic WebSocket applications use a pub/sub backend to broadcast messages across servers:

A client sends a message to its connected WebSocket server
The server publishes the message to a channel (Redis Pub/Sub, Kafka, RabbitMQ)
All WebSocket servers subscribe to relevant channels
When a server receives a published message, it forwards it to all connected clients subscribed to that channel

import { createClient } from 'redis'
import { WebSocketServer } from 'ws'

const redis = createClient()
await redis.connect()

const wss = new WebSocketServer({ port: 8080 })
const subscriptions = new Map()

wss.on('connection', (ws) => {
  ws.on('message', async (data) => {
    const msg = JSON.parse(data.toString())

    if (msg.type === 'subscribe') {
      const subscriber = redis.duplicate()
      await subscriber.connect()
      await subscriber.subscribe(msg.channel, (message) => {
        ws.send(message)
      })
      subscriptions.set(ws, subscriber)
    }

    if (msg.type === 'publish') {
      await redis.publish(msg.channel, JSON.stringify(msg.data))
    }
  })

  ws.on('close', () => {
    const subscriber = subscriptions.get(ws)
    if (subscriber) {
      subscriber.quit()
      subscriptions.delete(ws)
    }
  })
})

Multiplexing Challenge

Unlike HTTP/2, which multiplexes multiple streams over a single connection, WebSocket defines a single message stream per connection. To multiplex, you need either:

Multiple WebSocket connections: Each logical channel gets its own TCP/TLS connection
A multiplexing sub-protocol: Define your own framing inside WebSocket messages (e.g., add a channel ID header to each JSON message)
WebSocket over HTTP/2: RFC 8441 defines a way to tunnel WebSocket over individual HTTP/2 streams, enabling true multiplexing

WebSocket over HTTP/2

RFC 8441 defines how to run WebSocket over HTTP/2. Instead of a single TCP connection per WebSocket, the WebSocket is tunneled over an HTTP/2 stream. Multiple WebSocket connections can share one HTTP/2 connection.

This eliminates the TCP connection overhead per WebSocket and enables true multiplexing. The WebSocket frames are sent in DATA frames of the HTTP/2 stream, preserving the original frame boundaries.

Browser support for wss:// over HTTP/2 exists in modern browsers (Chrome, Firefox, Safari). The browser automatically negotiates the transport at the connection level.

WebSocket vs SSE vs gRPC

These three technologies overlap in the real-time communication space but have different strengths:

| Feature | WebSocket | SSE | gRPC Stream | |---------|-----------|-----|-------------| | Direction | Bidirectional | Server to Client | Bidirectional | | Transport | TCP (or HTTP/2) | HTTP/1.1+ | HTTP/2 | | Message format | Binary or Text | Text only | Protobuf (binary) | | Streaming | Full-duplex | Server -> Client | Full-duplex | | Auto-reconnect | Manual | Built-in | Manual | | Language support | All languages | Browser + Server | gRPC ecosystem | | Proxy complexity | May be blocked | Works through proxies | Requires HTTP/2 | | Typical use case | Chat, gaming, live sync | Notifications, feeds | Microservices, streaming RPC |

Use WebSocket for low-latency bidirectional communication where you control both client and server (chat apps, multiplayer games, collaborative editing, live trading).
Use SSE for server-to-client streaming when you need simplicity, HTTP compatibility, and automatic reconnection (live notifications, AI token streaming, monitoring dashboards).
Use gRPC streams in microservice-to-microservice communication when you already use protobuf and need HTTP/2 features like multiplexing and flow control.

Self-Check

Before you close this page, make sure you can answer these:

[ ] Can you explain the WebSocket opening handshake step by step, including the accept computation?
[ ] Can you draw the wire format of a WebSocket frame (FIN, opcode, MASK, length, masking key, payload)?
[ ] Why must client-to-server frames be masked? What attack does this prevent?
[ ] What is the difference between a data frame and a control frame? Which opcodes belong to each?
[ ] How does message fragmentation work? When would you use it?
[ ] How does Ping/Pong keepalive work, and what happens when a Pong is not received?
[ ] What status codes can a Close frame carry, and what do 1000, 1001, 1008, and 1011 mean?
[ ] Why do WebSocket servers need sticky sessions? How do load balancers implement this?
[ ] When would you choose SSE over WebSocket? When would you choose gRPC streams?

Test Your Knowledge

Question 1 of 712 pts

How does the WebSocket opening handshake work?

Score: 0 / 780%

WebSockets Deep Dive: Protocol, Framing, and Real-Time Bidirectional Communication

What Problem Does WebSocket Solve?

Enter WebSocket

WebSocket vs Polling vs SSE

The Opening Handshake

WebSocket Frame Format

Opcodes and Control Frames

Client-to-Server Masking

Building a WebSocket Server from Scratch

Message Fragmentation

Ping/Pong Keepalive

Close Handshake

Scaling WebSockets

Sticky Sessions

Pub/Sub Backend

Multiplexing Challenge

WebSocket over HTTP/2

WebSocket vs SSE vs gRPC

Self-Check

Further Reading

Test Your Knowledge