Design Pastebin: Building a Code Snippet Sharing Service

You are in a system design interview. The interviewer says: “Design a pastebin / code snippet service.” You have used these before — Pastebin, GitHub Gist, Hastebin, Glot.io. They let you paste text, get a shareable URL, and optionally set an expiration time. The problem sounds simple. The depth is in the details: how do you store millions of pastes efficiently, deduplicate content, handle syntax highlighting for 50+ languages, enforce expiration, and prevent abuse?

This is a complete walkthrough from zero to deployed architecture.

What We Are Building

A pastebin is like a sticky note for the internet. You write some text (usually code), the service gives you back a short URL, and anyone with that URL can view the paste. Services like Pastebin and GitHub Gist have been around for two decades because developers constantly need to share code snippets — in chat, on forums, in bug reports, during code reviews.

Real-world examples: Pastebin (launched 2002, still one of the most-trafficked sites on the internet), GitHub Gist (2008, integrated with Git for versioning), Hastebin (lightweight, used in Discord), Glot.io (runnable snippets with Docker), and PrivateBin (zero-knowledge, encrypted pastes).

Why do these services matter? Three reasons. First, collaboration — sharing a code snippet in Slack or IRC is useless if the code wraps badly or gets lost in the scrollback. A paste link is clean and persistent. Second, debugging — users paste error logs, stack traces, and server output to get help. Third, archival — pastebins serve as lightweight documentation for one-off scripts, configuration files, and SQL queries.

Requirements

Requirements Checklist

A pastebin service needs more than just storing text. Click any item to include or exclude it from your design scope.

Coverage20/20 (100%)

Core(5/5)

Expiration(1/1)

Privacy(1/1)

URL / Slug(2/2)

Metadata(2/2)

Advanced(4/4)

Abuse Prevention(2/2)

API(1/1)

UX(2/2)

Paste Configuration

Each paste stores its content, metadata, and behavior settings.

Syntax Highlighting

PythonJavaScriptTypeScriptGoRustSQLBashRuby+6 more

Expiration Options

Burn After Read10 min1 hr24 hr1 week1 monthNever

Visibility

Public

Visible in search and listings

Unlisted

Only accessible via direct URL

Private

Only accessible by creator

Functional Requirements

Create a paste — paste text content, optionally set a title, language, expiration, and visibility
View a paste — given a short slug URL, render the content with syntax highlighting
Raw text access — serve plain text at /raw/<slug> for curl and programmatic consumption
Syntax highlighting — support 20+ programming languages with automatic detection
Expiration — support burn-after-read, 10 minutes, 1 hour, 1 day, 1 week, 1 month, and never
Custom slugs — let users specify a custom URL alias
Visibility — public (searchable), unlisted (only with direct link), private (only creator)

Non-Functional Requirements

High availability — the view endpoint must never 5xx (99.9% uptime target)
Low latency — paste views complete in under 100ms at p99
Durability — no paste content is lost before its expiration
Scalability — handle 10,000+ writes/second and 1M+ reads/second at peak
Abuse resistance — rate limiting, content scanning, and spam detection

Out of Scope

We are not building: user authentication (optional for MVP), paste editing (complex versioning), collaborative editing (multiple simultaneous editors), runnable code execution (like Glot.io), or a full admin dashboard.

Paste Anatomy

Every paste has the same internal structure regardless of how users interact with it.

CREATE TABLE pastes (
    id            BIGSERIAL PRIMARY KEY,
    slug          VARCHAR(12) UNIQUE NOT NULL,
    title         VARCHAR(200),
    content_hash  CHAR(64) NOT NULL,
    content_path  VARCHAR(500) NOT NULL,
    language      VARCHAR(30) NOT NULL DEFAULT 'auto',
    visibility    VARCHAR(10) NOT NULL DEFAULT 'public',
    expiration    TIMESTAMPTZ,
    burn_after_read BOOLEAN NOT NULL DEFAULT false,
    access_count  INTEGER NOT NULL DEFAULT 0,
    content_size  INTEGER NOT NULL,
    created_at    TIMESTAMPTZ NOT NULL DEFAULT NOW(),
    updated_at    TIMESTAMPTZ NOT NULL DEFAULT NOW()
);

CREATE INDEX idx_pastes_slug ON pastes(slug);
CREATE INDEX idx_pastes_content_hash ON pastes(content_hash);
CREATE INDEX idx_pastes_expiration ON pastes(expiration);

Each field serves a specific purpose:

slug — the short string in the URL (pastebin.com/aB3x9Q). Must be unique, generated randomly or user-specified.
content_hash — SHA-256 of the raw paste content. Used for deduplication — if two pastes have the same hash, we store the content once.
content_path — the key in object storage (S3, R2) where the full compressed content lives. Metadata stays in SQL; content stays in blob storage.
language — the syntax highlighting language. Auto-detected on upload but user-overridable.
expiration — nullable timestamp. NULL means the paste never expires. When the timestamp is reached, the paste is eligible for cleanup.

URL Generation: Random vs Custom Slugs

Slugs must be short, unique, and URL-safe. Two approaches:

Random slugs (default): Generate a 7-character string from a 62-character alphabet (a-z, A-Z, 0-9). This gives 62^7 = ~3.5 trillion combinations — more than enough. Use a cryptographically random generator to prevent sequential enumeration. Check the database for collisions (extremely rare at this keyspace, but handle it with a retry).

import secrets
import string

ALPHABET = string.ascii_letters + string.digits

def generate_slug(length=7):
    return ''.join(secrets.choice(ALPHABET) for _ in range(length))

Custom slugs: Let users specify their own slug (e.g., deploy-script-v2). Validate length (max 100 chars), character set (alphanumeric plus hyphens), and uniqueness. Custom slugs need an explicit uniqueness check before insert because users will collide on common names like test or config.

Edge case: What if a random slug happens to match a custom slug? The random generator retries (typically 0-1 retries). What if a user requests an already-taken custom slug? Return a 409 Conflict with a suggestion.

Storage Strategy

Metadata lives in a relational database. Content lives in object storage. This separation lets us scale each independently — the database handles millions of rows with fast indexed lookups, while object storage handles gigabytes of raw paste content cheaply.

Storage Architecture

Metadata lives in a relational database. Content is stored separately in object storage, referenced by path.

Column	Type	Description
id	BIGSERIAL PK	Internal primary key
slug	VARCHAR(12) UNIQUE	Public short identifier
title	VARCHAR(200)	Optional paste title
content_hash	CHAR(64) INDEX	SHA-256 of content (dedup)
content_path	VARCHAR(500)	Object storage key or blob path
language	VARCHAR(30)	Syntax highlight language
visibility	VARCHAR(10)	public / unlisted / private
expiration	TIMESTAMP NULL	TTL expiry time, NULL = never
burn_after_read	BOOLEAN	Delete after first view
access_count	INT DEFAULT 0	Number of views
content_size	INT	Uncompressed content bytes
compressed_size	INT	Compressed content bytes
created_at	TIMESTAMP	Creation time

Why Not Store Content in SQL?

Paste content can be anything from a 3-line config to a 50 MB log file. Storing large blobs in PostgreSQL or MySQL causes table bloat, slow backups, and expensive queries. Object storage (S3, Cloudflare R2, GCS) costs pennies per GB and handles multi-TB scale without manual sharding.

The database stores only:

A 64-character SHA-256 hash (content deduplication key)
A content path string (the S3 key)
Metadata (title, language, expiration, visibility, access count)

Content Deduplication

When a user creates a paste:

Compute SHA-256 of the raw content
Query the pastes table for existing rows with the same content_hash
If found: reuse the existing content_path, increment the reference counter
If not found: compress the content with gzip, upload to S3, store the path

This dramatically reduces storage for common snippets. SQL queries, common config files, and popular code patterns get pasted thousands of times but stored once.

import hashlib
import gzip
import boto3

s3 = boto3.client('s3')

def store_content(content: str) -> tuple[str, str, int]:
    raw_bytes = content.encode('utf-8')
    content_hash = hashlib.sha256(raw_bytes).hexdigest()

    existing = db.query("SELECT content_path FROM pastes WHERE content_hash = %s LIMIT 1", [content_hash])
    if existing:
        db.execute("UPDATE content_refs SET ref_count = ref_count + 1 WHERE content_hash = %s", [content_hash])
        return (content_hash, existing[0][0], len(raw_bytes))

    compressed = gzip.compress(raw_bytes)
    key = f"pastes/{content_hash[:2]}/{content_hash[2:4]}/{content_hash}"
    s3.put_object(Bucket="paste-content", Key=key, Body=compressed, ContentType="text/plain")
    db.execute("INSERT INTO content_refs (content_hash, content_path, original_size, compressed_size, ref_count) VALUES (%s, %s, %s, %s, 1)",
               [content_hash, key, len(raw_bytes), len(compressed)])
    return (content_hash, key, len(raw_bytes))

Syntax Highlighting Pipeline

Syntax highlighting turns raw code into colored, formatted HTML. Every paste service needs this — without it, a paste is just monospace text.

The pipeline works in four stages:

Language detection — on upload, analyze the content to guess the language. Heuristics: filename extension (main.py → Python), shebang line (#!/usr/bin/env node → JavaScript), and content patterns (fn main() → Rust).
Tokenization — the highlighter (PrismJS, highlight.js, Pygments) breaks the code into tokens: keywords, strings, comments, operators, punctuation.
HTML rendering — each token is wrapped in a <span> with a CSS class representing its type (.token.keyword, .token.string).
Server-side caching — the rendered HTML is cached alongside the paste. Re-rendering happens only if the language preference changes.

from pygments import highlight
from pygments.lexers import guess_lexer, get_lexer_by_name
from pygments.formatters import HtmlFormatter

def render_paste(content: str, language: str = 'auto') -> str:
    if language == 'auto' or language not in SUPPORTED_LANGUAGES:
        lexer = guess_lexer(content)
    else:
        lexer = get_lexer_by_name(language)
    formatter = HtmlFormatter(classprefix='token ')
    return highlight(content, lexer, formatter)

For the frontend, PrismJS handles highlighting on the client side for dynamic content. The server-side rendered HTML serves as the initial load, while the raw text endpoint (/raw/<slug>) serves the unhighlighted content for API consumers.

Create Paste Flow

When a user submits a new paste, the following happens in sequence:

Client sends POST /api/pastes with { content, title?, language?, expiration?, visibility?, slug? }
API Gateway validates — content size check (max 10 MB), slug format (if custom), expiration validation (must be in the future or null)
Rate limiter checks — per-IP limit (e.g., 10 pastes/minute for anonymous, 100/minute for authenticated)
Content hash computed — SHA-256 for deduplication
Slug generated — random if not custom, with collision retry
Content compressed — gzip at level 6 (good balance of speed and compression)
Content stored — upload to S3 if not already present (dedup check)
Metadata inserted — INSERT into pastes table
Cache populated — store the rendered paste in Redis for hot access
Response returned — 201 Created with the paste URL

from fastapi import FastAPI, HTTPException
from pydantic import BaseModel

app = FastAPI()

class CreatePasteRequest(BaseModel):
    content: str
    title: str | None = None
    language: str = 'auto'
    expiration: str | None = None
    visibility: str = 'public'
    slug: str | None = None

@app.post("/api/pastes")
async def create_paste(req: CreatePasteRequest):
    if len(req.content) > 10 * 1024 * 1024:
        raise HTTPException(413, "Content exceeds 10 MB limit")

    client_ip = request.client.host
    if not rate_limiter.check(client_ip, "create_paste"):
        raise HTTPException(429, "Rate limit exceeded")

    content_hash, content_path, size = store_content(req.content)

    slug = req.slug or generate_slug()
    if slug_collides(slug):
        slug = generate_slug()

    paste_id = db.insert("pastes", {
        "slug": slug,
        "title": req.title,
        "content_hash": content_hash,
        "content_path": content_path,
        "language": req.language,
        "expiration": parse_expiration(req.expiration),
        "visibility": req.visibility,
        "content_size": size,
    })

    cache.set(f"paste:{slug}", render_paste(req.content, req.language), ttl=3600)
    return {"url": f"https://paste.example.com/{slug}", "slug": slug}

View Paste Flow

When a user opens a paste URL:

Request hits CDN — Cloudflare checks if the paste is cached at the edge (for extremely popular pastes)
Load balancer — routes to the nearest healthy API instance
Cache check — Redis lookup by slug. If found (cache hit), return immediately (~5ms latency)
Database lookup — on cache miss, query the pastes table by slug index
Expiration check — if expiration < NOW() or burn_after_read and already viewed, return 410 Gone
Content fetch — load from S3 using content_path, decompress gzip
Render — apply syntax highlighting, build the HTML page
Cache populate — store in Redis with a TTL proportional to remaining expiration (or 1 hour for permanent pastes)
Return — serve the rendered page, increment access_count

@app.get("/{slug}")
async def view_paste(slug: str):
    cached = cache.get(f"paste:{slug}")
    if cached:
        db.execute("UPDATE pastes SET access_count = access_count + 1 WHERE slug = %s", [slug])
        return HTMLResponse(cached)

    paste = db.query_one("SELECT * FROM pastes WHERE slug = %s", [slug])
    if not paste:
        raise HTTPException(404, "Paste not found")

    now = datetime.utcnow()
    if paste.expiration and paste.expiration < now:
        raise HTTPException(410, "Paste has expired")

    if paste.burn_after_read and paste.access_count > 0:
        hard_delete_paste(paste.id)
        raise HTTPException(410, "Paste has been burned")

    content = fetch_from_s3(paste.content_path)
    rendered = render_paste(content, paste.language)

    ttl = int((paste.expiration - now).total_seconds()) if paste.expiration else 3600
    cache.set(f"paste:{slug}", rendered, ttl=min(ttl, 3600))
    db.execute("UPDATE pastes SET access_count = access_count + 1 WHERE id = %s", [paste.id])
    return HTMLResponse(rendered)

Caching Hot Pastes

Paste traffic follows a power-law distribution: 80% of views hit 20% of pastes. A top-viral paste (e.g., a leaked config or a popular code snippet) can get millions of views in hours.

The cache strategy:

| Layer | Store | TTL | Size | Hit Rate Target | |-------|-------|-----|------|-----------------| | CDN edge | Cloudflare cache | 5 minutes | Unlimited | 40% of all requests | | Application cache | Redis | Remaining expiration (capped at 1 hour) | 10 GB | 80% of requests reaching app | | Database | PostgreSQL | N/A (source of truth) | N/A | 100% for misses |

Cache invalidation happens when: the paste is deleted (explicit delete by user, or expiration cleanup), the paste is edited (not in our MVP, but would clear the cache), or the TTL expires naturally.

For extremely hot pastes (a single paste getting >100K views/min), the CDN edge cache is the safety valve. The 5-minute CDN TTL means even a massive spike hits the origin only once per 5 minutes per edge location.

Rate Limiting

A pastebin without rate limiting is a pastebin that will be used to store spam, malware, and stolen data. We need rate limiting at multiple levels:

| Scope | Limit | Target | |-------|-------|--------| | Per IP (create) | 10 pastes / minute | Anonymous abuse | | Per IP (view) | 1000 views / minute | Scraping and DDoS | | Global (create) | 10,000 pastes / minute | Infrastructure protection | | Global (view) | 1,000,000 views / minute | Infrastructure protection | | Per slug (view) | 10,000 views / minute | Hot paste DoS |

Rate limiting uses a sliding window counter with Redis. The counter tracks requests per IP per endpoint over a rolling 60-second window. When a user exceeds the limit, the API returns a 429 response with a Retry-After header.

Expiration, Cleanup & Burn After Read

Every paste has a lifecycle. It is created, possibly viewed many times, and eventually deleted. Three mechanisms enforce this.

Expiration & Cleanup

Each paste has a TTL countdown. A background worker scans for expired pastes, soft-deletes them, then permanently removes them after a grace period.

Config SnippetaB3x9QActive

5m 40s

Deploy ScriptkL7m2PActive

35m 0s

API Keys RotatexY4z8RActive

Unread

Debug LognC5v1WActive

12h 0m 0s

Temp MigrationhT8b6FActive

4d 0h 0m

Background Cleanup WorkerScan #0

The worker queries for pastes where expiration < NOW(), moves them to a soft-delete state with a 24-hour grace period, then hard-deletes after the grace window expires. Burn-after-read pastes are deleted immediately on the first view.

Active

4

Burn After Read

1

Expired / Deleted

0

TTL Expiration

When a paste is created with an expiration time, the expiration column is set to NOW() + interval. The paste is returned normally until that timestamp passes. After expiration, the paste returns a 410 Gone status.

Background Cleanup Worker

A separate worker process runs on a cron schedule (every 5 minutes) to:

Scan — SELECT id, content_path FROM pastes WHERE expiration < NOW() AND deleted_at IS NULL
Soft delete — set deleted_at = NOW() on matched rows (grace period starts)
Grace period — 24 hours during which an admin could theoretically restore the paste (in practice, this is a safety net for accidental bulk deletes)
Hard delete — after 24 hours, delete the row from the database and remove the content from S3

def cleanup_expired_pastes():
    with db.transaction():
        expired = db.query("""
            SELECT id, content_path, content_hash
            FROM pastes
            WHERE expiration < NOW() AND deleted_at IS NULL
            LIMIT 1000
        """)

        for paste in expired:
            db.execute("UPDATE pastes SET deleted_at = NOW() WHERE id = %s", [paste.id])

    hard_delete_pending()

def hard_delete_pending():
    with db.transaction():
        pending = db.query("""
            SELECT id, content_path, content_hash
            FROM pastes
            WHERE deleted_at < NOW() - INTERVAL '24 hours'
            LIMIT 500
        """)

        for paste in pending:
            s3.delete_object(Bucket="paste-content", Key=paste.content_path)
            db.execute("DELETE FROM content_refs WHERE content_hash = %s", [paste.content_hash])
            db.execute("DELETE FROM pastes WHERE id = %s", [paste.id])

Why the grace period? If a bug in the cleanup worker deletes 10,000 active pastes, the grace period gives you 24 hours to catch it and restore. Without it, the data is gone forever.

Burn After Read

Burn-after-read pastes are the opposite of persistent storage. They are designed for secrets, API keys, and one-time sharing. The flow:

User creates a paste with burn_after_read: true
On the first view request, the paste is returned normally
The access_count is incremented to 1
On any subsequent view request, the paste returns 410 and is immediately hard-deleted

@app.get("/{slug}")
async def view_paste(slug: str):
    paste = db.query_one("SELECT * FROM pastes WHERE slug = %s", [slug])
    if paste.burn_after_read and paste.access_count >= 1:
        hard_delete_paste(paste.id)
        raise HTTPException(410, "This paste was burned after reading")
    ...

Full Architecture

The system spans CDN, load balancer, API service, Redis cache, PostgreSQL, and S3-compatible object storage. A background worker handles expiration cleanup.

Client

Browser / API

CDN

Cloudflare

Load Balancer

NGINX

API Gateway

Paste Service

Cache

Redis (hot pastes)

Database

PostgreSQL (metadata)

Object Store

S3 / R2 (content)

Cleanup Worker

Expired paste scanner

Create Flow

Client → CDN

Load Balancer → API Gateway

Database → Object Store

Database → API Gateway

Store metadata in DB, content in S3

View Flow (cache hit)

Client → CDN

Load Balancer → API Gateway

Cache → API Gateway

Hot paste served from Redis

Cleanup Worker

Worker → DB scan → Storage delete

Soft-delete → grace → hard-delete

The complete architecture connects all components:

Create flow: Client → CDN → Load Balancer → API Gateway → (compute content hash, dedup check) → Object Storage (store content) → Database (store metadata) → Cache (populate hot paste) → Response

View flow (cache hit): Client → CDN → Load Balancer → API Gateway → Cache (return rendered paste) → Response

View flow (cache miss): Client → CDN → Load Balancer → API Gateway → Cache (miss) → Database (lookup metadata) → Object Storage (fetch content) → Render → Cache (populate) → Response

Cleanup flow: Worker → Database (scan expired) → Object Storage (delete content) → Database (delete metadata)

Abuse Prevention

A public pastebin attracts abuse. Malware authors paste stolen data, spammers post links, and bad actors use it for C2 communication. Prevention strategies:

Content scanning: Every paste goes through a content scanner before it is stored. The scanner checks against known malware hashes, spam patterns, and credential patterns (passwords, API keys). Suspicious pastes are flagged for human review.

IP reputation: Block known VPNs, Tor exit nodes, and datacenter IPs for paste creation (viewing is unrestricted). This eliminates the vast majority of automated abuse.

Rate limiting by fingerprint: Even without authentication, you can fingerprint clients using TLS fingerprint, HTTP headers, and timing patterns. A spammer rotating IPs still gets caught by JA3 fingerprint matching.

Reporting: Expose a report endpoint (POST /api/report/{slug}). Reported pastes are queued for moderator review and removed if they violate terms of service.

Scaling Reads vs Writes

This system is dramatically read-heavy. Each paste is created once (write) but viewed dozens to thousands of times (reads). The ratio can be 1:100 or higher for viral pastes.

Scaling Reads

CDN edge caching — Cloudflare caches rendered paste HTML at 200+ edge locations. A popular paste never hits the origin for 5 minutes at a time.
Redis read replicas — the cache layer uses Redis with read replicas. One write master accepts cache population, multiple read replicas handle the view traffic.
Database read replicas — PostgreSQL streaming replicas handle the cache-miss traffic. The write master only handles paste creation.
Pagination for listing — public paste listings use cursor-based pagination (not offset-based) to avoid table scans on large datasets.

Scaling Writes

Writes are easier to scale because the volume is low but we still need durability:

Batch deduplication — content hashing means many writes produce zero new storage operations (existing pastes with the same hash reuse the S3 object).
Async S3 uploads — the content upload to S3 happens asynchronously. The API returns the paste URL as soon as the metadata is in the database. If the S3 upload fails, a retry worker picks it up.
Write queue — during traffic spikes, incoming paste creates are queued in a message broker (RabbitMQ or Redis streams) and processed at a steady rate.

Database Sharding

At enormous scale (billions of pastes), the pastes table is sharded by slug prefix. The first character of the slug determines the shard. This distributes reads and writes evenly because slugs are random.

-- Shard 0: slugs starting with a-m
-- Shard 1: slugs starting with n-z
-- Shard 2: slugs starting with 0-9

The shard mapping is handled by the API gateway or a lightweight proxy. Each shard is an independent PostgreSQL instance, with its own read replicas.

Test Your Knowledge

Question 1 of 710 pts

Why is paste content stored in object storage (S3) instead of directly in PostgreSQL?

Score: 0 / 800%

Self-Check Questions

After reading this walkthrough, test your understanding:

Why do we store content in S3 instead of directly in PostgreSQL? What problems does this solve?
How does content deduplication via SHA-256 work? What happens if two users paste the same code?
What is the difference between random slugs and custom slugs? How do you handle collisions for each?
How does the burn-after-read mechanism differ from TTL expiration?
Why does the cleanup worker use a 24-hour grace period before hard-deleting?
What rate limits would you set for anonymous users vs authenticated users?
How does the cache invalidation strategy ensure consistency?
What happens when a cache miss occurs for a very hot paste?