You are in a system design interview. The interviewer says: “Design a pastebin / code snippet service.” You have used these before — Pastebin, GitHub Gist, Hastebin, Glot.io. They let you paste text, get a shareable URL, and optionally set an expiration time. The problem sounds simple. The depth is in the details: how do you store millions of pastes efficiently, deduplicate content, handle syntax highlighting for 50+ languages, enforce expiration, and prevent abuse?
This is a complete walkthrough from zero to deployed architecture.
A pastebin is like a sticky note for the internet. You write some text (usually code), the service gives you back a short URL, and anyone with that URL can view the paste. Services like Pastebin and GitHub Gist have been around for two decades because developers constantly need to share code snippets — in chat, on forums, in bug reports, during code reviews.
Real-world examples: Pastebin (launched 2002, still one of the most-trafficked sites on the internet), GitHub Gist (2008, integrated with Git for versioning), Hastebin (lightweight, used in Discord), Glot.io (runnable snippets with Docker), and PrivateBin (zero-knowledge, encrypted pastes).
Why do these services matter? Three reasons. First, collaboration — sharing a code snippet in Slack or IRC is useless if the code wraps badly or gets lost in the scrollback. A paste link is clean and persistent. Second, debugging — users paste error logs, stack traces, and server output to get help. Third, archival — pastebins serve as lightweight documentation for one-off scripts, configuration files, and SQL queries.
A pastebin service needs more than just storing text. Click any item to include or exclude it from your design scope.
Each paste stores its content, metadata, and behavior settings.
/raw/<slug> for curl and programmatic consumptionWe are not building: user authentication (optional for MVP), paste editing (complex versioning), collaborative editing (multiple simultaneous editors), runnable code execution (like Glot.io), or a full admin dashboard.
Every paste has the same internal structure regardless of how users interact with it.
CREATE TABLE pastes (
id BIGSERIAL PRIMARY KEY,
slug VARCHAR(12) UNIQUE NOT NULL,
title VARCHAR(200),
content_hash CHAR(64) NOT NULL,
content_path VARCHAR(500) NOT NULL,
language VARCHAR(30) NOT NULL DEFAULT 'auto',
visibility VARCHAR(10) NOT NULL DEFAULT 'public',
expiration TIMESTAMPTZ,
burn_after_read BOOLEAN NOT NULL DEFAULT false,
access_count INTEGER NOT NULL DEFAULT 0,
content_size INTEGER NOT NULL,
created_at TIMESTAMPTZ NOT NULL DEFAULT NOW(),
updated_at TIMESTAMPTZ NOT NULL DEFAULT NOW()
);
CREATE INDEX idx_pastes_slug ON pastes(slug);
CREATE INDEX idx_pastes_content_hash ON pastes(content_hash);
CREATE INDEX idx_pastes_expiration ON pastes(expiration);
Each field serves a specific purpose:
pastebin.com/aB3x9Q). Must be unique, generated randomly or user-specified.Slugs must be short, unique, and URL-safe. Two approaches:
Random slugs (default): Generate a 7-character string from a 62-character alphabet (a-z, A-Z, 0-9). This gives 62^7 = ~3.5 trillion combinations — more than enough. Use a cryptographically random generator to prevent sequential enumeration. Check the database for collisions (extremely rare at this keyspace, but handle it with a retry).
import secrets
import string
ALPHABET = string.ascii_letters + string.digits
def generate_slug(length=7):
return ''.join(secrets.choice(ALPHABET) for _ in range(length))
Custom slugs: Let users specify their own slug (e.g., deploy-script-v2). Validate length (max 100 chars), character set (alphanumeric plus hyphens), and uniqueness. Custom slugs need an explicit uniqueness check before insert because users will collide on common names like test or config.
Edge case: What if a random slug happens to match a custom slug? The random generator retries (typically 0-1 retries). What if a user requests an already-taken custom slug? Return a 409 Conflict with a suggestion.
Metadata lives in a relational database. Content lives in object storage. This separation lets us scale each independently — the database handles millions of rows with fast indexed lookups, while object storage handles gigabytes of raw paste content cheaply.
Metadata lives in a relational database. Content is stored separately in object storage, referenced by path.
| Column | Type | Description |
|---|---|---|
| id | BIGSERIAL PK | Internal primary key |
| slug | VARCHAR(12) UNIQUE | Public short identifier |
| title | VARCHAR(200) | Optional paste title |
| content_hash | CHAR(64) INDEX | SHA-256 of content (dedup) |
| content_path | VARCHAR(500) | Object storage key or blob path |
| language | VARCHAR(30) | Syntax highlight language |
| visibility | VARCHAR(10) | public / unlisted / private |
| expiration | TIMESTAMP NULL | TTL expiry time, NULL = never |
| burn_after_read | BOOLEAN | Delete after first view |
| access_count | INT DEFAULT 0 | Number of views |
| content_size | INT | Uncompressed content bytes |
| compressed_size | INT | Compressed content bytes |
| created_at | TIMESTAMP | Creation time |
Paste content can be anything from a 3-line config to a 50 MB log file. Storing large blobs in PostgreSQL or MySQL causes table bloat, slow backups, and expensive queries. Object storage (S3, Cloudflare R2, GCS) costs pennies per GB and handles multi-TB scale without manual sharding.
The database stores only:
When a user creates a paste:
pastes table for existing rows with the same content_hashcontent_path, increment the reference counterThis dramatically reduces storage for common snippets. SQL queries, common config files, and popular code patterns get pasted thousands of times but stored once.
import hashlib
import gzip
import boto3
s3 = boto3.client('s3')
def store_content(content: str) -> tuple[str, str, int]:
raw_bytes = content.encode('utf-8')
content_hash = hashlib.sha256(raw_bytes).hexdigest()
existing = db.query("SELECT content_path FROM pastes WHERE content_hash = %s LIMIT 1", [content_hash])
if existing:
db.execute("UPDATE content_refs SET ref_count = ref_count + 1 WHERE content_hash = %s", [content_hash])
return (content_hash, existing[0][0], len(raw_bytes))
compressed = gzip.compress(raw_bytes)
key = f"pastes/{content_hash[:2]}/{content_hash[2:4]}/{content_hash}"
s3.put_object(Bucket="paste-content", Key=key, Body=compressed, ContentType="text/plain")
db.execute("INSERT INTO content_refs (content_hash, content_path, original_size, compressed_size, ref_count) VALUES (%s, %s, %s, %s, 1)",
[content_hash, key, len(raw_bytes), len(compressed)])
return (content_hash, key, len(raw_bytes))
Syntax highlighting turns raw code into colored, formatted HTML. Every paste service needs this — without it, a paste is just monospace text.
The pipeline works in four stages:
main.py → Python), shebang line (#!/usr/bin/env node → JavaScript), and content patterns (fn main() → Rust).<span> with a CSS class representing its type (.token.keyword, .token.string).from pygments import highlight
from pygments.lexers import guess_lexer, get_lexer_by_name
from pygments.formatters import HtmlFormatter
def render_paste(content: str, language: str = 'auto') -> str:
if language == 'auto' or language not in SUPPORTED_LANGUAGES:
lexer = guess_lexer(content)
else:
lexer = get_lexer_by_name(language)
formatter = HtmlFormatter(classprefix='token ')
return highlight(content, lexer, formatter)
For the frontend, PrismJS handles highlighting on the client side for dynamic content. The server-side rendered HTML serves as the initial load, while the raw text endpoint (/raw/<slug>) serves the unhighlighted content for API consumers.
When a user submits a new paste, the following happens in sequence:
{ content, title?, language?, expiration?, visibility?, slug? }pastes table201 Created with the paste URLfrom fastapi import FastAPI, HTTPException
from pydantic import BaseModel
app = FastAPI()
class CreatePasteRequest(BaseModel):
content: str
title: str | None = None
language: str = 'auto'
expiration: str | None = None
visibility: str = 'public'
slug: str | None = None
@app.post("/api/pastes")
async def create_paste(req: CreatePasteRequest):
if len(req.content) > 10 * 1024 * 1024:
raise HTTPException(413, "Content exceeds 10 MB limit")
client_ip = request.client.host
if not rate_limiter.check(client_ip, "create_paste"):
raise HTTPException(429, "Rate limit exceeded")
content_hash, content_path, size = store_content(req.content)
slug = req.slug or generate_slug()
if slug_collides(slug):
slug = generate_slug()
paste_id = db.insert("pastes", {
"slug": slug,
"title": req.title,
"content_hash": content_hash,
"content_path": content_path,
"language": req.language,
"expiration": parse_expiration(req.expiration),
"visibility": req.visibility,
"content_size": size,
})
cache.set(f"paste:{slug}", render_paste(req.content, req.language), ttl=3600)
return {"url": f"https://paste.example.com/{slug}", "slug": slug}
When a user opens a paste URL:
pastes table by slug indexexpiration < NOW() or burn_after_read and already viewed, return 410 Gonecontent_path, decompress gzipaccess_count@app.get("/{slug}")
async def view_paste(slug: str):
cached = cache.get(f"paste:{slug}")
if cached:
db.execute("UPDATE pastes SET access_count = access_count + 1 WHERE slug = %s", [slug])
return HTMLResponse(cached)
paste = db.query_one("SELECT * FROM pastes WHERE slug = %s", [slug])
if not paste:
raise HTTPException(404, "Paste not found")
now = datetime.utcnow()
if paste.expiration and paste.expiration < now:
raise HTTPException(410, "Paste has expired")
if paste.burn_after_read and paste.access_count > 0:
hard_delete_paste(paste.id)
raise HTTPException(410, "Paste has been burned")
content = fetch_from_s3(paste.content_path)
rendered = render_paste(content, paste.language)
ttl = int((paste.expiration - now).total_seconds()) if paste.expiration else 3600
cache.set(f"paste:{slug}", rendered, ttl=min(ttl, 3600))
db.execute("UPDATE pastes SET access_count = access_count + 1 WHERE id = %s", [paste.id])
return HTMLResponse(rendered)
Paste traffic follows a power-law distribution: 80% of views hit 20% of pastes. A top-viral paste (e.g., a leaked config or a popular code snippet) can get millions of views in hours.
The cache strategy:
| Layer | Store | TTL | Size | Hit Rate Target |
|---|---|---|---|---|
| CDN edge | Cloudflare cache | 5 minutes | Unlimited | 40% of all requests |
| Application cache | Redis | Remaining expiration (capped at 1 hour) | 10 GB | 80% of requests reaching app |
| Database | PostgreSQL | N/A (source of truth) | N/A | 100% for misses |
Cache invalidation happens when: the paste is deleted (explicit delete by user, or expiration cleanup), the paste is edited (not in our MVP, but would clear the cache), or the TTL expires naturally.
For extremely hot pastes (a single paste getting >100K views/min), the CDN edge cache is the safety valve. The 5-minute CDN TTL means even a massive spike hits the origin only once per 5 minutes per edge location.
A pastebin without rate limiting is a pastebin that will be used to store spam, malware, and stolen data. We need rate limiting at multiple levels:
| Scope | Limit | Target |
|---|---|---|
| Per IP (create) | 10 pastes / minute | Anonymous abuse |
| Per IP (view) | 1000 views / minute | Scraping and DDoS |
| Global (create) | 10,000 pastes / minute | Infrastructure protection |
| Global (view) | 1,000,000 views / minute | Infrastructure protection |
| Per slug (view) | 10,000 views / minute | Hot paste DoS |
Rate limiting uses a sliding window counter with Redis. The counter tracks requests per IP per endpoint over a rolling 60-second window. When a user exceeds the limit, the API returns a 429 response with a Retry-After header.
Every paste has a lifecycle. It is created, possibly viewed many times, and eventually deleted. Three mechanisms enforce this.
Each paste has a TTL countdown. A background worker scans for expired pastes, soft-deletes them, then permanently removes them after a grace period.
When a paste is created with an expiration time, the expiration column is set to NOW() + interval. The paste is returned normally until that timestamp passes. After expiration, the paste returns a 410 Gone status.
A separate worker process runs on a cron schedule (every 5 minutes) to:
SELECT id, content_path FROM pastes WHERE expiration < NOW() AND deleted_at IS NULLdeleted_at = NOW() on matched rows (grace period starts)def cleanup_expired_pastes():
with db.transaction():
expired = db.query("""
SELECT id, content_path, content_hash
FROM pastes
WHERE expiration < NOW() AND deleted_at IS NULL
LIMIT 1000
""")
for paste in expired:
db.execute("UPDATE pastes SET deleted_at = NOW() WHERE id = %s", [paste.id])
hard_delete_pending()
def hard_delete_pending():
with db.transaction():
pending = db.query("""
SELECT id, content_path, content_hash
FROM pastes
WHERE deleted_at < NOW() - INTERVAL '24 hours'
LIMIT 500
""")
for paste in pending:
s3.delete_object(Bucket="paste-content", Key=paste.content_path)
db.execute("DELETE FROM content_refs WHERE content_hash = %s", [paste.content_hash])
db.execute("DELETE FROM pastes WHERE id = %s", [paste.id])
Why the grace period? If a bug in the cleanup worker deletes 10,000 active pastes, the grace period gives you 24 hours to catch it and restore. Without it, the data is gone forever.
Burn-after-read pastes are the opposite of persistent storage. They are designed for secrets, API keys, and one-time sharing. The flow:
burn_after_read: trueaccess_count is incremented to 1@app.get("/{slug}")
async def view_paste(slug: str):
paste = db.query_one("SELECT * FROM pastes WHERE slug = %s", [slug])
if paste.burn_after_read and paste.access_count >= 1:
hard_delete_paste(paste.id)
raise HTTPException(410, "This paste was burned after reading")
...
The system spans CDN, load balancer, API service, Redis cache, PostgreSQL, and S3-compatible object storage. A background worker handles expiration cleanup.
The complete architecture connects all components:
Create flow: Client → CDN → Load Balancer → API Gateway → (compute content hash, dedup check) → Object Storage (store content) → Database (store metadata) → Cache (populate hot paste) → Response
View flow (cache hit): Client → CDN → Load Balancer → API Gateway → Cache (return rendered paste) → Response
View flow (cache miss): Client → CDN → Load Balancer → API Gateway → Cache (miss) → Database (lookup metadata) → Object Storage (fetch content) → Render → Cache (populate) → Response
Cleanup flow: Worker → Database (scan expired) → Object Storage (delete content) → Database (delete metadata)
A public pastebin attracts abuse. Malware authors paste stolen data, spammers post links, and bad actors use it for C2 communication. Prevention strategies:
Content scanning: Every paste goes through a content scanner before it is stored. The scanner checks against known malware hashes, spam patterns, and credential patterns (passwords, API keys). Suspicious pastes are flagged for human review.
IP reputation: Block known VPNs, Tor exit nodes, and datacenter IPs for paste creation (viewing is unrestricted). This eliminates the vast majority of automated abuse.
Rate limiting by fingerprint: Even without authentication, you can fingerprint clients using TLS fingerprint, HTTP headers, and timing patterns. A spammer rotating IPs still gets caught by JA3 fingerprint matching.
Reporting: Expose a report endpoint (POST /api/report/{slug}). Reported pastes are queued for moderator review and removed if they violate terms of service.
This system is dramatically read-heavy. Each paste is created once (write) but viewed dozens to thousands of times (reads). The ratio can be 1:100 or higher for viral pastes.
Writes are easier to scale because the volume is low but we still need durability:
At enormous scale (billions of pastes), the pastes table is sharded by slug prefix. The first character of the slug determines the shard. This distributes reads and writes evenly because slugs are random.
-- Shard 0: slugs starting with a-m
-- Shard 1: slugs starting with n-z
-- Shard 2: slugs starting with 0-9
The shard mapping is handled by the API gateway or a lightweight proxy. Each shard is an independent PostgreSQL instance, with its own read replicas.
After reading this walkthrough, test your understanding: