Imagine you walk into a restaurant. With REST, you order from a fixed menu: you ask the waiter for the “User Platter,” and they bring you a plate with every side dish the kitchen includes — name, email, address, phone number, profile picture, friend list, and 14 other fields you didn’t ask for. You wanted just the name and email, but the restaurant decided what a “User Platter” contains.
With GraphQL, you are the chef. You tell the waiter exactly what you want: “Bring me the user’s name and email, and for each of their posts, bring me the title and the first 100 characters of the body.” The waiter writes it down, the kitchen prepares exactly that, and nothing else arrives at your table.
REST APIs return fixed data shapes per endpoint. /api/users/1 returns a User object with all its fields, whether you need them or not. If you need the user’s posts too, you make a second call to /api/users/1/posts. If you need the post’s comments, that’s a third call.
GraphQL solves this with a single endpoint and a query language that lets the client specify the exact shape of the response. One request, one response, no over-fetching, no under-fetching.
| Concern | REST | GraphQL |
|---|---|---|
| Data shape | Fixed per endpoint | Client-specified per request |
| Over-fetching | Common (unused fields) | None (client asks for what it needs) |
| Under-fetching | Common (multiple round-trips) | None (one query fetches nested data) |
| Endpoints | Many (/users, /users/:id/posts, etc.) | One (/graphql) |
| Versioning | URL or header based (/v2/users) | Evolve schema, deprecate fields |
| Tooling | Swagger, OpenAPI | GraphiQL, Apollo Studio, Introspection |
The trade-off: GraphQL shifts complexity from the client to the server. The server must understand the query, validate it against the schema, and resolve each requested field efficiently. That is what we explore in this post.
GraphQL queries look like JSON without the values — a mirror of the response shape you want back. Every query starts at the root Query type and walks down the field tree.
query GetUserWithPosts {
user(id: "1") {
name
email
posts {
title
body
}
}
}
The response comes back as JSON matching the query structure exactly:
{
"data": {
"user": {
"name": "Alice Wonderland",
"email": "alice@example.com",
"posts": [
{ "title": "Hello World", "body": "First post content..." },
{ "title": "GraphQL is great", "body": "Second post..." }
]
}
}
}
Three types of operations exist:
Every operation specifies a selection set — the fields you want at each level. Fields can have arguments (like id: "1" or limit: 10) and can be aliased to avoid naming conflicts:
query {
myself: user(id: "1") { name }
myFriend: user(id: "2") { name }
}
Variables keep queries reusable and secure:
query GetUser($id: ID!, $limit: Int) {
user(id: $id) {
name
posts(limit: $limit) { title }
}
}
Sent as JSON with a separate variables dictionary:
{
"query": "query GetUser($id: ID!, $limit: Int) { ... }",
"variables": { "id": "1", "limit": 5 }
}
The GraphQL schema defines what queries are possible, what arguments each field accepts, what types each field returns, and what data the client can ask for. The schema is the contract between client and server.
Two concepts form the backbone of every GraphQL server: the schema and the resolvers.
The schema is the what: it declares what types exist, what fields they have, and what arguments each field accepts. It is a blueprint written in the Schema Definition Language (SDL).
type User {
id: ID!
name: String!
email: String!
posts(limit: Int): [Post!]!
}
type Post {
id: ID!
title: String!
body: String!
author: User!
comments: [Comment!]!
}
The resolvers are the how: they are functions that actually fetch or compute the data for each field. When GraphQL encounters user(id: "1") in a query, it calls the user resolver on the Query type, passing { id: "1" } as arguments. The resolver fetches the user from the database and returns it.
const resolvers = {
Query: {
user: (_, { id }) => db.users.find((u) => u.id === id),
},
User: {
posts: (user, { limit }) => {
const userPosts = db.posts.filter((p) => p.authorId === user.id)
return limit ? userPosts.slice(0, limit) : userPosts
},
},
}
The resolver for Query.user returns a User object. Then GraphQL looks at the query’s selection set under user: it needs name and email. These are scalar fields on User — GraphQL can resolve them directly (a name resolver by default returns user.name). But posts is a relation — GraphQL needs to call the User.posts resolver to fetch the related data.
This is the resolver chain: GraphQL walks the query tree depth-first, calling resolvers for each field and collecting the results. Each resolver receives the parent object (the return value of the parent resolver), the field arguments, the context (shared state like database connections or auth info), and the field info.
Every value in GraphQL has a type. The type system defines what data looks like and what operations are valid. Understanding the type system is the key to designing good schemas.
Object types represent entities with multiple fields:
type Comment {
id: ID!
text: String!
author: User!
createdAt: DateTime
}
Scalar types are leaf values. GraphQL ships with five built-in scalars: Int, Float, String, Boolean, ID. Custom scalars (like DateTime, JSON, URL) require serialization logic.
Enum types restrict a field to a fixed set of values:
enum PostStatus {
DRAFT
PUBLISHED
ARCHIVED
}
Interface types define shared fields that multiple object types implement:
interface Node {
id: ID!
createdAt: DateTime!
}
type User implements Node {
id: ID!
createdAt: DateTime!
name: String!
email: String!
}
type Post implements Node {
id: ID!
createdAt: DateTime!
title: String!
body: String!
}
Union types represent a field that can return one of several types, with no shared fields required:
union SearchResult = User | Post | Comment
type Query {
search(term: String!): [SearchResult!]!
}
Clients query unions with inline fragments:
query {
search(term: "alice") {
... on User { name email }
... on Post { title body }
... on Comment { text }
}
}
The type system also includes input types for mutations. Input types are like object types but used for arguments:
input CreatePostInput {
title: String!
body: String!
authorId: ID!
}
type Mutation {
createPost(input: CreatePostInput!): Post!
}
The ! suffix means non-null. If a field is String!, the server guarantees it will never return null. If a resolver throws an error for a non-null field, GraphQL propagates the null up to the nearest nullable parent, potentially losing the entire object.
Type modifiers form three categories:
| Syntax | Meaning |
|---|---|
String | Nullable string |
String! | Non-null string |
[String] | Nullable list of nullable strings |
[String!] | Nullable list of non-null strings |
[String]! | Non-null list of nullable strings |
[String!]! | Non-null list of non-null strings |
SDL is the syntax used to write GraphQL schemas. It is language-agnostic — the same schema works with JavaScript, Python, Ruby, Go, or Rust servers.
Here is a complete schema for a blog:
type Query {
user(id: ID!): User
users(limit: Int): [User!]!
post(id: ID!): Post
posts(limit: Int): [Post!]!
search(term: String!): [SearchResult!]!
}
type Mutation {
createPost(input: CreatePostInput!): Post!
updatePost(id: ID!, input: UpdatePostInput!): Post!
deletePost(id: ID!): Boolean!
addComment(postId: ID!, text: String!): Comment!
}
type Subscription {
postEvents: PostEvent!
}
type User {
id: ID!
name: String!
email: String!
posts(limit: Int): [Post!]!
}
type Post {
id: ID!
title: String!
body: String!
author: User!
comments: [Comment!]!
status: PostStatus!
createdAt: DateTime!
}
type Comment {
id: ID!
text: String!
author: User!
createdAt: DateTime!
}
enum PostStatus {
DRAFT
PUBLISHED
ARCHIVED
}
union SearchResult = User | Post | Comment
input CreatePostInput {
title: String!
body: String!
authorId: ID!
}
input UpdatePostInput {
title: String
body: String
status: PostStatus
}
Every schema starts with three special root types: Query, Mutation, and Subscription. These define the entry points for operations. The root types are the only way data enters the graph.
SDL supports comments with #:
# A blog post with markdown body and moderation status
type Post {
id: ID!
title: String!
body: String!
status: PostStatus!
}
And deprecation with the @deprecated directive:
type User {
phoneNumber: String @deprecated(reason: "Use contact.email instead")
contact: ContactInfo
}
Directives are annotations that modify behavior. Built-in directives include @deprecated, @skip, @include, and @specifiedBy. Custom directives extend the schema with validation, authentication, or formatting logic.
Before executing any query, GraphQL validates it against the schema. Validation catches errors early, before any resolvers run. This is one of GraphQL’s superpowers: malformed queries never reach your database.
Validation rules fall into several categories:
Document-level rules: Every operation must have a unique name (or be anonymous). Fragment names must be unique. Operations must be of a known type (query, mutation, subscription).
Field rules: Every field in the query must exist on the parent type. Fields cannot query for subfields if they return a scalar. Fields cannot be leaf values if they return an object (you must specify a selection set).
# INVALID: User has no 'age' field
query { user(id: "1") { age } }
# INVALID: Post is an object, needs a selection set
query { user(id: "1") { posts } }
# VALID: Post has a selection set
query { user(id: "1") { posts { title } } }
Argument rules: Every required argument must be provided with a value of the correct type. Unknown arguments are rejected.
# INVALID: user requires id: ID!
query { user { name } }
# INVALID: wrong type for id
query { user(id: true) { name } }
# VALID
query { user(id: "1") { name } }
Type rules: Fragment conditions must exist in the schema. Inline fragment type conditions must be a possible type for the field. Union and interface fragments must use ... on Type syntax.
Directive rules: @skip(if: Boolean) and @include(if: Boolean) evaluate their arguments at runtime. If a variable is used, it must be declared and its type must match.
query ($showEmail: Boolean!) {
user(id: "1") {
name
email @include(if: $showEmail)
}
}
Validation makes GraphQL self-documenting in a way that REST endpoints are not. Tools like GraphiQL use the schema to provide autocomplete, inline documentation, and real-time validation feedback — every error a developer sees in the editor, a runtime query would also reject.
When a validated query reaches the execution engine, GraphQL walks the query tree depth-first, calling resolvers for each field. Understanding this execution model is essential for performance.
The resolver signature in most GraphQL implementations follows this pattern:
fieldName(parent, args, context, info) => data
The four arguments:
root or source)Default resolvers: if a field has no explicit resolver, GraphQL looks for a property with the same name on the parent object. This is how scalar fields like User.name resolve without writing resolver code.
Consider this query and the execution order:
query {
user(id: "1") {
name
posts {
title
comments { text }
}
}
}
Execution order:
Query.user({ id: "1" }) resolves and returns a user objectname (scalar, default resolver reads user.name)posts (calls User.posts(user, {}))title (default resolver)comments (calls Post.comments(post, {}))text (default resolver)GraphQL uses a per-field resolution model. Every field, no matter how deeply nested, gets its own resolver. This is what makes GraphQL so flexible — and what makes the N+1 problem so dangerous if you are not careful.
Each resolver returns either a value or a Promise (for async operations like database queries). GraphQL waits for all promises at a given level before moving to the next level of the tree.
const resolvers = {
Query: { user: async (_, { id }) => db.users.findByPk(id) },
User: { posts: async (user) => db.posts.findAll({ where: { authorId: user.id } }) },
Post: { comments: async (post) => db.comments.findAll({ where: { postId: post.id } }) },
}
This naive implementation triggers the N+1 problem immediately. Let us see why.
The N+1 problem is the most common performance pitfall in GraphQL. It happens because resolvers fire independently for each parent object.
Suppose we query 5 users and their posts. The naive resolver above produces:
SELECT * FROM users WHERE id = ? (1 query to fetch all users)But wait — the resolver for Query.users might return all 5 users in one query. The problem is in the next level: User.posts fires once per user.
SELECT * FROM posts WHERE authorId = 1 (for user 1)SELECT * FROM posts WHERE authorId = 2 (for user 2)SELECT * FROM posts WHERE authorId = 3 (for user 3)SELECT * FROM posts WHERE authorId = 4 (for user 4)SELECT * FROM posts WHERE authorId = 5 (for user 5)Six queries total: 1 for users + 5 for posts = 6 queries. As the number of users grows, the query count grows linearly. At 100 users, you get 101 queries.
The fix: batch the post queries into a single WHERE authorId IN (...) query. But how do you know which author IDs you need before all users resolve? You need a deferred batching mechanism — collect all the author IDs from the users level, then issue one batched query at the posts level.
DataLoader is a utility that batches and caches individual database loads within a single request. It was created by Lee Byron (GraphQL co-author) and is the standard solution for N+1 in GraphQL.
The core idea is simple: instead of calling db.load(id) directly, you call loader.load(id). DataLoader collects all IDs loaded during a single tick of the event loop, then calls your batch function once with all of them.
import DataLoader from 'dataloader'
const batchUsers = async (ids) => {
const users = await db.users.findAll({ where: { id: ids } })
// Must return results in the same order as ids
return ids.map((id) => users.find((u) => u.id === id))
}
const userLoader = new DataLoader(batchUsers)
// Usage: loader.load(id) instead of db.users.findByPk(id)
For the N+1 scenario:
const resolvers = {
Query: {
users: async (_, args, context) => {
return context.userLoader.loadMany([1, 2, 3, 4, 5])
},
},
User: {
posts: async (user, _, context) => {
// Instead of querying per user, we batch all post loads
return context.postsByAuthorLoader.load(user.id)
},
},
}
The postLoader collects all author IDs from all User.posts resolvers and issues:
SELECT * FROM posts WHERE author_id IN (1, 2, 3, 4, 5)
One query instead of five. Total: 2 queries (1 for users, 1 for posts) instead of 6.
DataLoader provides two more benefits:
Caching per request: If two fields need the same user (e.g., author on two different posts), DataLoader’s per-request cache returns the user from memory without a second database query.
// Post.comments resolver and Post.author resolver both need the same user
// DataLoader cache serves the second request from memory
const resolvers = {
Comment: {
author: (comment, _, context) => context.userLoader.load(comment.authorId),
},
Post: {
author: (post, _, context) => context.userLoader.load(post.authorId),
},
}
Priming: You can pre-populate the cache if you already have data, avoiding unnecessary database calls:
userLoader.prime('1', { id: '1', name: 'Alice' })
// Subsequent userLoader.load('1') returns this cached object
The context object is created fresh per request and passed to every resolver:
const server = new ApolloServer({
typeDefs,
resolvers,
context: () => ({
userLoader: new DataLoader(batchUsers),
postsByAuthorLoader: new DataLoader(batchPostsByAuthor),
}),
})
Three rules for effective DataLoader usage:
Queries and mutations use HTTP request-response — one-shot. Subscriptions use WebSockets for persistent, real-time communication. The client subscribes, and the server pushes data whenever an event occurs.
subscription OnPostCreated {
postCreated {
id
title
author { name }
}
}
The WebSocket connection follows the graphql-transport-ws protocol:
connection_init messageconnection_acksubscribe message with the subscription querynext messages when events occurcomplete to end the subscriptionOn the server, subscriptions use a pub/sub pattern. An event source (database trigger, message queue, webhook) publishes events. The subscription resolver subscribes to the appropriate channel.
const { PubSub } = require('graphql-subscriptions')
const pubsub = new PubSub()
const resolvers = {
Subscription: {
postCreated: {
subscribe: () => pubsub.asyncIterator(['POST_CREATED']),
},
},
Mutation: {
createPost: async (_, { input }, context) => {
const post = await db.posts.create(input)
pubsub.publish('POST_CREATED', { postCreated: post })
return post
},
},
}
The subscribe function returns an AsyncIterator. When the createPost mutation publishes an event, the iterator yields the data, and GraphQL sends it over the WebSocket to all subscribed clients.
Production considerations for subscriptions:
| Concern | Solution |
|---|---|
| Connection loss | Client sends connection_init on reconnect, server replays missed events if tracked |
| Backpressure | Use Redis-backed PubSub for multi-server deployments (graphql-redis-subscriptions) |
| Auth | Validate auth token in onConnect callback, reject unauthorized connections |
| Rate limiting | Limit subscriptions per user, throttle event delivery |
| Scaling | Use external pub/sub (Redis, RabbitMQ) so any server can publish to all subscribers |
Real use case: A collaborative editor where DocumentSubscription pushes changes to all connected clients. When user A inserts text, the mutation publishes the delta, and the subscription pushes it to user B’s client in real time. The resolver chain: Mutation.updateDocument -> pubsub.publish('DOCUMENT_UPDATED', ...) -> Subscription.documentUpdated.asyncIterator -> WebSocket push.
As your API grows past a single team’s scope, you need to split the GraphQL schema across multiple services. Federation is Apollo’s approach to distributed GraphQL.
Each team owns a subgraph — an independent GraphQL service with its own schema and resolvers. A gateway (also called the supergraph) composes all subgraphs into one unified schema and routes queries to the appropriate services.
Think of it like a power grid: each power plant (subgraph) generates electricity independently. The grid infrastructure (gateway) routes electricity from the right plants to meet demand.
Here is a Users subgraph:
type User @key(fields: "id") {
id: ID!
name: String!
email: String!
}
type Query {
user(id: ID!): User
users: [User!]!
}
And a Posts subgraph:
type Post @key(fields: "id") {
id: ID!
title: String!
body: String!
author: User @requires(fields: "authorId")
}
type User @key(fields: "id") {
id: ID!
posts: [Post!]!
}
extend type Query {
post(id: ID!): Post
posts: [Post!]!
}
The @key(fields: "id") directive tells the gateway that User is an entity that can be referenced across subgraphs. When the Posts service returns an author field, it includes only the id. The gateway resolves the rest of the User fields by querying the Users service.
# Query that spans both subgraphs
query {
user(id: "1") {
name # Users service
posts { # Posts service
title # Posts service
}
}
}
The gateway’s execution plan:
{ user(id: "1") { name __typename } } to Users service{ id: "1", name: "Alice", __typename: "User" }posts is provided by Posts service, which needs the User’s id{ _entities(representations: [{ __typename: "User", id: "1" }]) { ... on User { posts { title } } } } to Posts serviceEntity resolution is the heart of federation. Each subgraph that contributes fields to a type must implement a __resolveReference function:
// Posts service
const resolvers = {
User: {
__resolveReference: (reference) => {
// reference = { __typename: "User", id: "1" }
return { id: reference.id }
},
posts: (user) => db.posts.findAll({ where: { authorId: user.id } }),
},
}
The gateway sends an _entities query with a list of representations (typed references). Each subgraph resolves the reference and returns the entity, then resolves the requested fields.
Federation directives summary:
| Directive | Purpose |
|---|---|
@key(fields: "id") | Declares an entity’s primary key |
@extends | Marks a type as defined in another subgraph |
@external | Marks a field that is defined in another subgraph |
@requires(fields: "authorId") | Declares that a field needs data from another subgraph |
@provides(fields: "name") | Declares that a subgraph can provide a field normally owned elsewhere |
Production use cases:
Product catalog: Products service owns Product.name, Product.price. Inventory service owns Product.stockCount, Product.warehouseLocation. Reviews service owns Product.rating, Product.reviews. The gateway composes all three, and a single query fetches price, stock, and reviews.
Multi-team ownership: Team A owns Users. Team B owns Posts. Team C owns Analytics. Each team deploys independently, uses its own database, and sets its own scaling policies. The gateway handles cross-service resolution.
GraphQL’s flexibility comes with security risks that REST APIs do not face. A client can craft a single query that overwhelms your server by exploiting deep nesting, list multiplication, or expensive field resolution.
Depth limiting: A query like { user { posts { comments { author { posts { ... } } } } } } can nest arbitrarily deep. Limit query depth to prevent malicious nesting:
const depthLimit = require('graphql-depth-limit')
const server = new ApolloServer({
schema,
validationRules: [depthLimit(7)],
})
A depth limit of 7-10 stops pathological queries while allowing legitimate deeply-nested queries.
Query cost analysis: Not all fields cost the same. A user(id: "1") field costs one database query. A search(term: String) field might scan millions of documents. Assign cost values to fields and reject queries that exceed the budget:
const costAnalysis = require('graphql-cost-analysis')
const server = new ApolloServer({
schema,
validationRules: [
costAnalysis({
maximumCost: 1000,
defaultCost: 1,
// Cost multipliers for list fields
costMap: {
Query: { search: { multiplier: 'searchTerm', useMultiplier: true } },
User: { posts: { multiplier: 'limit', useMultiplier: true } },
},
}),
],
})
A query with cost 1000 and 10 parallel root fields costs 10,000 and gets rejected.
Rate limiting: GraphQL rate limiting is harder than REST because all requests hit one endpoint. You cannot rate-limit by path. Instead, rate-limit by:
/graphql, use operationName as a discriminatorsearch gets 10 calls/min per user, user gets 1000 calls/min)const rateLimit = require('graphql-rate-limit')
const resolvers = {
Query: {
search: rateLimit({ window: '1m', max: 10 })((_, { term }) => {
return searchEngine.search(term)
}),
},
}
Alias-based attacks: A query can use aliases to request the same expensive field many times:
query {
a: search(term: "x")
b: search(term: "y")
// ... 100 more aliases
}
The query still has one search field in the schema, but the same resolver fires 100+ times. Defend with cost analysis that counts each alias instance.
Batching attacks: Since GraphQL allows batching multiple queries in one request, a client could send 100 queries in one HTTP request:
[
{ "query": "{ user(id: \"1\") { name } }" },
{ "query": "{ user(id: \"2\") { name } }" },
// ...
]
Apollo Server limits batch size to 1 by default. Enable batching only with a cap (e.g., max 5 queries per batch) and authentication so anonymous users cannot batch.
Security checklist:
| Risk | Mitigation |
|---|---|
| Deep nesting | Depth limiting (max 7-10 levels) |
| Expensive fields | Query cost analysis (budget per user) |
| Alias multiplication | Cost analysis counts alias instances |
| Batch abuse | Limit batch size, require auth for batching |
| Introspection leak | Disable introspection in production (or restrict to authenticated users) |
| Field suggestion leak | Disable field suggestions in production (they leak schema info on error) |
| DataLoader cache not per request | Always create DataLoader instances per request context |
Comparison: GraphQL vs REST Security
| Attack | REST Defense | GraphQL Additional Defense |
|---|---|---|
| DDoS | Rate limit per endpoint | Cost analysis + depth limiting |
| Data over-fetching | Server controls response shape | Client controls it (more risk) |
| Schema leak | Endpoints are visible | Introspection leaks full schema (disable in prod) |
| Batch abuse | One request = one response | One query = many operations (limit batch size) |
A production GraphQL server combines every concept we covered:
const { ApolloServer } = require('@apollo/server')
const { expressMiddleware } = require('@apollo/server/express4')
const { ApolloGateway, IntrospectAndCompose } = require('@apollo/gateway')
const DataLoader = require('dataloader')
const depthLimit = require('graphql-depth-limit')
const gateway = new ApolloGateway({
supergraphSdl: new IntrospectAndCompose({
subgraphs: [
{ name: 'users', url: 'http://users-service:4001/graphql' },
{ name: 'posts', url: 'http://posts-service:4002/graphql' },
],
}),
})
const server = new ApolloServer({
gateway,
validationRules: [depthLimit(8)],
})
app.use('/graphql', expressMiddleware(server, {
context: async ({ req }) => ({
user: await authenticate(req),
loaders: {
user: new DataLoader(batchUsers),
posts: new DataLoader(batchPosts),
comments: new DataLoader(batchComments),
},
}),
}))
Building a GraphQL API requires thinking differently than REST:
The same query language that makes frontend development so productive — one request, exactly the data you need, strongly typed — also demands discipline on the backend. Schema design, resolver architecture, batching strategy, and security hardening are not afterthoughts. They are the foundation.