Imagine you are writing a program that needs to call a function on another computer. You want it to feel like a local function call — you pass arguments, you get a return value, and you move on. That is Remote Procedure Call (RPC).
Before gRPC, there were older RPC systems: CORBA, Java RMI, XML-RPC, SOAP. Each had its own way of serializing data, describing services, and handling network communication. They were complex, slow, and tightly coupled to specific languages.
gRPC, created by Google in 2015, solves this with a clean stack:
| Layer | Technology | Role |
|---|---|---|
| Interface Definition | Protocol Buffers (.proto files) | Define services and message shapes |
| Serialization | Protocol Buffers (binary wire format) | Compact, fast encoding of structured data |
| Transport | HTTP/2 | Multiplexed streams, headers, flow control |
| Code Generation | protoc + language plugin | Generate stubs and skeletons in any language |
gRPC generates client and server code from a .proto file. The client calls a local method, which serializes the arguments into protobuf bytes, sends them over HTTP/2 to the server, which deserializes them, calls the handler, and sends the response back.
When you define a service in a .proto file and run the protoc compiler, it generates two pieces of code:
stub.GetUser(request), and the stub handles serialization, framing, and network I/O.service UserService {
rpc GetUser (GetUserRequest) returns (User);
}
After code generation, the stub translates GetUser into an HTTP/2 request with path /UserService/GetUser. The server skeleton routes this to your handler. The network is abstracted away — you work with native objects on both sides.
// Generated client stub (JavaScript)
const client = new UserServiceClient('https://api.example.com', grpc.credentials.createInsecure())
const request = new GetUserRequest()
request.setId(42)
client.getUser(request, (error, response) => {
console.log(response.getName()) // "Alice"
})
Protocol Buffers (protobuf) is both an IDL (Interface Definition Language) and a serialization mechanism. You write a schema in a .proto file, and protobuf compiles it into code for any supported language.
syntax = "proto3";
message User {
int32 id = 1;
string name = 2;
string email = 3;
bool is_active = 4;
repeated string roles = 5;
}
service UserService {
rpc GetUser (GetUserRequest) returns (User);
rpc ListUsers (ListUsersRequest) returns (stream User);
rpc CreateUser (stream CreateUserRequest) returns (User);
rpc Chat (stream ChatMessage) returns (stream ChatMessage);
}
Key protobuf features:
int32, string, bool, enum, nested message)repeated: Lists / arrays of a type (replaces arrays in JSON)stream: Marks a streaming RPC (client or server side)oneof: Exactly one of several fields can be setmap<K,V>: Key-value pairs (like dictionaries)Compared to JSON Schema or OpenAPI, protobuf is more compact and code-generation-first.
When protobuf serializes a message to bytes, it uses a compact binary format. Each field is encoded as a tag-value pair:
tag = (field_number << 3) | wire_type
There are only six wire types:
| Wire Type | Meaning | Used For |
|---|---|---|
| 0 | Varint | int32, int64, uint32, bool, enum |
| 1 | 64-bit | fixed64, sfixed64, double |
| 2 | Length-delimited | string, bytes, embedded messages, repeated |
| 3 | Start group | Deprecated (proto2 only) |
| 4 | End group | Deprecated (proto2 only) |
| 5 | 32-bit | fixed32, sfixed32, float |
Varint encoding packs integers into fewer bytes. The most significant bit (MSB) of each byte indicates whether more bytes follow. For values under 128, it is just one byte. For larger values, it takes as many bytes as needed.
For example, int32 id = 1 with value 150:
Binary: 10010110 00000001
-> Strip MSBs: 0010110 0000001
-> Reverse groups: 0000001 0010110
-> Value: 128 + 22 = 150
Protobuf messages are self-describing enough that tools can decode them without the schema (using field numbers and wire types), but field names are lost — that is why you need the .proto file to get meaningful output.
Edit the fields below and watch the binary wire format update in real time. Each field is encoded as a tag (field_number << 3 | wire_type) followed by its value.
message Person { int32 id = 1; string name = 2; string email = 3;}{ "id": 42,
"name": "Alice",
"email": "alice@example.com" }The superpower of protobuf is backward and forward compatibility. You can add fields, remove fields, and change types — as long as you follow the rules:
reserved.int32 to int64 or uint32 (same wire type), but not int32 to string (different wire type).message User {
reserved 6, 10 to 15; // These field numbers can never be used
int32 id = 1;
string name = 2;
string email = 3;
bool is_active = 4;
repeated string roles = 5;
// New fields added later:
string phone = 16; // Safe: new number
Address address = 17; // Safe: new number
}
A client built with the old schema (without phone and address) can still deserialize a response from a new server — it simply ignores the unknown fields. A new client deserializing an old response gets "" for phone and a default Address for address.
This is dramatically better than JSON, where adding a field is technically easy but every client must handle the missing field explicitly.
gRPC uses HTTP/2 as its transport. HTTP/2 provides features that HTTP/1.1 cannot:
content-type: application/grpc) are compressed to a few bytes.The gRPC protocol maps onto HTTP/2 like this:
:method = POST, :path = /package.Service/Method, content-type = application/grpc:status = 200, grpc-status = 0For streaming RPCs, multiple DATA frames flow in one or both directions on the same stream. The stream stays open until both sides signal completion.
Unary RPC is the simplest pattern: the client sends exactly one request message, and the server sends exactly one response message. It maps directly to how HTTP/1.1 works, but with HTTP/2 multiplexing and protobuf efficiency.
The key gRPC-unique behavior happens in the framing:
grpc-status trailer header is sent to signal success (0 = OK) or error (non-zero)// Unary RPC call
const request = { userId: 42 }
client.getUser(request, { deadline: Date.now() + 5000 }, (err, response) => {
if (err) {
console.error('gRPC error:', err.code, err.details)
return
}
console.log('User:', response.name)
})
Under the hood, this triggers the frame exchange shown in the demo below.
Server-streaming RPC is one of the patterns that gRPC makes easy but REST struggles with. The client sends a single request, and the server sends a stream of responses over time.
This is perfect for:
// Server-streaming RPC call
const call = client.listUsers({ role: 'admin' })
call.on('data', (user) => {
console.log('Received user:', user.name)
})
call.on('end', () => {
console.log('All users received')
})
call.on('error', (err) => {
console.error('Stream error:', err)
})
The server writes multiple response messages on the same HTTP/2 stream. Each message has the standard 5-byte frame header. The stream stays open until the server sends a grpc-status trailer.
Client-streaming RPC reverses the pattern: the client sends multiple messages, and the server sends a single response after receiving all of them.
This is useful for:
// Client-streaming RPC call
const call = client.createUser((error, response) => {
if (error) {
console.error('Upload failed:', error)
return
}
console.log('Batch created:', response.count, 'users')
})
call.write({ name: 'Alice', email: 'alice@x.com' })
call.write({ name: 'Bob', email: 'bob@x.com' })
call.write({ name: 'Carol', email: 'carol@x.com' })
call.end()
The client sends DATA frames for each message. The server accumulates them or processes them incrementally. When the client calls end(), the server sends its single response.
Bidirectional streaming allows both client and server to send messages independently on the same stream. Unlike server-streaming (where the server responds to a single request) or client-streaming (where the client sends before the server), bidirectional streaming has no ordering constraints.
Messages flow asynchronously. The client can send 5 messages, then the server sends 3. Or the client sends 1, the server sends 1, the client sends 2. Any pattern works.
This is the foundation for:
// Bidirectional streaming RPC call
const call = client.chat()
call.on('data', (message) => {
console.log('Server says:', message.text)
})
call.write({ text: 'hello', userId: 1 })
call.write({ text: 'how are you?', userId: 1 })
// Meanwhile, the server can send messages whenever it wants
The key insight: bidirectional streaming in gRPC does NOT require the client and server to take turns. The stream is full-duplex on top of HTTP/2’s multiplexed framing.
Every gRPC call should have a deadline (client-side timeout). If the server does not respond within the deadline, the client cancels the request and receives a DEADLINE_EXCEEDED error.
// Set a deadline of 5 seconds from now
const deadline = new Date()
deadline.setSeconds(deadline.getSeconds() + 5)
client.getUser(request, { deadline }, (err, response) => {
if (err && err.code === grpc.status.DEADLINE_EXCEEDED) {
console.error('Request timed out')
}
})
Without a deadline, a gRPC client could wait forever if the server hangs. Deadlines propagate through the call chain — if service A calls B with a 5-second deadline, and B calls C, the remaining time is automatically propagated to C. This is called deadline propagation and prevents cascading failures.
// Server-side: check remaining time
server.on('getUser', (call, callback) => {
const remaining = call.getDeadline() - Date.now()
if (remaining < 500) {
callback({ code: grpc.status.DEADLINE_EXCEEDED, details: 'Not enough time to process' })
return
}
// Process normally
})
Interceptors are the gRPC equivalent of middleware in web frameworks. They wrap every RPC call with cross-cutting behavior, without modifying the business logic.
Client interceptors run on the client side, wrapping outgoing calls:
Server interceptors run on the server side, wrapping incoming calls:
// Server-side interceptor (pseudocode)
function loggingInterceptor(ctx, next) {
const start = Date.now()
console.log(`[gRPC] -> ${ctx.method}`)
return next(ctx).then(response => {
const duration = Date.now() - start
console.log(`[gRPC] <- ${ctx.method} (${duration}ms)`)
return response
})
}
Interceptors compose like a chain: the first interceptor wraps the next, which wraps the next, until the actual handler runs. The response flows back through the chain in reverse order. This pattern is called the middleware chain or pipeline pattern.
Toggle interceptors on or off. When you send a request, they execute in order: client-side first (top to bottom), then the gRPC call, then server-side (top to bottom).
gRPC and REST solve the same problem — client-server communication — but with radically different trade-offs.
| Dimension | gRPC | REST |
|---|---|---|
| Serialization | Binary (protobuf) | Text (JSON/XML) |
| Schema | Required (.proto file) | Optional (OpenAPI is separate) |
| Streaming | Native (4 types) | Polling or SSE (workarounds) |
| Browser support | gRPC-Web (limited) | Native (fetch, XMLHttpRequest) |
| Human readability | No (binary) | Yes (JSON) |
| Caching | No (POST only) | Yes (GET caching) |
| Code generation | Built-in | External tools (OpenAPI Generator) |
| Performance | 5-10x faster serialization | Slower, larger payloads |
gRPC excels in two scenarios:
REST still wins for:
Browsers cannot send raw HTTP/2 frames or use the gRPC trailers mechanism. gRPC-Web bridges this gap:
// gRPC-Web client (browser)
import { GrpcWebClient } from 'grpc-web'
const client = new UserServiceClient('https://api.example.com')
client.getUser({ id: 42 }, (err, response) => {
// Works the same as the standard gRPC client
})
gRPC-Web has limitations:
For internal browser applications that need streaming, WebSocket or SSE are often better choices than gRPC-Web.
Load balancing gRPC is different from HTTP load balancing because gRPC connections are long-lived (HTTP/2 persistent connections) and do not use the typical request-per-connection model.
Client-side load balancing: The gRPC client maintains a list of server addresses and distributes calls across them. Popular strategies:
// Client-side load balancing with round robin
const client = new UserServiceClient('dns:///api.example.com:50051',
grpc.credentials.createInsecure(),
{ 'grpc.lb_policy_name': 'round_robin' }
)
Proxy-based load balancing: An L7 proxy (Envoy, Linkerd, NGINX) terminates the gRPC connection and distributes individual RPCs to backend servers. This is simpler and works with any language, but adds latency.
The key challenge: gRPC clients open a single HTTP/2 connection and multiplex many RPCs over it. If you do round-robin at the TCP level (L4), all RPCs go to the same server (the one connected). You must load-balance at the RPC level, not the connection level.
gRPC reflection allows clients to discover services and methods at runtime without the .proto file. This is essential for tools like grpcurl, grpc_cli, and debugging consoles.
# grpcurl with reflection
grpcurl -plaintext localhost:50051 list
# Output:
# grpc.health.v1.Health
# UserService
grpcurl -plaintext localhost:50051 describe UserService.GetUser
# Output:
# UserService.GetUser is a unary RPC
# Input: GetUserRequest
# Output: User
grpcurl -plaintext -d '{"id": 42}' localhost:50051 UserService.GetUser
# Output:
# {
# "id": 42,
# "name": "Alice",
# "email": "alice@example.com"
# }
Enable reflection on your server:
import "grpc/reflection/v1/reflection.proto";
// Register the reflection service in your server code
gRPC health checking uses a standard protocol (defined in grpc.health.v1.Health) to report service health. Kubernetes, Envoy, and other orchestrators use this to determine if a service is ready to receive traffic.
grpcurl -plaintext localhost:50051 grpc.health.v1.Health/Check
# Output:
# {
# "status": "SERVING"
# }
A service reports one of three statuses:
SERVING: Ready to handle requestsNOT_SERVING: Alive but not accepting requests (e.g., warming up, draining)SERVICE_UNKNOWN: The health check service is not registeredHealth checking and reflection together form the operational foundation for running gRPC services at scale.
gRPC is a modern, high-performance RPC framework that combines Protocol Buffers for serialization with HTTP/2 for transport. Its four streaming patterns (unary, server-streaming, client-streaming, bidirectional) cover every communication pattern a distributed system needs, from simple request-reply to full-duplex real-time messaging.
The key takeaways:
When you need to move data between services efficiently, with strong contracts and native streaming, gRPC is the tool that gets out of your way and lets you focus on what matters: the business logic.