REST vs gRPC vs GraphQL

REST vs gRPC vs GraphQL

Three dominant API paradigms used at FAANG — each optimized for a different problem. Knowing when to reach for each is a system design interview staple.

At a Glance

RESTgRPCGraphQL
TransportHTTP/1.1 or HTTP/2HTTP/2 (required)HTTP/1.1 or HTTP/2
Wire formatJSON (text)Protocol Buffers (binary)JSON (text)
SchemaOpenAPI (optional).proto (required)SDL (required)
Payload sizeLarge (verbose JSON)Small (binary, ~3–10× smaller)Variable (client-specified)
LatencyModerateLowModerate to high (resolver fan-out)
StreamingSSE / chunked✅ Native (4 patterns)✅ Subscriptions (WebSocket)
Caching✅ Easy (HTTP cache headers)❌ Hard (POST-only, no URL)❌ Hard (POST-only, dynamic queries)
Browser support✅ Native⚠️ Requires gRPC-Web proxy✅ Native
Type safetyOptional (OpenAPI codegen)✅ Enforced at compile time✅ Schema-validated at runtime
VersioningURL path (/v2/) or headerField addition (backward compat)Schema evolution with @deprecated

REST

REST (Representational State Transfer) maps operations to HTTP verbs on resource URLs. The server is stateless — no session state between requests.

HTTP verb semantics:

VerbSemanticsIdempotentSafe
GETRead
POSTCreate / trigger
PUTReplace (full update)
PATCHPartial update❌ (unless designed so)
DELETEDelete

Idempotency matters at scale — retrying a safe/idempotent operation is always safe. Retrying a POST (non-idempotent) may create duplicate records. Use idempotency keys (Idempotency-Key: <uuid>) for POST endpoints that should be retry-safe (payment APIs, order creation).

HTTP caching is REST’s biggest advantage — GET responses are cacheable by default. Reverse proxies (Nginx, Varnish) and CDNs cache based on URL + headers automatically. ETag and Last-Modified enable conditional requests to avoid sending unchanged data.

Over-fetching and under-fetching: A single REST endpoint returns a fixed shape. A mobile client asking for a user’s name gets the full user object (over-fetch). Assembling a feed requires multiple round trips to /users, /posts, /comments (under-fetch). This is the problem GraphQL was designed to solve.

Versioning trap: URL versioning (/v1/, /v2/) duplicates controller logic and is hard to deprecate. Header versioning (Accept: application/vnd.api.v2+json) is cleaner but less visible. Additive changes (new optional fields) avoid versioning entirely — prefer this when possible.

gRPC

gRPC uses HTTP/2 as the transport and Protocol Buffers as the serialization format. The API contract lives in a .proto file; client and server code is generated from it.

Proto definition:

service OrderService {
  rpc GetOrder (GetOrderRequest) returns (Order);           // Unary
  rpc StreamOrders (StreamRequest) returns (stream Order); // Server streaming
  rpc UploadItems (stream Item) returns (UploadResult);    // Client streaming
  rpc Chat (stream Message) returns (stream Message);      // Bidirectional
}

message Order {
  string order_id = 1;
  int64  created_at = 2;
  repeated LineItem items = 3;
}

Four communication patterns:

PatternClient sendsServer sendsUse case
Unary1 request1 responseStandard RPC call
Server streaming1 requeststream of responsesLive feed, large result set
Client streamingstream of requests1 responseFile upload, telemetry ingest
BidirectionalstreamstreamChat, collaborative editing

Why binary matters at scale: A JSON payload of 1 KB becomes ~100–300 bytes in Protobuf. At 100k RPS, that’s 70–90 MB/s of bandwidth saved. More importantly, binary parsing is significantly faster than JSON — fewer CPU cycles per request.

Deadlines and cancellation: gRPC has first-class deadline propagation. A client sets a deadline; the server checks ctx.Done() and cancels in-progress work. Deadlines cascade through service calls — if the root request deadline expires, all downstream gRPC calls are cancelled. This prevents cascading slow-drain failures.

ctx, cancel := context.WithTimeout(context.Background(), 500*time.Millisecond)
defer cancel()
resp, err := client.GetOrder(ctx, &pb.GetOrderRequest{OrderId: "123"})

Browser limitation: Browsers cannot speak HTTP/2 trailers, which gRPC requires. gRPC-Web solves this with a JavaScript client that communicates with an Envoy proxy (or Nginx module) that translates to native gRPC. This adds an extra hop and loses bidirectional streaming.

GraphQL

GraphQL exposes a single endpoint. The client sends a query specifying exactly which fields it needs. The server returns only those fields.

# Client query — asks only for what it needs
query {
  user(id: "u_123") {
    name
    avatar
    recentOrders(limit: 3) {
      id
      total
      status
    }
  }
}

The server resolves each field via a resolver function. Fields can be resolved from different data sources (databases, microservices, caches).

N+1 problem: A naive resolver for recentOrders fires one DB query per user. Fetching a feed of 100 users triggers 1 (users) + 100 (orders) = 101 queries.

users query        → 1 SQL query   → returns 100 users
orders resolver    → 100 SQL queries (one per user)   ← N+1

Fix: DataLoader batching. DataLoader collects resolver calls within a single event loop tick, batches them into one query, then distributes results.

orders resolver    → batched: SELECT * FROM orders WHERE user_id IN (u1, u2, ..., u100)
                   → 1 SQL query regardless of result size

Caching is hard: REST uses URL-based HTTP caching naturally. GraphQL queries are POST requests — no URL to cache on. Solutions:

ApproachHow it works
Persisted queriesClient registers query hash; server stores query by hash. GET /graphql?queryId=abc123 — now cacheable by CDN
Response cache (Apollo)Cache full query responses by hash in Redis
CDN cachingOnly works with persisted queries over GET; dynamic queries cannot be CDN-cached
Fragment cachingCache individual resolver results, not full responses

Schema federation (Apollo Federation): Large orgs split the schema across teams. Each service owns its subgraph. The gateway stitches subgraphs into a unified schema at query time. Netflix, Shopify, and Twitter use this pattern to let product teams own their own GraphQL types independently.

⚠️

GraphQL introspection — the ability to query the schema itself — is useful in development but should be disabled in production for public APIs. It exposes your full data model and can be used to map attack surface.

Decision Guide

ScenarioRecommendationReason
Public-facing API (third-party developers)RESTFamiliar, widely tooled, easy to cache, works in every HTTP client
Internal microservice communicationgRPCBinary efficiency, compile-time contracts, deadline propagation, streaming
Mobile BFF (Backend for Frontend)GraphQLMobile clients fetch exactly the fields they render — avoids over-fetch on metered connections
Multi-client (web, iOS, Android, TV)GraphQLOne schema serves all clients; each client queries its own shape
High-throughput data pipelinegRPCBinary payload, streaming, low CPU overhead
Real-time data (live scores, collaborative)gRPC bidirectional or WebSocket over RESTNative streaming patterns
Startup / small teamRESTSimplest to build, debug, and evolve; no codegen step
Service mesh (Istio, Linkerd)gRPCSidecar proxies speak HTTP/2 natively; richer observability (per-RPC metrics)
ℹ️

These are not mutually exclusive. A common pattern at FAANG: gRPC between internal microservices, REST for the public API gateway, and GraphQL as the BFF layer that aggregates internal gRPC calls into client-optimized responses.