Sammy Henningsson

September 17, 2021

Another REST vs GraphQL comparison

Why yet another article comparing REST and GraphQL when there are already thousands of articles describing this topic? Plus they all pretty much agree that GraphQL is in and REST is out.


TL;DR All systems have different requirements, so as always.. It depends. YMMV etc.
To fairly compare REST vs GraphQL it helps having some background of what they are as well as their strengths and weaknesses. Saying that one of them is always better than the other is rather pointless.
GraphQL is extremely flexible and efficient when it comes to minimizing round trips at the expense of http caching and business logic. REST plays well with http caching and keeps business logic on the server side, but is usually more chatty than GraphQL.
 

A quick network background

Both REST and GraphQL uses the same network stack, meaning HTTP, TCP etc. This means that for every new connection that a client initiates it has to perform a TCP handshake and a TLS handshake (and possibly also a DNS lookup). This means that performing a request on a new connection requires at minimum 3 round trip times (RTTs). Each RTT adds response time and the only way to improve them is to keep them as few as possible.
Also noteworthy is that with an initial window scaling size of 10 (the number of TCP packets that may be sent before an ACK is returned), the server (or client) can send 10 x 1460 bytes in one go, i.e. a bit more that 14kb. This is what we push into the wire, so when using compression the actually data size more likely to be somewhere around 100kb. The number 1460 comes from the largest ethernet frame that we can use, MTU, which is 1500 bytes, minus the size of the IP header as well as the TCP header.
Previously with HTTP/1.1, fetching multiple resources had to be done either sequentially or by opening new connections at the cost of more handshakes. Nowadays with HTTP/2, we can perform parallel requests on a single connection (without the cost of new handshakes). And with HTTP3, around the corner, the connection setup is improved further and new connection may then be created with only 2 RTTs.


GraphQL

GraphQL was invented by Facebook and released 2015. It allows clients to choose exactly what they want using a single request/response. Here an example query from graphql.org:

{
  hero {
    name
    friends {
      name
      homeWorld {
        name
        climate
      }
      species {
        name
        lifespan
        origin {
          name
        }
      }
    }
  }
}

This lets clients fetch a hero, her friends and for each friend, their home world and all its species. Plus some other attributes. All this is done in a single request/response cycle, i.e 3 RTTs.
If this happens to be something that should always be grouped together, then the REST equivalent is an easy fix. Just expose all this as a resource. However if different clients wants different data, things start to get more complicated.
Facebook has hundreds of API clients and each of them have different needs. Many (probably most) of their API clients are developed by third parties. So it's impractical, to say the least, for Facebook to have full knowledge about the specific needs for all those clients. In this case it's great that each client can be in charge of what data to get.


REST

REST, which is an abbreviation of Representational State Transfer, was coined by Roy Fielding in his doctoral dissertation. The dissertation describes the architecture of the web and this results in REST which defines some constraints that makes up this architecture. These constraints covers caching and hypermedia, which are two things that many developers forget about when designing REST (or RESTful) APIs. This is so misunderstood that some people started calling REST APIs, Hypermedia APIs to distinguish them from APIs not using hypermedia.
Hypermedia simply means that we need to have links. This means that we must use a format that supports links, e.g. application/hal+json or application/vnd.siren+json. (The most common format application/json has no support for links and wont due.)
Links allows clients to go from one state to another. Such as from one resource to another resource or to modify (update/delete) a resource. A benefit of having links is that clients does not have to know about how to construct links plus the server can change them when needed. But more importantly, links allows the server to let clients know which transitions are available.
Taking the example with the hero and friends from above, this would typically be done with multiple requests using REST.
Client requests a hero:

GET /hero

{
  "name": "Luke Skywalker",
  "_links": {
    "self": { "href": "https://rest.example.api/hero" },
    "canonical": { "href": "https://rest.example.api/luke" },
    "friends": { "href": "https://rest.example.api/luke/friends" }
  }
}

Then clients gets the friends:

GET /luke/friends

[
  {
    "name": "Han Solo",
    "_links": {
      "self": { "href": "https://rest.example.api/han" },
      "homeworld": { "href": "https://rest.example.api/homeworld/7" }
      "species": { "href": "https://rest.example.api/han/species" }
    }
  },
  {
    "name": "Leia Organa",
    "_links": {
      "self": { "href": "https://rest.example.api/leia" },
      "homeworld": { "href": "https://rest.example.api/homeworld/3" }
      "species": { "href": "https://rest.example.api/leia/species" }
    }
  },
]

Then for each friend we get their home world and their species. Note, back when GraphQL was designed, which was pre HTTP2 each of these requests had to be done sequentially or in parallel on new connections, requiring new TCP and TLS handshakes. But now that we have HTTP2 we can make requests in parallel on the same connection. This means that counting the round trips for this flow we get 3 + 1 + 1 = 5 RTTs. This is of course less performant than 3 RTTs as with GraphQL, but it's probably a lot better than what people might think. (HTTP/1.1 requires way more RTTs, especially in the more realistic case where Luke has a lot more than just two friends).

Why would anyone settle for 5 RTTs when we can get away with 3 RTTs?
There are two advantages with this solution.
- The first is caching. If these resources are not frequently changed we can leverage HTTP caching. Assuming we have a resources stored in the cache then all these requests could be served directly from a proxy cache and we don't even need to touch the API server.
- The second reason is that the server is controlling the business logic. Say that clients of this API are only allowed to see the home world of characters if they have liked those characters and that clients are only allowed to see the species of a character if they have liked a friend of that character. (My apologies for not coming up with a better real world examples.) That might have resulted in a request like this:

GET /luke/friends

[
  {
    "name": "Han Solo",
    "_links": {
      "self": { "href": "https://rest.example.api/han" },
      "homeworld": { "href": "https://rest.example.api/homeworld/7" }
    }
  },
  {
    "name": "Leia Organa",
    "_links": {
      "self": { "href": "https://rest.example.api/leia" },
      "species": { "href": "https://rest.example.api/leia/species" }
    }
  },
]

IMHO this is the biggest benefit of using hypermedia. Now clients would know that they cannot fetch the species of Han Solo nor the home world of Leia. This means that the business logic can be kept in one place (in the server) and clients only need to check for which links are present.
In GraphQL we would need to put this logic in all clients instead (as well as in the server). Making sure that all this code duplication is in sync is both cumbersome and much more difficult to change. Changing these rules, would typically involve publishing new app releases in App store and in Play store and waiting for users to update their apps. Compared to REST where we can simply update the backend with new rules of when different links should be present and then all clients starts getting the new behavior.
Note: this does complicate caching a fair bit, since "Client A" and "Client B" might not have the same privileges and should thus see different views of the same resources. I would say that the removal of the business logic duplication is such an important design decision that its more important to keep business logic away from clients than having the possibility to store these resources in a shared cache. (However, I think there are ways around it. E.g by putting the authentication/authorization in front of the proxy cache and adding role based HTTP headers to the requests and having the server include those HTTP headers in the Vary header. Though I haven't tried this nor seen anyone use that pattern. And not to forget, using a good standard for client side caching is still available and useful).


Overfetching and Underfetching

Two areas which people tend to highlight when it comes to GraphQL is, fixing the problem of overfetching and underfetching.
Imagine that we are creating an API that exposes invoices, where each invoice has a bunch of attributes, e.g. amount, tax, invoice date, due date, paid status etc. This API will have three clients where client A only cares about the invoicing amount and tax, client B only cares about due date and paid status and client C wants to know about amount, invoice date, due date and paid status.
If we build a REST API then that would typically mean that all clients would always fetch the full resource even though many of the attributes are ignored. This is known as overfetching.
If we instead build a GraphQL API, then each client can themselves decide which fields they want and the response will only include those fields.

It clearly sounds better to only fetch what's needed and often the analysis stops here. But we also need to think about what the "problem" of overfetching entails. As we saw earlier we can send up to ~100kb of data  before needing extra RTTs. This is plenty when it comes to sending some JSON. And pushing some extra bytes down the pipe will hardly make much difference now that we normally have connections that can push down millions of bytes per second.
However, using "full" resources means that we can cache and reuse them. So if client A fetch an invoice and shortly after client B or client C fetch the same invoice, then the request can be served directly from an HTTP cache which means the API server doesn't need to waist any cpu cycles for those requests.
Of course not all APIs have resources that can be cached, but most systems tend to be heavier on reads compared to writes and normally profit from caching.

Now lets have a look at the second area which is underfetching. Continuing with our API for invoices, now imagine that each invoice is connected to a customer and a shopping cart. In turn each shopping cart has a list of items.
The usual REST approach to this is to create a few new types of resources, e.g. customer, shopping cart and item. The invoice would then have links to the customer and the shopping cart, which in turn would have links to each item that it contains. This could mean that clients would need to make one request for the invoice, one request for the customer, one request for the shopping cart and N requests for the items in the shopping cart.
Compare this to GraphQL where clients can fetch all the data they want in one request.
There's no denying that the GraphQL approach is hugely favorable. However, in some cases we could get the same behavior with REST. Say that invoices are always shown in a context were users are shown the invoice, the customer, the shopping cart and each item. Then perhaps it makes more sense to create a larger resource that includes everything the client should have. Then we have the same performance as with GraphQL, i.e. clients only need to make one request to get all the data they need. (Note: this is often the way the web works. Since REST is really just a description of the web, I do think its good to use the web as inspiration when building a REST API),.
However, if your API is similar to the ones of Netflix's or Facebook's etc. and you have hundreds of clients (many of which you have little control over), then this approach is off the table. It wouldn't  be possible to maintain all those different resources that they would require. Meaning, GraphQL is probably the way to go. But a lot of companies do have the luxury to have full control over their typical three clients, e.g android app, iOS app and Web. For them, it might make sense to think about merging resources into larger resources.

A cool thing with HTTP2 is that we could get similar behavior without merging resources. With HTTP2 PUSH the server can send resources that the server thinks that clients should have. This way, when a client request an invoice, the server could respond with the invoice and then push out the customer, the shopping cart as well as the items. By keeping the resources separate, then we can cache each resource separately and we don't have to invalidate the "full" resource when a single resource is stale.


Fixing GraphQL's problem with duplicated business logic

What about adding GraphQL fields that lets clients know available actions? It's definitely possible to add a bunch of Boolean fields, like canViewSpecies and canViewHomeWorld etc. This does work, but it's far from being as elegant as having links. Normally this means that clients need to know about how to perform the corresponding action (though this is somewhat similar to the required knowledge about link relations). And for unsafe actions: which mutation do they correspond to? Basically we are back to putting business logic in clients instead of keeping it on the server.
Marc-André Giroux wrote a very interesting article about how to get some hypermedia benefits into GraphQL. I think this approach is pretty slick, but since he himself seem a bit hesitant to it and IIRC he doesn't even mention it in his book, "Production Ready GraphQL", I get the feeling that he doesn't encourage this pattern.

There are also other areas where REST vs GraphQL has some advantages/disadvantages over each other. But IMHO, caching and placement of business logic is the most important. Plus a lot of other things are actually not that different between them, (though in the REST case you typically need to come up with some new conventions). For example versioning, deprecations and type systems are pretty similar. Though GraphQL comes with batteries included while REST requires adopting solutions like json schema and deprecation headers (which aren't nearly as good as deprecations in GraphQL). The strategy for versioning IS the same. Which is don't.

Conclusion

When choosing between REST and GraphQL, it's important to look at the consumers of the API. If you have many clients with different needs then perhaps GraphQL is a better choice. If you have few clients and you know what they want then perhaps REST is a better choice for you.
So basically, if you know that you can compose an awesome burger that your clients like, then just keep it simple and serve that. But if your clients need to put all kinds of weird stuff into their burgers, then perhaps you need a query language.