This is my notes on graphQL.
If this is not where you want to be, head on over to the Readme.
This mainly teaches us the history of GraphQL and what is going on. Honestly I don’t think I quite understand it unless I touch it or see more examples of it being used.
-
GraphQL is a new API standard that was invented & open-sourced by Facebook.
-
In its core, graphQL enables declarative data fetching where a client can specify exactly what data it needs from the API.
-
So instead of multiple endpoints that return fixed data like REST, graphQL server exposes single endpoint and responds with precisely the data that the client asked for.
So most information are stored remotely in a database that is accessible over the internet which is what servers do. So whenever a client needs an information that it wants displayed to a user, it will send a request to the server. The server is programmed in a way that it can respond to the request by retrieving the information from the database and then sends it back to the client.
REST was a popular way to expose data from a server. It was a good fit for many applications, however, with the changes in technology, it challenged how APIs are designed. For example:
-
Increased mobile usage (those with slower internet) creates a need for efficient data loading
With graphQL it limits the data that needs to be transferred over the network.
-
Variety of different frontend frameworks and platforms on the client-side
Front-end frameworks and platforms that run client applications make it difficult to build and maintain one API that would fit all the requirements of all. With GraphQL, each client can access precisely the data it needs.
-
Fast development speed & expectation for rapid feature development.
Continuous deployment has become a standard for many companies. Now data is constantly iterating and updating. With REST API, it needed to be modified to account for specific requirements and there needed to be design changes on the client side. This hurts fast development practices and product iterations.
GraphQL has been around since 2012 but was later publicized in 2015 when they mentioned they were going to opensource. They as in Facebook, of course. GraphQL is not limited to only ReactJs but can be used with any programming language and framework that communicates with an API. Coursea and Netflix also tried to make a similar thing to GraphQL. After graphql was publicized, coursera ditched theirs and went with graphQL.
GraphQL was developedt o cope with the need for more flexibility and efficiency in client-server communication. It solves many of the shortcomings and inefficiences that developers experienced with REST API.
In the example, there are 3 API endpoints to get information from a user with REST
-
/users/<id>
-
/users/<id>/posts
-
/users/<id>/followers
We want displayed, name, title of posts, and followers
The issue with REST is that when we input the information /users/<id>
we will end up getting additional information that is unecessary.
example of only interested in name but getting more information
{ "user" : { "id": "352353235" "name" : "Mary", "address" : {...} } }
With the same thing with REST, you will send a single request instad of 3 like REST did just to get name, title of posts and followers.
For GraphQL, we will send a post request to the server. In the query it will describe all the data requirements of the client.
query { User(id: "235532532") { name posts { title } followers(last: 3) { name } } }
The request is self explanatory. With that ID, we want the name of the user, the title of posts, and the last 3 follower’s names.
The server will then process and fetch exactly the requested information and read it from the database. It will then package it up into a JSON object that will return to the client. The response will look like this:
{ "data": { "User" : { "name" : "Mary", "posts" : [ { title: "Learn GraphQL today"}, { title: "React & GraphQL - A declarative love story" }, { title: "Why GraphQL is better than REST" }, { title: "Relay vs Apollo - GraphQL Clients" } ], "followers" : [ {name: "John"}, { name: "Alice"}, { name: "Sarah"} ] } } }
This JSON information has the root field called data. Data returns to us all of the information that was requested in the query.
-
Elimite Over and Under fetching data.
-
Overfetching means downloading unnecessary data.
-
Underfetching means an endpoint doesn’t return enough information or you need to send multiple requests (n + 1- requests problem)
-
-
great for rapid Product Iterations. changes in the client side can be made without any extra work on the server. This is good for faster feedback cycles and product iterations.
-
Insightful Analytics.
-
Its Schema & Types.
-
GraphQL uses strong type system to define capabilities of an API. All of the types that are exposed in an API are written down in a schema using the graphQL’s schema definition language.
-
The same schema is how a client can access the data. When the schema is defined, the team’s working on the front end and the back end can do their work without further communication.
-
Question: What is a benefit of the GraphQL schema and strong type system?
-
They work well with the javascript type system
-
It solves the n+1 request problem
-
Once the schema is defined, frontend and backend teams can work independently from one another
-
Trick question: GraphQL doesn’t havea type system.
Answer: Once the schema is defined, frontend and backend teams can work independently from one another.
In this lesson, we are going to learn about GraphQL’s language. We are going to learn about how graphQL define types as well as sending queries and mutations.
The syntax for writing schema is called SDL. Here’s an example of how to use SDL with these two simple types.
type Person { name: String! age: Int! }
-
This one has two fields. key name and age and the values string and int.
-
The ! with it means that it is required.
type Post { title: String! }
-
This one only has one field with key title and value String.
-
Again, ! means it’s required.
We can define relationships between the two. GraphQL calls that relations. To add a relation to this example between Person and Post, we need to express that one person can be the author of many posts.
-
We add the author field to the Post type like this:
type Post { title: String! author: Person! }
Now, every post that is created needs to be associated with a person who is the author of it.
-
We add the posts field to the Person type to show that a person can write multiple posts.
type Person { name: String! age: Int! posts: [Post!]! }
Note
|
the [] bracket tells it that it will be a list. |
After the connection has been made, now there is a one-to-many-relationship between Person and Post. They say one-to-many because of the expected array inside of Person.
Rest’s fetching is encoded in the URL that it connects to. GraphQL’s method is a whole lot more different. GraphQL doesn’t have multiple endpoints like REST does, but just one single endpoint. This works because the data is not fixed, but actually more flexible and let’s the client decide on the requested data. The downside to that is that the client needs to send more information to the server to get what the client wants.
the process is called a Query. Here is an example of a query:
{ allPersons { name } }
-
The root of the query is named allPersons.
-
Anything that follows after the root field is called the payload of the query.
-
This query will return a list of all the persons that are stored in the database and will only show their name.
Here’s it’s output example in JSON:
{ "allPersons": [ {"name": "Johnny"}, {"name": "Sarah"}, {"name": "Alice"} ] }
If you wanted to add allPerson’s age as well as their name, all you need to do is add age into the payload like this:
{ allPerson { name age } }
and it will output this:
{ "allPersons": [ {"name": "Johnny", "age": 23}, {"name": "Sarah", "age": 20}, {"name": "Alice", "age": 20} ] }
You can actually control how much is output for the requested query. Such as if you want to limit to only the last 2 names instead of giving all of the results.
{ allPersons(last: 2) { name age } }
This will list the last two names that were stored in the database. This will be great to show the newest users or something.
{ "allPersons": [ {"name": "Sarah", "age": 20}, {"name": "Alice", "age": 20} ] }
Basically using the last 2 took out poor Johnny.
You can also query using nested information.
In the example, we want to load all the posts that each person has written. Here is how the query would look like:
{ allPersons { name posts { title } } }
Here is how the server is going to output the query request in JSON:
{ "allPerson": [ { "name": "Johnny", "posts": [ {"title": "graphQL is awesome"}, {"title": "Relay is a powerful graphql client"} ] }, { "name": "Sarah", "posts": [ {"title": "How to get started with React & GraphQL"} ] } ] }
There we see that it only shows the title of the posts nested with the user’s name.
We learned about requesting data, now we will learn about making changes to the data that is stored. With GraphQL, we make these changes using so called mutations. There are three kinds of mutations:
-
creating new data
-
updating existing data
-
deleting existing data
To modify the data, it is actually written the same way you would when making a request except that you need to start it with the mutation keyword. Example:
mutation { createPerson(name: "Bob", age: 36) { name age } }
-
Just like the request, mutation also has a root field, and this root field is called createPerson.
-
Inside the paranthesis are arguments that will be placed into the payload. The arguments are: name: "Bob", age: 36.
-
The payload’s key will then look at the argument’s values to replace with it.
the output for the mutation will look like this:
{ "createPerson": { "name": "Bob", "age": 36 } }
GraphQL types have unique IDs that are generated by the server:
type Person { id: ID! name: String! age: Int! }
When we were making a mutation, we can also request an ID when we were creating a person.
mutation { createPerson(name: "Bob", age: 36) { id } }
One absolute necessity is to have a real-time connection to the server to get immediate information about changes.
For this use case, graphql offers the concept of subscriptions. When a client subscribes to an event, it will initiate and hold a steady connection to the server.
In this example, the client subscribes on the server to get informed about new users being created.
subscription { newPerson { name age } }
So now when a new user is created, the server will push the data to the client. Unlike queries and mutations that follow a typical response cycle, subscriptions will have a continuous stream of data sent over to the client.
Again, so with that subscription setup with the root newPerson, the client will receive a user’s name and age every single time a new user is created.
-
The schema defines the capabilities of the API by specifying how a client can fetch, mutate, and subscribe the data.
-
It is also a contract between the client and the server.
-
It is also a collection of GraphQL types with special root types.
The schema will have different root types that define its entry points for the API.
-
type Query sends requests for data to be read.
-
type Mutation modifies the data.
-
type Subscription is like query except it sets up a constant connection for when changes happen.
Query type:
{ allPersons { name } }
type Query { allPersons(last: Int): [Person!]! }
-
(last: Int) is to limit the result. You can even use first..
Mutation type:
mutation { createPerson(name: "Bob", age: 36) { id } }
type Mutation { createPerson(name: String!, age: Int!: Person!) }
Subscription type:
subscription { newPerson { name age } }
type Subscription { newPerson: Person! }
The full schema of what it’s using:
type Person { id: ID! name: String! age: Int! posts: [Post!]! }
type Post { title: String! author: Person! }
-
When you see Person! , it is linked to Person and is being sent there.
-
When you see [Person!], it is also linked but is receiving it.
Right now it doesn’t really do anything, but we can define a CRUD style API by adding a new fields to the schemas’s root types. CRUD is short for create, read, update, delete which actually refers to the type of operations we will use with GraphQL.
So right now we want to add the new field to the query type to retrieve all of the stored posts from the server. The field is called allPosts and looks very similar to the allPersons field.
type Query { allPersons(last: Int): [Person!]! allPosts(last: Int): [Post!]! }
All this does is ask for all the posts and all of the persons that are stored in the database. However, this way is not possible to query individual posts or person objects.
To complete the CRUD API for the person type, we can do so by adding two more mutations. One for updating and one for deleting person objects.
type Mutation { createPerson(name: String!, age: String!): Person! updatePerson(id: ID!, name: String!, age: String!): Person! deletePerson(id: ID!): Person! }
-
updatePerson has to look up the ID which will allow us to specify which person that needs to be updated, and also supplied the new arguments to change.
-
deletePerson only needs the ID to tell the server which person to delete.
Now, the same must be done for posts.
createPost(title: String!): Post! updatePost(id: ID!, title: String!): Post! deletePost(id: ID!): Post!
-
For the update and delete mutation to work, we need to specify an ID.
Now to update the subscription, we need to add all the persons and posts type that were created and existing.
type Subscription { newPerson: Person! updatedPerson: Person! deletedPerson: Person! newPost: Post! updatedPost: Post! deletedPost: Post! }
Question: What are GraphQL subscriptions used for?
-
Event-based realtime functionality
-
You use them to subscribe to the GraphQL Weekly newsletter
-
Schema-based realtime functionality
-
They combine Queries and Mutations and allow you to read and write data.
Answer: Event-based realtime functionality
An important thing to note is that graphQL was only released as a specification. Which means it tells how the graphql server has to behave. We can read the specification here: graphql github.
If we want to use graphql in a project, we have to build the graphql server ouselves. We can do that with any programming language.
We can also use a service graphql that provides a powerful server out of the box.
There are three main architectural use cases.
-
GraphQL server with a connected database
-
this is a simple setup that the lesson will cover later (hopefully.. this is what I’m interested in).
-
this is more common with greenfield projects.
-
the setup is: you have 1 web server that implements the graphQL specification. Query arrives at graphQL’s server, the server reads it and fetches the data. This is called resolving the query.
-
GraphQL is actually transport layer agnostic.
-
GraphQL can be used with any available network protocol, so you can implement a graphql server on TCP, websockets, or any transport.
-
-
GraphQL doesn’t care about the format that the database stores its data. We can use SQL database like AWS, aurora, or even a noSQL database like MongoDB.
-
-
GraphQL server to integrate existing system.
-
a bunch of other third party systems will get computed to a single graphQL API.
-
great for companies with legacy infrastructures and many different APIs.
-
GraphQL can be used to unify those existing systems and hide the messy backend.
-
GraphQL will communicate with the existing systems and packages it up to the graphQL’s response format.
-
GraphQL doesn’t care about the data sources are, such as databases, web services, 3rd party APIs, etc).
-
You can also combine one with graphql and still communicate with the others.
-
GraphQL handles this with Resolver functions. The purpose of a resolver function is to retrieve the data for its corresponding field.
-
-
A hybrid approach with a connected database and integration of existing system.
GraphQL is great for frontend developers as data fetching complexity can be pushed to the server-side. The server machine will be dealing with the heavy work.
-
Imperative data fetching : With REST, you used to have to create a fetch request and then parse the server response and then would have to be stored locally. Only after then, it will be displayed in the UI.
-
Declarative data fetching : It has to describe its data requirements with query, mutation, or subscription. Then it can display information in the UI. The in between steps are handled by the GraphQL client. This is what graphQL client libraries will handle such as Relay or Apollo.
Question: What is a resolver function?
-
A function that a GraphQL client uses to resolve a query on the frontend.
-
It’s a synonym for serverless functions.
-
A function that solves all your problems.
-
A function on GraphQL server that’s responsible for fetching the data for a single field.
-
Answer: A function on a GraphQL server that’s responsible for fetching the data for a single field.
Some infrastructure feature that is needed for our app is:
-
directly sending queries and mutations without constructing HTTP requests.
-
view-layer integration.
-
caching
-
validating and optimizing queries based on the schema.
With graphQL’s client we no need to do an HTTP fetch and parse anymore. But how do we get it to show up in the UI? If we use ReactJS, the graphQL will fetch the needed data under the hood and make it available in the props of our component. The declarative nature ties well with functional reactive programming techniques.
It’s good practice to store the data that was retrieved somewhere with the user to save load times and whatnot. A good way to do this is to normalize the data beforehand. Which means that the query result gets flattened and the store will only contain individual records that can be referenced with a globally unique ID.
When the build environment has access to the schema, it can parse all of the graphQL code that’s located in the project and compare it against the information from the schema. This catches any typos and other errors before it is put in production where the users can see this error.
The great thing about GraphQL is that it has a split view. One is the written data requirements and the other is the UI code. This split view gives you a better visualization of the data and will give a better understanding of its data.
The query is traversed field by field, executing "resolvers" for each field. So, here’s an example of a schema:
type Query { author(id: ID!): Author } type Author { posts: [Post] } type Post { title: String content: String }
And then here is an actual query that we will send to the server with the schema above:
query { author(id: "abc") { posts { title content } } }
You can also set the query up with the associated type.
query: Query { author(id: "abc"): Author { posts: [Post] { title: String content: String } } }
What will then happen is the resolvers will run for every field sequentually like a list. So this is how it will be executed:
Query.author(root, { id: 'abc' }, context) -> author Author.posts(author, null, context) -> posts for each post in posts Post.title(post, null, context) -> title Post.content(post, null, context) -> content
So after that, the system will put everything together into the correct shape for the result and returns a JSON format.
So there’s a bit of an issue if we wanted to retrive data like this:
query { posts { title author { name avatar } } }
Well, if it was a blog, it’s most likely that most of the posts will have the same authors. So if we need to make an API call to get each author obect, we might accidentally make multiple requests for the same one… so for example:
fetch('/authors/1') fetch('/authors/2') fetch('/authors/1') fetch('/authors/2') fetch('/authors/1') fetch('/authors/2')
To avoid this problem, we should wrap our fetching function in a utility that will wait for all of the resolvers to run, then make sure to only fetch each item once.
authorLoader = new AuthorLoader() // Queue up a bunch of fetches authorLoader.load(1); authorLoader.load(2); authorLoader.load(1); authorLoader.load(2); // Then, the loader only does the minimal amount of work fetch('/authors/1'); fetch('/authors/2');
If the API supports batched requests, we can just do one fetch to the backend, like this:
fetch('/authors?ids=1,2')
In JavaScript, the above methods can be implemented using a utility called dataloader.
Which statement about GraphQL servers is correct?
-
A single record is never fetched more than once for a given query.
-
Resolvers can be defined on a field graularity
-
Query fields are resolved depth-first
-
A GraphQL server that doesn’t use DataLoader isn’t a GraphQL server.
Answer: Resolvers can be defined on a field granularity
In this lesson we will find out why GraphQL is great for developing tools.
GraphQL’s Type system allows us to quickly define the surface area of our API and it allows developers to define the capabilities of an API. GraphQL allows clients to ask a server for information about its schema. It is called introspection.
To let the client know what is accessible through a graphQL API, we ask graphQL for this information by querying the \__schema
meta-field which
is available on the root type of a Query.
query { __schema { types { name } } }
So here’s this schema definition:
type Query { author(id: ID!): Author } type Author { posts: [Post!]! } type Post { title: String! }
With the introspection query that we wrote out and used it with the schema above, we would get this result:
{ "data": { "__schema": { "types": [ { "name": "Query" }, { "name": "Author" }, { "name": "Post" }, { "name": "ID" }, { "name": "String" }, { "name": "__Schema" }, { "name": "__Type" }, { "name": "__TypeKind" }, { "name": "__Field" }, { "name": "__InputValue" }, { "name": "__EnumValue" }, { "name": "__Directive" }, { "name": "__DirectiveLocation" } ] } } }
What we did was we queried for all types on the schema. We get both the object types that were defined like "Author"
and scalar types.
There’s more than to just name for introspection types. Here’s an example:
{ __type(name: "Author") { name description } }
This example we query a single type using the \__type
meta-field and we ask for its name and description.
{ "data": { "__type": { "name": "Author", "description": "The author of a post.", } } }
The great thing about Introspection is that it’s great for autocomplete, code generating, etc. I wish they would actually talk more about this, actually. What a short chapter.
There’s a special IDE for playing with GraphQL called GraphQL Playground. The IDE you can make queries, mutations, subscriptions,
autocomplete, and validation. There’s even a documenation explorer to refer to which is powered by introspection. It also has query history so it lets you work with multiple graphQL api’s side by side.
It seamlessly integrates with graphql-config
. This tool is great to work with a graphql server without writing plain graphQL queries over curl.
With the client’s freedom, GraphQL’s server must be ready to handle those that want to abuse the power or those who submit very large queries. These cases can take a GraphQL server down.
There are ways to avoid this issue.
Using a timeout is the easiest way to handle overload queries. For example we can configure with a 5 second timeout would stop the execution of any query that is taking more than 5 seconds to execute.
-
Pros:
-
Simple to implement.
-
Most strategies will still use a timeout as a final protection.
-
-
Cons:
-
Damage can already be done even when the timeout kicks in.
-
Sometimes hard to implement. Cutting connections after a certain time may result in strange behaviors.
-
GraphQL will let us craft any complex query which means we can nest within nest within nest. So there can be a way to limit a query depth. GraphQL can reject or accept a request based on its depth.
Let’s say the server was configured witha maximum Query Depth of 3. Everything within the red maker of this image is considered too deep and the query is invalid.
-
Pros:
-
Since the AST of the document is analyzed statically, the query does not even execute, which adds no load on the graphql server.
-
-
Cons:
-
Depth alone is often not enough to cover all abusive queries. For example a query requesting an enormous amount of nodes on the root will be very expensive but unlikely to be blocked by a query depth analyzer.
-
To minimize load, try to be specific with the request to get a singular result rather than a list of items.
Here’s an example of a query and its complexity:
query { author(id: "abc") { # complexity: 1 posts(first: 5) { # complexity: 5 title # complexity: 1 } } }
-
Pros:
-
Covers more cases than a simple query depth.
-
Reject queries before executing them by statically analyzing the complexity.
-
-
Cons:
-
Hard to implement perfectly.
-
If complexity is estimated by developers, how do we keep it up to date? How do we find the costs in the first place?
-
Mutations are hard to estimate. what if they have a side effect that is hard to measure like queuing a background job?
-
We can throttle a request based on time.
In the example, our maximum server time (Bucket Size) allowed is set to 1000ms
, the clients gain 100ms
of server time per second (leak rate) and this mutation:
mutation { createPost(input: { title: "GraphQL Security" }) { post { title } } }
This mutation will take an average 200ms
to complete. The client will operate the request more than 5 times within 1 second until more available server time is added to the
client.
After two seconds (200ms
), our client could call the createPost
a single time. Throttling will be good because we can use it less often and smaller queries may be
called more often since they will be very fast to compute.
This query complexity example:
query { author(id: "abc") { # complexity: 1 posts { # complexity: 1 title # complexity: 1 } } }
Has a total of 3
query cost based on its complexity. We can set a maximum cost of 9
and our client could run this query only three times before the leak
rate forbids them to query more.
The client can calculate the costs of their queries themselves without needing to estimate server time.
There’s a github public API that uses this approach to throttle their clients. Here’s how they did it: https://developer.github.com/v4/guides/resource-limitations/.
Question: How does a GraphQL server deal with failures?
-
A GraphQL server never fails
-
It returns a dedicated errors object in the server response
-
It uses HTTP status codes to inidcate what went wrong
-
It starts crying and hides under the bed
Answer: It returns a dedicated errors object in the server response