Introduction
In HackSoft, we deal with GraphQL on a daily basis. And since we’re using this technology a lot, we are dealing with both query building and execution.
In this blog post, we will be focusing on building GraphQL queries with Python. I will guide you through the steps for building a complete GraphQL query, along with pagination and filtering (using Relay mechanisms).
Final goal
In the end, we will achieve a schema looking like this:
query {
movies (directorName: "Vince Gilligan", first: 3) {
edges {
node {
id
name
rating
releaseDate
}
},
pageInfo {
endCursor
startCursor
hasNextPage
hasPreviousPage
}
}
}
Let’s say we have a movies
query, which can be filtered by directorName
and paginated by specific constraints.
Along with that, we’re taking into consideration the nodes
and edges
idea from Relay, where each node
is a single entry in our query – thus a single node
corresponds to a single movie
.
Generally, this query translates to:
“I want to retrieve the first 3 movies directed by Vince Gilligan and see their name, rating, and release date, along with some page metadata that I can use further on.”
So let’s get our hands dirty!
Library choice
There are a few libraries that you can use for this purpose:
We’ve chosen sgqlc
as our tool of the trade due to its ability to allow us to construct our schema & query with predefined classes, which makes it more flexible than the other choices.
Although it is easy to use, there are a few pitfalls that you can step into. The library is hosted on GitHub and can be installed via pip
.
It contains a few modules that we will be mainly using:
sgqlc.types
– Primitive data types (Int
,String
,list_of
, etc.).sgqlc.types.datetime
– Date & time types (Date
,Time
,DateTime
).-
sgqlc.types.relay
– Relay types, related to pagination and ordering (Node
,Connection
). sgqlc.operation
– The main entry point, which wraps the whole query (Operation
).sgqlc.endpoint.http
– Wrapper over the HTTP protocol (HTTPEndpoint
).
Simple query
To begin with, we will build our query without any filters or pagination. It will be as simple as that:
query {
movies {
name
rating
releaseDate
}
}
This query means:
“Give me the name, rating & release date of all available movies.”
Movie
Let’s create the prototype for our Movie
:
from sgqlc.types.datetime import Date
from sgqlc.types import String, Float, Type
class MovieNode(Type):
name = String
rating = Float
release_date = Date
As our MovieNode
class is a new custom type of ours, we should extend the Type
class from the types
module. If we do not do that, we cannot build our query correctly. This class contains all the fields that we want to have in the query – name
, rating
, and release_date
.
You can (and should) name your properties in the Pythonicsnake_case
, they’re lately converted to the required by GraphQLcamelCase
.
By now, we have constructed the most inner part of your query:
query {
movies {
name
rating
releaseDate
}
}
Query
There is one more thing that we should do in order to complete our schema – the Query
.
This class is again a new custom type. Each property in this class is a separate query name that we can use. That’s why there must be only one Query
class in our project which wraps all the structures that we’ve implemented. In this example, we have only one query – movies
.
from sgqlc.types.datetime import Date
from sgqlc.types import String, Float, Type, Field
class MovieNode(Type):
name = String
rating = Float
release_date = Date
class Query(Type):
movies = Field(MovieNode)
Thesgqlc
library requires you to name your query class exactlyQuery
(for example – you cannot name itMoviesQuery
). Otherwise, you’ll receive an exception after you execute the script.
When we want to reference one custom type in another custom type, as it is the case with Query
and MovieNode
– we can use the Field
class from the types
module which does exactly that [line 12].
Building the query
After we have implemented the structure, we need to build the query.
We can start by initializing our Query
by providing it to the Operation
class, which comes from the operation
module of the library:
from sgqlc.operation import Operation
from .schema import Query
query = Operation(Query)
This is the class that does the whole magic. It can translate our classes to a valid GraphQL notation with the correct values. It basically kicks off the construction process of our query.
Then we need to invoke the __call__
method of our query property (movies
) in order to attach it. If we do so, all the fields defined in our MovieNode
class will be added to the query (name
, rating
, and releaseDate
):
from sgqlc.operation import Operation
from .schema import Query
query = Operation(Query)
query.movies() # Attaches all fields
If you want to attach only a specific field, you can do it like this:
from sgqlc.operation import Operation
from .schema import Query
query = Operation(Query)
query.movies.name() # Attaches only the name field
The__call__
method directly modifies theOperation
instance.
With this, the query
looks exactly like we wanted:
query {
movies {
name
rating
releaseDate
}
}
The last thing that we need to do is call an actual GraphQL API with the already constructed query:
from sgqlc.operation import Operation
from sgqlc.endpoint.http import HTTPEndpoint
from .schema import Query
query = Operation(Query)
query.movies()
endpoint = HTTPEndpoint(url='https://.../graphql')
result = endpoint(query=query)
This can be done by using the HTTPEndpoint
class that wraps our whole query into an HTTP POST request to a given url
.
The endpoint
variable contains our actual GraphQL query, while the result
contains the response, which should have the following structure:
{
"data": {
"movies": [
{
"name": "Breaking Bad",
"rating": 9.5,
"releaseDate": "20-01-2008"
},
...
]
}
}
By now, you have learned how to build a simple GraphQL query using the sgqlc
library. In the next section, we will look at how to build more complex queries with filters and pagination abilities, using Relay.
Complex query with pagination and filters
In order to build complex queries that can be filtered and paginated by specific constraints, it is a good idea to use the mechanisms of Relay.
As said in the GraphQL docs:
To ensure a consistent implementation of this pattern, the Relay project has a formal specification you can follow for building GraphQL APIs which use a cursor based connection pattern.
We want to be able to filter our movies by their director’s name and be able to limit them to a certain count. This is our final goal.
Relay overview
Relay has a concept that strongly couples with Graph theory. We have nodes, where each node is a single element of the main Query
. Each Node
is connected to the query via an Edge
. The query has many edges.
One of the core assumptions of Relay is a description of how to page through connections. The linkage between the query and the edges is called a connection.
By using Relay, we benefit from out-of-the-box support for pagination:
first: <number>
last: <number>
Each Node
(Movie
) in the context of Relay has a cursor
. This property is a unique string identifier of each node, which is related only to pagination. Having cursors, we can use statements in the query in order to fetch elements before or after a certain cursor ID:
before: <cursor-id>
after: <cursor-id>
Important: In the context of sgqlc
, the cursor
is named id
.
Along with that, we can also benefit from a subquery called pageInfo
. It contains a meta-information about our current pagination state:
endCursor
– The cursor id of the last node from the result querystartCursor
– The same as the above, but for the first nodehasNextPage
– A boolean which tells whether there is a next available page to iteratehasPreviousPage
– The same as the above, but for the previous page
All these functionalities are shipped with the sgqlc
library and by using the correct classes, we can achieve great things and build more complex queries with less effort.
Schema
If we want to enhance the abilities of our Query
, we need to tweak our schema from the previous section a bit.
Node
Let’s tweak the MovieNode
from the previous example ⬆️ and inherit from the Node
class (which comes from the Relay
module) instead of the generic Type
:
from sgqlc.types.relay import Node
from sgqlc.types.datetime import Date
from sgqlc.types import String, Float
class MovieNode(Node):
name = String
rating = Float
release_date = Date
The reason for changing the parent class is that Node
gives us some fields out of the box – like cursor id
, which we can use later on for pagination. Besides that, Node
is again a Type
behind the scenes.
Edge
As we’ve built our Node
, we need to “link” it to the query via the “edges”:
from sgqlc.types.relay import Node
from sgqlc.types.datetime import Date
from sgqlc.types import String, Float, Type, Field
class MovieNode(Node):
name = String
rating = Float
release_date = Date
class MovieEdge(Type):
node = Field(MovieNode)
- There is no dedicated class for Relay’s
Edge
here, so we define our own type usingType
. - We connect our already defined
MovieNode
to the newly createdMovieEdge
with the already known approach – usingField
.
Connection
Having the Node
and Edge
defined, we need to glue things up and link the edges and nodes, using the Connection
class from the Relay module:
from sgqlc.types.relay import Node, Connection
from sgqlc.types.datetime import Date
from sgqlc.types import String, Float, Type, Field, list_of
class MovieNode(Node):
name = String
rating = Float
release_date = Date
class MovieEdge(Type):
node = Field(MovieNode)
class MovieConnection(Connection):
edges = list_of(MovieEdge)
- Our
MovieConnection
contains alist_of
(the wrapper of the Python’slist
data structure insgqlc
) edges. - The
Connection
class, same as theNode
, adds some fields for us out of the box. In this particular case – we get thepageInfo
subquery.
Having this, we’re benefiting from the already mentioned features that come with Relay.
Query
In order to build everything that we’ve already implemented, we need to define our main entry point – the Query
class:
from sgqlc.types.relay import Node, Connection
from sgqlc.types.datetime import Date
from sgqlc.types import Int, String, Float, Type, Field, list_of
class MovieNode(Node):
name = String
rating = Float
release_date = Date
class MovieEdge(Type):
node = Field(MovieNode)
class MovieConnection(Connection):
edges = list_of(MovieEdge)
class Query(Type):
movies = Field(
MovieConnection,
args={
'director_name': String,
'first': Int
}
)
- Instead of
MovieNode
, now we provide theMovieConnection
to theField
[line 22]. - Notice that the
Field
class accepts anargs
keyword argument. In this case,args
is a mapping of our filters –director_name
(which is aString
) andfirst
(which is anInt
). [lines 23-26]
Important: Even though we benefit from the Relay filtering mechanisms, we still have to add first
to the args
mapping. This applies to all the filters that you want to have in the query. There’s an already exposed mapping of all Relay pagination filters that I’ve already mentioned ⬆️. It is called connection_args
and you can use it like this:
from sgqlc.types.relay import Node, Connection, connection_args
...
class Query(Type):
movies = Field(
MovieConnection,
args={
'director_name': String,
**connection_args()
}
)
With this, our schema is done.
Building the query
The beginning of the query building process is the same as the one we’ve already done for the simpler query:
from sgqlc.operation import Operation
from sgqlc.endpoint.http import HTTPEndpoint
from .schema import Query
query = Operation(Query)
query.movies(director_name='Vince Gilligan', first=3)
But now we are able to provide values to the filters that we’ve already mapped in the Query
by calling the movies
property and provide values as keyword arguments.
By now, we have constructed these particular parts of our query:
query {
movies (directorName: "Vince Gilligan", first: 3) {
edges {
node {
id
name
rating
releaseDate
}
},
pageInfo {
endCursor
startCursor
hasNextPage
hasPreviousPage
}
}
}
Along with the filters, we need to attach our edges to the query. The idea here is the same as the previous example:
You can attach all edges:
from sgqlc.operation import Operation
from sgqlc.endpoint.http import HTTPEndpoint
from .schema import Query
query = Operation(Query)
query.movies(director_name='Vince Gilligan', first=3)
query.movies.edges() # Attaches all fields (along with nodes and edges)
Or explicitly choose specific ones:
from sgqlc.operation import Operation
from sgqlc.endpoint.http import HTTPEndpoint
from .schema import Query
query = Operation(Query)
query.movies(director_name='Vince Gilligan', first=3)
query.movies.edges.node.name() # Attaches only the name field from the node
With this, we’re almost ready with the query:
query {
movies (directorName: "Vince Gilligan", first: 3) {
edges {
node {
id
name
rating
releaseDate
}
},
pageInfo {
endCursor
startCursor
hasNextPage
hasPreviousPage
}
}
}
The last step is to add the pageInfo
subquery (which comes by default, but should be explicitly attached):
from sgqlc.operation import Operation
from sgqlc.endpoint.http import HTTPEndpoint
from .schema import Query
query = Operation(Query)
query.movies(director_name='Vince Gilligan', first=3)
query.movies.edges()
query.movies.page_info()
Having this, we’ve successfully constructed our query with filtering & pagination:
query {
movies (directorName: "Vince Gilligan", first: 3) {
edges {
node {
id
name
rating
releaseDate
}
},
pageInfo {
endCursor
startCursor
hasNextPage
hasPreviousPage
}
}
}
Same as the previous example, the last step of the process is to call a GraphQL endpoint which will execute the query that you’ve built:
from sgqlc.operation import Operation
from sgqlc.endpoint.http import HTTPEndpoint
from .schema import Query
query = Operation(Query)
query.movies(director_name='Vince Gilligan', first=3)
query.movies.edges()
query.movies.page_info()
endpoint = HTTPEndpoint(url='https://.../graphql')
result = endpoint(query=query)
The result
variable contains the actual JSON response from the endpoint that executed our query:
{
"data": {
"movies": {
"edges": [
{
"node": {
"id": "Qm9va2luZ05vZGU6ZTllMWUyNzQtYTE2NS00ZmE1LTg3MzctZDNiYWI4YzJmMWM2",
"name": "Breaking Bad",
"rating": 9.5,
"releaseDate": "20-01-2008"
}
},
...
],
"pageInfo": {
"hasNextPage": true,
...
}
}
}
}
We got it, but with Python! 🌟
Resources
Building GraphQL queries by hand can be a tough job once the queries start to grow bigger. By using Python to do the job, we can achieve the end result in a lot less time and have a scalable codebase that we can extend further on.
Here are the main resources that I’ve used to create this article:
- The
sgqlc
library implementation and documentation. It is worthwhile to dig through the implementation of the library and see how they’ve approached certain things. - https://graphql.org/learn/pagination/
- https://relay.dev/graphql/connections.htm
- https://relay.dev/docs/guides/graphql-server-specification/
- An additional resource that may be interesting for you is a video of a presentation that I’ve given on this topic in a Django meetup in Bulgaria, held on the 16th of January 2020. Here are the slides from the presentation.
Thanks for reading and I hope this article helped you! 🙌🏼