Building GraphQL queries with Python

Ventsislav Tashev

Jan 6, 2020

Categories:Python

Introduction

In HackSoft, we deal with GraphQL on a daily basis. And since we’re using this technology a lot, we are dealing with both query building and execution.

In this blog post, we will be focusing on building GraphQL queries with Python. I will guide you through the steps for building a complete GraphQL query, along with pagination and filtering (using Relay mechanisms).

Final goal

In the end, we will achieve a schema looking like this:

query {
    movies (directorName: "Vince Gilligan", first: 3) {
        edges {
            node {
                id
                name
                rating
                releaseDate
            }
        },
        pageInfo {
            endCursor
            startCursor
            hasNextPage
            hasPreviousPage
        }
    }
}

Let’s say we have a movies query, which can be filtered by directorName and paginated by specific constraints.

Along with that, we’re taking into consideration the nodes and edges idea from Relay, where each node is a single entry in our query – thus a single node corresponds to a single movie.

Generally, this query translates to:

“I want to retrieve the first 3 movies directed by Vince Gilligan and see their name, rating, and release date, along with some page metadata that I can use further on.”

So let’s get our hands dirty!

Library choice

There are a few libraries that you can use for this purpose:

We’ve chosen sgqlc as our tool of the trade due to its ability to allow us to construct our schema & query with predefined classes, which makes it more flexible than the other choices.

Although it is easy to use, there are a few pitfalls that you can step into. The library is hosted on GitHub and can be installed via pip.

It contains a few modules that we will be mainly using:

sgqlc.types – Primitive data types (Int, String, list_of, etc.).
sgqlc.types.datetime – Date & time types (Date, Time, DateTime).
sgqlc.types.relay – Relay types, related to pagination and ordering (Node, Connection).
sgqlc.operation – The main entry point, which wraps the whole query (Operation).
sgqlc.endpoint.http – Wrapper over the HTTP protocol (HTTPEndpoint).

Simple query

To begin with, we will build our query without any filters or pagination. It will be as simple as that:

query {
  movies {
    name
    rating
    releaseDate
  }
}

This query means:

“Give me the name, rating & release date of all available movies.”

Movie

Let’s create the prototype for our Movie:

from sgqlc.types.datetime import Date
from sgqlc.types import String, Float, Type


class MovieNode(Type):
    name = String
    rating = Float
    release_date = Date

As our MovieNode class is a new custom type of ours, we should extend the Type class from the types module. If we do not do that, we cannot build our query correctly. This class contains all the fields that we want to have in the query – name, rating, and release_date.

You can (and should) name your properties in the Pythonic snake_case, they’re lately converted to the required by GraphQL camelCase.

By now, we have constructed the most inner part of your query:

query {
  movies {
    name
    rating
    releaseDate
  }
}

Query

There is one more thing that we should do in order to complete our schema – the Query.

This class is again a new custom type. Each property in this class is a separate query name that we can use. That’s why there must be only one Query class in our project which wraps all the structures that we’ve implemented. In this example, we have only one query – movies.

from sgqlc.types.datetime import Date
from sgqlc.types import String, Float, Type, Field


class MovieNode(Type):
    name = String
    rating = Float
    release_date = Date


class Query(Type):
    movies = Field(MovieNode)

The sgqlc library requires you to name your query class exactly Query (for example – you cannot name it MoviesQuery). Otherwise, you’ll receive an exception after you execute the script.

When we want to reference one custom type in another custom type, as it is the case with Query and MovieNode – we can use the Field class from the types module which does exactly that [line 12].

Building the query

After we have implemented the structure, we need to build the query.

We can start by initializing our Query by providing it to the Operation class, which comes from the operation module of the library:

from sgqlc.operation import Operation

from .schema import Query


query = Operation(Query)

This is the class that does the whole magic. It can translate our classes to a valid GraphQL notation with the correct values. It basically kicks off the construction process of our query.

Then we need to invoke the __call__ method of our query property (movies) in order to attach it. If we do so, all the fields defined in our MovieNode class will be added to the query (name, rating, and releaseDate):

from sgqlc.operation import Operation

from .schema import Query


query = Operation(Query)

query.movies()  # Attaches all fields

If you want to attach only a specific field, you can do it like this:

from sgqlc.operation import Operation

from .schema import Query


query = Operation(Query)

query.movies.name()  # Attaches only the name field

The __call__ method directly modifies the Operation instance.

With this, the query looks exactly like we wanted:

query {
  movies {
    name
    rating
    releaseDate
  }
}

The last thing that we need to do is call an actual GraphQL API with the already constructed query:

from sgqlc.operation import Operation
from sgqlc.endpoint.http import HTTPEndpoint

from .schema import Query


query = Operation(Query)

query.movies()

endpoint = HTTPEndpoint(url='https://.../graphql')
result = endpoint(query=query)

This can be done by using the HTTPEndpoint class that wraps our whole query into an HTTP POST request to a given url.

The endpoint variable contains our actual GraphQL query, while the result contains the response, which should have the following structure:

{
  "data": {
    "movies": [
      {
        "name": "Breaking Bad",
        "rating": 9.5,
        "releaseDate": "20-01-2008"
      },
      ...
    ]
  }
}

By now, you have learned how to build a simple GraphQL query using the sgqlc library. In the next section, we will look at how to build more complex queries with filters and pagination abilities, using Relay.

Complex query with pagination and filters

In order to build complex queries that can be filtered and paginated by specific constraints, it is a good idea to use the mechanisms of Relay.

As said in the GraphQL docs:

To ensure a consistent implementation of this pattern, the Relay project has a formal specification you can follow for building GraphQL APIs which use a cursor based connection pattern.

We want to be able to filter our movies by their director’s name and be able to limit them to a certain count. This is our final goal.

Relay overview

Relay has a concept that strongly couples with Graph theory. We have nodes, where each node is a single element of the main Query. Each Node is connected to the query via an Edge. The query has many edges.

One of the core assumptions of Relay is a description of how to page through connections. The linkage between the query and the edges is called a connection.

By using Relay, we benefit from out-of-the-box support for pagination:

first: <number>
last: <number>

Each Node (Movie) in the context of Relay has a cursor. This property is a unique string identifier of each node, which is related only to pagination. Having cursors, we can use statements in the query in order to fetch elements before or after a certain cursor ID:

before: <cursor-id>
after: <cursor-id>

Important: In the context of sgqlc, the cursor is named id.

Along with that, we can also benefit from a subquery called pageInfo. It contains a meta-information about our current pagination state:

endCursor – The cursor id of the last node from the result query
startCursor – The same as the above, but for the first node
hasNextPage – A boolean which tells whether there is a next available page to iterate
hasPreviousPage – The same as the above, but for the previous page

All these functionalities are shipped with the sgqlc library and by using the correct classes, we can achieve great things and build more complex queries with less effort.

Schema

If we want to enhance the abilities of our Query, we need to tweak our schema from the previous section a bit.

Node

Let’s tweak the MovieNode from the previous example ⬆️ and inherit from the Node class (which comes from the Relay module) instead of the generic Type:

from sgqlc.types.relay import Node
from sgqlc.types.datetime import Date
from sgqlc.types import String, Float


class MovieNode(Node):
    name = String
    rating = Float
    release_date = Date

The reason for changing the parent class is that Node gives us some fields out of the box – like cursor id, which we can use later on for pagination. Besides that, Node is again a Type behind the scenes.

Edge

As we’ve built our Node, we need to “link” it to the query via the “edges”:

from sgqlc.types.relay import Node
from sgqlc.types.datetime import Date
from sgqlc.types import String, Float, Type, Field


class MovieNode(Node):
    name = String
    rating = Float
    release_date = Date


class MovieEdge(Type):
    node = Field(MovieNode)

There is no dedicated class for Relay’s Edge here, so we define our own type using Type.
We connect our already defined MovieNode to the newly created MovieEdge with the already known approach – using Field.

Connection

Having the Node and Edge defined, we need to glue things up and link the edges and nodes, using the Connection class from the Relay module:

from sgqlc.types.relay import Node, Connection
from sgqlc.types.datetime import Date
from sgqlc.types import String, Float, Type, Field, list_of


class MovieNode(Node):
    name = String
    rating = Float
    release_date = Date


class MovieEdge(Type):
    node = Field(MovieNode)


class MovieConnection(Connection):
    edges = list_of(MovieEdge)

Our MovieConnection contains a list_of (the wrapper of the Python’s list data structure in sgqlc) edges.
The Connection class, same as the Node, adds some fields for us out of the box. In this particular case – we get the pageInfo subquery.

Having this, we’re benefiting from the already mentioned features that come with Relay.

Query

In order to build everything that we’ve already implemented, we need to define our main entry point – the Query class:

from sgqlc.types.relay import Node, Connection
from sgqlc.types.datetime import Date
from sgqlc.types import Int, String, Float, Type, Field, list_of


class MovieNode(Node):
    name = String
    rating = Float
    release_date = Date


class MovieEdge(Type):
    node = Field(MovieNode)


class MovieConnection(Connection):
    edges = list_of(MovieEdge)


class Query(Type):
    movies = Field(
        MovieConnection,
        args={
            'director_name': String,
            'first': Int
        }
    )

Instead of MovieNode, now we provide the MovieConnection to the Field [line 22].
Notice that the Field class accepts an args keyword argument. In this case, args is a mapping of our filters – director_name (which is a String) and first (which is an Int). [lines 23-26]

Important: Even though we benefit from the Relay filtering mechanisms, we still have to add first to the args mapping. This applies to all the filters that you want to have in the query. There’s an already exposed mapping of all Relay pagination filters that I’ve already mentioned ⬆️. It is called connection_args and you can use it like this:

from sgqlc.types.relay import Node, Connection, connection_args

...

class Query(Type):
    movies = Field(
        MovieConnection,
        args={
            'director_name': String,
            **connection_args()
        }
    )

With this, our schema is done.

Building the query

The beginning of the query building process is the same as the one we’ve already done for the simpler query:

from sgqlc.operation import Operation
from sgqlc.endpoint.http import HTTPEndpoint

from .schema import Query


query = Operation(Query)

query.movies(director_name='Vince Gilligan', first=3)

But now we are able to provide values to the filters that we’ve already mapped in the Query by calling the movies property and provide values as keyword arguments.

By now, we have constructed these particular parts of our query:

query {
    movies (directorName: "Vince Gilligan", first: 3) {
        edges {
            node {
                id
                name
                rating
                releaseDate
            }
        },
        pageInfo {
            endCursor
            startCursor
            hasNextPage
            hasPreviousPage
        }
    }
}

Along with the filters, we need to attach our edges to the query. The idea here is the same as the previous example:

You can attach all edges:

from sgqlc.operation import Operation
from sgqlc.endpoint.http import HTTPEndpoint

from .schema import Query


query = Operation(Query)

query.movies(director_name='Vince Gilligan', first=3)

query.movies.edges()  # Attaches all fields (along with nodes and edges)

Or explicitly choose specific ones:

from sgqlc.operation import Operation
from sgqlc.endpoint.http import HTTPEndpoint

from .schema import Query


query = Operation(Query)

query.movies(director_name='Vince Gilligan', first=3)

query.movies.edges.node.name()  # Attaches only the name field from the node

With this, we’re almost ready with the query:

query {
    movies (directorName: "Vince Gilligan", first: 3) {
        edges {
            node {
                id
                name
                rating
                releaseDate
            }
        },
        pageInfo {
            endCursor
            startCursor
            hasNextPage
            hasPreviousPage
        }
    }
}

The last step is to add the pageInfo subquery (which comes by default, but should be explicitly attached):

from sgqlc.operation import Operation
from sgqlc.endpoint.http import HTTPEndpoint

from .schema import Query


query = Operation(Query)

query.movies(director_name='Vince Gilligan', first=3)

query.movies.edges()
query.movies.page_info()

Having this, we’ve successfully constructed our query with filtering & pagination:

query {
    movies (directorName: "Vince Gilligan", first: 3) {
        edges {
            node {
                id
                name
                rating
                releaseDate
            }
        },
        pageInfo {
            endCursor
            startCursor
            hasNextPage
            hasPreviousPage
        }
    }
}

Same as the previous example, the last step of the process is to call a GraphQL endpoint which will execute the query that you’ve built:

from sgqlc.operation import Operation
from sgqlc.endpoint.http import HTTPEndpoint

from .schema import Query


query = Operation(Query)

query.movies(director_name='Vince Gilligan', first=3)

query.movies.edges()
query.movies.page_info()

endpoint = HTTPEndpoint(url='https://.../graphql')
result = endpoint(query=query)

The result variable contains the actual JSON response from the endpoint that executed our query:

{
  "data": {
    "movies": {
      "edges": [
        {
          "node": {
            "id": "Qm9va2luZ05vZGU6ZTllMWUyNzQtYTE2NS00ZmE1LTg3MzctZDNiYWI4YzJmMWM2",
            "name": "Breaking Bad",
            "rating": 9.5,
            "releaseDate": "20-01-2008"
          }
        },
        ...
      ],
      "pageInfo": {
        "hasNextPage": true,
        ...
      }
    }
  }
}

We got it, but with Python! 🌟

Resources

Building GraphQL queries by hand can be a tough job once the queries start to grow bigger. By using Python to do the job, we can achieve the end result in a lot less time and have a scalable codebase that we can extend further on.

Here are the main resources that I’ve used to create this article:

The sgqlc library implementation and documentation. It is worthwhile to dig through the implementation of the library and see how they’ve approached certain things.
https://graphql.org/learn/pagination/
https://relay.dev/graphql/connections.htm
https://relay.dev/docs/guides/graphql-server-specification/
An additional resource that may be interesting for you is a video of a presentation that I’ve given on this topic in a Django meetup in Bulgaria, held on the 16th of January 2020. Here are the slides from the presentation.

Thanks for reading and I hope this article helped you! 🙌🏼

Your development partner beyond code.