Using Dataloader.KV to call APIs lazily

2019/11/04

Dataloader is a library for loading data in batches. It integrates very well with Absinthe, a GraphQL library for Elixir.

My background with Absinthe & Dataloader

I’ve been working a bit with GraphQL (Absinthe/Elixir) microservices in the past few months. For the most part, those services have been able to exist in their realm and not need to query outside of it. However, from time to time, there’s a need to incorporate a field - an association - within a schema which calls APIs outside of that realm. Sometimes those APIs are friendly to batch queries, and sometimes they’re not. Either way, I’ve found that Dataloader.KV can be used to effectively manage batching requests to those services.

When I first came across the problem of calling other APIs efficiently in Absinthe, I only knew of Dataloader.KV. I read a bit about the documentation and asked around the community for more information. Oddly, there seemed to be few resources on getting it going in a simple case. I hope this blog can help other people jump-start into using it for the future.

Getting started

Let’s skip ahead to the result first. I find myself making use of Dataloader.KV in two ways.

Pretend we have a User, with an association, posts, which are hosted on an external service we’ll call postal;

object :user do
  field :posts, list(:post), resolve: dataloader(:postal)
end
field :posts, list(:post) do
  resolve fn parent, args, %{context: %{loader: loader}} ->
    loader
    |> Dataloader.load(:postal, :posts, parent)
    |> on_load(fn loader ->
      loader
      |> Dataloader.get(:postal, :posts, parent)
      |> do_something_with_the_result()
    end)
  end
end

For those using Dataloader already, both of those usages should look familiar.

The question now is how to make it work… First, we’re going to create a module that will house the code for calling the external API.

We will need to define a function to give to Dataloader.KV that will take the association and parent records and resolve the data; dataloader_postal_loader (arbitrarily named). I, however, factor the actual resolving of data out of this function, and just let it manage the incoming options.

Note: One thing that’s important to getting this right, is that you will receive a list of parent records in the format of a MapSet. Dataloader is expecting you to return a map of these records to a result. The keys of those maps must be exactly as they came; if these records are Ecto structs, and are missing some associations that you need to load before you resolve this external data, then you will need to hold onto the original key alongside your record with loaded associations so you can return it in the result map.

defmodule UserAppWeb.Sources.Postal do
  @spec dataloader_postal_loader() :: (({:posts, keyword} | :posts, MapSet.new(User.t)) -> map)
  def dataloader_postal_loader do
    fn
      # Signature for use with `dataloader` helper function
      {:posts, _opts}, users ->
        load_posts_for_users(users)
        
      # Signature for use with manual dataloading (without opts)
      :posts, users ->
        load_posts_for_users(users)
    end
  end
  
  # Example - just call an API which accepts bulk arguments
  @spec load_posts_for_users(list(User.t)) :: map
  defp load_posts_for_users(users) do
    user_ids = 
      users
      |> Stream.map(users, & &1.id)
      |> Enum.join(",")
    
    # Perhaps returns JSON map of user_id -> [string]
    # Don't hard pattern match {:ok, result} in the real world.
    # You could also use Flow to concurrently make external calls for non-bulk APIs
    {:ok, result} = HTTPoison.get("http://example.com/posts?user_ids=#{user_ids}")
    
    # Wrapping result in ok tuple will result in `nil` being returned for the field
    # if no result was found for the user.
    users
    |> Stream.map(fn original_user -> {original_user, {:ok, Map.get(result, original_user.id)}} end)
    |> Map.new()
  end
end

You will need to add a dataloader source, finally, to make use of this functionality. Probably somewhere like a main schema.ex

def context(ctx) do
  postal_loader =
    UserAppWeb.Sources.Postal.dataloader_postal_loader()
    |> Dataloader.KV.new()

  loader =
    Dataloader.new()
    |> Dataloader.add_source(..., ...) # Your other sources
    |> Dataloader.add_source(:postal, postal_loader)

  Map.put(ctx, :loader, loader)
end

You should now be able to make GraphQL queries which lazily call the external API.

Further considerations

The above example touches on the very simple case. There are some issues with managing errors from your external APIs. Instead of returning {:ok, value} or {:ok, nil} for every result, you can in fact return an error tuple, however you will need to specially manage this in your field resolver. I found that I couldn’t get the ok or error without first changing the dataloader setting of get_policy to :tuples (see dataloader options). This looks a bit like;

field :posts, list(:post) do
  resolve fn parent, args, %{context: %{loader: loader}} ->
    loader
    |> Dataloader.load(:postal, :posts, parent)
    |> on_load(fn loader ->
      loader
      |> Map.put(:options, Keyword.put(loader.options, :get_policy, :tuples)) # So that we can get errors out
      |> Dataloader.get(:postal, :posts, parent)
      |> case do
        {:ok, {:ok, _value} = success} ->
          success
        {:ok, nil} ->
          {:ok, nil}
        {:error, _error} = error ->
          error
      end
    end)
  end
end

If you have any better ways of managing that issue, I’d love to hear them - please get in touch!

- Andrew