Skip to content
This repository has been archived by the owner on Sep 28, 2022. It is now read-only.

aloelabs/tap-thegraph

 
 

Repository files navigation

tap-thegraph

tap-thegraph is a Singer tap for The Graph built with the Meltano Tap SDK.

Quickstart

# 1. Install our packages for extracting subgraph data.
npm install -g graphql-api-to-json-schema
pipx install git+https://github.com/superkeyio/tap-thegraph.git

# 2. Install a Singer target for loading the data to a destination (for example, CSV).
pipx install target-csv

# 3. Configure which subgraphs and entities to extract (for example, all markets on Compound V2).
echo "{\"subgraphs\":[{\"url\":\"https://api.thegraph.com/subgraphs/name/graphprotocol/compound-v2\",\"entities\":[{\"name\":\"Market\"}]}]}" >> config.json

# 4. Run the pipeline!
tap-thegraph --config config.json | target-csv

Installation

npm install -g graphql-api-to-json-schema
pipx install git+https://github.com/superkeyio/tap-thegraph.git

Help?!?

tap-thegraph --help

Configuration

You must pass in a JSON file following this format:

{
  "subgraphs": [
    {
      "url": "<SUBGRAPH_URL>",
      "entities": [
        { "name": "<ENTITY_NAME>" }, 
        { "name": "<ENTITY_NAME>", "created_at": "<TIMESTAMP_OR_BLOCK_NUMBER_FIELD>" }, 
        ...
      ]
    },
    ...
  ]
}

See the examples/ directory for example config files.

Entities

For each entity that you want to extract, you must specify the name (ex: Market) and, optionally, created_at, which is the name of a timestamp or block number field corresponding to when the entity was created.

Specifying created_at for an entity enables "incremental" replication, which means that we can re-run the tap and resume where we left off instead of replicating everything again ("full table" replication).

Batch size

By default, the tap extracts 1000 entities at a time. You can change that by specifying a batch_size at the root level of the configuration JSON.

Usage

Meltano

tap-thegraph was built with the Meltano SDK so it integrates with Meltano's open source data stack out-of-the-box.

Executing the Tap Directly

You can also run the tap directly via the command line like so...

tap-thegraph --config config.json

Developer Resources

Initialize your Development Environment

pipx install poetry
poetry install

Create and Run Tests

Create tests within the tap_thegraph/tests subfolder and then run:

poetry run pytest

You can also test the tap-thegraph CLI interface directly using poetry run:

poetry run tap-thegraph --help

Testing with Meltano

Note: This tap will work in any Singer environment and does not require Meltano. Examples here are for convenience and to streamline end-to-end orchestration scenarios.

Your project comes with a custom meltano.yml project file already created. Open the meltano.yml and follow any "TODO" items listed in the file.

Next, install Meltano (if you haven't already) and any needed plugins:

# Install meltano
pipx install meltano
# Initialize meltano within this directory
cd tap-thegraph
meltano install

Now you can test and orchestrate using Meltano:

# Test invocation:
meltano invoke tap-thegraph --version
# OR run a test `elt` pipeline:
meltano elt tap-thegraph target-jsonl

SDK Dev Guide

See the dev guide for more instructions on how to use the SDK to develop your own taps and targets.

Releases

No releases published

Packages

No packages published

Languages

  • Python 100.0%