Skip to content

MetaphorData/connectors

Repository files navigation

Metaphor Connectors

Codecov CodeQL PyPI Version Python version 3.9+ PyPI Downloads Docker Pulls License

This repository contains a collection of Python-based "connectors" that extract metadata from various sources to ingest into the Metaphor platform.

Installation

This package requires Python 3.9+ installed. You can verify the version on your system by running the following command,

python -V  # or python3 on some systems

Once verified, you can install the package using pip,

pip install "metaphor-connectors[all]"  # or pip3 on some systems

This will install all the connectors and required dependencies. You can also choose to install only a subset of the dependencies by installing the specific extra, e.g.

pip install "metaphor-connectors[snowflake]"

Similarly, you can also install the package using requirements.txt or pyproject.toml.

Docker

We automatically push a docker image to Docker Hub as part of the CI/CD. See this page for more details.

GitHub Action

You can also run the connectors in your CI/CD pipeline using the Metaphor Connectors GitHub Action.

Connectors

Each connector is placed under its own directory under metaphor and extends the metaphor.common.BaseExtractor class.

Connector Name Metadata
athena Schema, description, queries
azure_data_factory Lineage, Pipeline
bigquery Schema, description, statistics, queries
bigquery.lineage Lineage
bigquery.profile Data profile
confluence Document embeddings
custom.data_quality Data quality
custom.governance Ownership, tags, description
custom.lineage Lineage
custom.metadata Custom metadata
custom.query_attributions Query attritutions
datahub Description, tag, ownership
dbt dbt model, test, lineage
dbt.cloud dbt model, test, lineage
fivetran Lineage, Pipeline
glue Schema, description
great_expectations Data monitor
informatica Lineage, Pipeline
looker Looker view, explore, dashboard, lineage
kafka Schema, description
metabase Dashboard, lineage
mongodb Schema, statistics
monte_carlo Data monitor
mssql Schema
mysql Schema, description
openapi API, description
oracle Schema, description, queries
notion Document embeddings
postgresql Schema, description, statistics
postgresql.profile Data profile
power_bi Dashboard, lineage
quick_sight Dashboard, lineage
redshift Schema, description, statistics, queries
redshift.profile Data profile
s3 Schema, description
sharepoint Document embeddings
snowflake Schema, description, statistics, queries
snowflake.profile Data profile
static_web Document embeddings
synapse Schema, queries
tableau Dashboard, lineage
thought_spot Dashboard, lineage
trino Schema, description, queries
unity_catalog Schema, description
unity_catalog.profile Data profile, statistics

Development

See Development Environment for more instructions on how to set up your local development environment.

Custom Connectors

See Adding a Custom Connector for instructions and a full example of creating your custom connectors.