An overview of the Western Pennsylvania Regional Data Center's data API services
The Western Pennsylvania Regional Data Center's data portal (where you can get lots of open data) runs on CKAN. CKAN has an API that allows the user to do things like get a list of all datasets on the WPRDC data portal, get a list of all datasets with resources and dataset metadata, download the data from any data table on the portal, or even run SQL queries.
Click this link to get a JSON-formatted response with all dogs licensed in Allegheny County whose names start with "MR.".
We've done a workshop on using web APIs in general and the WPRDC's APIs in particular. The linked GitHub repository includes examples and documention on using our data API.
We've also written query-slinger - a bare-bones interactive query editor/launcher for running SQL queries against a CKAN data portal, written in Python and run in the Python REPL for easy query debugging. This includes documentation and scripts with comments to get you statred.
The CKAN API documentation is here. Documenation for the datastore_search and datastore_search_sql endpoints (for querying data tables) are here and here.
The number one most asked question: Why is my CKAN SQL query not working?
Usually the answer is because the query is not formatted to match the requirements of Postgres (which is the database behind our data portal). Specifically, it's best to surround all 1) field names and table names with double quotes and 2) string values with single quotes, as in this example:
Click it to see it in action!
If you use Python or R, there are API wrappers that allow you to simplify the syntax needed to make API calls:
- query-slinger - A tool we wrote specifically to make it easier to query WPRDC data tables.
- Jupyter notebook showing how to get and use WPRDC data in Python - Page down to the "Using SQL queries" section of the notebook.
- Using the CKAN API wrapper + converting string fields to integers in SQL queries - Addressing a common pitfall when running SQL queries, this R script shows how to convert a string field to an integer and then use it in the WHERE clause of a SQL query. This also gives a simple example of using the ckanr wrapper package to more easily use the CKAN API.
- How to build and run very simple datastore queries using Python - Again the "datastore" is a database that stores tables of data on the WPRDC data portal (like uploaded CSV files).