Common Core Data processing pipeline

Prerequisites

Bash v4 or higher
csvkit is necessary to run the optional tasks for generating schemas from csv headers, but the main task sequence will run without it
Postgres installed and runnable as the current user

Data Sources

This is a simplified workflow to import and join enrollment school-level Common Core Data to district-level directory attributes.

It uses:

Public Elementary/Secondary School Universe Survey Data, (v.1a) for the 2021 - 2022 school year
- Membership (school-level enrollment)
- Directory (school-level directory that contains fuller district associations)

This data is accessible on the NCES website's CCD Data Files bulk downloader, under the nonfiscal -> school -> 2021-22 selection in the search menu.

The header extraction and import processes are easily adaptable to additional Common Core Datasets, but you'll have to write fresh filter and join files if you want to incorporate new datasets.

Instructions

To run

Execute ./runme.sh to step through the various scripts. That file is also commented, so you can see the individual processing steps and re-run them once the pipeline is initialized.

Execute a specific task as ./runme.sh task_name

Sample query

To test import success try the following:

psql ccd_stats

select distinct statename from districts;

Troubleshooting

If you receive an error stating that you don't have permissions to run runme.sh, but your permissions indicate you should be able to run the file, check whether the file is executable.

ls -alh

To make executable:

chmod 755 runme.sh

Name		Name	Last commit message	Last commit date
Latest commit History 14 Commits
.gitignore		.gitignore
combine.sql		combine.sql
filter.sql		filter.sql
import_ccd_districts.sql		import_ccd_districts.sql
import_ccd_main.sql		import_ccd_main.sql
readme.md		readme.md
runme.sh		runme.sh

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Common Core Data processing pipeline

Prerequisites

Data Sources

Instructions

To run

Sample query

Troubleshooting

About

Releases

Packages

Contributors 2

Languages

Chalkbeat/2021_ccd_processing

Folders and files

Latest commit

History

Repository files navigation

Common Core Data processing pipeline

Prerequisites

Data Sources

Instructions

To run

Sample query

Troubleshooting

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Contributors 2

Languages

Packages