-
Notifications
You must be signed in to change notification settings - Fork 0
Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
Merge branch 'main' into public_main
- Loading branch information
Showing
156 changed files
with
17,989 additions
and
50 deletions.
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -1,2 +1,48 @@ | ||
# ddhcb-release | ||
Public Release of DDHC-B | ||
This repository contains source code for the SafeTab-H disclosure | ||
avoidance application. SafeTab-H was used by the Census Bureau for the | ||
protection of individual 2020 Census responses in the tabulation and | ||
publication of the Detailed Demographic and Housing Characteristics | ||
File B (DDHC-B). Previously, the Census Bureau has released the source | ||
code for SafeTab-P, the application used to protect the Detailed | ||
Demographic and Housing Characteristics File A (DDHC-A). | ||
|
||
Using the mathematical principles of formal privacy, SafeTab-H infused | ||
noise into Census survey results to create *privacy-protected | ||
microdata* which were used by Bureau subject matter experts to | ||
tabulate the 2020 DDHC-H product. SafeTab-H was built on Tumult's | ||
"Analytics" and "Core" platforms. both SafeTab-H and the underlying | ||
platforms are implemented in Python. The latest version of the | ||
platforms can be found at [[https://tmlt.dev/]]. | ||
|
||
In the interests of both transparency and scientific advancement, the | ||
Census Bureau committed to releasing any source code used in creation | ||
of products protected by formal privacy guarantees. In the case of the | ||
the Detailed Demographic & Housing Characteristics publications, this | ||
includes code developed under contract by Tumult Software (tmlt.io) | ||
and MITRE corporation. Tumult's underlying platform is evolving and | ||
the code in the repository is a snapshot of the code used for the | ||
production of the DDHC-B product. | ||
|
||
The bureau has already separately released the internally developed | ||
software for the Top Down Algorithm (TDA) used in production of the | ||
2020 Redistricting and the 2020 Demographic & Housing Characteristics | ||
products. | ||
|
||
This software int this repository is divided across multiple | ||
sub-directories, including: | ||
* `configs` contains the specific configuration files used for the | ||
production DDHC-B runs, including privacy loss budget (PLB) allocations | ||
and the rules for adaptive table generation. These configurations reflect | ||
decisions by the Bureau's DSEP (Data Stewardship Executive Policy) committee | ||
based on experiments conducted by Census Bureau staff. | ||
* `safetab-h/safetab_h` contains the source code for the application itself as used | ||
to generate the protected microdata used in production. | ||
* `safetab-h/safetab_utils` contains utilities common among the SafeTab products | ||
developed by Tumult for the Census Bureau. | ||
* `mitre/cef_readers` contains code by MITRE to read the Census input | ||
files used by the SafeTab applications. | ||
* `tumult` contains the Tumult Analytics platform. This is divided | ||
into `common`, `analytics`, and `core` directories. The `core` directory | ||
also includes a pre-packaged Python *wheel* for the core library. | ||
* `ctools` contains Python utility libraries developed the the Census | ||
Bureau's DAS team and used by the MITRE CEF readers. |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
|
@@ -17,16 +17,6 @@ Copyright 2024 Tumult Labs | |
|
||
This repository contains SafeTab-H and its supporting Tumult-developed libraries. For instructions on running SafeTab-H, see its [README](safetab_h/README.md). | ||
|
||
### Access to the Deliverable | ||
|
||
The source code and documentation for this deliverable can be accessed by executing the following command at the command line (or entering the URL into the clone window of a client, e.g., Github Desktop): | ||
|
||
``` | ||
git clone https://decennial-census:[email protected]/tumult-labs/safetab-h-release.git | ||
``` | ||
|
||
In the URL above, `AaY8XLQ8_zanZhSiKtJf` is a Gitlab deploy token associated with the username `decennial-census`. This grants read access to this repository. | ||
|
||
### Contents | ||
|
||
In the repository there are six folders, each of which contains a component of the release: | ||
|
@@ -42,6 +32,7 @@ SafeTab-H also requires a CEF reader module for reading data from Census' file f | |
|
||
For details, consult each library's `README` within its respective subfolder. To see which new features have been added since the previous versions, consult their respective `CHANGELOG`s. | ||
|
||
<<<<<<< HEAD | ||
### Synthetic Data | ||
|
||
This release also comes with a set of synthetic data files that can be used to test SafeTab-H. The ZIP file containing the sample files is hosted on Amazon Simple Storage Service (Amazon S3). Please note that the download link will be valid until 2024-04-09 at 12:00 pm Eastern. | ||
|
@@ -80,3 +71,5 @@ The download file is `safetab-h-full-size-synthetic-data.zip` will contain the f | |
- `pop-group-totals.txt`: The T1 output file from a SafeTab-P run on a 300 million record synthetic dataset. | ||
|
||
See [SafeTab-H Spec Doc](safetab_h/SafeTab_H_Documentation.pdf) for a description of each file. See the [SafeTab-H Library `README`](safetab_h/README.md) for more input directory setup notes. | ||
======= | ||
>>>>>>> main |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,11 @@ | ||
*.xlsx binary | ||
*.docx binary | ||
*.py text eol=auto | ||
*.ini text eol=auto | ||
*.md text eol=auto | ||
*.tex text eol=auto | ||
*.txt text eol=auto | ||
*.bat text eol=auto | ||
*.log text eol=auto | ||
.gitattributes text eol=auto | ||
.gitignore text eol=auto |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,101 @@ | ||
|
||
# Usually these files are written by a python script from a template | ||
# before PyInstaller builds the exe, so as to inject date/other infos into it. | ||
# Byte-compiled / optimized / DLL files | ||
# C extensions | ||
# Distribution / packaging | ||
# Django stuff: | ||
# Environments | ||
# Flask stuff: | ||
# Installer logs | ||
# Jupyter Notebook | ||
# PyBuilder | ||
# PyInstaller | ||
# Rope project settings | ||
# SageMath parsed files | ||
# Scrapy stuff: | ||
# Sphinx documentation | ||
# Spyder project settings | ||
# Translations | ||
# Unit test / coverage reports | ||
# celery beat schedule file | ||
# mkdocs documentation | ||
# mypy | ||
# pyenv | ||
*$py.class | ||
*.cover | ||
*.egg | ||
*.egg-info/ | ||
*.log | ||
*.manifest | ||
*.mo | ||
*.pot | ||
*.py[cod] | ||
*.sage.py | ||
*.so | ||
*.spec | ||
*~ | ||
.DS_Store | ||
.Python | ||
.cache | ||
.coverage | ||
.coverage.* | ||
.eggs/ | ||
.env | ||
.hypothesis/ | ||
.installed.cfg | ||
.ipynb_checkpoints | ||
.mypy_cache/ | ||
.pytest_cache/ | ||
.python-version | ||
.ropeproject | ||
.scrapy | ||
.spyderproject | ||
.spyproject | ||
.tox/ | ||
.venv | ||
.webassets-cache | ||
/site | ||
ENV/ | ||
MANIFEST | ||
__pycache__/ | ||
build/ | ||
celerybeat-schedule | ||
coverage.xml | ||
db.sqlite3 | ||
develop-eggs/ | ||
dist/ | ||
docs/_build/ | ||
downloads/ | ||
eggs/ | ||
env.bak/ | ||
env/ | ||
htmlcov/ | ||
instance/ | ||
lib/ | ||
lib64/ | ||
local_settings.py | ||
nosetests.xml | ||
parts/ | ||
pip-delete-this-directory.txt | ||
pip-log.txt | ||
sdist/ | ||
target/ | ||
var/ | ||
venv.bak/ | ||
venv/ | ||
wheels/ | ||
output.* | ||
*.aux | ||
*.tex | ||
demo?.html | ||
demo?.md | ||
demo*.png | ||
*.png | ||
test.gz | ||
.idea | ||
.idea/ | ||
|
||
TAGS | ||
tydoc_awsome_demo.html |
Oops, something went wrong.