Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Ensure file system ACLs of all files/directories are consistent with the type and state of data. #595

Open
pdblood opened this issue Jun 2, 2022 · 3 comments
Labels
I Infrastructure 🛑 ! Blocked

Comments

@pdblood
Copy link
Member

pdblood commented Jun 2, 2022

Once data are uploaded, they pass through various states (upload, ingest, processing, QA, publication) and move to various locations on the file system. We want to ensure that file access controls are consistent with HuBMAP policy and best practices each time a change of state or location occurs.

See here for current policy:
https://docs.google.com/document/d/1pE-XxWVWhUHMQapTzX_VbUXHmYm7gRanMWG6BGEiXks/edit

@pdblood pdblood added the I Infrastructure label Jun 3, 2022
@Daghis
Copy link
Contributor

Daghis commented Jun 17, 2022

I had been considering a Python script, but it appears that Puppet can handle this far more easily.

For reference, I'm looking at posix_acl.

@Daghis
Copy link
Contributor

Daghis commented Jun 30, 2022

Puppet code written to manage ACLs.

I'm making a replica data tree (same directory layout, same permissions, all data replaced with random 4K data files). This tree will test validation and correction as well as measuring performance. (The current filesystem makes this a very expensive operation, so gathering the "cost" will permit us to propose a run frequency.)

@Daghis
Copy link
Contributor

Daghis commented Aug 11, 2022

Passed along proposed ACL setup (promoting hubmap and hubseq to Unix GID permissions and other minor tweaks) as well as a script that should be run daily/weekly (depending on how long it takes to complete). It's a security consideration given the protected information, so it is recommended to do this as a backup.

This information was passed to Bill for review before running it. Talked with Tod Pike about having a repository for configuration files and setting up a cronjob to run this periodically.

Notes were collected (including copy-and-pastes of the source files) in this doc.

@jpuerto-psc jpuerto-psc assigned jpuerto-psc and unassigned Daghis and jpuerto-psc Aug 23, 2022
@jpuerto-psc jpuerto-psc added the 🛑 ! Blocked label Aug 25, 2022
@shirey shirey added this to Pitt HIVE Jun 7, 2024
@shirey shirey moved this to Ready in Pitt HIVE Jun 7, 2024
@shirey shirey removed this from Pitt HIVE Jun 7, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
I Infrastructure 🛑 ! Blocked
Projects
None yet
Development

No branches or pull requests

3 participants