Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

import jade-lta data into FC #221

Open
dsschult opened this issue Nov 29, 2021 · 11 comments
Open

import jade-lta data into FC #221

dsschult opened this issue Nov 29, 2021 · 11 comments
Assignees

Comments

@dsschult
Copy link
Contributor

Import old archive data from jade-lta into the FC, in the proper format to be retrieved.

@ric-evans
Copy link
Member

Sounds like this involves appending to the FC data schema. Can this be reflected in https://github.com/WIPACrepo/file_catalog/blob/master/file_catalog/schema/types.py#L163?

@dsschult
Copy link
Contributor Author

I'm not sure if it does or not. Depends exactly how much is optional, and how much info @blinkdog has in the jade-lta DB.

@ric-evans
Copy link
Member

Optional fields included, it's nice to have a centralized reference of what all could possibly be in an FC record. Even though the schema is not prescriptive / records aren't fully validated (as of now).

@jnbellinger
Copy link
Contributor

BTW, some checksums are NULL. Not a huge number, but that's another corner case for us.

@ric-evans
Copy link
Member

BTW, some checksums are NULL. Not a huge number, but that's another corner case for us.

What's an instance of this?

@ric-evans
Copy link
Member

A checksum with a sha512 subfield is mandatory in the validation step (https://github.com/WIPACrepo/file_catalog/blob/master/file_catalog/schema/validation.py#L47). NULL/None values will fail to POST

@jnbellinger
Copy link
Contributor

Of course the file isn't there right now, but
/data/exp/IceCube/2016/unbiased/PFRaw/1020/PFRaw_PhysicsFiltering_Run00128628_Subrun00000000_00000193.tar.gz

@ric-evans
Copy link
Member

Looks like a regular tar, would it be difficult to get checksums for these?

@jnbellinger
Copy link
Contributor

We can

  1. pull the bundles back from NERSC and unpack them

  2. read from the pole disk

  3. is obviously somewhat faster, but there's a small chance that the file got corrupted before bundling, and it would be good to verify it.

@blinkdog
Copy link
Contributor

blinkdog commented Dec 2, 2021

There should be checksums for those files in the regular JADE database too.
It may be possible to restore a backup and run a query for it.
I have a bunch of these backups in my /data/user directory.

As for the files at NERSC, there's no help for it. We'll need to bring the bundle back in order to check that the file in there matches our checksum (regardless if we get it from pole disk or a JADE database).

@dsschult
Copy link
Contributor Author

dsschult commented Dec 3, 2021

If we need to bring all the files back, that sounds like a great thing to do next year for running pass3.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants