-
Notifications
You must be signed in to change notification settings - Fork 3
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[WIP] Coffea2023 #14
base: main
Are you sure you want to change the base?
[WIP] Coffea2023 #14
Conversation
ewkcoffea/modules/selection_wwz.py
Outdated
mask = (lep_collection.idx == ak.flatten(ll_pairs_idx.l0[sfos_pair_closest_to_z_idx])) | ||
mask = (mask | (lep_collection.idx == ak.flatten(ll_pairs_idx.l1[sfos_pair_closest_to_z_idx]))) | ||
flat_pair_idxs = ak.flatten(ll_pairs_idx[sfos_pair_closest_to_z_idx]) | ||
mask = ((lep_local_idx == flat_pair_idxs.l0) | (lep_local_idx == flat_pair_idxs.l1)) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
can you ... unfix... this for now - some other people are looking that the over touching issue in your code in-situ :-)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Unfixed : )
…argument's steps must be an iterable of integer offsets or start-stop pairs
@@ -43,7 +43,6 @@ | |||
"/store/user/kdownham/skimOutput/3LepTau_4Lep/DoubleEG_Run2016D-HIPM_UL2016_MiniAODv2_NanoAODv9-v2_NANOAOD_3LepTau_4Lep/output_1.root", | |||
"/store/user/kdownham/skimOutput/3LepTau_4Lep/DoubleEG_Run2016D-HIPM_UL2016_MiniAODv2_NanoAODv9-v2_NANOAOD_3LepTau_4Lep/output_45.root", | |||
"/store/user/kdownham/skimOutput/3LepTau_4Lep/DoubleEG_Run2016D-HIPM_UL2016_MiniAODv2_NanoAODv9-v2_NANOAOD_3LepTau_4Lep/output_38.root", | |||
"/store/user/kdownham/skimOutput/3LepTau_4Lep/DoubleEG_Run2016D-HIPM_UL2016_MiniAODv2_NanoAODv9-v2_NANOAOD_3LepTau_4Lep/output_15.root", |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
so this file is opening, accessing the tree but the tree is completely empty upon checking by hand?
preprocess should just remove files like that when skip_bad_files is turned on.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Checking by hand, it seems file has zero events. I have since found a few others that are empty as well and will probably remove those too (and I'll talk with the person who did the skimming to try to understand why their skimming code is producing some empty files).
The reason I noticed this now because these empty files seem to cause a crash in apply_to_fileset
like this [1]. (I guess previously they'd just been getting essentially skipped, but not causing a crash.)
I would hesitate to use skip_bad_files as a solution, as generally I would want to process exactly the same static set of input samples every run. This is to avoid getting unpredictably different results in different runs (due to instances where e.g. some transient xrd error causes one file to not be opened properly).
[1]
Traceback (most recent call last):
File "/home/k.mohrman/coffea_dir/migrate_to_coffea2023_repo/ewkcoffea/analysis/wwz/run_wwz4l.py", line 356, in <module>
histos_to_compute, reports = apply_to_fileset(
File "/home/k.mohrman/coffea_dir/migrate_to_coffea2023_repo/ewkcoffea/coffea_dir/coffea/src/coffea/dataset_tools/apply_processor.py", line 125, in apply_to_fileset
dataset_out = apply_to_dataset(
File "/home/k.mohrman/coffea_dir/migrate_to_coffea2023_repo/ewkcoffea/coffea_dir/coffea/src/coffea/dataset_tools/apply_processor.py", line 67, in apply_to_dataset
events = NanoEventsFactory.from_root(
File "/home/k.mohrman/coffea_dir/migrate_to_coffea2023_repo/ewkcoffea/coffea_dir/coffea/src/coffea/nanoevents/factory.py", line 674, in events
events = self._mapping(form_mapping=self._schema)
File "/blue/p.chang/k.mohrman/dir_for_miniconda/miniconda3/envs/coffea2023_env00/lib/python3.9/site-packages/uproot/_dask.py", line 167, in dask
files = uproot._util.regularize_files(files, steps_allowed=True, **options)
File "/blue/p.chang/k.mohrman/dir_for_miniconda/miniconda3/envs/coffea2023_env00/lib/python3.9/site-packages/uproot/_util.py", line 927, in regularize_files
for file_path, object_path, maybe_steps in _regularize_files_inner(
File "/blue/p.chang/k.mohrman/dir_for_miniconda/miniconda3/envs/coffea2023_env00/lib/python3.9/site-packages/uproot/_util.py", line 888, in _regularize_files_inner
maybe_steps = regularize_steps(maybe_steps)
File "/blue/p.chang/k.mohrman/dir_for_miniconda/miniconda3/envs/coffea2023_env00/lib/python3.9/site-packages/uproot/_util.py", line 786, in regularize_steps
raise TypeError(
TypeError: 'files' argument's steps must be an iterable of integer offsets or start-stop pairs.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
OK - then a better solution is to return [[0,0]]
for the steps in those cases and supply a dataset transformation to remove those files.
It would be fairly straightforward to implement filter_files
that takes some function you define and just drops files when that function returns True.
Coffea2023 systs
No description provided.