Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Discussion for v5+ of skims #4

Open
sam-may opened this issue Feb 5, 2021 · 5 comments
Open

Discussion for v5+ of skims #4

sam-may opened this issue Feb 5, 2021 · 5 comments

Comments

@sam-may
Copy link
Collaborator

sam-may commented Feb 5, 2021

Thread for discussing details on v5+ of the skims

@sam-may
Copy link
Collaborator Author

sam-may commented Feb 5, 2021

Content: would be useful to have two versions of next skims:

  1. Diphoton preselection + 1 lep/tau
  2. Diphoton preselection

Skims with the diphoton preselection only will be useful for studying data-driven background descriptions, and will also allow us to explore other things like boosted ggbb. @leonardogiannini , is this much more extra work than commenting out a few lines for the tau/lepton requirement and resubmitting?

Samples:

  • ggbb: to explore boosted ggbb and also perform regular ggbb analysis as a cross-check for ggTauTau -- if our expected sensitivity for running a simplified ggbb analysis is consistent with results from HIG-19-018, gives us more confidence in our expected results for ggTauTau
  • ggWW, ggZZ: check signal yields to see if adding WW->at least 1 hadronic tau or ZZ->anything can add to our sensitivity

@mhl0116 says he will submit jobs to make nanoAODs for these.

Technical:

  • I get errors when loading the gHIdx branch with awkward. Don't entirely understand why this is the case, maybe someone else has a better understanding?

@mhl0116
Copy link
Collaborator

mhl0116 commented Feb 5, 2021

@sam-may did you try something like this:

def get_gHidx(args):
'''
This needs to be inserted in prepare_inputs
Consider make flat indexing, otherwise don't work
'''
fname,entrystart,entrystop = args
f = uproot.open(fname)
t = f["Events"]
idx_keys = t.keys(filter_name="gHidx")
gHidx = t.arrays( idx_keys, entry_start=entrystart, entry_stop=entrystop, library="ak", how="zip" )
return gHidx

@sam-may
Copy link
Collaborator Author

sam-may commented Feb 5, 2021

@mhl0116 No, have not tried that. Why do we need the entry_start and entry_stop arguments (and how do we know what to set them to)?

Is it the case that gHidx is not valid for all events and these arguments select only the events for which it has valid values?

@mhl0116
Copy link
Collaborator

mhl0116 commented Feb 5, 2021

@sam-may

there could be two usage of it:
1, if a file is too large (containing too many events, say >1e6), we can break it into smaller chunks, and start,stop entry tells the uproot where to pick stuff, and I stole this from nick:

def get_chunking(filelist, chunksize, treename="Events", workers=12, skip_bad_files=False, xrootd=False, client=None, use_dask=False):

2, if I want to inspect something like 100 events, I can set start=0 and end=100, so it's for quick exploration of event content

this is one feature I was hoping skim v5 can has, turn this index into:

selectedPhotons_*, and each event only has two selectedPhotons, which have passed diphoton pre-selection

@mhl0116
Copy link
Collaborator

mhl0116 commented Feb 9, 2021

wish list for skim v5:

1, drop gHidx, instead use format like: selectedPhotons (then each event saved in skim have two selectedPhotons)

  • save all photon related branches from nanoAOD (pt/eta/phi/photonID...)

2, for diphoton preselection, add pT/mgg_leading > 0.3 and pT/mgg_subleading > 0.25, mgg > 100

3, for leptons and taus, use same format (selectedTaus, selectedMuons, selectedElectrons)

  • save all existing branches in nanoAOD
  • clean against photon using 0.2 cone size (same as in HIG-19-013)
  • tau and leptons pass loose requirement as in bbtautau AN
  • add branch for tau/lepton: passedTight, 1 if pass tight requirement (from bbtautau AN), else 0

4, save events if passing diphoton preselection + 1 extra loose tau/lepton (e/mu)

5, pair for SVFit calculation

  • if there are 2 loose tau, or 1 loose tau + 1 loose lepton (e/mu) or 0 loose tau + 2 loose leptons (e/mu), calculate SVFit score and save in the event

6, other per-event quantities:

  • mvis, dR(tau,tau) or dR(tau,lep) and others saved in the event when passing same requirement as in 5
  • diphoton p4, dR(diphoton, ditau candidate), dPhi(diphoton, ditau candidate)
  • ditau candidate, can use two versions, before/after SVFit
  • other quantities...

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants