Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Reimplement ATLAS Z0 7TEV 46FB Dataset #2237

Merged
merged 20 commits into from
Jan 22, 2025

Conversation

ecole41
Copy link
Collaborator

@ecole41 ecole41 commented Dec 4, 2024

This pull request is merging the two CC and CF datasets into one filter.py script.
Functions haves been added to produce the data central, kinematic and uncertainties yaml files for both of these datasets.

Old vs New Data

CC Dataset: https://vp.nnpdf.science/SpiOHitcS-2WggfpKcLW3Q==/

CF Dataset: https://vp.nnpdf.science/D3121rJ6RWG3IgyqViXJMw==/

Compatibility Check:

from validphys.api import API
import numpy as np
 
inp1 = {"dataset_input": {"dataset": "ATLASZRAP11CC"}, "theoryid": 40_000_000, "use_cuts": "internal", "t0pdfset": "NNPDF40_nnlo_as_01180", "use_t0": True}
inp2 = {"dataset_input": {"dataset": "ATLASZRAP11CC", "variant": "legacy"}, "theoryid": 40_000_000, "use_cuts": "internal", "t0pdfset": "NNPDF40_nnlo_as_01180", "use_t0": True}
 
covmat1 = API.covmat_from_systematics(**inp1)
covmat2 = API.covmat_from_systematics(**inp2)
 
t0_covmat1 = API.t0_covmat_from_systematics(**inp1)
t0_covmat2 = API.t0_covmat_from_systematics(**inp2)
 
result = np.all(np.isclose(covmat1, covmat2))
result_2 = np.all(np.isclose(t0_covmat1, t0_covmat2))

print('covmat', result)
print('t0_covmat', result_2)

inp3 = {"dataset_input": {"dataset": "ATLASWZRAP11CF"}, "theoryid": 40_000_000, "use_cuts": "internal", "t0pdfset": "NNPDF40_nnlo_as_01180", "use_t0": True}
inp4 = {"dataset_input": {"dataset": "ATLASWZRAP11CF", "variant": "legacy"}, "theoryid": 40_000_000, "use_cuts": "internal", "t0pdfset": "NNPDF40_nnlo_as_01180", "use_t0": True}
 
covmat3 = API.covmat_from_systematics(**inp3)
covmat4 = API.covmat_from_systematics(**inp4)
 
t0_covmat3 = API.t0_covmat_from_systematics(**inp3)
t0_covmat4 = API.t0_covmat_from_systematics(**inp4)
 
result3 = np.all(np.isclose(covmat3, covmat4))
result_4 = np.all(np.isclose(t0_covmat3, t0_covmat4))

print('covmat', result3)
print('t0_covmat', result_4)

[Out]:

covmat True
t0_covmat True
covmat True
t0_covmat True

@scarlehoff
Copy link
Member

Hi @ecole41 this is not ready for rewview yet right? I see that the variables are still called k1/k2/k_i etc for instance.

@ecole41
Copy link
Collaborator Author

ecole41 commented Dec 4, 2024

Hi @ecole41 this is not ready for rewview yet right? I see that the variables are still called k1/k2/k_i etc for instance.

No, this is not ready yet. I will keep working on it

@scarlehoff
Copy link
Member

Ok! No problem! Is any of the PR finished? There are now many and I got a bit lost. So that I can review (and hopefully merge) the ones that are

@ecole41
Copy link
Collaborator Author

ecole41 commented Dec 5, 2024

Ok! No problem! Is any of the PR finished? There are now many and I got a bit lost. So that I can review (and hopefully merge) the ones that are

Yes, #2178 and #2202 should be ready unless there is something that I have missed.

@ecole41
Copy link
Collaborator Author

ecole41 commented Dec 11, 2024

This branch should be complete, but is failing the checks. I have merged from the master so am not sure why this is happening

@RoyStegeman RoyStegeman requested review from jacoterh and removed request for jacoterh December 11, 2024 15:31
@jacoterh
Copy link
Collaborator

jacoterh commented Dec 18, 2024

This branch should be complete, but is failing the checks. I have merged from the master so am not sure why this is happening

Indeed, it's really odd - I don't see what's wrong with your metadata. The yaml file cannot be parsed for some reason. I'm looking into it.

variants:
legacy:
data_uncertainties:
- uncertainties_legacy_CC-Y.yaml
data_central: data_legacy_CC-Y.yaml
data_central:
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I noticed that if you remove data_central from variants: legacy the tests pass again

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ok, I will just check if changing the formatting of this helps

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The tests pass now, excellent!

@ecole41 ecole41 changed the title [WIP] Reimplement ATLAS Z0 7TEV 46FB Dataset Reimplement ATLAS Z0 7TEV 46FB Dataset Dec 18, 2024
Copy link
Collaborator

@jacoterh jacoterh left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks like this is good to go after the minor comments I left are implemented. Thanks!

@scarlehoff
Copy link
Member

This can be merged, right? (I see it is approved, but just checking!)

Copy link
Member

@scarlehoff scarlehoff left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just the comment on the labels, for both datasets (like in #2223 I see the check was done with the old names though, here's a link with the check with the new names: https://vp.nnpdf.science/eS7CgNDEQMmflyqxTbBmdQ==

label: k1
abs_eta:
description: Absolute dilepton rapidity
label: abs_eta
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Eventually this will be the label to be used in plots, so it would be better to use |\eta| or m_{ll}^2 (below) etc.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ok, I have changed the format in the metadata. Is this format correct?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, that works. Since it is rapidity, like in the other PR it might make sense to change the labels to "y" already.

Also, for sqrts, I think you are missing the \ in sqrt but these are minor points.

Please have a look at the conflict and then this can be merged I believe.

label: k1
abs_eta:
description: Absolute dilepton rapidity
label: "|y|"
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@scarlehoff Just wanted to check if this is now the correct labelling.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think in the one below (m_ll) you need to also put $m_{ll}^2$ ($) for matplotlib to render it correctly. Not sure about the |. But other than that, yes, looks good.

In the xlabel though you might want to remove (in both here and the other observable) the \eta part so that it is only the y.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ok, I have just changed this

@ecole41
Copy link
Collaborator Author

ecole41 commented Jan 21, 2025

After updating the compatibility test to use the new dataset name, I get that the matrices are compatible within 0.1% except the covariance matrix for the CF set which is compatible within 1%, so there are some changes in the new implementation :

from validphys.api import API
import numpy as np
 
inp1 = {"dataset_input": {"dataset": "ATLAS_Z0_7TEV_46FB_CC-Y"}, "theoryid": 40_000_000, "use_cuts": "internal", "t0pdfset": "NNPDF40_nnlo_as_01180", "use_t0": True}
inp2 = {"dataset_input": {"dataset": "ATLASZRAP11CC", "variant": "legacy"}, "theoryid": 40_000_000, "use_cuts": "internal", "t0pdfset": "NNPDF40_nnlo_as_01180", "use_t0": True}
 
covmat1 = API.covmat_from_systematics(**inp1)
covmat2 = API.covmat_from_systematics(**inp2)
 
t0_covmat1 = API.t0_covmat_from_systematics(**inp1)
t0_covmat2 = API.t0_covmat_from_systematics(**inp2)

result = np.all(np.isclose(covmat1, covmat2,rtol=1e-3))
result_2 = np.all(np.isclose(t0_covmat1, t0_covmat2,rtol=1e-3))

print('covmat', result)
print('t0_covmat', result_2)

inp3 = {"dataset_input": {"dataset": "ATLAS_Z0_7TEV_46FB_CF-Y"}, "theoryid": 40_000_000, "use_cuts": "internal", "t0pdfset": "NNPDF40_nnlo_as_01180", "use_t0": True}
inp4 = {"dataset_input": {"dataset": "ATLASWZRAP11CF", "variant": "legacy"}, "theoryid": 40_000_000, "use_cuts": "internal", "t0pdfset": "NNPDF40_nnlo_as_01180", "use_t0": True}
 
covmat3 = API.covmat_from_systematics(**inp3)
covmat4 = API.covmat_from_systematics(**inp4)
 
t0_covmat3 = API.t0_covmat_from_systematics(**inp3)
t0_covmat4 = API.t0_covmat_from_systematics(**inp4)
 
result3 = np.all(np.isclose(covmat3, covmat4,rtol=1e-3))
result3_2 = np.all(np.isclose(t0_covmat3, t0_covmat4,rtol=1e-2))
result_4 = np.all(np.isclose(t0_covmat3, t0_covmat4,rtol=1e-3))

print('covmat 0.001', result3)
print('covmat 0.01', result3_2)
print('t0_covmat', result_4)

[OUT]:
covmat True
t0_covmat True
covmat 0.001 False
covmat 0.01 True
t0_covmat True

Here are the updated old vs new reports:

@scarlehoff
Copy link
Member

Thanks. If you solve the conflict I think this can be merged.

@scarlehoff scarlehoff merged commit bf1a67a into master Jan 22, 2025
9 checks passed
@scarlehoff scarlehoff deleted the reimplement_ATLAS_Z0_7TEV_46FB_CC_AND_CF branch January 22, 2025 04:15
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants