-
Notifications
You must be signed in to change notification settings - Fork 13
[WIP] Implement ATLAS WPWM 13TEV DIF (Future test) #2382
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: master
Are you sure you want to change the base?
Conversation
|
@enocera I am not certain on the treatment of the uncertainties here. Do you know if any more should be added to the |
|
@enocera:
Whereas I have implemented these as:
I have also added some changes to the uncertainty treatments after discussing this with an ATLAS experimentalist. She suggested that all unfolding systematics should be treated as uncorrelated and all normalisation systematics as multiplicative and uncorrelated - I have set this in this branch. |
|
@ecole41 is this (de)correlation prescription approved by ATLAS in some manner? They always get quite nervous if we start to play with their correlation model, so having some kind of official endorsement always helps |
|
Hi @ecole41 @jekoorn thanks for the work. Maybe @enocera has other ideas but my two cents are the following:
so to me the preferred structure would be what @ecole41 has done but removing the muon datasets, if this is clear. In any case it should be easy for @jekoorn to adopt Ella's implementation, and then you can cross-check each other concerning the implementation of systematic errors. |
|
In any case as I mentioned above it is important to document our choice of correlation model, and make sure we can back it up with some official ATLAS recommendation |
|
Once @enocera signs off the dataset implementation, we will move to the generation of NNLO grids using NNLOJET, which will also be a non-trivial amount of work specially the first time that it is done |
|
hi @juanrojochacon thanks a lot for the comments! I agree with everything, also in terms of fool-proofing so that there does not need to be any confusion about what dataset is to be fitted, and which is not. With your proposed structure one should enter one dataset to the runcard at the time. Just to be sure:
So that means we we should put them sequentially in the same file, as Ella already did? In any case I will make these changes to my implementation. Clear, thanks! |
|
yes indeed, we put one after the other. It is the exact same analysis, so there will never be a reason why we choose to fit W+ but not W-. This is the same as what is done for similar datasets. So yes, follow Ella's implementation and then you can compare the two and check that they are the same |
|
For your reference, I paste here what I recommended to @ecole41 in a private conversation. I would implement the cross section single differential in m_W^T separately for positive and negative leptons (so only Tabs. 38 and 39). I would also implement the cross section double differential in m_W^T and eta, again separately for positive and negative leptons (Tabs. 44-53). Two remarks.
I would implement four different observables in the same data set, as follows:
|
|
Good point @enocera I agree. We can check that results based on the muon dataset are consistent with those of the combined dataset. In any case for this measurement I expect that we are limited by systematics, so actually it may be better to stick to the muon dataset to have a better grasp of the systematics. So we have a plan |
This is one point on which I don't agree completely, for reasons related to the interpretation of systematic uncertainties, that can get more ambiguous (especially w.r.t. correlations) in the combined case, as I explained above. Theoretical predictions will remain the same, therefore I recommend the implementation of both the muon cross sections and the combined cross sections in the commondata framework.
This is another point on which I (partly) disagree. Our commondata implementation is flexible enough to have multiple observables for the same data set. In other words: the data set is one, that incorporates both the 1D and the 2D distributions. But they are two mutually exclusive observables (in the same data set), of course, because we don't know correlations. We can elegantly implement them in a single data set, and call only a subset of observables (1D or 2D) in our fit runcard. I have listed above the preferred clustering. |
|
sure @enocera I meant separated as in a different file, but we can keep them as subsets of the same dataset, as we do for many other datasets. So I agree with your remarks |
OK, we are on the same page, then. |
|
Dear @ecole41 I (finally!) had the chance to look at your implementation of the data set. I would say that most of it is very nicely done. I suggest to use your implementation as a baseline w.r.t. that of @jekoorn . I have some suggestions about the treatment of uncertainties, though.
|
|
Dear @ecole41, @enocera, and @juanrojochacon, I have updated my implementation following Emanuele's request in #2380 , and cross-checked with Ella's numbers, which should all add up perfectly. I |
|
Great! I understand that your numbers and those from Ella are identical? |
|
If so yes, while @enocera completes his review i would start with the grid generation. |
|
For the NNLO grid implementation, as we agreed I would suggest that @ecole41 and @jekoorn proceed in parallel with the implementation of the PineFarm cards etc, produce a low-stats grid with NNLOJET, and check that they get consistent numbers. Then for the final, high stat grids, we only need to do it once |
|
at least this is the plan we made with @enocera and @scarlehoff at Morimondo, and I still think it is a good idea which saves time on the long run |
Whereas I initially thougt yes, it seems there is some deviation in the numbers for the <only muon, double differential> set. Interestingly, the other double differential, which is generated using the same function, does seem to be correct. |
|
ok, this is precisely why benchmarks are useful ;) With the help of the benchmark, should be possible to understand where the problem is Then we move to the NNLO grid generation |
|
Hi all, I have looked a bit closer at the difference between my numbers and Ella's (why some were swapped around). To be more precise, it seems that in my implementation and your implementation of the DDIF sets (I checked the data and kinematic tables), we have the following structure in the data file in terms of the HEPData tables: But for the muon data, from what I understand, your filter swaps them around in the following way: which makes sense given this line in your code where you append the data to your tables either 'first all plus then all minus' or 'alternating plus/min'. But in the kinematics file for DDIF MUON they are not swapped around and instead follow the structure of "first all plus, then all minus". I assume that we would like to do the former and have first all plus tables, and then all minus. So the numbers are correct in the end, just misaligned. I guess this is an easy fix ;-) Nice! Then we are set. |
|
@jekoorn Thanks for figuring that out. I have now adjusted my filter.py script so that this structure should now match the structure in your PR. Let me know if there are still any inconsistencies. I have also changed the uncertainty treatments to match @enocera's suggestions. I just wanted to check that |
Yes, thanks. |
@juanrojochacon Grids will be generated with NNLOjet. I understand that production has been automatised as much as possible relying on the piece of information contained in the commondata. According to our established workflow I expect:
Discussion can occur in either PR, though it should be possibly cross-referenced. |
|
ok clear @enocera . Yes, indeed, grid production should be automated with pinefarm, but as usual the proof is in the pudding. I suggest that @ecole41 and @jekoorn try independently to generate the NNLO grid and then they both cross-check each other results. Once the low-stats grid is produced, we can produced to the high-stats grid generation and then produce the FK tables etc |
|
@enocera looking closely at this dataset, this is a W+J at Leading Order (which is a few orders of magnitude more expensive to compute than just W). So perhaps we want to do a first NLO check before moving to NNLO? |
| description: Combined Born-level single-differential cross-section in the $l^+$ and $l^-$ channels in sequence | ||
| label: ATLAS $W$ 13 TeV Differential | ||
| units: 'pb/GeV' | ||
| process_type: DY_CC_PT |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I am not sure if this process type is correct, I have chosen DY_CC_PT as the description given in process_options.py matches DY W + j
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It is correct, but in process_option.py you might need to add a condition in order to compute the x-Q kinematic coverage from transverse mass instead of from pT.
Btw, please use WJ instead of WPWMJ, see https://docs.nnpdf.science/data/dataset-naming-convention.html#nnpdf-s-dataset-naming-convention
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ok, the reason I used WPWMJ was that this process would be handled correctly when generating pinecards as it splits the observable into WP and WM. But I will just alter the pinefarm interface to also treat WJ like this
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Make sure to use this branch https://github.com/NNPDF/pinefarm/pull/107/files#diff-22d68a6023c028591ce66c2a6240ac2df65408ccd1736fb8f41c4e2bb038b389 and push the changes there directly.
It's what I was using for NNPDF/pinecards#187
Of course. The idea is to first make a cheap run (perhaps you can even limit statistics a little?) |
This branch included an implementation of the ATLAS 13TEV WPWM Differential measurements for future test data. Another version of this implementation has been added in PR #2380.
Still To Do:
metadata.yamlfile