-
Notifications
You must be signed in to change notification settings - Fork 4.6k
CondDBESSource.cc added dump method to JSON file #43374
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
|
+code-checks Logs: https://cmssdt.cern.ch/SDT/code-checks/cms-sw-PR-43374/37852
|
|
A new Pull Request was created by @PonIlya for master. It involves the following packages:
@consuegs, @cmsbuild, @saumyaphor4252, @francescobrivio, @perrotta can you please review it and eventually sign? Thanks. cms-bot commands are listed here |
|
@PonIlya this is going to disrupt the possibility for dumping the twiki format from the text logs. Nothing that cannot get fixed by the scripts in AlCaTools later on, but I would let the possibility to dump either in the text or in the json file ruled by a configuration parameter. Moreover, the json files in output are numbered incrementally, and their name depends on what's already in the repository where you run them. I think that also those names should be made configurable, so that scripts in AlCaTools/ConditionsConsumed (for example runCMSDrivers_data2023D.sh) can be used to give them the same name as of the python config which is run. |
|
this is an extremely useful feature! many thanks |
|
@perrotta I propose it this way: As for the names of the files, I will change the naming method so that it is taken from the arguments and corresponds to the name .log (output_step1_L1.json and etc.) |
Yes, #43374 (comment) must be addressed before we can proceed (in particular the second part, that canno be cured by an additional script in AlCaTools) |
|
how about dumping the info both in the log file (the old way) and in a json (new way) on the side, systematically, when |
|
IMO the name of the json file should be fixed and unambiguous like "CondDBESSource_stats.json" or "CondDBESSource.json" |
I going to add the configuration parameter which can take the name of the JSON file. Like --j [filename].json and by default |
|
But, I have a problem with the way to pass the JSON file name and start flag to the CondDBESSource.cc from cmsDriver.py For now, i chose this way, with customise_commands: |
|
@PonIlya , I would go with #43374 (comment) and just give the file a static (non-configurable) name. |
|
+code-checks Logs: https://cmssdt.cern.ch/SDT/code-checks/cms-sw-PR-43374/38218
|
|
Pull request #43374 was updated. @francescobrivio, @consuegs, @saumyaphor4252, @perrotta, @cmsbuild can you please check and sign again. |
|
@perrotta I changed the code so that you can select the dump method and file name through custom commands. |
|
@malbouis My opinion is that one file name is not convenient because... if you run several dumps in a row, it will be overwritten. It will not be possible to collect information in JSON format immediately about GEN, SIM, etc. with one .sh script. I have already added a new approach, I think it has become more convenient |
|
please test |
|
+1 Summary: https://cmssdt.cern.ch/SDT/jenkins-artifacts/pull-request-integration/PR-c35264/36550/summary.html Comparison SummarySummary:
|
|
is this good to go ? |
|
please test |
|
+1
|
|
This pull request is fully signed and it will be integrated in one of the next master IBs after it passes the integration tests. This pull request will now be reviewed by the release team before it's merged. @antoniovilela, @sextonkennedy, @rappoccio (and backports should be raised in the release meeting by the corresponding L2) |
|
+1 Summary: https://cmssdt.cern.ch/SDT/jenkins-artifacts/pull-request-integration/PR-c35264/36802/summary.html Comparison SummarySummary:
|
|
+1 |
| recordData["timeLookupPayloadIds"].push_back(payloadIdData); | ||
| } | ||
|
|
||
| jsonData[recName].push_back(recordData); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Out of curiosity, I was wondering, does it make sense to limit the json output to the payloads that are actually consumed in the job (i.e. if !pids.empty() or is there any particular reason to print all the tags that would be consumed but are actually not ?
PR description:
The purpose of this PR is to improve the CondDBESSource.cc dump method.
Records and their consumption will be output to a JSON file rather than a log file or console due to the fact that they are usually too long for visual analysis.
This is more convenient for further use, for example, generating сonsumption tables a log parsing script was previously used to fill it out
Previously, this issue was raised in the following PR but was not approved
Taking into account the comment, I left the old dump method but added the option to upload JSON specifying the file name via the command:
process.GlobalTag.JsonDumpFileName =cms.untracked.string("CondDBESSource.json")The command above starts the creation of a JSON dump file if the file name or path to it is specified.
PR validation:
The generated .json file should have the same contents as the previous version of dumpstat after running.
cmsDriver.py TTbar_13TeV_TuneCUETP8M1_cfi --conditions auto:phase1_2023_realistic_postBPix -n 5 --era Run3_2023 --geometry DB:Extended -s GEN --fileout output_step1_GEN.root --beamspot Realistic25ns13p6TeVEarly2023Collision --customise_commands='process.GlobalTag.DumpStat =cms.untracked.bool(True) \n process.GlobalTag.JsonDumpFileName =cms.untracked.string("CondDBESSource.json")'|& tee output_step1_GEN.logOR (without dump into .log)
cmsDriver.py TTbar_13TeV_TuneCUETP8M1_cfi --conditions auto:phase1_2023_realistic_postBPix -n 5 --era Run3_2023 --geometry DB:Extended -s GEN --fileout output_step1_GEN.root --beamspot Realistic25ns13p6TeVEarly2023Collision --customise_commands='process.GlobalTag.JsonDumpFileName =cms.untracked.string("CondDBESSource.json")'|& tee output_step1_GEN.logNot a backport