-
Notifications
You must be signed in to change notification settings - Fork 3
To Do
Zeynep Demiragli edited this page Aug 3, 2015
·
36 revisions
- Private Data Transfers
Write tools to make fixing the bookkeeping easier.- Transfer service should be updated to respond to the term signal more elegantly
- Implement the code necessary to split the folders to limit the number of files in a given run both for mergers and the transfer
- For files in the bad area, move also the lock files. Here is an example:
/store/lustre/mergeMacro/run238685/bad/run238685_ls0111_streamNanoDST_StorageManager.lock
/store/lustre/transfer/run238685/bad/run238685_ls0111_streamNanoDST_StorageManager.dat
/store/lustre/transfer/run238685/bad/run238685_ls0111_streamNanoDST_StorageManager.jsn- Put the 60s cool off time before bookkeeper call back in eor.py. It was removed in the commit 5eb315
- Update the filename of the log file used for inject worker notifications to follow the pattern
<date>-<hostname>-<instance>.log, see https://github.com/smpro/transfer-scripts/blob/master/inject/compat/closeFile.pl#L110 Make sure that the stream rates are filled in WBM even for runs with the TIER0_TRANSFER_OFF run key(?).Use WatchedFileHandler + logrotate for log-file rotation: http://stackoverflow.com/questions/10235220/python-logging-logrotate-optionsClean successfully transferred and handled files. There is a clean up crone that exists in the old production, which also updates the database and marks them as deleted.Python / Perl transfer RPM and configs in PuppetFor eor, remove the time out (minutes to hours) since the last added metadata JSON before brute-force closing.-
DQM delivery from merger to transfersUpdate streams_to_dqm in smhookd.confUpdate the dqm path in smhookd.confUpdate the /dqmburam mount definition to point to RAM /fff/input
Update/etc/init.d/functions-storagemanagerto check for stale lock files during service start.Copying the Event Display to nfs and to eos areas.-
Using srv to transfer old minidaq runs withwith Minidaq setup label. Setup label is a configuration parameter.
-
Implement some more functionality in eos file info :- calculate adler checksum after copying if the jsn checksum and the destination checksum don't match.- if checksums are not the same print out "FAIL" and retry just once ... Log a WARNING if the trigger rate latency for WBM is too long (> 20s ?).Castor -> EOSOnly transfer runs with some data-
Update the view of SM instances so that it works also for mrg* machines in addition to srv-c2c* machines: Separate the transfers for DQM from the rest of the streams to achieve low latencyRequire that the number of MiniEoR files is the same as the number of ini files in macroeorVeto eventsRunTotal < eventsInputStreams in macroeorUse Python standard library for transfer loggingHonor transfer flagExclude runs with run number > 10^9Reduce logging verbosity of watchAndInject