A storage and version control tool for WITCH result files.
Add the gdxstore/source directory to your PATH.
Using the setup.py file would require using a virtual environment on HPC. I prefer to avoid this for now.
Then copy the configuration file to your WITCH installation folder. For example:
cp config.ini-example ../witch/config.ini
and replace <your-username>, or change the whole path if you prefer.
This is the path where all the files will be stored. A new subfolder will be created
for each commit with files to store.
In alternative, config.ini can be copied to $XDG_CONFIG_HOME/gdxstore/ or ~/.config/gdxstore/.
The script first checks the current directory for the configuration file, then the two folders above, as
a global default.
The plan is to add more global options to config.ini, such as default gdxdiff tolerances
and the log history start date.
If on juno or cassandra, in order to let other people use your directory for storage, you need to
- let them traverse your main folder in
/data:
cd /data/cmcc
chmod g+x <your-username>
(this allows all seme users to access and list directories in your folder);
- add permissions to the
witch_resultsfolder to the specific users you want to collaborate with
cd <your-username>`
setfacl -m u:<a-trusted-persons-username>:rwx witch_results/
and you can check that it worked by listing permissions using getfacl witch_results/.
- Store a file:
gdxstore.py -s results_ssp2_bau.gdx
The code checks that the file is a makefile target. If not, it asks the user to provide
a script used to produce it, so that it can be stored along with the file for reproducibility.
It then checks that there are no uncommitted changes to the code. If there are any, it asks the
user whether a patch with the changes should be created. In this case, time and date of the run
are appended to the file name and to the patch name.
The code finally checks that the computation started after the last source file change. If this is not true,
it raises an error and doesn't store it.
In case you are sure that the result was produced with the current version of the code,
even if the latest change is later than the run start time, you can avoid this check by adding flag
no-timing-validation to the command. Something like this can happen if, for example, you make some
changes to the code while the code is running and then you revert them.
Wildcards work: you can store a bunch of files like this:
gdxstore.py -s results_ssp2_*.gdx.
- Display the git log including the list of stored files for each commit:
gdxstore.py --log
- Compare the current version of a result file with one from a previous commit:
gdxstore.py -d results_ssp2_ctax_200.gdx --commit 353d204b05029d94cf4d82
This calls gdxdiff. TODO: add gdxdiff options.
- Override the default storage directory:
gdxstore.py -s results_ssp2_bau.gdx --storage-folder ../temp_results