Document/Document_main.pdf is the latest version of the thesis. The same is rendered as a webpage in Document_main.html, by automatic conversion - may contain errors.
script/thesis.py can be run as-is. It contains the relevant calculations for the thesis. Interactive execution as IPython cells is recommended.
The remaining files and folders are best ignored.
(May be outdated)
- Document folder contains TeX files, figures and tables. Renders to HTML using pandoc.
- main.tex: root TeX file, includes all others.
- Document_*: pandoc conversions of main.tex.
- chapters: chapters to include.
- resources: contains figures, tables, fonts, bibliography etc.
- Script folder contains the Python script and its dependencies. prepare.py and thesis.py write to Document/resources, as these outputs are required by the TeX document.
- thesis.py: main script.
- *scorecard: contains the scorecard and some evaluation scripts.
- prepare.py: prepares dataset for processing by thesis.py.
- requires data-raw.csv
- outputs data.parquet
- models.py: trains the models and writes them to models/*.
- Methods.py: outsourced functions, esp. visualization-related.
- environment.yaml: conda environment dump.
- models: Pickle files of the ML models and scorecard predictions. Required by thesis.py.
- depreceated: files that are no longer required are kept here to easily revert, just in case.
- OnlineSources folder contains PDF prints of online sources