config.py: contains all the models. Needed by almost all the scripts
create_one_hot_seq.py: create one hot encoded sequences for all the following analyses
The output file is named "cv_free_MPRA.csv" bacause the prediction was not made using cross_val_predict but predicted using the entire "bpnet_bottleneck_feat" as input and the top layer as the model.
Snakefile_new_top_layer (requires util_cross_val.py)
Snakefile_reconstruction (requires util_cross_val.py and reconstruction_util.py)
plot.py (to get spearman's correlation)
Snakefile_modisco: I made some modifications to plot_weights() in modisco/visualization/viz_sequence.py. Specifically, I added a name parameter, which is the name of the image generated, and plt.show() was changed to plt.savefig(name) to save the image.
modisco_pattern_generator.py