Skip to content

Conversation

@ntalluri
Copy link
Collaborator

@ntalluri ntalluri commented Aug 12, 2025

This tutorial and registration for the conference is due September 28, 2025

Event: https://co.mbine.org/events/
Example tutorials: https://co.mbine.org/author/combine-2023/

@read-the-docs-community
Copy link

read-the-docs-community bot commented Aug 12, 2025

Documentation build overview

📚 spras | 🛠️ Build #29926215 | 📁 Comparing 220954d against latest (fea328d)


🔍 Preview build

Show files changed (14 files in total): 📝 7 modified | ➕ 4 added | ➖ 3 deleted
File Status
genindex.html 📝 modified
htcondor.html ➖ deleted
index.html 📝 modified
py-modindex.html 📝 modified
contributing/patching.html ➖ deleted
fordevs/modules.html 📝 modified
fordevs/spras.config.html ➖ deleted
fordevs/spras.html 📝 modified
prms/domino.html 📝 modified
prms/prms.html 📝 modified
tutorial/advanced.html ➕ added
tutorial/beginner.html ➕ added
tutorial/intermediate.html ➕ added
tutorial/introduction.html ➕ added

@tristan-f-r tristan-f-r added the documentation Improvements or additions to documentation label Aug 13, 2025
@ntalluri ntalluri requested a review from agitter August 13, 2025 16:24
Copy link
Collaborator

@agitter agitter left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is a good outline. I added some detailed thoughts.

@ntalluri
Copy link
Collaborator Author

We will also need to write an abstract

Pathway reconstruction is a computational approach used in biology to rebuild biological pathways (such as signaling pathways) from high-throughput experimental data.

Curated pathway databases provide references to pathways, but they are often generalized and may not capture the context-specific details relevant to a particular disease or experimental condition.
To address this, pathway reconstruction algorithms (PRAs) help map molecules of interest (such as proteins, genes, or metabolites identified in omics experiments or that are known as points of reference) onto large-scale interaction networks, called interactomes (maps of molecular interactions in a cell).
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I prefer to avoid the PRA acronym because it is not generally used

- Bow Tie Builder
- ResponseNet

- Each algorithm has an include flag (true/false) to turn it on or off.
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Link to our docs about these here?

- data_dir: the path to where the input dataset files live
- other_files: a placefolder for potential need for future delevvelopment

4. Gold Standards
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Skip gold standards for the basic intro and introduce in medium?

- Defines the filepath where reconstructed networks are saved (output directory by default)
- Basic housekeeping for how SPRAS organizes and stores results.

6. Analysis
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

How much of this do we cover here versus skip until medium? We may not need to explain everything that goes in the config file all at once.

- egfr
- one algorithm
- three different preset combos
- have them make the configuration file?
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think that is necessary. The basic tutorial can have them start with an premade config, maybe modify it trivially, and make sure they understand what it did. A powerful example would be to run it, add one extra parameter, and run it again to see how much is cached.

Copy link
Collaborator

@agitter agitter left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I really like this style of explaining what each command does and showing the file tree produced.

Once beginner is done, try practicing it live. This first tutorial may end up being mostly beginner content, which is okay.

- mention parameter tuning
- say that parameters are not preset and need to be tuned for each dataset

CHTC integration
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

CHTC is local to our university. The way to say it may be Snakemake integration with cloud and high-throughput computing resources, which we've prototyped in our local cluster. If we start testing in OSG that would be different because many people are eligible for accounts.


Stores all results generated by SPRAS. Subfolders are created automatically for each run, and their structure can be controlled through the configuration file.

By default, the directories are set to config/, input/, and output/. The config/, input/, and output/ folders can be placed anywhere within the SPRAS repository. Their input/ and output/ locations can be updated in the configuration file, and the configuration file itself can be found by providing its path when running SPRAS.
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we place these directories anywhere? They don't have to be subdirectories anymore, do they? Do absolute paths work?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I have never tried an absolute path to put them anywhere. I thought it was forced to be within spras, but I'm not sure anymore. I'll test it out.

@tristan-f-r tristan-f-r mentioned this pull request Sep 6, 2025

4. Organizing results with parameter hashes

Each dataset–algorithm–parameter combination is placed in its own folder named like egfr-pathlinker-params-D4TUKMX/. D4TUKMX is a hash that uniquely identifies the specific parameter combination (k = 10 here). A matching log file in logs/parameters-pathlinker-params-D4TUKMX.yaml records the exact parameter values.
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
Each dataset–algorithm–parameter combination is placed in its own folder named like egfr-pathlinker-params-D4TUKMX/. D4TUKMX is a hash that uniquely identifies the specific parameter combination (k = 10 here). A matching log file in logs/parameters-pathlinker-params-D4TUKMX.yaml records the exact parameter values.
Each dataset–algorithm–parameter combination is placed in its own folder named like egfr-pathlinker-params-D4TUKMX/. D4TUKMX is a hash that uniquely identifies the specific parameter combination (k = 10 here). A matching log file in basic/logs/parameters-pathlinker-params-D4TUKMX.yaml records the exact parameter values.


2. Organizing outputs per parameter combination

Each new dataset–algorithm–parameter combination gets its own folder (e.g egfr-pathlinker-params-7S4SLU6/ and egfr-pathlinker-params-VQL7BDZ/)
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

add the parameters that ran


2. Running the summary analysis

SPRAS aggregates the pathway.txt files from all selected parameter combinations into a single summary table. This table reports key graph-based statistics for each pathway, including:
Copy link
Collaborator Author

@ntalluri ntalluri Oct 8, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
SPRAS aggregates the pathway.txt files from all selected parameter combinations into a single summary table. This table reports key graph-based statistics for each pathway, including:
SPRAS aggregates the pathway.txt files from all selected parameter combinations per dataset into a single summary table. This table reports key graph topological statistics for each pathway, including:


3. Running the Cytoscape analysis

All pathway.txt files from the chosen parameter combinations are collected and passed into the Cytoscape Docker image. A Cytoscape session file is then generated, containing visualizations for each pathway. This file is saved as egfr-cytoscape.cys and can be opened in Cytoscape for interactive exploration.
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
All pathway.txt files from the chosen parameter combinations are collected and passed into the Cytoscape Docker image. A Cytoscape session file is then generated, containing visualizations for each pathway. This file is saved as egfr-cytoscape.cys and can be opened in Cytoscape for interactive exploration.
All pathway.txt files from the given parameter combinations for a specific dataset which collected and passed into the Cytoscape Docker image. A Cytoscape session file is then generated, containing visualizations for each pathway. This file is saved as egfr-cytoscape.cys and can be opened in Cytoscape for interactive exploration.

- Domino
- Source-Targets Random Walk with Restarts
- Random Walk with Restarts
- BowTieBuilder
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

bowtiebuilder on the egfr data takes forever, I just ran it and it been 2 hours and it still is not done.

Copy link
Collaborator

@agitter agitter left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Our goal is to have a version of this merged very soon so that Neha can test it on Monday. If there isn't time to address all of my comments, they can be addressed in a follow up pull request.

k: 1
# run2: # uncomment for step 3.2
# k: [10, 100] # uncomment for step 3.2

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this still indented too much? It looks unaligned in GitHub.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do we need to move these planning notes before merging? Or keep them here temporarily and delete them in the pre-COMBINE follow up pull request?


Required knowledge:

- Basic Python skills
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Will participants be coding in Python? Or is it the ability to edit the yaml files and run command line tools?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ability to edit the yaml files and run command line tools

What Happens When You Run This Command
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^

SPRAS will run more slowly than the beginner.yaml configuration.
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Will take longer to run?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes

Advanced Capabilities and Features
======================================

More like these are all the things we can do with this, but will not be showing
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, this can be a list more or less of things SPRAS can do. The beginner and intermediate steps will already take plenty of time.

===============================
Pathway reconstruction algorithms allow researchers to systematically find context-specific subnetworks without performing exhaustive experiments. Different algorithms use distinct computational strategies and parameters, providing flexibility to highlight various aspects of the underlying biology and generate new, testable hypotheses giving researchers the flexibility to create and identify different subnetworks specific to their experimental conditions.

What is SPRAS?
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do better to explain: do we need to state somewhere what SPRAS does at a high level? It takes input node information and a networks, runs one or more algorithms with one or parameter combinations, etc. A newcomer may not know what we mean yet by algorithms and datasets in this sentence.


5. Running the algorithm

SPRAS launches the PathLinker Docker image, sending it the prepared files and parameter settings.
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Link to the docker image on docker hub

Co-authored-by: Anthony Gitter <agitter@users.noreply.github.com>
Copy link
Collaborator

@agitter agitter left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Approving the temporary version for the demo

@agitter agitter merged commit 4f97f33 into Reed-CompBio:main Oct 13, 2025
19 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

documentation Improvements or additions to documentation

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants