GitHub - kroganlab/af3.template: Template for executing Alphafold3 runs on UCSF Wynton

Krogan Lab Alphafold 3 pipeline compatible with UCSF Wynton HPC

Quickstart: Running Pipeline

Ssh into your Wynton account and then into a dev node. If you do not have a Wynton account and need to request one, see this link. If you're unfamiliar with Wynton, there is very good documentation, you can start here; read at least the first 4 tabs on the "Get Started" header before continuing.
Clone the github repo to a Wynton working directory

git clone https://github.com/kroganlab/af3.template.git

Move into af3.template. This will be your project working directory

cd  af3.template

2b. First time only you run this on Wynton, make sure you have the R packages you need. This is only necessary for some of the post processing after AlphaFold completes.

bash installPackages.sh

Make a new AlphaFoldJobList.csv file (match format to pten.jobTable.txt)
Make a new masterFasta.fasta file (match format to pten_preys.fa)
Edit submission script af.jobs.sh, most importantly the number of tasks, but also new file names or job names if desired
Submit job with qsub af.jobs.sh
View queue and job status with qstat

Postprocessing

Once the job is finised, check your output folder. Each PPI should have its own subdirectory. Within the output directory, the key output files are: {PPI_name}_summaryScores.csv, {PPI_name}.msa.png and {PPI_name}.pae.png
Sometimes AF jobs fail to complete, which can be often due to timeout. To check for incomplete runs, you can use the following bash one-liner. Just cd into the output directory and run:

for dir in *;do if [[ ! -e ./$dir/${dir}_model.cif ]]; then echo $dir;fi;  done

This checks for the existance of the top scoring model, and prints the name of the directory if this isn't found (you can remove the ! symbol to print names of completed runs)

You can capture these incomplete runs and convert to a new AlphaFoldJobList_remainingRuns.csv for a fresh submission, using the following one-liner: (assumption here is you are searching from the parent directory of outputDir i.e ./outputDir/protein1__protein2/outputFiles)

find ./outputDir -maxdepth 1 -type d '!' -exec test -e "{}/ranking_scores.csv" ';' -print | cut -d '/' -f3 | awk '{FS="__";OFS=","}  {print(NR-1,toupper($1),toupper($2))}' > ./AlphaFoldJobList_remainingRuns.csv

Be sure to edit the af.jobs.sh script and extend the job runtime beyond two hours to avoid another timeout!

A handy one-liner for collating all summaryScores.csv into a single file with only a single header:

awk -FS, 'BEGIN{FS=","} NR == 1 || $1 != "model"  { print }' output/*/*_summaryScores.csv > AllSummaryScores.csv

Name		Name	Last commit message	Last commit date
Latest commit History 34 Commits
jobLogs		jobLogs
Alphafold3_utils.py		Alphafold3_utils.py
Dockerfile		Dockerfile
MSApairing_utils.py		MSApairing_utils.py
README.md		README.md
Visualize_PAEandMSA.R		Visualize_PAEandMSA.R
af.jobs.sh		af.jobs.sh
af.msaOnly.sh		af.msaOnly.sh
getContactsPAE.R		getContactsPAE.R
installPackages.R		installPackages.R
installPackages.sh		installPackages.sh
ipsae.py		ipsae.py
pDockq_utils.py		pDockq_utils.py
plotContactsPLDDT.R		plotContactsPLDDT.R
pten.jobTable.txt		pten.jobTable.txt
pten_preys.fa		pten_preys.fa
resubmitFailedJobs.py		resubmitFailedJobs.py
run_alphafold3.py		run_alphafold3.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Krogan Lab Alphafold 3 pipeline compatible with UCSF Wynton HPC

Quickstart: Running Pipeline

Postprocessing

About

Uh oh!

Releases

Packages

Contributors 3

Uh oh!

Languages

kroganlab/af3.template

Folders and files

Latest commit

History

Repository files navigation

Krogan Lab Alphafold 3 pipeline compatible with UCSF Wynton HPC

Quickstart: Running Pipeline

Postprocessing

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Contributors 3

Uh oh!

Languages

Packages