Skip to content
Open
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
148 commits
Select commit Hold shift + click to select a range
2ed8115
initial compatibility changes for upgrading multimer
jnwei Jan 12, 2024
201eafd
np type update in openfold.np.relax
jnwei Jan 16, 2024
e71c1b1
initial compatibility changes for upgrading multimer
jnwei Jan 12, 2024
91776cd
np type update in openfold.np.relax
jnwei Jan 16, 2024
427a6ee
update deprecated jax.numpy.DeviceArray to jax.Array
jnwei Jan 23, 2024
9c94078
Merge branch 'multimer-pytorch-update' of https://github.com/aqlabora…
jnwei Jan 23, 2024
e813bb5
Additional fix for multimer deepspeed test
christinaflo Jan 23, 2024
df4dfac
first pass changes to run with pl 2.1
jnwei Jan 24, 2024
456103d
initial compatibility changes for upgrading multimer
jnwei Jan 12, 2024
ff36800
first pass changes to run with pl 2.1
jnwei Jan 24, 2024
f0fc7d9
merging changes from main
jnwei Feb 19, 2024
49ab053
Merge pull request #407 from jnwei/pl_upgrades
jnwei Feb 19, 2024
17f24bd
Added custom template folder
rostro36 Feb 20, 2024
10b6838
Fix always is_custom_template.
rostro36 Mar 1, 2024
e9bacd8
Less dependent on input sequnece, use template length instead
rostro36 Mar 2, 2024
50f8617
Support for multiple custom templates.
rostro36 Mar 2, 2024
5f5a79a
initial compatibility changes for upgrading multimer
jnwei Jan 12, 2024
6dc34d7
first pass changes to run with pl 2.1
jnwei Jan 24, 2024
36cd9eb
Merge branch 'pl_upgrades' of https://github.com/jnwei/openfold into …
jnwei Mar 19, 2024
295d0d5
Fixed documentation according to comments.
rostro36 Mar 20, 2024
a317ad2
superimposition fix from Aymen
jnwei Mar 21, 2024
cfd2e71
seed workers fix and validation_epoch_end extra argument
jnwei Mar 25, 2024
6b4f167
Added alignment method.
rostro36 Mar 25, 2024
0c3435c
add metric logging to progress bar.
jnwei Mar 25, 2024
5ff5177
more logging changes
jnwei Mar 27, 2024
8626358
add paren to save_hyperparameters
jnwei Apr 2, 2024
577219c
Removes OF copy of zero_to_fp32.py favoring deepspeed.util version
jnwei Apr 2, 2024
523adaf
adds reload_dataloaders_every_n_epochs flag
jnwei Apr 11, 2024
80e6341
change message for test_model.py compare
jnwei Apr 17, 2024
1ae833b
Updates low_precision check to use current precision settings.
jnwei Apr 19, 2024
ea142a0
fixes deepspeed function definition.
jnwei Apr 19, 2024
5ccb7de
updates Dockerfile
jnwei Apr 19, 2024
3cab807
fix mkl version to 2024.0.0
jnwei Apr 19, 2024
866477a
Update gpg keys for Docker build
jnwei Apr 20, 2024
addb80a
changes to Dockerfile ane pin mkl to 2024
jnwei Apr 20, 2024
793eb96
adjust pytorch version number
jnwei Apr 20, 2024
ad34fc3
updates Bio.PDBData call and environment.yml
jnwei Apr 22, 2024
ed5261f
Split cuda install commands in Dockerfile
jnwei Apr 22, 2024
0b11ced
change mamba version
jnwei Apr 22, 2024
1d22373
upgrading hmmer hhsuite and kalign2 packages
jnwei Apr 22, 2024
cf0cc8b
small edit to Dockerfile
jnwei Apr 22, 2024
12eb81b
Reset miniforge version to 23.3.1-1
jnwei Apr 22, 2024
4ee9943
Remove nvcc compute capability 37 which caused kernel build issues
jnwei Apr 23, 2024
76fb7ce
remove test print statements
jnwei May 6, 2024
a51b08c
initial compatibility changes for upgrading multimer
jnwei Jan 12, 2024
7de0ab0
first pass changes to run with pl 2.1
jnwei Jan 24, 2024
4f2f069
Additional fix for multimer deepspeed test
christinaflo Jan 23, 2024
53cdb24
add metric logging to progress bar.
jnwei Mar 25, 2024
3eed6cb
more logging changes
jnwei Mar 27, 2024
19c8158
adds reload_dataloaders_every_n_epochs flag
jnwei Apr 11, 2024
cdf6039
change message for test_model.py compare
jnwei Apr 17, 2024
0a8ae6a
Updates low_precision check to use current precision settings.
jnwei Apr 19, 2024
ee502a2
updates Dockerfile
jnwei Apr 19, 2024
4631b54
Update gpg keys for Docker build
jnwei Apr 20, 2024
1b7f8f4
changes to Dockerfile ane pin mkl to 2024
jnwei Apr 20, 2024
8d816e3
adjust pytorch version number
jnwei Apr 20, 2024
571ae26
updates Bio.PDBData call and environment.yml
jnwei Apr 22, 2024
6b9c61c
Split cuda install commands in Dockerfile
jnwei Apr 22, 2024
435ec2f
change mamba version
jnwei Apr 22, 2024
2b2f7b7
upgrading hmmer hhsuite and kalign2 packages
jnwei Apr 22, 2024
52d4bb8
small edit to Dockerfile
jnwei Apr 22, 2024
4172f34
Reset miniforge version to 23.3.1-1
jnwei Apr 22, 2024
e3e09c4
Remove nvcc compute capability 37 which caused kernel build issues
jnwei Apr 23, 2024
c715b13
Merge remote-tracking branch 'refs/remotes/jnwei/pl_upgrades' into pl…
jnwei May 6, 2024
f10f662
pins mkl version to 2022 to avoid conda environment conflict
jnwei May 6, 2024
ed69f06
make space for docker CI
jnwei May 6, 2024
12eae13
Update docker-image.yml
jnwei May 11, 2024
0eaf08a
make sure padded asym_id won't affect permutation steps
dingquanyu Feb 15, 2024
54ec5c4
fixed bugs in unittests for multi-chain permutation. now working on e…
dingquanyu Feb 15, 2024
b542701
remove unnecessary lines
dingquanyu Feb 15, 2024
2669287
restore to the verison on main
dingquanyu Feb 15, 2024
9f964fe
added typing hints and fixed some comments
dingquanyu Feb 16, 2024
939fd0a
make sure no padded features are going to be selected as anchors
dingquanyu Feb 20, 2024
9597368
fixed typing errors; added more comments
dingquanyu Mar 21, 2024
515b082
added comments
dingquanyu Mar 21, 2024
eb262d2
update comments;fixed typos
dingquanyu May 10, 2024
ad414ec
Update tests and comments
dingquanyu May 10, 2024
dc2da1f
fixed typing error of anchor_gt_residue
dingquanyu May 10, 2024
5221ed4
Update test_permutation.py
jnwei May 11, 2024
26f8761
Initial commit for sphinx documentation.
jnwei Mar 20, 2024
4873c02
Rough draft dump of docs and readthedocs build
jnwei May 8, 2024
0b724be
fix typo in readthedocs.yaml
jnwei May 8, 2024
d4a14b6
replace doc environment pip dependencies with conda builds
jnwei May 8, 2024
9f1e0a8
cleanup makefiles and original readme
jnwei May 8, 2024
9c98e57
updates to Inference.md
jnwei May 8, 2024
44e5733
Add addtional inference pages
jnwei May 8, 2024
6a52cc4
add convert v1 weights instructions
jnwei May 8, 2024
d64dffd
Adds FAQ section
jnwei May 8, 2024
f1175ab
creates link to FAQ in documentation
jnwei May 8, 2024
6261c95
small edits to main page
jnwei May 8, 2024
55c1e0e
minor language edits
jnwei May 8, 2024
a744abe
Adds mkl version to environment.yml
jnwei Apr 30, 2024
89c756d
make space for docker CI
jnwei May 6, 2024
78644cd
Shorten README.md main page.
jnwei May 9, 2024
e338f20
adds mmseqs2 to environment.yml for clustering
jnwei May 9, 2024
0b30bb8
Update training OpenFold docs with correct paths.
jnwei May 10, 2024
cc565fd
Adds example directory
jnwei May 10, 2024
3ed09c6
Update docker-image.yml
jnwei May 13, 2024
e5ce219
Update docker-image.yml
jnwei May 13, 2024
a8c61c6
Merge branch 'main' into pl_upgrades
jnwei May 13, 2024
1647ec9
fix typo in environment.yml
jnwei May 13, 2024
dc93d33
remove mpipy from pip install requirements
jnwei May 13, 2024
c07075c
in scripts/utils.py account for case where no conda environment is sp…
jnwei May 13, 2024
3bec3e9
Merge pull request #438 from jnwei/pl_upgrades
jnwei May 13, 2024
3c1fd31
Merge pull request #443 from jnwei/main
jnwei May 17, 2024
734ebc4
Update Aux_seq_files.md
jnwei May 23, 2024
f6c875b
Merge pull request #448 from aqlaboratory/aux_seq_files_update
jnwei May 23, 2024
d3c89fc
Fix 3-to-1 letter conversion to use extended mapping
ljarosch Jul 6, 2024
f37d0d9
Merge pull request #464 from ljarosch/main
jnwei Jul 8, 2024
8b5212d
Remove unnecessary double-load of config_json.
ryan-attunely Jul 17, 2024
0dafd62
Update documentation.
rostro36 Jul 18, 2024
b38b607
Change path of Inference.md
rostro36 Jul 18, 2024
7d22739
Merge branch 'main' into main
rostro36 Jul 18, 2024
c48f850
Merge pull request #408 from rostro36/main
jnwei Jul 18, 2024
6f63267
Merge pull request #470 from rkosai/main
jnwei Jul 18, 2024
e6ce9c9
Update Installation.md - fix pl_upgrades clone instructions
vaclavhanzl Aug 24, 2024
6bdbd48
Fix pl_upgrades enwironment (numpy, cuda, gcc)
vaclavhanzl Oct 22, 2024
23cf2f6
Merge pull request #496 from vaclavhanzl/vh-fix-pl_upgrades-env
jnwei Nov 7, 2024
b79ca29
Merge pull request #479 from vaclavhanzl/patch-3
jnwei Nov 7, 2024
cde001f
add environment variables
etowahadams Nov 13, 2024
8ece4f3
Merge pull request #502 from etowahadams/etowahadams/install-docs
jnwei Nov 14, 2024
625ade9
Revert "docs: Add env var instructions to install guide "
jnwei Nov 14, 2024
e605ec8
Merge pull request #504 from aqlaboratory/revert-502-etowahadams/inst…
ljarosch Nov 14, 2024
a01a60f
docs: env variable
etowahadams Nov 14, 2024
9f09442
fix: formatting
etowahadams Nov 14, 2024
6a43510
fix: formatting
etowahadams Nov 14, 2024
f1d0ae7
try quotes
etowahadams Nov 14, 2024
c05a354
fix: formatting
etowahadams Nov 14, 2024
56277ea
fix: formatting
etowahadams Nov 14, 2024
a1192c8
Merge pull request #505 from etowahadams/update-install
jnwei Dec 4, 2024
a364def
Minor typo fix in Installation.md
nenuadrian Dec 26, 2024
e8d3558
updated script
etowahadams Feb 23, 2025
26d1a5d
Merge pull request #520 from etowahadams/etowahadams/update-script
jnwei Feb 24, 2025
815a042
Merge pull request #516 from nenuadrian/patch-1
jnwei Feb 24, 2025
100a309
Maintainance to pl_upgrades
jnwei Apr 23, 2025
ab4a245
Merge branch 'main' into pl_upgrades
jnwei Apr 23, 2025
0c2d455
fix environment to support tests
jnwei Apr 23, 2025
9caf30a
Change casting for deepspeed compare model test to fp32
jnwei Apr 24, 2025
7e06ed9
support openmm>8 and fix tolerance units in amber minimization
jnwei Apr 24, 2025
cb899a5
Merge pull request #2 from aqlaboratory/pl_upgrades
jnwei Apr 24, 2025
da37880
Allow numpy>2 and support compute capability >9
jnwei Apr 24, 2025
4312aec
Add link to issue for deepspeed_evo_attention test.
jnwei Apr 25, 2025
0672517
Update installation docs to build CUDA12 version
jnwei Apr 25, 2025
16af434
fix inference documentation
jnwei Apr 25, 2025
fe10216
update version number.
jnwei Apr 25, 2025
a5433c3
Update config.py
jnwei Apr 25, 2025
620a54f
Update amber_minimize.py
jnwei Apr 25, 2025
dbc2d30
Update OpenFold notebook to updated pytorch2 commit
jnwei Apr 26, 2025
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
17 changes: 10 additions & 7 deletions Dockerfile
Original file line number Diff line number Diff line change
@@ -1,17 +1,20 @@
FROM nvidia/cuda:11.3.1-cudnn8-devel-ubuntu18.04
FROM nvidia/cuda:12.1.1-cudnn8-devel-ubuntu22.04

# metainformation
LABEL org.opencontainers.image.version = "1.0.0"
LABEL org.opencontainers.image.authors = "Gustaf Ahdritz"
LABEL org.opencontainers.image.version = "2.0.0"
LABEL org.opencontainers.image.authors = "OpenFold Team"
LABEL org.opencontainers.image.source = "https://github.com/aqlaboratory/openfold"
LABEL org.opencontainers.image.licenses = "Apache License 2.0"
LABEL org.opencontainers.image.base.name="docker.io/nvidia/cuda:10.2-cudnn8-runtime-ubuntu18.04"
LABEL org.opencontainers.image.base.name="docker.io/nvidia/cuda:12.4.1-devel-ubuntu22.04"

RUN apt-get update && apt-get install -y wget

RUN apt-key del 7fa2af80
RUN apt-key adv --fetch-keys https://developer.download.nvidia.com/compute/cuda/repos/ubuntu1804/x86_64/7fa2af80.pub
RUN apt-key adv --fetch-keys https://developer.download.nvidia.com/compute/cuda/repos/ubuntu1804/x86_64/3bf863cc.pub
RUN wget https://developer.download.nvidia.com/compute/cuda/repos/ubuntu2204/x86_64/cuda-keyring_1.0-1_all.deb
RUN dpkg -i cuda-keyring_1.0-1_all.deb

RUN apt-get install -y libxml2 cuda-minimal-build-12-1 libcusparse-dev-12-1 libcublas-dev-12-1 libcusolver-dev-12-1 git

RUN apt-get update && apt-get install -y wget libxml2 cuda-minimal-build-11-3 libcusparse-dev-11-3 libcublas-dev-11-3 libcusolver-dev-11-3 git
RUN wget -P /tmp \
"https://github.com/conda-forge/miniforge/releases/download/23.3.1-1/Miniforge3-Linux-x86_64.sh" \
&& bash /tmp/Miniforge3-Linux-x86_64.sh -b -p /opt/conda \
Expand Down
6 changes: 3 additions & 3 deletions docs/source/Aux_seq_files.md
Original file line number Diff line number Diff line change
Expand Up @@ -68,9 +68,9 @@ All together, the file directory would look like:
└── 6kwc.cif
└── alignment_db
├── alignment_db_0.db
├── alignment_db_1.db
...
├── alignment_db_9.db
├── alignment_db_1.db
...
├── alignment_db_9.db
└── alignment_db.index
```

Expand Down
15 changes: 8 additions & 7 deletions docs/source/Inference.md
Original file line number Diff line number Diff line change
Expand Up @@ -62,7 +62,7 @@ python3 run_pretrained_openfold.py \
$TEMPLATE_MMCIF_DIR
--output_dir $OUTPUT_DIR \
--config_preset model_1_ptm \
--uniref90_database_path $BASE_DATA_DIR/uniref90 \
--uniref90_database_path $BASE_DATA_DIR/uniref90/uniref90.fasta \
--mgnify_database_path $BASE_DATA_DIR/mgnify/mgy_clusters_2018_12.fa \
--pdb70_database_path $BASE_DATA_DIR/pdb70 \
--uniclust30_database_path $BASE_DATA_DIR/uniclust30/uniclust30_2018_08/uniclust30_2018_08 \
Expand Down Expand Up @@ -138,6 +138,7 @@ Some commonly used command line flags are here. A full list of flags can be view
- `--data_random_seed`: Specifies a random seed to use.
- `--save_outputs`: Saves a copy of all outputs from the model, e.g. the output of the msa track, ptm heads.
- `--experiment_config_json`: Specify configuration settings using a json file. For example, passing a json with `{globals.relax.max_iterations = 10}` specifies 10 as the maximum number of relaxation iterations. See for [`openfold/config.py`](https://github.com/aqlaboratory/openfold/blob/main/openfold/config.py#L283) the full dictionary of configuration settings. Any parameters that are not manually set in these configuration settings will refer to the defaults specified by your `config_preset`.
- `--use_custom_template`: Uses all .cif files in `template_mmcif_dir` as template input. Make sure the chains of interest have the identifier _A_ and have the same length as the input sequence. The same templates will be read for all sequences that are passed for inference.


### Advanced Options for Increasing Efficiency
Expand All @@ -159,12 +160,12 @@ Note that chunking (as defined in section 1.11.8 of the AlphaFold 2 supplement)
#### Long sequence inference
To minimize memory usage during inference on long sequences, consider the following changes:

- As noted in the AlphaFold-Multimer paper, the AlphaFold/OpenFold template stack is a major memory bottleneck for inference on long sequences. OpenFold supports two mutually exclusive inference modes to address this issue. One, `average_templates` in the `template` section of the config, is similar to the solution offered by AlphaFold-Multimer, which is simply to average individual template representations. Our version is modified slightly to accommodate weights trained using the standard template algorithm. Using said weights, we notice no significant difference in performance between our averaged template embeddings and the standard ones. The second, `offload_templates`, temporarily offloads individual template embeddings into CPU memory. The former is an approximation while the latter is slightly slower; both are memory-efficient and allow the model to utilize arbitrarily many templates across sequence lengths. Both are disabled by default, and it is up to the user to determine which best suits their needs, if either.
- Inference-time low-memory attention (LMA) can be enabled in the model config. This setting trades off speed for vastly improved memory usage. By default, LMA is run with query and key chunk sizes of 1024 and 4096, respectively. These represent a favorable tradeoff in most memory-constrained cases. Powerusers can choose to tweak these settings in `openfold/model/primitives.py`. For more information on the LMA algorithm, see the aforementioned Staats & Rabe preprint.
- Disable `tune_chunk_size` for long sequences. Past a certain point, it only wastes time.
- As a last resort, consider enabling `offload_inference`. This enables more extensive CPU offloading at various bottlenecks throughout the model.
- As noted in the AlphaFold-Multimer paper, the AlphaFold/OpenFold template stack is a major memory bottleneck for inference on long sequences. OpenFold supports two mutually exclusive inference modes to address this issue. One, `average_templates` in the `template` section of the config, is similar to the solution offered by AlphaFold-Multimer, which is simply to average individual template representations. Our version is modified slightly to accommodate weights trained using the standard template algorithm. Using said weights, we notice no significant difference in performance between our averaged template embeddings and the standard ones. The second, `offload_templates`, temporarily offloads individual template embeddings into CPU memory. The former is an approximation while the latter is slightly slower; both are memory-efficient and allow the model to utilize arbitrarily many templates across sequence lengths. Both are disabled by default, and it is up to the user to determine which best suits their needs, if either.
- Inference-time low-memory attention (LMA) can be enabled in the model config. This setting trades off speed for vastly improved memory usage. By default, LMA is run with query and key chunk sizes of 1024 and 4096, respectively. These represent a favorable tradeoff in most memory-constrained cases. Powerusers can choose to tweak these settings in `openfold/model/primitives.py`. For more information on the LMA algorithm, see the aforementioned Staats & Rabe preprint.
- Disable `tune_chunk_size` for long sequences. Past a certain point, it only wastes time.
- As a last resort, consider enabling `offload_inference`. This enables more extensive CPU offloading at various bottlenecks throughout the model.
- Disable FlashAttention, which seems unstable on long sequences.

Using the most conservative settings, we were able to run inference on a 4600-residue complex with a single A100. Compared to AlphaFold's own memory offloading mode, ours is considerably faster; the same complex takes the more efficent AlphaFold-Multimer more than double the time. Use the `long_sequence_inference` config option to enable all of these interventions at once. The `run_pretrained_openfold.py` script can enable this config option with the `--long_sequence_inference` command line option
Using the most conservative settings, we were able to run inference on a 4600-residue complex with a single A100. Compared to AlphaFold's own memory offloading mode, ours is considerably faster; the same complex takes the more efficent AlphaFold-Multimer more than double the time. Use the `long_sequence_inference` config option to enable all of these interventions at once. The `run_pretrained_openfold.py` script can enable this config option with the `--long_sequence_inference` command line option

Input FASTA files containing multiple sequences are treated as complexes. In this case, the inference script runs AlphaFold-Gap, a hack proposed [here](https://twitter.com/minkbaek/status/1417538291709071362?lang=en), using the specified stock AlphaFold/OpenFold parameters (NOT AlphaFold-Multimer).
Input FASTA files containing multiple sequences are treated as complexes. In this case, the inference script runs AlphaFold-Gap, a hack proposed [here](https://twitter.com/minkbaek/status/1417538291709071362?lang=en), using the specified stock AlphaFold/OpenFold parameters (NOT AlphaFold-Multimer).
23 changes: 12 additions & 11 deletions docs/source/Installation.md
Original file line number Diff line number Diff line change
Expand Up @@ -4,7 +4,7 @@ In this guide, we will OpenFold and its dependencies.

**Pre-requisites**

This package is currently supported for CUDA 11 and Pytorch 1.12. All dependencies are listed in the [`environment.yml`](https://github.com/aqlaboratory/openfold/blob/main/environment.yml)
This package is currently supported for CUDA 12 and Pytorch 2. All dependencies are listed in the [`environment.yml`](https://github.com/aqlaboratory/openfold/blob/main/environment.yml).

At this time, only Linux systems are supported.

Expand All @@ -19,9 +19,16 @@ At this time, only Linux systems are supported.
Mamba is recommended as the dependencies required by OpenFold are quite large and mamba can speed up the process.
- Activate the environment, e.g `conda activate openfold_env`
1. Run the setup script to configure kernels and folding resources.
> scripts/install_third_party_dependencies.sh`
1. Prepend the conda environment to the `$LD_LIBRARY_PATH`., e.g.
`export $LD_LIBRARY_PATH=$CONDA_PREFIX/lib:$LD_LIBRARY_PATH`. You may optionally set this as a conda environment variable according to the [conda docs](https://conda.io/projects/conda/en/latest/user-guide/tasks/manage-environments.html#saving-environment-variables) to activate each time the environment is used.
> scripts/install_third_party_dependencies.sh
1. Prepend the conda environment to the `$LD_LIBRARY_PATH` and `$LIBRARY_PATH`., e.g.

```
export LIBRARY_PATH=$CONDA_PREFIX/lib:$LIBRARY_PATH
export LD_LIBRARY_PATH=$CONDA_PREFIX/lib:$LD_LIBRARY_PATH
```

You may optionally set this as a conda environment variable according to the [conda docs](https://conda.io/projects/conda/en/latest/user-guide/tasks/manage-environments.html#saving-environment-variables) to activate each time the environment is used.

1. Download parameters. We recommend using a destination as `openfold/resources` as our unittests will look for the weights there.
- For AlphaFold2 weights, use
> ./scripts/download_alphafold_params.sh <dest>
Expand All @@ -46,12 +53,6 @@ Certain tests perform equivalence comparisons with the AlphaFold implementation.

## Environment specific modifications

### CUDA 12
To use OpenFold on CUDA 12 environment rather than a CUDA 11 environment.
In step 1, use the branch [`pl_upgrades`](https://github.com/aqlaboratory/openfold/tree/pl_upgrades) rather than the main branch, i.e. replace the URL in step 1 with https://github.com/aqlaboratory/openfold/tree/pl_upgrades
Follow the rest of the steps of [Installation Guide](#Installation)


### MPI
To use OpenFold with MPI support, you will need to add the package [`mpi4py`](https://pypi.org/project/mpi4py/). This can be done with pip in your OpenFold environment, e.g. `$ pip install mpi4py`.

Expand All @@ -64,4 +65,4 @@ If you don't have access to `aws` on your system, you can use a different downlo

### Docker setup

A [`Dockerfile`] is provided to build an OpenFold Docker image. Additional notes for setting up a docker container for OpenFold and running inference can be found [here](original_readme.md#building-and-using-the-docker-container).
A [`Dockerfile`](https://github.com/aqlaboratory/openfold/blob/main/Dockerfile) is provided to build an OpenFold Docker image. Additional notes for setting up a docker container for OpenFold and running inference can be found [here](original_readme.md#building-and-using-the-docker-container).
32 changes: 17 additions & 15 deletions environment.yml
Original file line number Diff line number Diff line change
Expand Up @@ -3,36 +3,38 @@ channels:
- conda-forge
- bioconda
- pytorch
- nvidia
dependencies:
- python=3.9
- libgcc=7.2
- cuda
- gcc=12.4
- python=3.10
- setuptools=59.5.0
- pip
- openmm=7.7
- openmm
- pdbfixer
- pytorch-lightning
- biopython
- numpy
- pandas
- PyYAML==5.4.1
- PyYAML
- requests
- scipy==1.7
- tqdm==4.62.2
- typing-extensions==4.0
- scipy
- tqdm
- typing-extensions
- wandb
- modelcif==0.7
- awscli
- ml-collections
- aria2
- mkl=2024.0
- mkl
- git
- bioconda::hmmer==3.3.2
- bioconda::hhsuite==3.3.0
- bioconda::kalign2==2.04
- bioconda::mmseqs2
- pytorch::pytorch=1.12.*
- bioconda::hmmer
- bioconda::hhsuite
- bioconda::kalign2
- pytorch::pytorch=2.5
- pytorch::pytorch-cuda=12.4
- pip:
- deepspeed==0.12.4
- deepspeed==0.14.5
- dm-tree==0.1.6
- git+https://github.com/NVIDIA/dllogger.git
- git+https://github.com/Dao-AILab/flash-attention.git@5b838a8
- flash-attn
15 changes: 7 additions & 8 deletions notebooks/OpenFold.ipynb
Original file line number Diff line number Diff line change
Expand Up @@ -107,11 +107,11 @@
"\n",
"python_version = f\"{version_info.major}.{version_info.minor}\"\n",
"\n",
"\n",
"os.system(\"wget -qnc https://github.com/conda-forge/miniforge/releases/latest/download/Mambaforge-Linux-x86_64.sh\")\n",
"os.system(\"bash Mambaforge-Linux-x86_64.sh -bfp /usr/local\")\n",
"os.system(\"wget -qnc https://github.com/conda-forge/miniforge/releases/latest/download/Miniforge3-Linux-x86_64.sh\")\n",
"os.system(\"bash Miniforge3-Linux-x86_64.sh -bfp /usr/local\")\n",
"os.environ[\"PATH\"] = \"/usr/local/bin:\" + os.environ[\"PATH\"]\n",
"os.system(\"mamba config --set auto_update_conda false\")\n",
"os.system(f\"mamba install -y -c conda-forge -c bioconda kalign2=2.04 hhsuite=3.3.0 openmm=7.7.0 python={python_version} pdbfixer biopython=1.83\")\n",
"os.system(f\"mamba install -y -c conda-forge -c bioconda kalign2=2.04 hhsuite=3.3.0 openmm=8.2.0 python={python_version} pdbfixer biopython=1.83\")\n",
"os.system(\"pip install -q torch ml_collections py3Dmol modelcif\")\n",
"\n",
"try:\n",
Expand All @@ -127,7 +127,7 @@
"\n",
" %shell mkdir -p /content/openfold/openfold/resources\n",
"\n",
" commit = \"3bec3e9b2d1e8bdb83887899102eff7d42dc2ba9\"\n",
" commit = \"1ffd197489aa5f35a5fbce1f00d7dd49bce1bd2f\"\n",
" os.system(f\"pip install -q git+https://github.com/aqlaboratory/openfold.git@{commit}\")\n",
"\n",
" os.system(f\"cp -f -p /content/stereo_chemical_props.txt /usr/local/lib/python{python_version}/site-packages/openfold/resources/\")\n",
Expand Down Expand Up @@ -893,8 +893,7 @@
"metadata": {
"colab": {
"provenance": [],
"gpuType": "T4",
"toc_visible": true
"gpuType": "T4"
},
"kernelspec": {
"display_name": "Python 3",
Expand All @@ -907,4 +906,4 @@
},
"nbformat": 4,
"nbformat_minor": 0
}
}
2 changes: 1 addition & 1 deletion openfold/config.py
Original file line number Diff line number Diff line change
Expand Up @@ -660,7 +660,7 @@ def model_config(
},
"relax": {
"max_iterations": 0, # no max
"tolerance": 2.39,
"tolerance": 10.0,
"stiffness": 10.0,
"max_outer_iterations": 20,
"exclude_residues": [],
Expand Down
19 changes: 16 additions & 3 deletions openfold/data/data_pipeline.py
Original file line number Diff line number Diff line change
Expand Up @@ -23,8 +23,19 @@
from typing import Mapping, Optional, Sequence, Any, MutableMapping, Union
import numpy as np
import torch
from openfold.data import templates, parsers, mmcif_parsing, msa_identifiers, msa_pairing, feature_processing_multimer
from openfold.data.templates import get_custom_template_features, empty_template_feats
from openfold.data import (
templates,
parsers,
mmcif_parsing,
msa_identifiers,
msa_pairing,
feature_processing_multimer,
)
from openfold.data.templates import (
get_custom_template_features,
empty_template_feats,
CustomHitFeaturizer,
)
from openfold.data.tools import jackhmmer, hhblits, hhsearch, hmmsearch
from openfold.np import residue_constants, protein

Expand All @@ -38,7 +49,9 @@ def make_template_features(
template_featurizer: Any,
) -> FeatureDict:
hits_cat = sum(hits.values(), [])
if(len(hits_cat) == 0 or template_featurizer is None):
if template_featurizer is None or (
len(hits_cat) == 0 and not isinstance(template_featurizer, CustomHitFeaturizer)
):
template_features = empty_template_feats(len(input_sequence))
else:
templates_result = template_featurizer.get_templates(
Expand Down
2 changes: 1 addition & 1 deletion openfold/data/mmcif_parsing.py
Original file line number Diff line number Diff line change
Expand Up @@ -283,7 +283,7 @@ def parse(
author_chain = mmcif_to_author_chain_id[chain_id]
seq = []
for monomer in seq_info:
code = PDBData.protein_letters_3to1.get(monomer.id, "X")
code = PDBData.protein_letters_3to1_extended.get(monomer.id, "X")
seq.append(code if len(code) == 1 else "X")
seq = "".join(seq)
author_chain_to_sequence[author_chain] = seq
Expand Down
Loading
Loading