Skip to content

Latest commit

 

History

History
53 lines (40 loc) · 2.94 KB

File metadata and controls

53 lines (40 loc) · 2.94 KB

Setting up kernels

Highlight:
RNAPro kernel setup (custom CUDA kernels, triangle attention/multiplicative, etc.) follows the configuration from Protenix.
Most acceleration options and installation steps are the same or compatible with Protenix.
⚡ The guide below is an explanation from the Protenix documentation.

  • Custom CUDA layernorm kernels modified from FastFold and Oneflow accelerate about 30%-50% during different training stages. To use this feature, run the following command:

    export LAYERNORM_TYPE=fast_layernorm

    If the environment variable LAYERNORM_TYPE is set to fast_layernorm, the model will employ the layernorm we have developed; otherwise, the naive PyTorch layernorm will be adopted. The kernels will be compiled when fast_layernorm is called for the first time.

  • Triangle_attention Kernel Options The model supports four implementations for triangle attention, configurable in configs_base.py:

    triangle_attention = "cuequivariance"  # or "triattention"/"deepspeed"/"torch"
    1. cuEquivariance Kernel(Default) Optimized implementation using NVIDIA's cuEquivariance library.

    2. TriAttention kernel

      Custom kernel implementation from rnapro/model/tri_attention/

    3. DeepSpeed DS4Sci_EvoformerAttention kernel is a memory-efficient attention kernel developed as part of a collaboration between OpenFold and the DeepSpeed4Science initiative.

      DS4Sci_EvoformerAttention is implemented based on CUTLASS. If you use this feature, You need to clone the CUTLASS repository and specify the path to it in the environment variable CUTLASS_PATH. The Dockerfile has already include this setting:

      RUN git clone -b v3.5.1 https://github.com/NVIDIA/cutlass.git  /opt/cutlass
      ENV CUTLASS_PATH=/opt/cutlass

      If you set up RNAPro by pip, you can set environment variable CUTLASS_PATH as follows:

      git clone -b v3.5.1 https://github.com/NVIDIA/cutlass.git  /path/to/cutlass
      export CUTLASS_PATH=/path/to/cutlass

      The kernels will be compiled when DS4Sci_EvoformerAttention is called for the first time.

  • Triangle_multiplicative Kernel Options

    The Triangle Multiplicative operation supports two implementations,Configuration configurable in configs_base.py:

    triangle_multiplicative = "cuequivariance"  # or "torch"
    1. cuEquivariance Kernel (Default) Optimized implementation using NVIDIA's cuEquivariance library.

    2. Torch Native Standard PyTorch implementation.