231N

CS231N

Overview This project implements a self-supervised learning approach to understand and model local folding patterns in small protein fragments (30-50 amino acids). By treating protein distance maps as images and applying a masked autoencoder approach, we learn the underlying principles of protein structure without relying on labeled data. Our model uses a Vision Transformer (ViT) architecture that learns to reconstruct masked portions of protein distance maps, forcing it to understand the complex spatial relationships and constraints that govern protein folding.

Approach Self-Supervised Learning Strategy Rather than using labeled data, we employ a self-supervised approach where the model learns to predict masked regions of distance maps from visible regions. This approach has several advantages:

It doesn't require manual annotation It can leverage large amounts of unlabeled protein structure data It encourages the model to learn intrinsic properties of protein folding

Data Representation Protein structures are represented as distance maps, where each pixel (i,j) corresponds to the distance between alpha carbon atoms of residues i and j. These distance maps:

Capture the complete 3D structure in a 2D format Show characteristic patterns associated with secondary structures Reveal long-range interactions critical for folding

Vision Transformer Architecture We implement a ViT-based masked autoencoder that:

Divides the distance map into patches Randomly masks a high percentage (75%) of these patches Processes visible patches through an encoder Reconstructs the full distance map using a decoder Learns to predict the masked regions

Name		Name	Last commit message	Last commit date
Latest commit History 7 Commits
gcp_version		gcp_version
5000pdb.py		5000pdb.py
Protein_fragment_dataset.py		Protein_fragment_dataset.py
README.md		README.md
dataset.py		dataset.py
evaluate.py		evaluate.py
model.py		model.py
requirements.txt		requirements.txt
train.py		train.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

231N

About

Uh oh!

Releases

Packages

Contributors 3

Uh oh!

Languages

BackSideAttack/231N

Folders and files

Latest commit

History

Repository files navigation

231N

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Contributors 3

Uh oh!

Languages

Packages