Skip to content

Drug Repurposing Assessment for Rare Disease Targets

Notifications You must be signed in to change notification settings

serenafrancisco/DRARDT

Folders and files

NameName
Last commit message
Last commit date

Latest commit

Β 

History

1 Commit
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 

Repository files navigation

DRARDT - Drug Repurposing Assessment for Rare Disease Targets

License: MIT Python 3.9+ DOI FreeSASA

✨ Features

  • πŸ”¬ Comprehensive target analysis from multiple databases (UniProt, PubMed, STRING-DB, KEGG, PDB, AlphaFold)
  • πŸ’Š DRARDT scoring system (0-3) indicating drug repurposing potential
  • 🧬 Optional mutation analysis with 3D definition across PDB and AlphaFold target structures and structural impact prediction using FreeSASA and SimBa-NI method

πŸ“ Core Files

  1. drardt.py - Main command-line interface
  2. drardt_core.py - DRARDT scoring algorithm
  3. drardt_mutations.py - 3D structural analysis (B-factor, pLDDT, RSA, ΔΔG)
  4. simba.tsv - Amino acid properties database
  5. environment.yml - Conda environment specification

πŸ“¦ Installation

Requirements

  • Conda or Miniconda
  • Linux or macOS (recommended)

🎯 Quick Install

# 1. Clone repository
git clone https://github.com/serenafrancisco/drardt.git
cd drardt

# 2. Create environment from file
conda env create -f environment.yml

# Activate environment
conda activate drardt

The environment.yml file automatically installs:

  • Python 3.10
  • Biopython 1.81
  • Requests 2.31.0
  • FreeSASA (from conda-forge)

πŸš€ Usage

Basic Analysis

Analyze a gene target without mutations:

python drardt.py --gene GENE_NAME --email your@email.com

🧬 Analysis with Mutations

Include missense mutation analysis:

python drardt.py --gene GENE_NAME --email your@email.com \
  --mutations R367Q,F409S

πŸ“„ Save Results

Save output to a file:

python drardt.py --gene CFTR --email your@email.com --output results.txt

βš™οΈ Parameters

  • --gene (required): Gene name (e.g., BRCA1, TP53, CFTR)
  • --email (required): Valid email address (required by NCBI Entrez API)
  • --mutations (optional): Comma-separated missense mutation(s) in format A123B (wildtype-position-mutant). If more than one mutation provided, these must be comma-separated.
  • --output (optional): Output file path (default: prints to stdout)
  • --help: Print help and exit

πŸ”’ DRARDT Scoring

The tool evaluates 5 parameters to calculate a final DRARDT score:

Individual Parameters

  1. πŸ“š Publications (2000-present)

    • Counts PubMed articles mentioning the gene
    • Score: 1-4 points
  2. πŸ”— Protein Interactors (STRING-DB)

    • Number of experimentally validated protein-protein interactions
    • Score: 1-4 points
  3. πŸ”€ KEGG Pathways

    • Number of metabolic/signaling pathways the gene is involved in
    • Score: 1-3 points
  4. βš›οΈ PDB Structures

    • Number of available experimental 3D structures
    • Score: 1-4 points
  5. AlphaFold2 Prediction

    • Availability of AI-predicted structure
    • Score: 1-2 points

⭐ Final DRARDT Score

Score Label Interpretation
0 Very Low Minimal drug repurposing potential
1 Low Limited drug repurposing potential
2 High Good drug repurposing potential
3 Very High Excellent drug repurposing potential

🧬 Mutation Analysis

When analyzing mutations with a PDB structure, the tool provides:

PDB Coverage

Checks if the mutation position is present in the provided experimental structure.

Relative Solvent Accessibility (RSA)

Calculated using FreeSASA:

  • RSA > 20%: Residue is solvent-exposed
  • RSA ≀ 20%: Residue is buried in the protein core

ΔΔG Prediction

Stability change prediction using the SimBa-NI method:

  • ΔΔG > -1.5 kcal/mol: Mutation is stabilizing or neutral (unlikely to cause unfolding)
  • ΔΔG ≀ -1.5 kcal/mol: Mutation is destabilizing (likely to cause protein unfolding)

Formula: ΔΔG = -1.64 + 1.9(RSA/100) + 0.49(Vdiff/100) - 0.12(Hdiff)

Where:

  • Vdiff = Volume difference between mutant and wild-type amino acids
  • Hdiff = Polarity difference between mutant and wild-type amino acids

Example Output

======================================================================
DRARDT Analysis for AP4M1
======================================================================

GENE INFORMATION
----------------------------------------------------------------------
UniProt ID: O00189
Protein length: 453 amino acids

Disease associations:
  β€’ Spastic paraplegia 50, autosomal recessive: [description]

DRARDT PARAMETERS
----------------------------------------------------------------------

Publications (2000-present): 55
  Score: 2/4

STRING-DB Interactors: 2
  Interactors: AP4B1, AP4E1
  Score: 2/4

KEGG Pathways: 1
  Score: 2/3

PDB Structures: 2
  β€’ PDB ID: 3L81, Method: X-ray, Resolution: 1.60 A, Chains: A=160-453
  β€’ PDB ID: 4MDR, Method: X-ray, Resolution: 1.85 A, Chains: A=160-453
  Score: 2/4

AlphaFold2 Prediction: https://alphafold.ebi.ac.uk/entry/O00189
  Score: 2/2


DRARDT FINAL SCORE
======================================================================
Score: 1/3
Assessment: Low

Interpretation: This target may have limited potential for drug repurposing.

======================================================================
Mutation Analysis
======================================================================


Fetching PDB structures from RCSB...
  Downloaded PDB structure: 3L81
  Downloaded PDB structure: 4MDR
Found 2 PDB structure(s)

Analyzing PDB structure: 3L81

Analyzing PDB structure: 4MDR

Fetching AlphaFold structure...
  Attempting to download from: https://alphafold.ebi.ac.uk/files/AF-O00189-F1-model_v6.pdb
  Response status code: 200
  Successfully saved to: /tmp/tmpl86yh53d.pdb
AlphaFold structure downloaded

Mutation: R367Q
  PDB Coverage: YES
    β€’ 3L81, Chain A
    β€’ 4MDR, Chain A
  3D Structural Quality: PDB (3L81): Medium | PDB (4MDR): Medium | AlphaFold: High
    β€’ PDB 3L81 B-factor (5Γ… radius): 32.49 Γ…Β² (10 residues, 47 atoms)
    β€’ PDB 4MDR B-factor (5Γ… radius): 43.20 Γ…Β² (7 residues, 44 atoms)
    β€’ AlphaFold pLDDT (5Γ… radius): 95.83 (6 residues, 46 atoms)
  RSA (Relative Solvent Accessibility):
    β€’ 3L81: 62.47% (Solvent-exposed)
    β€’ 4MDR: 47.94% (Solvent-exposed)
    β€’ AlphaFold: 50.24% (Solvent-exposed)
  ΔΔG (Stability prediction):
    β€’ 3L81: -0.60 kcal/mol β†’ Stable
    β€’ 4MDR: -0.87 kcal/mol β†’ Stable
    β€’ AlphaFold: -0.83 kcal/mol β†’ Stable

Mutation: F409S
  PDB Coverage: YES
    β€’ 3L81, Chain A
    β€’ 4MDR, Chain A
  3D Structural Quality: PDB (3L81): High | PDB (4MDR): High | AlphaFold: High
    β€’ PDB 3L81 B-factor (5Γ… radius): 23.87 Γ…Β² (9 residues, 62 atoms)
    β€’ PDB 4MDR B-factor (5Γ… radius): 28.93 Γ…Β² (8 residues, 56 atoms)
    β€’ AlphaFold pLDDT (5Γ… radius): 96.55 (7 residues, 55 atoms)
  RSA (Relative Solvent Accessibility):
    β€’ 3L81: 2.53% (Buried)
    β€’ 4MDR: 2.80% (Buried)
    β€’ AlphaFold: 2.28% (Buried)
  ΔΔG (Stability prediction):
    β€’ 3L81: -2.57 kcal/mol β†’ Destabilizing
    β€’ 4MDR: -2.56 kcal/mol β†’ Destabilizing
    β€’ AlphaFold: -2.57 kcal/mol β†’ Destabilizing

πŸ—„οΈ Data Sources

  • UniProt: Protein information, disease associations, length
  • PubMed/NCBI: Publication counts
  • STRING-DB: Protein-protein interactions
  • KEGG: Metabolic and signaling pathways
  • RCSB PDB: Experimental 3D structures
  • AlphaFold: AI-predicted protein structures
  • FreeSASA: Solvent accessibility calculations

βš™οΈ Technical Details

3D Structural Quality Assessment

  • Selection method: Any atom within 5Γ… of mutation CA (Chimera-compatible)
  • B-factor thresholds: <30 (High), 30-60 (Medium), β‰₯60 (Low)
  • pLDDT thresholds: >70 (High), 50-70 (Medium), ≀50 (Low)
  • Averaging: Per-residue average β†’ average of per-residue averages

RSA Calculation

  • Uses FreeSASA with Lee-Richards algorithm
  • Normalized to percentage (0-100%)
  • Threshold: >20% = exposed, ≀20% = buried

ΔΔG Prediction

  • SimBa-NI algorithm (Structure-informed Bayesian model)
  • Formula: ΔΔG = -1.64 + 1.9(RSA/100) + 0.49(Vdiff/100) - 0.12(Hdiff)
  • Threshold: > -1.5 = Stable, ≀ -1.5 = Destabilizing

πŸ“ Files

drardt/
β”œβ”€β”€ drardt.py              # Main CLI script
β”œβ”€β”€ drardt_core.py         # Core analysis functions
β”œβ”€β”€ drardt_mutations.py    # Mutation analysis functions
β”œβ”€β”€ simba.tsv             # Amino acid properties database
β”œβ”€β”€ environment.yml        # Conda environment specification
└── README.md             # This file

❌ Invalid mutation format

Mutations must be in format A123B:

  • A = single letter amino acid code (wild-type)
  • 123 = position number
  • B = single letter amino acid code (mutant)

Examples: R367Q, F409S, A123V

Importantly, multiple mutations should be comma-separated without spaces (e.g. 'R367Q,F409S').

πŸ“§ Contact

πŸ™ Acknowledgments

πŸ“– Citation

If you use DRARDT in your research, please cite: Francisco, Serena et al. β€œRestoring adapter protein complex 4 function with small molecules: an in silico approach to spastic paraplegia 50.” Protein science : a publication of the Protein Society vol. 34,1 (2025): e70006. doi:10.1002/pro.70006

Releases

No releases published

Packages

No packages published

Languages