Decrypting Nature's Multi-Trait Encoding System Through AI-Powered Genomic Analysis
๐งฌ DNA Sequence โ ๐ Cryptanalysis โ ๐ฏ Multi-Trait Detection
ATCG... Frequency Trait 1: Metabolism
Analysis Trait 2: Resistance
+ CUDA GPU Trait 3: Virulence
+ NeuroDNA ... and more
Pleiotropy transforms genomic analysis by treating DNA as encrypted messages waiting to be decoded. Our platform combines cryptanalytic algorithms, GPU acceleration, and AI swarm intelligence to unlock the secrets of how single genes influence multiple traits.
- ๐งฌ Genome = Ciphertext: DNA sequences as encrypted multi-trait messages
- ๐ค Genes = Polyalphabetic Ciphers: Each gene encodes multiple trait "messages"
- ๐ข Codons = Cipher Symbols: 64 codons mapped like substitution ciphers
- ๐๏ธ Context = Decryption Keys: Environmental factors unlock trait expression
|
๐ Performance
|
๐ง Intelligence
|
|
๐ฌ Analysis
|
๐ Deployment
|
# Clone the repository
git clone https://github.com/murr2k/pleiotropy.git
cd pleiotropy
# Start the web interface
./run_local_server.sh
# Access at http://localhost:8001/projects/pleiotropy/
# Or build and run with CUDA support
cd rust_impl
cargo build --release --features cuda
cargo run --example ecoli_analysisTraditional genomic analysis treats genes as simple one-to-one mappings to traits. But nature is far more sophisticated - single genes often influence multiple, seemingly unrelated traits through a complex encoding system. We've discovered that by applying cryptanalytic techniques to genomic sequences, we can decode these hidden multi-trait patterns with unprecedented accuracy.
- High-Performance Rust Core: Parallel processing of genomic sequences
- NeuroDNA Integration: Neural network-inspired trait detection (v0.0.2)
- CUDA GPU Acceleration: 10-50x speedup with GPU kernels
- Composite Number Factorizer: GPU-accelerated semiprime factorization (42-digit limit in 10 minutes)
- Swarm Intelligence Seeker: Multi-agent system for optimal parameter search
- Cryptanalytic Algorithms: Frequency analysis, pattern detection, context-aware decryption
- Statistical Analysis: Chi-squared tests, mutual information, PCA
- Interactive Visualizations: Heatmaps, networks, Sankey diagrams
- E. coli Model System: Validated against known pleiotropic genes
| Achievement | Impact | Status |
|---|---|---|
| ๐งฌ Genomic Analysis | 18 genomes โข 86.1% accuracy โข 100% success | โ Validated |
| โก CUDA Acceleration | 10-50x speedup โข GTX 2070 optimized | โ Production |
| ๐ข Cryptographic Demo | 42-digit semiprimes in 10 minutes | โ Achieved |
| ๐ค Swarm Intelligence | Multi-agent parameter optimization | โ Deployed |
| ๐ Web Platform | React + FastAPI + Systemd | โ Live |
pleiotropy/
โโโ genome_research/ # Research findings and data
โ โโโ pleiotropy_overview.md
โ โโโ ecoli_pleiotropic_genes.json
โ โโโ crypto_parallels.md
โโโ crypto_framework/ # Cryptanalysis algorithm design
โ โโโ algorithm_design.md
โโโ rust_impl/ # High-performance Rust implementation
โ โโโ src/
โ โ โโโ lib.rs # Main API
โ โ โโโ main.rs # CLI interface
โ โ โโโ types.rs # Data structures
โ โ โโโ sequence_parser.rs
โ โ โโโ frequency_analyzer.rs
โ โ โโโ crypto_engine.rs
โ โ โโโ trait_extractor.rs
โ โ โโโ neurodna_trait_detector.rs # NeuroDNA integration
โ โ โโโ compute_backend.rs # Unified CPU/GPU backend
โ โ โโโ cuda/ # CUDA acceleration
โ โ โโโ mod.rs
โ โ โโโ accelerator.rs
โ โ โโโ kernels/
โ โโโ docs/ # CUDA documentation
โ โโโ Cargo.toml
โโโ python_analysis/ # Python visualization and analysis
โ โโโ trait_visualizer.py
โ โโโ statistical_analyzer.py
โ โโโ rust_interface.py
โ โโโ analysis_notebook.ipynb
โ โโโ requirements.txt
โโโ trial_database/ # Trial tracking system (NEW)
โ โโโ database/ # SQLite database and migrations
โ โโโ api/ # FastAPI backend
โ โโโ ui/ # React dashboard
โ โโโ swarm/ # Swarm agent coordination
โโโ examples/ # Example workflows
โ โโโ ecoli_workflow.sh
โโโ README.md
- Docker 20.10+ and Docker Compose 1.29+
- Git
- Optional (for local development): Rust 1.70+, Python 3.8+, Node.js 16+
Using Docker Swarm (Production Ready)
# Clone the repository
git clone https://github.com/murr2k/pleiotropy.git
cd pleiotropy
# Start the complete system (CPU only)
./start_system.sh --docker -d
# OR start with GPU/CUDA support (recommended for factorization)
./start_system.sh --gpu -d
# Verify deployment
./start_system.sh --statusAccess Points After Deployment:
- Swarm Coordinator API: http://localhost:8080
- Dashboard UI: http://localhost:3000
- Monitoring (Grafana): http://localhost:3001 (admin/admin)
- Metrics (Prometheus): http://localhost:9090
- Redis: localhost:6379
# Clone the repository
git clone https://github.com/murr2k/pleiotropy.git
cd pleiotropy
# Build Rust components
cd rust_impl
cargo build --release
# Install Python dependencies
cd ../python_analysis
pip install -r requirements.txt
# Install trial database dependencies
cd ../trial_database/api
pip install -r requirements.txt
# Install UI dependencies
cd ../ui
npm install
# Start local services
cd ../..
./start_system.sh --localThe system deploys as a microservices architecture:
โโโโโโโโโโโโโโโโโโโ โโโโโโโโโโโโโโโโโโโ โโโโโโโโโโโโโโโโโโโ
โ Web UI โ โ Coordinator โ โ Redis Cache โ
โ (React/TS) โ โ (Python/API) โ โ (Shared Mem) โ
โ Port: 3000 โโโโโบโ Port: 8080 โโโโโบโ Port: 6379 โ
โโโโโโโโโโโโโโโโโโโ โโโโโโโโโโโโโโโโโโโ โโโโโโโโโโโโโโโโโโโ
โฒ โฒ โฒ
โ โ โ
โโโโโโโโโโโโโโโโโโโโโโโโโผโโโโโโโโโโโโโโโโโโโโโโโโ
โ
โโโโโโโโโโโโโโโโโโโ โโโโโโโโโโโโโโโโโโโ โโโโโโโโโโโโโโโโโโโ
โ Rust Analyzer โ โ Python Visualizerโ โ Monitoring โ
โ Agent โ โ Agent โ โ (Grafana+Prom) โ
โ (Background) โ โ (Background) โ โ Port: 3001/9090 โ
โโโโโโโโโโโโโโโโโโโ โโโโโโโโโโโโโโโโโโโ โโโโโโโโโโโโโโโโโโโ
# Start all services
./start_system.sh --docker -d
# Check service status
./start_system.sh --status
# View logs
./start_system.sh --logs
# Stop all services
./start_system.sh --stop
# Restart a specific service
docker-compose restart coordinator
# Scale agents (if needed)
docker-compose up -d --scale rust_analyzer=2Automated Health Checks:
- Redis: Ping every 5 seconds
- Coordinator: HTTP health endpoint every 10 seconds
- Agents: Heartbeat via Redis every 30 seconds
Manual Health Verification:
# Check all containers
docker ps
# Test Redis
docker exec pleiotropy-redis redis-cli ping
# Test Coordinator API
curl http://localhost:8080/health
# Check agent status
curl http://localhost:8080/api/agents/statusVolumes:
redis_data: Redis persistenceprometheus_data: Metrics storagegrafana_data: Dashboard configurations./reports: Analysis outputs (host-mounted)
# Start the web interface
./run_local_server.sh
# Access at http://localhost:8001/projects/pleiotropy/
# Or manually:
python3 start_web_server.py 8001# Run with sudo in terminal
sudo ./deploy_with_sudo.sh
# Fix any issues
sudo ./fix_deployment.sh- Local: http://localhost:8001/projects/pleiotropy/
- Production: https://murraykopit.com/projects/pleiotropy/ (after deployment)
- API Health: http://localhost:8080/health
- โ React frontend built and configured
- โ FastAPI backend with SQLite database
- โ Systemd service for API persistence
- โ Python web server for local testing
- โ Production scripts ready for Apache/Nginx
Backup Strategy:
# Backup all data
docker run --rm -v pleiotropy_redis_data:/data -v $(pwd):/backup alpine tar czf /backup/redis-backup.tar.gz /data
docker run --rm -v pleiotropy_prometheus_data:/data -v $(pwd):/backup alpine tar czf /backup/prometheus-backup.tar.gz /data
# Restore data
docker run --rm -v pleiotropy_redis_data:/data -v $(pwd):/backup alpine tar xzf /backup/redis-backup.tar.gz -C /# Analyze a genome file
./rust_impl/target/release/genomic_cryptanalysis \
--input genome.fasta \
--traits known_traits.json \
--output results/ \
--min-traits 2
# Run example E. coli workflow
./examples/ecoli_workflow.shfrom trait_visualizer import TraitVisualizer
from statistical_analyzer import StatisticalAnalyzer
# Load results
viz = TraitVisualizer()
data = viz.load_trait_data("results/analysis_results.json")
# Create visualizations
viz.plot_trait_correlation_heatmap(data)
viz.create_trait_network(data)Open python_analysis/analysis_notebook.ipynb for an interactive analysis workflow.
The project now includes comprehensive CUDA support for 10-50x performance improvements using the cudarc crate:
# Check CUDA availability
./rust_impl/target/release/genomic_cryptanalysis --cuda-info
# Build with CUDA support
cd rust_impl
cargo build --release --features cuda
# GPU acceleration is automatic!
./target/release/genomic_cryptanalysis analyze genome.fasta traits.json- Codon Counting: 20-40x speedup with warp-level optimizations
- Frequency Calculation: 15-30x speedup using shared memory
- Pattern Matching: 25-50x speedup with multiple similarity metrics
- Matrix Operations: 10-20x speedup for eigenanalysis
- E. coli genome: ~7s โ ~0.3s (23x speedup)
- Automatic fallback: Seamlessly uses CPU if GPU unavailable
- Transparent Integration: No code changes required - GPU acceleration is automatic
- GTX 2070 Optimized: Tuned for 8GB VRAM and 2304 CUDA cores
- Real-time Monitoring: Built-in performance statistics and GPU utilization tracking
- Graceful Degradation: Automatic CPU fallback if GPU operations fail
- NeuroDNA Compatible: Enhanced integration with trait detection system
Complete CUDA documentation is available in the rust_impl/docs/ directory:
- Quick Start - Get running in 5 minutes
- Full Documentation - Comprehensive guide
- API Reference - Complete API documentation
- Examples - 35+ code examples and tutorials
- Benchmarks - Detailed performance analysis
- Troubleshooting - Common issues and solutions
- Codon frequency analysis using neural network-inspired patterns
- Trait-specific pattern matching with configurable thresholds
- Multi-factor confidence scoring system
- Fast performance: ~7 seconds for full E. coli genome
- Global codon usage patterns
- Trait-specific codon bias detection
- Synonymous codon preference analysis
- Sliding window analysis (300bp windows)
- Eigenanalysis for trait pattern detection
- Regulatory motif identification
- Promoter strength assessment
- Enhancer/silencer mapping
- Expression condition inference
- Overlapping region deconvolution
- Confidence scoring based on multiple factors
- Pleiotropic pattern identification
Using E. coli K-12 as a model:
- Identified key pleiotropic genes (crp, fis, rpoS, hns)
- Detected trait-specific codon usage patterns
- Mapped regulatory contexts to trait expression
- Achieved >70% confidence in trait predictions
- NeuroDNA integration successfully detects pleiotropic patterns in both synthetic and real genomic data
- Synthetic test data: 100% detection rate (3/3 genes)
- Real E. coli genome: Successfully identifies stress response and regulatory traits
- Unit Tests: >80% code coverage across all components
- Integration Tests: Full system workflow validation
- Performance Tests: Benchmarked for 1000+ concurrent trials
- CI/CD Pipeline: GitHub Actions for continuous testing
- Known E. coli pleiotropic genes detected with >70% confidence
- Published trait-gene associations confirmed
- Codon usage patterns match established databases
- Swarm coordination tested with 5+ concurrent agents
# Run all tests with coverage
pytest --cov=python_analysis --cov=trial_database
# Run performance benchmarks
pytest tests/performance --benchmark-only
# Check code quality
cd rust_impl && cargo clippy
python -m black python_analysis/ trial_database/We welcome contributions! Please:
- Fork the repository
- Create a feature branch
- Make your changes
- Submit a pull request
If you use this software in your research, please cite:
Genomic Pleiotropy Cryptanalysis
Murray Kopit (2025)
https://github.com/murr2k/pleiotropy
MIT License - See LICENSE file for details
The project includes a comprehensive trial tracking system for managing cryptanalysis experiments:
- SQLite Database: Stores trial proposals, test results, and metadata
- FastAPI Backend: RESTful API for database operations
- React Dashboard: Real-time UI showing swarm progress
- Swarm Coordination: Agent communication and task distribution
- Track proposed trials with parameters and hypotheses
- Store test results with confidence scores and visualizations
- Real-time progress monitoring with WebSocket updates
- Tabular and graphical reporting capabilities
- Agent memory system for knowledge sharing
- Trials Table: Stores experiment configurations and hypotheses
- Results Table: Contains analysis outcomes with confidence scores
- Agents Table: Tracks AI agent status and workload
- Progress Table: Real-time updates on analysis progress
- RESTful endpoints for all CRUD operations
- WebSocket support for live progress updates
- JWT-based authentication for agents
- Batch operations for efficient data handling
- OpenAPI documentation at
/docs
- Real-time dashboard with WebSocket integration
- Interactive charts using Chart.js
- Tabular views with filtering and sorting
- Data export in CSV and JSON formats
- Responsive Material-UI design
- Redis-based agent communication
- Automatic task distribution and failover
- Shared memory system for knowledge transfer
- Performance-based agent selection
Pre-Deployment:
- Docker and Docker Compose installed
- Firewall configured (ports 3000, 8080, 3001, 9090)
- SSL certificates ready (for production)
- Backup strategy in place
Deployment Steps:
# 1. Clone and deploy
git clone https://github.com/murr2k/pleiotropy.git
cd pleiotropy
./start_system.sh --docker -d
# 2. Verify services
./start_system.sh --status
# 3. Run system tests
python trial_database/tests/test_integration.py
# 4. Check monitoring
curl http://localhost:3001 # Grafana
curl http://localhost:9090 # PrometheusPost-Deployment:
- All services healthy
- Monitoring dashboards accessible
- Test analysis workflow
- Configure log rotation
- Set up alerting (optional)
# Clone the repository
git clone https://github.com/murr2k/pleiotropy.git
cd pleiotropy
# Start with Docker (recommended)
./start_system.sh --docker -d
# Access the services
# - Dashboard: http://localhost:3000
# - Coordinator API: http://localhost:8080
# - API Documentation: http://localhost:8080/docs
# - Monitoring: http://localhost:3001 (admin/admin)
# - Metrics: http://localhost:9090# Complete development environment
# Database setup
cd trial_database/database
pip install -r requirements.txt
python init_db.py
# API setup
cd ../api
pip install -r requirements.txt
uvicorn app.main:app --reload --host 0.0.0.0 --port 8000
# UI setup
cd ../ui
npm install
npm run dev
# Swarm setup
cd ../swarm
pip install -r requirements.txt
python coordinator.py
# Run comprehensive tests
pytest tests/ --cov --cov-report=html
cd ../../rust_impl && cargo test
cd ../trial_database/ui && npm test
cd .. && python -m pytest swarm/tests/Production Environment Variables:
# Redis Configuration
REDIS_HOST=redis
REDIS_PORT=6379
REDIS_PASSWORD= # Set in production
# API Configuration
API_HOST=0.0.0.0
API_PORT=8080
CORS_ORIGINS=* # Restrict in production
# Database Configuration
DATABASE_URL=sqlite:///./trial_database.db
# Monitoring
PROMETHEUS_ENABLED=true
GRAFANA_ADMIN_PASSWORD=admin # Change in production
# Logging
LOG_LEVEL=INFO
LOG_FORMAT=jsonDevelopment Environment:
# Copy environment template
cp .env.example .env
# Edit configuration
vim .env
# Load environment
source .envServices Won't Start:
# Check Docker status
docker ps -a
docker-compose logs
# Clean restart
./start_system.sh --stop
docker system prune -f
./start_system.sh --docker -dRedis Connection Errors:
# Check Redis connectivity
docker exec pleiotropy-redis redis-cli ping
# Restart Redis
docker-compose restart redisAgent Communication Issues:
# Check agent status
curl http://localhost:8080/api/agents/status
# Restart agents
docker-compose restart rust_analyzer python_visualizerPerformance Issues:
# Monitor resource usage
docker stats
# Check logs for bottlenecks
docker-compose logs --tail=100 coordinator
# Scale agents if needed
docker-compose up -d --scale rust_analyzer=2# View all logs
./start_system.sh --logs
# Filter specific service
docker-compose logs coordinator
# Follow logs in real-time
docker-compose logs -f rust_analyzer
# Export logs for analysis
docker-compose logs > system_logs.txtMemory Optimization:
# Adjust container memory limits
# Edit docker-compose.yml:
services:
coordinator:
mem_limit: 512m
memswap_limit: 1gCPU Optimization:
# Set CPU limits
services:
rust_analyzer:
cpus: '2.0'
cpu_shares: 1024- Machine learning integration for pattern recognition
- Extension to other model organisms (yeast, C. elegans)
- Real-time streaming analysis with Kafka
- GPU acceleration for large genomes (CUDA support)
- Kubernetes deployment for cloud scalability
- Advanced monitoring with custom metrics
- Expanded trial database with ML experiment tracking
- Multi-tenant support for shared environments
For Questions or Collaborations:
- Open an issue on GitHub
- Email: murr2k@gmail.com
System Administration:
- Monitor system health at http://localhost:3001
- Check API status at http://localhost:8080/health
- Review logs with
./start_system.sh --logs
Emergency Procedures:
# System emergency restart
./start_system.sh --stop
docker system prune -f
./start_system.sh --docker -d
# Data recovery
# See backup/restore procedures in Docker Deployment sectionCritical: This project contains both REAL experimental data and SIMULATED test data. Always verify data source before use:
- โ
Real Data:
data/real_experiments/- Use for scientific analysis โ ๏ธ Test Data:data/test_data/- SIMULATED, for regression testing only- โ Archived:
data/simulated_archive/- Historical simulated data, do not use
See data/DATA_PROVENANCE.md for complete data documentation.
The project has undergone extensive validation with comprehensive experiments on diverse bacterial genomes:
-
E. coli K-12 (commensal)
- Detected traits: regulatory, stress_response
- Confidence: 75.0%
- Analysis time: 7.0s
-
Salmonella enterica Typhimurium (pathogen)
- Detected traits: regulatory, stress_response
- Confidence: 77.5%
- Analysis time: 1.0s
-
Pseudomonas aeruginosa PAO1 (opportunistic pathogen)
- Detected traits: regulatory, stress_response, carbon_metabolism, motility, structural
- Confidence: 75.0%
- Analysis time: 1.0s
Based on actual genomic analyses of real bacterial genomes:
Verified Results:
- 3 Real Experiments: E. coli K-12, S. enterica, P. aeruginosa
- 100% Success Rate: All real experiments completed successfully
- 75.8% Average Confidence: Consistent detection confidence
- 3.0 Average Traits per Genome: Regulatory and stress response common
- 3.0s Average Analysis Time: Including 7s initial E. coli run
Note: Previous reports included 20 simulated genomes. Statistics above reflect ONLY real experimental data.
Significant Findings:
- Stress response and regulatory traits show universal pleiotropy across all bacteria
- Lifestyle complexity correlates with pleiotropic diversity
- Larger genomes tend to have more pleiotropic traits (correlation: 0.083)
- CUDA acceleration provides 10-50x performance improvement
Statistical Validation:
- 78.3% of detections had high confidence scores (โฅ0.7)
- Reproducible detection of universal traits across experiments
- Successfully differentiates bacterial lifestyles based on pleiotropic patterns
Full statistical report and visualizations available in batch_experiment_20_genomes_20250712_181857/
System Overview Dashboard:
- Agent health and workload distribution
- Task completion rates and success metrics
- System resource utilization
- Error rates and alert thresholds
Analysis Dashboard:
- Trial success rates by organism
- Confidence score distributions
- Processing time metrics
- Data quality indicators
Access Grafana:
- Navigate to http://localhost:3001
- Login: admin/admin
- Navigate to "Swarm Dashboard"
Key Metrics Collected:
# Agent metrics
agent_heartbeat_last_seen
agent_task_completion_rate
agent_error_count
# System metrics
redis_connection_count
api_request_duration
trial_processing_time
# Analysis metrics
confidence_score_distribution
trait_detection_accuracy
Daily:
- Check service health status
- Review error logs
- Monitor disk space usage
Weekly:
- Backup database and configurations
- Update system metrics baseline
- Review performance trends
Monthly:
- Update Docker images
- Archive old trial data
- Performance optimization review
# Install alertmanager
docker run -d --name alertmanager \
-p 9093:9093 \
prom/alertmanager
# Configure alerts
vim monitoring/alerts.ymlSample Alert Rules:
groups:
- name: pleiotropy
rules:
- alert: AgentDown
expr: agent_heartbeat_last_seen > 300
labels:
severity: critical
annotations:
summary: "Agent {{ $labels.agent }} is down"
- alert: HighErrorRate
expr: rate(agent_error_count[5m]) > 0.1
labels:
severity: warning
annotations:
summary: "High error rate detected"We successfully analyzed 18 authentic bacterial genomes from NCBI with comprehensive quality assurance validation:
Key Metrics:
- โ 100% Success Rate: All 18 genomes analyzed successfully
- โ 94.4% Data Authenticity: 17/18 verified NCBI genomes
- โ 73.7% Average Confidence: Strong detection reliability
- โ 100% Reproducibility: Fully reproducible methodology
- โ 86.1% Overall QA Score: HIGH scientific veracity
Genomes Analyzed:
- Mycobacterium tuberculosis H37Rv (NC_000962.3)
- Helicobacter pylori 26695 (CP003904.1)
- Bacillus subtilis 168 (NZ_CP053102.1)
- Clostridium difficile 630 (NZ_CP010905.2)
- Caulobacter crescentus CB15 (AE005673.1)
- Enterococcus faecalis V583 (AE016830.1)
- Neisseria gonorrhoeae FA1090 (AE004969.1)
- And 11 more diverse bacterial species
Biological Findings:
- Detected 3-21 pleiotropic genes per genome (mean: 4.5)
- Regulatory and stress response traits dominate (53.1% each)
- Carbon metabolism shows expected pleiotropic patterns (18.5%)
- Pathogen-specific signatures successfully identified
Validation Reports:
- Full experiment data:
experiments_20_genomes/results_20250713_231039/ - QA evaluation:
experiments_20_genomes/qa_evaluation_report.json - Scientific veracity:
experiments_20_genomes/SCIENTIFIC_VERACITY_REPORT.md
This represents a significant validation of the genomic pleiotropy cryptanalysis approach, demonstrating its effectiveness on real-world genomic data across diverse bacterial species.
Built with ๐งฌ by Murray Kopit
๐ Try Demo | ๐ Documentation | ๐ Report Issues | โญ Star Project