A curated list of software packages, databases, methods, and resources for transposable element research.
Transposable elements (TEs) are repetitive DNA sequences that comprise significant portions of most eukaryotic genomes (>45% human, >85% maize). They shape genome evolution, gene regulation, and contribute to genetic diversity and disease.
This will begin as a human-centered list
- Background
- TE Detection and Annotation
- TE Activity Analysis
- Population and Comparative Studies
- Experimental Methods
- Specialized Computational Approaches
- Visualization
- Quality Control and Validation
- Community
- Contributing
- License
- Acknowledgments
π Essential tutorials, and educational materials for understanding transposable elements.
𧬠Databases containing TE sequences, annotations, and classifications.
- DFAM: Open database of Repetitive DNA families organized around multiple sequence alignments of TE families. This is a DFAM demo by Jessica Storer.
π Organizing and naming transposable elements across different communities.
β‘ Tools for quickly estimating TE content without comprehensive annotation - useful for initial genome assessment.
π Software for identifying novel TEs without relying on existing TE libraries - e.g. for new species or finding species-specific TEs.
π Complete workflows that combine multiple tools to annotate TEs genome-wide, from initial detection to final classification.
π·οΈ Software for categorizing identified TEs into families and superfamilies based on structural features and sequence similarity.
π Tools for analyzing TE activity in RNA sequencing data.
Methods for quantifying TE expression from RNA-seq experiments, handling multi-mapping reads and ambiguous assignments.
Specialized approaches for detecting TE expression at single-cell resolution, revealing cell-type specific TE activity.
π§ͺ Tools for studying TE regulation through epigenetic modifications.
Software for analyzing TE methylation patterns from bisulfite sequencing data (WGBS, RRBS).
Methods for assessing TE chromatin states using ATAC-seq, DNase-seq, and similar assays.
Tools for analyzing histone marks at TE loci from ChIP-seq data.
π¬ Software for detecting TE-derived peptides and proteins in mass spectrometry data.
π Tools that integrate multiple data types to provide comprehensive views of TE activity and regulation.
π₯ Tools for discovering TE insertions that vary between individuals, populations, or strains.
π Software for comparing TE landscapes across multiple genomes, studying TE evolution and dynamics.
π§« Experimental techniques designed to specifically capture and sequence transposable elements or their insertion sites.
π§² Laboratory methods for enriching TE-containing DNA or RNA fragments, including immunoprecipitation and capture approaches.
π Specialized tools that leverage PacBio or Oxford Nanopore long reads to resolve complex TE structures and insertions.
π¨ Software for creating publication-quality figures of TE annotations, distributions, and comparative analyses.
β Methods and metrics for evaluating the quality, completeness, and accuracy of TE annotations.
π Software interfaces for expert review and refinement of automated TE predictions.
π Benchmark datasets, simulated data, and gold standards for testing and comparing TE detection methods.
- π¬ BioStars TE Tag - Q&A for TE analysis
- π¬ Transposons Worldwide Slack Channel - Active research community
- Transposable elements labs: a list of labs studying TEs by the Mobile DNA journal
- π€ [International Conference on Transposable Elements](link needed) - e.g. Biennial conference
- π€ [Mobile DNA Conference](link needed) - e.g. Regional meetings
- π Bourque et al. 2018 - Ten things you should know about transposable elements
- π Wells & Feschotte 2020 - A field guide to eukaryotic transposable elements
- π Goubert et al. 2022 - TE detection benchmarking
Please see our contributing guidelines. We welcome additions of:
- Well-maintained software with clear documentation
- Databases actively used by the community
- Experimental protocols
- Educational resources that have proven helpful
- Fork the repository
- Add your tool/resource with appropriate description
- Ensure you follow the format
- Submit a pull request
See here visual instructions.
This work is licensed under a Creative Commons Zero v1.0 Universal License.
Inspired by other awesome bioinformatics lists. Special thanks to all contributors and the TE research community.
Maintainers: @nixonlab, @hreypar, @mlbendall
