Skip to content

Releases: streetslab/dimelo-toolkit

v0.2.1

25 Nov 22:58
c02ce86

Choose a tag to compare

v0.2.1 contains several small updates to reflect the dimelo-toolkit at the time of the submission of the dimelo-toolkit manuscript, https://www.biorxiv.org/content/10.1101/2025.11.09.687458v1.full. The main updates from v0.2.0 are as follows:

environment.yml: change environment name to dimelo-toolkit

load_processed.read_vectors_from_hdf5: updates to support new single read browser plots that better show data

  • Added span_full_window option. If true, only reads that start before the region_start and end before the region_end are returned, i.e. they must be at least as long as the region. If false, maintain old behavior: all reads that end after the region_start and start before region_end are returned.
  • Added read_length field for sorting
  • Added sorting in asc vs desc order for each of the sequential sorting operations

parse_bam::read_by_base_txt_to_hdf5: updates to support single reads with long gaps, e.g. RNA with splicing

  • When finding reads in the .txt file, instead of calculating read end position once when the read is first encountered (based on current position in genome + current position in read + read length), the read end position is re-calculated each time a new base is encountered for that read. Because the txt is ordered ascending along the genome, the last base encountered for a read will be the one closest to the end of that read. With e.g. megalodon bam files this always gives identical results to the old read end calculation method.

The new method still misses any read gaps that show up after the last potential modification site: this information will only be available with the modkit upgrade described here: nanoporetech/modkit#270

plot_enrichment_profile::plot_enrichment_profile: update to support profile plots with absolute rather than relative coordinates

  • plot_enrichment_profile methods can take in relative flag; True means x-axis is centered around region centers, False means x-axis is absolute genome positions
  • make_enrichment_profile_plot can take offset_center, which gives a position offset to apply to the plot x-axis (e.g., when plotting absolute genome positions)

plot_read_browser::plot_read_browser: update to allow new plot customizations and make it work a bit better with gapped read e.g. RNA

  • markers/lines/color palette settings
  • pass down the span_full_window parameter to the loader
  • read_extent_df now keeps the longest version of each read name, because until we have the new modkit version there are different entries for each mod type for each read, and those each have lengths calculated by the method above which are not guaranteed to be the same between mods

utils::DEFAULT_COLORS: add new ones

utils::ParsedMotif: fix mod code handling for mod codes more than one character long: the set() function if given a string will give a set of the characters in the string whereas we want in this case a set of string, e.g. "17802" for pseudouridine

v0.2.0

13 Jun 19:04
be929fa

Choose a tag to compare

v0.2.0 is a major overhaul compared to v0.1.0. It supports the same core pileup and single read extraction operations as the original dimelo v0.1.0 package, but focuses on a number of new objectives:

  • Support multicolor data / any base modification context (GpC, CpC, etc)
  • Vector extraction for all data types
  • Enhanced speed and reliability, enabling e.g. whole genome processing
  • Maintainability -> using a small number of standard dependencies, outsourcing as much as possible to well-maintained third-party packages (e.g. modkit, pysam, h5py, and a few others)
  • Modularity in both architecture and operation
  • Ease of use, especially for multiplatform installation
  • More powerful plotting e.g. bam files from different basecallers, single read sorting, rapid iteration

v0.1.0

16 Sep 23:00
7ff4632

Choose a tag to compare

First release