Skip to content
Open
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
111 commits
Select commit Hold shift + click to select a range
6ee600e
add README
taliaferrojm Nov 9, 2021
df7207e
add ascii art
taliaferrojm Nov 10, 2021
5674e8d
update ascii again
taliaferrojm Nov 10, 2021
b8b7ab1
one more time for the ascii
taliaferrojm Nov 10, 2021
9e668ad
add look for g_t or g_c independently
taliaferrojm Jan 12, 2022
c904161
Add GLM for bacon
taliaferrojm May 10, 2022
356ecfc
Delete bacon.py
taliaferrojm May 10, 2022
2f432f3
Delete glm.py
taliaferrojm May 10, 2022
f7cbf77
update readme for bacon
taliaferrojm May 10, 2022
449358d
Merge branch 'master' of https://github.com/TaliaferroLab/OINC-seq in…
taliaferrojm May 10, 2022
219d386
change multiconv option to nconv integer
taliaferrojm May 12, 2022
17816df
fix bug in quality score reading
taliaferrojm May 13, 2022
56cb13a
add script for parsing output of bam-readcount
taliaferrojm May 23, 2022
ca058e7
add ability to manually mask positions
taliaferrojm May 24, 2022
09fe216
fix qual score bug and add use read1 and/or read2
taliaferrojm May 25, 2022
27d58ae
fix read1/2 bug in pigpen argparse
taliaferrojm Jun 6, 2022
20e01e1
add alignment and quantification script
taliaferrojm Jul 5, 2022
10e1ed8
typos in alignAndQuant
taliaferrojm Jul 5, 2022
da59501
another stupid typo in alignAndQuant
taliaferrojm Jul 5, 2022
3a4477e
update default snp variant freq
taliaferrojm Jul 7, 2022
aed55d8
write conversions as pickled dictionary
taliaferrojm Jul 7, 2022
71eb0ed
index STAR bam after creating it
taliaferrojm Jul 7, 2022
5c900b5
add assignreads_salmon
taliaferrojm Jul 7, 2022
7f584f3
do not write conversions to pickled dict
taliaferrojm Jul 7, 2022
e4e0d59
remove looking for an existing merged.vcf
taliaferrojm Jul 8, 2022
68493bb
update README
taliaferrojm Jul 8, 2022
cf65516
major reorg including incorporation of salmon
taliaferrojm Jul 8, 2022
40c7672
fix bug where convs per tx were being overwritten
taliaferrojm Jul 15, 2022
abdd808
remove salmonbam after creating it
taliaferrojm Jul 15, 2022
cd4ec28
update python version in README
taliaferrojm Jul 15, 2022
cf14995
add parameters to output and create outputDir
taliaferrojm Jul 15, 2022
c6aa3c0
format convG in output
taliaferrojm Jul 15, 2022
6b122da
deal inelegantly with tx versions in salmon output
taliaferrojm Jul 27, 2022
2cd23ab
add look for convs in defined regions --ROIbed
taliaferrojm Jul 29, 2022
e14a719
fixed tx verison issue
vaethk Jul 29, 2022
8d2bd37
Fixed error when reading in .txt output
vaethk Aug 4, 2022
6a02f76
add min overlap length
taliaferrojm Aug 26, 2022
c31f30f
add min mapping quality
taliaferrojm Aug 26, 2022
632ceb8
add alignandquant2
taliaferrojm Aug 26, 2022
cb6f0f5
add ability to handle comments in pigpen output
taliaferrojm Aug 31, 2022
d16ffda
add minMappingQual to ROIbed mode
taliaferrojm Sep 16, 2022
b846278
change bacon to consider specific conversions
taliaferrojm Jan 4, 2023
a2d2aee
in bacon make deltas specific to metrics
taliaferrojm Jan 6, 2023
be5c607
minor update to bacon for float formatting
taliaferrojm Jan 6, 2023
6d6abe5
update alignandquant scripts
taliaferrojm Jan 26, 2023
5525fdf
dedupUMI
goeringr Feb 6, 2023
841373c
Revert "dedupUMI"
goeringr Feb 6, 2023
783a68e
dedupUMI
goeringr Feb 6, 2023
e7264fe
more UMI tools updates
goeringr Feb 7, 2023
b74aa97
added --libType parameter
goeringr Feb 13, 2023
b02cc29
Merge pull request #1 from TaliaferroLab/IncludingUMIs
taliaferrojm Feb 13, 2023
e30dc81
SnakeMakeWorkflow1.0
goeringr Feb 16, 2023
5ebaa59
Merge pull request #2 from TaliaferroLab/SnakemakeWorkflow
taliaferrojm Feb 17, 2023
4f62035
readme update
taliaferrojm Mar 2, 2023
16ca1a0
update alignandquants to handle single-end data
taliaferrojm Mar 2, 2023
0128cdd
small readme update
taliaferrojm Mar 6, 2023
0a8eeec
unique mappers only actually recognized in alignandquant
taliaferrojm Mar 7, 2023
5ad8859
add pigpen support for single end reads
taliaferrojm Mar 8, 2023
58c0a4d
minor updates for UMI extraction and quant
taliaferrojm Apr 13, 2023
9b32deb
add datatype param to use with ROIbed
taliaferrojm Apr 13, 2023
7de6de1
optionally allow multimappers with alignAndQuant
taliaferrojm Jul 27, 2023
618ced1
update readme image
taliaferrojm Jul 27, 2023
7718087
change required tx overlap length in assignreads.py
taliaferrojm Nov 1, 2023
bc7a2ae
add maxmap parameter to alignAndQuant.py
taliaferrojm Nov 1, 2023
c23432d
removed necessity of dedup UMI in pigpen.py
taliaferrojm Dec 6, 2023
8d0026d
alignandquant unfiltered bam is no longer kept
taliaferrojm Dec 6, 2023
77af14d
alignUMIquant allow filtering by num of alignments
taliaferrojm Dec 6, 2023
a54a403
fix typo in bacon_glm
taliaferrojm Feb 23, 2024
1cf6b8e
add ability to quantify deletions in query
taliaferrojm Feb 26, 2024
9fad66f
update bacon for use with G deletions
taliaferrojm Mar 22, 2024
4c1e571
fix typo in pigpen.py
taliaferrojm Mar 22, 2024
35e50f7
add try/except for deletions at end of read
taliaferrojm Mar 22, 2024
4c8790b
add mismatch code for MPRA data
taliaferrojm Feb 3, 2025
aa8e42c
add maxmap to alignandquant
taliaferrojm Feb 15, 2025
ee75f48
add argument for source of GFF
taliaferrojm Feb 15, 2025
46b8c6d
small update to pigpen.py
taliaferrojm Feb 15, 2025
1f1b014
update readme and add setup.py
taliaferrojm Feb 17, 2025
96a0162
update setup.py with package excluding
taliaferrojm Feb 17, 2025
ec9d46a
change from distutils to setuptools
taliaferrojm Feb 17, 2025
16a38a0
add license
taliaferrojm Feb 17, 2025
38c5eb4
add shebang to pigpen.py
taliaferrojm Feb 17, 2025
baf037a
reorganize into src directory
taliaferrojm Feb 18, 2025
ec86a88
change setup.py
taliaferrojm Feb 18, 2025
6f9596d
update setup.py again
taliaferrojm Feb 18, 2025
5756282
add package_dir to setup
taliaferrojm Feb 18, 2025
6a5c4f6
another setup.py
taliaferrojm Feb 18, 2025
f6afc3b
add init.py
taliaferrojm Feb 18, 2025
f37ecce
update setup.py
taliaferrojm Feb 19, 2025
2d2d48b
change main script name
taliaferrojm Feb 19, 2025
6507f0f
change setup.py
taliaferrojm Feb 19, 2025
6264c9e
update module names in import
taliaferrojm Feb 19, 2025
7d148e0
update import statements on getmismatches
taliaferrojm Feb 19, 2025
7b6ef0d
now conda installable (I think)
taliaferrojm Feb 19, 2025
6528cf9
update readme
taliaferrojm Feb 28, 2025
ef89db8
stupid space in readme
taliaferrojm Feb 28, 2025
d40159c
fix assignreads import error
taliaferrojm Mar 1, 2025
60cb91e
fix iteratereads_singleend import, single end still not supported
taliaferrojm Mar 1, 2025
1f503d0
add bacon_glm to setup.py
taliaferrojm Mar 5, 2025
d6eb564
reorganize alignandquant for accessibility from installed package
taliaferrojm Mar 21, 2025
a61b3d6
fix single end capability in getmismatches
taliaferrojm Mar 21, 2025
25a7100
change setup.py to make alignAndQuant accessible after installation
taliaferrojm Mar 21, 2025
65ce352
add yaml for manual conda environment creation
taliaferrojm Mar 21, 2025
a5a13da
change location of pigpen_env.yaml
taliaferrojm Mar 21, 2025
d5be642
add shebang to alignandquant
taliaferrojm Mar 21, 2025
f3a3d96
update README
taliaferrojm Mar 21, 2025
9c7f318
update setup.py to 0.0.7
taliaferrojm Mar 31, 2025
46f4c47
fix bug in iterratereads_singleend call with 1 proc
taliaferrojm Apr 7, 2025
ffd8f1d
add consideration of 'ncRNA_gene' to assignreads_salmon_ensembl
taliaferrojm Apr 10, 2025
c808def
add cutadapt script
taliaferrojm Apr 28, 2025
d9959b9
change entry_points in setup.py
taliaferrojm Apr 28, 2025
4b5457d
fix bug in call to iterratereads_singleend with 1 proc
taliaferrojm May 15, 2025
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
14 changes: 14 additions & 0 deletions .gitignore
Original file line number Diff line number Diff line change
@@ -0,0 +1,14 @@

.DS_Store
.spyproject/config/codestyle.ini
.spyproject/config/defaults/defaults-codestyle-0.2.0.ini
.spyproject/config/defaults/defaults-encoding-0.2.0.ini
.spyproject/config/defaults/defaults-vcs-0.2.0.ini
.spyproject/config/defaults/defaults-workspace-0.2.0.ini
.spyproject/config/workspace.ini
.spyproject/config/vcs.ini
.spyproject/config/encoding.ini
.spyproject/config/backups/codestyle.ini.bak
.spyproject/config/backups/encoding.ini.bak
.spyproject/config/backups/vcs.ini.bak
.spyproject/config/backups/workspace.ini.bak
21 changes: 21 additions & 0 deletions LICENSE
Original file line number Diff line number Diff line change
@@ -0,0 +1,21 @@
MIT License

Copyright (c) 2025 Taliaferro lab

Permission is hereby granted, free of charge, to any person obtaining a copy
of this software and associated documentation files (the "Software"), to deal
in the Software without restriction, including without limitation the rights
to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
copies of the Software, and to permit persons to whom the Software is
furnished to do so, subject to the following conditions:

The above copyright notice and this permission notice shall be included in all
copies or substantial portions of the Software.

THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE
SOFTWARE.
212 changes: 210 additions & 2 deletions README.md
Original file line number Diff line number Diff line change
@@ -1,2 +1,210 @@
# OINC-seq
Detecting oxidative marks on RNA through high-throughput sequencing
# OINC-seq <br/> <br/>Detecting oxidative marks on RNA using high-throughput sequencing

,-,-----,
PIGPEN **** \ \ ),)`-'
<`--'> \ \`
/. . `-----,
OINC! > ('') , @~
`-._, ___ /
-|-|-|-|-|-|-|-| (( / (( / -|-|-|
|-|-|-|-|-|-|-|- ''' ''' -|-|-|-
-|-|-|-|-|-|-|-|-|-|-|-|-|-|-|-|-|

Pipeline for Identification
Of Guanosine Positions
Erroneously Notated

## Overview

[OINC-seq](https://www.biorxiv.org/content/10.1101/2024.11.12.623278v1.abstract) (Oxidation-Induced Nucleotide Conversion sequencing) is a sequencing technology that allows the direction of oxidative marks on RNA molecules. Because guanosine has the lowest redox potential of any of the ribonucleosides, it is the one most likely to be affected by oxidation. When this occurs, guanosine is turned into 8-oxoguanosine (8-OG) or further oxidized products. When reverse transcriptase encounters these products, it makes predictable errors in the resulting cDNA (see [here](https://pmc.ncbi.nlm.nih.gov/articles/PMC5623583/)). OINC-seq employs spatially restricted singlet oxygen radicals to oxidize RNAs at specific subcellular locations. The level of RNA oxidation detected for each RNA species is therefore a readout of the amount of that RNA species at that subcellular location.

To detect and quantify these conversions, we have created software called **PIGPEN** (Pipeline for Identification of Guanosine Positions Erroneously Notated).

PIGPEN starts with RNAseq fastq files. These files are aligned to the genome using [STAR](https://github.com/alexdobin/STAR). Single and paired-end reads are supported, although paired-end reads are preferred (for reasons that will become clear later). To minimize the contribution of positions that appear as mutations due to non-ideal alignments, PIGPEN only considers uniquely aligned reads (mapping quality == 255). For now, it is required that paired-end reads be stranded, and that read 1 correspond to the sense strand. This is true for most, but not all, modern RNAseq library preparation protocols.

Uniquely aligned reads are then extracted and used to quantify transcript abundances using [salmon](https://combine-lab.github.io/salmon/). Posterior probabilities of transcript assignments are then derived using [postmaster](https://github.com/COMBINE-lab/postmaster). `STAR`, `salmon`, and `postmaster` must be in the user's `$PATH`. All three of these preparatory steps can be easily and automatically done using `alignAndQuant.py`.

Following the creation of alignment files produced by `STAR` and `postmaster` as well as transcript quantifications produced by `salmon`, these files are then used by `pigpen.py` to identify nucleotide conversions, assign them to transcripts and genes, and then quantify the number of conversions in each gene. A graphical overview of the flow of `PIGPEN` is shown below.

![alt text](https://images.squarespace-cdn.com/content/v1/591d9c8cbebafbf01b1e28f9/77d2062a-a31e-41b9-90ad-5963f618c6a6/updatedPIGPENscheme.png?format=1000w "PIGPEN overview")

## Requirements

PIGPEN has the following prerequisites:

- python >= 3.8
- samtools >= 1.15
- varscan >= 2.4.4
- bcftools >= 1.15
- pysam >= 0.19
- numpy >= 1.21
- pybedtools >= 0.9.0
- pandas >= 1.3.5
- bamtools >= 2.5.2
- salmon >= 1.9.0
- STAR >= 2.7.10
- gffutils >= 0.11.0
- umi_tools >= 1.1.0 (if UMI collapsing is desired)
- [postmaster](https://github.com/COMBINE-lab/postmaster)
>Note: postmaster is a [rust](https://www.rust-lang.org/) package. Installing it requires rust (which itself is installable using [conda](https://anaconda.org/conda-forge/rust)). Once rust is installed, use `cargo install --git https://github.com/COMBINE-lab/postmaster` to install postmaster.

BACON has the following prerequisites:

- python >= 3.6
- statsmodels >= 0.13.2
- numpy >= 1.21
- rpy2 >= 3.4.5
- R >= 4.1

## Installation

### Option 1: conda

PIGPEN can be installed using [bioconda](https://bioconda.github.io/) using `conda install -c bioconda pigpen`. Following installation using `conda`, PIPGEN is accessible by calling `pigpen`, e.g. `pigpen -h`. Currently, `postmaster` must be installed separately afterward. This can be done using `cargo install --git https://github.com/COMBINE-lab/postmaster`. If PIPGEN was installed via `conda`, make sure to install postmaster in the same environment. #TODO: make bacon and alignAndQuant easily accessible after installation.

### Option 2: manual installation

Alternatively, you can download PIGPEN directly from this repository. PIPGEN is python-based, but requires a number of extra modules as well as some R and Rust libraries. These are most easilty installed with [conda](https://docs.conda.io/projects/conda/en/stable/index.html). The necessary software is listed in `pigpen_env.yaml`. This configuration file can be provided to conda and has all the information needed to setup a PIGPEN-ready environment.

`conda env create -f pigpen_env.yaml`

This will create an environment called `pigpen_env` that contains all the necessary modules. To activate the environment, type

`source activate pigpen_env`

Uncompress the repository and move into the compressed directory. Install PIGPEN using

`python setup.py install`

Then to make sure you are ready to go, ask for the help options in the PIGPEN, BACON, and alignAndQuant scripts using

`pigpen -h`

`bacon -h`

`alignAndQuant -h`

If there are errors, one or more of the modules likely did not install properly. In that case, using an alternative package manager like pip may help. If you see no errors, you are good to go.

## Preparing alignment files

`pigpen` expects a particular directory structure for organization of `STAR`, `salmon`, and `postmaster` outputs. This is represented below.

```
workingdir
└───sample1
│ │
│ └───STAR
│ │ │ sample1Aligned.sortedByCoord.out.bam
│ │ │ sample1Aligned.sortedByCoord.out.bam.bai
│ │ │ ...
│ │
│ └───salmon
│ │ │ sample1.quant.sf
│ │ │ sample1.salmon.bam
│ │ │ ...
│ │
│ └───postmaster
│ │ │ sample1.postmaster.bam
│ │ │ sample1.postmaster.bam.bai
│ │ │ ...
└───sample2
│ │
│ └───STAR
│ │ │ sample2Aligned.sortedByCoord.out.bam
│ │ │ sample2Aligned.sortedByCoord.out.bam.bai
│ │ │ ...
│ │
│ └───salmon
│ │ │ sample2.quant.sf
│ │ │ sample2.salmon.bam
│ │ │ ...
│ │
│ └───postmaster
│ │ │ sample2.postmaster.bam
│ │ │ sample2.postmaster.bam.bai
│ │ │ ...
...
```

This structure can be automatically acheived by running `alignAndQuant` in `workingdir` once for each sample. Following this, the samples are ready to be analyzed with `pigpen`.

For example:

`alignAndQuant --forwardreads reads.r1.fq.gz --reversereads reads.r2.fq.gz --nthreads 32 --STARindex <STARindex> --salmonindex <salmonindex> --samplename sample1`

`STARindex` and `salmonindex` should be created according to the instructions for creating them found [here](https://github.com/alexdobin/STAR/blob/master/doc/STARmanual.pdf) and [here](https://salmon.readthedocs.io/en/latest/).

## Running PIGPEN

Samples are then ready for analysis with `pigpen.py`. From `workingdir`, a comma-separated list of samples is supplied to `--samplenames`. In the example above, this would be `--samplenames sample1,sample2`. Optionally, a list of control samples are provided to `--controlsamples`. These should correspond to samples in which nucleotide conversions were not intentionally induced. They serve as controls for SNP identification (see below). They may be a subset of the samples provided to `--samplenames`.

## SNPs

8-OG-induced conversions are rare, and this rarity makes it imperative that contributions from conversions that are not due to oxidation are minimized. A major source of apparent conversions is SNPs. It is therefore advantageous to find and mask SNPs in the data.

PIGPEN performs this by using [varscan](http://varscan.sourceforge.net/using-varscan.html) to find SNP positions. These locations are then excluded from all future analyses. Varscan parameters are controled by the PIGPEN parameters `--SNPcoverage` and `--SNPfreq` that control the depth and frequency required to call a SNP. We recommend being aggressive with these parameters. We often set them to 20 and 0.2, respectively.

PIGPEN performs this SNP calling on control samples (`--controlsamples`) in which the intended oxidation did not occur. PIGPEN will use the union of all SNPs found in these files for masking. Whether or not to call SNPs at all (you probably should) is controlled by `--useSNPs`.


## GFFtype

PIGPEN uses genome annotations to relate transcripts and genes. There are peculiarities to these annotations based on where they came from. Generally, PIGPEN prefers annotation files from either [GENCODE](www.gencodegenes.org) or [Ensembl](https://www.ensembl.org/index.html). Tell PIGPEN where your annotation came from using the --gfftype flag.

## Quantifying conversions

PIGPEN then identifies conversions in reads. This can be done using multiple processors (`--nproc`). In order to minimize the effect of sequencing error, PIGPEN only considers positions for which the sequencing quality was at least 30. There are two important flags to consider here.

First, `--onlyConsiderOverlap` requires that the same conversion be observed in both reads of a mate pair. Positions interrogated by only one read are not considered. This can improve accuracy. True oxidation-induced conversions are rare. Rare enough that sequencing errors can cause a problem. Requiring that a conversion be present in both reads minimizes the effect of sequencing errors. If the fragment sizes for a library are especially large relative to the read length, the number of positions interrogated by both mates will be small.

Second, `nConv` sets the minimum number of G -> C / G -> T conversions in a read pair in order for those conversions to be recorded. The rationale here is again to reduce the contribution of background, non-oxidation-related conversions. Background conversions should be distributed relatively randomly across reads. However, due to the spatial nature of the oxidation reaction, oxidation-induced conversions should be more clustered into specific reads. Therefore, requiring at least two conversions can increase specificity. In practice, this works well if the data is very deep or concentrated on a small number of targets. When dealing with transcriptome-scale data, this flag often reduces the number of observed conversions to an unacceptably low level.

## Assigning reads to genes

After PIGPEN calculates the number of converted and noncoverted nucleotides in each read pair, it intersects that data with the probabilistic transcript assignment for each read performed by `salmon` and `postmaster`. Conversions within read pair X are assigned proportionally to transcript Y according to the `salmon`/`postmaster`-calculated probability that read pair X originated from transcript Y. This transcript-level data is then collapsed to gene-level data according to the transcript/gene relationships found in `--gff`. Transcript IDs in `--gff` should match those in the fasta file used to make `--salmonindex`. The use of [GENCODE](www.gencodegenes.org) annotations is recommended if possible. Alternatively, Ensembl annotations can be used. The source of the annotations should be supplied using the `--gfftype` flag.

## Calculating the number of conversions per gene

We have observed that the overall rate of conversions (not just G -> T + G -> C, but all conversions) can vary signficantly from sample to sample, presumably due to a technical effect in library preparation. For this reason, PIGPEN calculates **PORC** (Proportion of Relevant Conversions) values. This is the log2 ratio of the relevant conversion rate ([G -> T + G -> C] / total number of reference G encountered) to the overall conversion rate (total number of all conversions / total number of positions interrogated). PORC therefore normalizes to the overall rate of conversions, removing this technical effect.

PIGPEN can use G -> T conversions, G -> C conversions, G deletions, or any combination when calculating PORC values. This behavior is controlled by supplying some or all of the options `--use_g_t`, `--use_g_c`, and `--use_g_x`, respectively.

## Using one read of a paired end sample

The use of one read in a paired end sample for conversion quantification can be controlled using `--use_read1` and `--use_read2`. To use both reads, supply both flags. `--onlyConsiderOverlap` requires the use of both reads. Importantly, both reads can still used for genomic alignment and transcript quantification.

## Mask specific positions

To prevent specific genomic locations from being considered during conversion quantification, supply a bed file of these locations to `--maskbed`.

## Output

Output files are named `<samplename>`.pigpen.txt. These files contain the number of observed conversions for each gene as well as derived values like conversion rates and PORC values.

## Statistical framework for comparing gene-level PORC values across conditions

We could simply compare PORC values across conditions, but with that approach we lose information about the number of counts (conversions) that went into the PORC calculation.

For each gene, PIGPEN calculates the number of relevant conversions (G -> T + G -> C) as well as all other conversions encountered. Each gene therefore ends up with a 2x2 contingency table of the following form:

|| converted | not converted |
| ----------------|---------------|-------- |
| G -> C or G -> T | a | b |
| other conversion | c | d |

We then want to compare groups (replicates) of contingency tables across conditions. BACON (Bioinformatic Analysis of the Conversion of Nucleotides) performs this comparison using a binomial linear mixed-effects model. Replicates are modeled as random effects.

`full model = conversions ~ nucleotide + condition + nucleotide:condition + (1 | replicate)`

`null model = conversions ~ nucleotide + condition + (1 | replicate)`

The two models are then compared using a likelihood ratio test.

As input, BACON takes a tab-delimited, headered file of the following form with one row per sample:

| file | sample | condition |
| -----|--------|-----------|
| /path/to/pigpen/output | sample name | condition ID|
Loading