Skip to content

Outputs

RenzoTale88 edited this page Aug 12, 2021 · 4 revisions

nf-LO will save the outputs from the analyses in the folder specified with --outdir PREFIX. Inside this folder, you will find the following subfolders. The most important is PREFIX/chainnet, that contains the chain/net files that you can use to perform the liftover of the positions.

Folder structure

The workflow will generate a subfolder with the following structure:

[CWD]
└───${params.outdir}/
    └───genome2bit/
    |   |   source.2bit
    |   |   source.sizes
    |   |   target.2bit
    |   |   target.sizes
    |
    └───singlechains/
    |   |   src0.tgt0.chain
    |   |   src0.tgt1.chain
    |   |   src1.tgt1.chain
    |   |   ...
    |   |   srcN.tgtM.chain
    |
    └───rawchain/
    |   |   rawchain.chain
    |
    └───chainnet/
    |   |   liftover.chain
    |   |   netfile.net
    |
    └───maf/
    |   |   ${params.chain_name}.maf
    |   |   ${params.chain_name}.fixed.maf
    |
    └───lifted/
    |   |   ${params.chain_name}.bed
    |   |   ${params.chain_name}.unmapped.bed
    |
    └───stats/
    |   |   mafCoverage.out
    |   |   mafStats.out
    |   |   mafIdentity.out
    |   |   features.txt
    |
    └───reports/
        |   chainMetrics.html
        |   execution_timeline.html
        |   execution_report.html
        |   execution_trace.txt
        |   pipeline_dag.svg

Where params.chain_name and params.outdir are specified by the user at runtime using the options --chain_name and --outdir, respectively. The folders contain the following data:

  1. genome2bit: this folder will contain the input genomes converted to 2bit format, coupled with the respective size file.
  2. singlechains: single chain files generated from the raw alignments; if the alignment is performed in chunks, then this folder will contain one chain per single alignment.
  3. rawchain: this folder includes one chain file, generated by combining the single chain files in singlechains folder.
  4. chainnet: contains the refined chain/net files, that can be used for liftover.
  5. maf: this folder contains the maf (multiple alingment format) generated from the chain file; optional.
  6. lifted: this folder contains the lifted and unlifted positions (present only when a file to lift is provided, see here for details on the accepted inputs).
  7. reports: this folder contains the HTML reports for the workflow (cpu times, task-specific times etc) as well as the metrics for the chain (if --maf flag is specified and mafTools is installed) and lifted annotation (if an annotation is provided).
  8. stats: this folder contains the metrics for the chain file (if --maf flag is specified and mafTools is installed) and lifted annotation (if an annotation is provided).

Main output files

The workflow generates a large number of files, several of which are conserved for the user to introspect and investigate freely. Nevertheless, the primary outputs of the workflow can be considered the following:

  1. ${params.outdir}/chainnet/liftover.chain: this is the final chain file that can be used as input for liftover, crossmap and other tools to perform the actual liftover.
  2. ${params.outdir}/chainnet/netfile.net: net file associated with the chain file generate to perform the liftover.
  3. ${params.outdir}/stats/mafCoverage.out: number of bases in the two genomes covered by the chain file, calculated using mafTools (when available)
  4. ${params.outdir}/stats/mafIdentity.out: number of identical bases in the two genomes covered by the chain file, calculated using mafTools (when available)
  5. ${params.outdir}/stats/mafStats.out: generic metrics of the chain file calculated using mafTools (when available)
  6. ${params.outdir}/stats/features.txt: number of features lifted (when a feature set is provided)
  7. ${params.outdir}/lifted/${params.chain_name}.*: set of lifted features using either crossmap or liftover

Clone this wiki locally