Skip to content

Commit d6569ed

Browse files
committed
updated busco db and docs
1 parent 3a3a712 commit d6569ed

File tree

2 files changed

+9
-7
lines changed

2 files changed

+9
-7
lines changed

README.md

Lines changed: 6 additions & 4 deletions
Original file line numberDiff line numberDiff line change
@@ -19,7 +19,7 @@ The aim of colora is to produce complete, chromosome-scale primary or phased ass
1919
- [Hifiasm](https://github.com/chhylp123/hifiasm): used to extract contigs from raw PacBio HiFi reads. Hifiasm operates either with HiFi reads exclusively or in conjunction with Oxford Nanopore reads to generate primary assemblies. Colora supports also the hifiasm 'Hi-C mode', used to create phased assemblies with distinct haplotypes.
2020
- [FCS-GX](https://github.com/ncbi/fcs-gx): this pipeline is employed to eliminate contaminants from genome assemblies. This step is optional.
2121
- [purge_dups](https://github.com/dfguan/purge_dups): applied to remove haplotypic duplications and overlaps from primary assemblies. This step is optional and must be skipped in case of phased assembly (Hi-C mode).
22-
- [Arima Genomics Mapping Pipeline](https://github.com/ArimaGenomics/mapping_pipeline): used to map Hi-C reads to contigs. It has been adapted to Snakemake within Colora with minor modifications (`-M` flag added to `bwa mem` commands).
22+
- [Arima Genomics Mapping Pipeline](https://github.com/ArimaGenomics/mapping_pipeline): used to map Hi-C reads to contigs. It has been adapted to Snakemake within Colora with minor modifications (`-5SPM` flag added to `bwa mem` commands).
2323
- [YaHS](https://github.com/c-zhou/yahs): used for scaffolding the assemblies.
2424

2525
In addition, Colora executes the following tasks:
@@ -88,9 +88,11 @@ Even in this case, the following is just an example:
8888
cd colora/resources
8989
mkdir busco_db
9090
cd busco_db
91-
wget https://busco-data.ezlab.org/v5/data/lineages/saccharomycetes_odb10.2024-01-08.tar.gz
91+
wget https://busco-data.ezlab.org/v5/data/lineages/saccharomycetes_odb12.2025-07-01.tar.gz
9292
```
9393

94+
Double check that you are using the latest version of you lineage of interest as they are often updated.
95+
9496
- **ncbi FCS-gx database (optional)**
9597

9698
You can avoid this if you are not planning to automatically remove contaminants from your assembly.
@@ -152,8 +154,8 @@ cd ..
152154
```
153155
mkdir busco_db
154156
cd busco_db
155-
wget https://busco-data.ezlab.org/v5/data/lineages/saccharomycetes_odb10.2024-01-08.tar.gz
156-
tar -xzf saccharomycetes_odb10.2024-01-08.tar.gz
157+
wget https://busco-data.ezlab.org/v5/data/lineages/saccharomycetaceae_odb12.2025-07-01.tar.gz
158+
tar -xzf saccharomycetaceae_odb12.2025-07-01.tar.gz
157159
cd ..
158160
```
159161

config/config_test.yaml

Lines changed: 3 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -90,6 +90,6 @@ quast:
9090

9191
# Customisable parameters for busco
9292
busco:
93-
lineage: "test_data/busco_db/saccharomycetes_odb10" # lineage to be used for busco analysis
94-
optional_params:
95-
"--metaeuk": True
93+
lineage: "test_data/busco_db/saccharomycetaceae_odb12" # lineage to be used for busco analysis
94+
optional_params: {}
95+
# "--metaeuk": True

0 commit comments

Comments
 (0)