-
Notifications
You must be signed in to change notification settings - Fork 36
Add Hypericum elodes #216
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Add Hypericum elodes #216
Conversation
|
Hi @ldemirdj, thanks for sending the EAR of Hypericum elodes. |
|
ok |
|
|
Hi @jesgomez, do you agree to review this assembly? |
|
Yes |
|
Thanks for agreeing! |
|
Hi @ldemirdj Let's do what we can to make this as good as it can without Hi-C. There are quite a few chromosome-level assemblies from |
|
I’m pasting below an exchange we had with the sample ambassador about the Hypericum genome assembly: Please note that, for example, Hypericum perforatum also shows a high rate of duplicated BUSCOs, which may indicate polyploidization events. Be careful when removing suggested allelic contigs or regions, as they might actually correspond to polyploidization events! Jean-Marc |
|
Hi all, I just remember that you may find interesing to try Inspector here since Hi-C data is not available, just a comment :) |
|
Thanks @auryjm I will give my "philosophical" opinion, and then others can let me know what they think. If the species is genuinely tetraploid, we should try to either include the contigs from the 2X set of the chromosomes, or reduce it to 1X representation of each chromosome. You will probably have a better idea of whether it is auto-/allo-tetraploid and what is feasible from genomescope and smudgeplots, but looking at the spectra it almost looks like it would better to put back sequences, to account for the missing kmers under what I'm assuming is the homozygous peak. Are you able to pair the sequences you already have, via something like Dgenies, to get an idea of how many "copies" of the genome are currently included? I note the coverage is also on the lower end for such an assembly, but I would like us to push for as good, or as "complete" as possible. Before I forget as well, were you able to identify mito and chloroplast sequences? I love OatK for this. |
|
Thank you @tbrown91 , I believe it is important to submit a 1n genome, but we cannot exclude the possibility that in the 1n genome all chromosomes are duplicated. If you then look at the 2n genome, you should observe four copies of each region. For example, Brassica napus contains two genomes, A (B. rapa) and C (B. oleracea). The 1n genome to be submitted is the one composed of 10 A chromosomes and 9 C chromosomes and obvisouly BUSCOs duplicated score is high but this is expected. We will have a look, as another plant genome for BGE is in the same situation, except that for that one we do have Hi-C data. |
|
Ping @tbrown91, |
4 similar comments
|
Ping @tbrown91, |
|
Ping @tbrown91, |
|
Ping @tbrown91, |
|
Ping @tbrown91, |
|
Ping @tbrown91, |
1 similar comment
|
Ping @tbrown91, |
|
Just a quick update: the sample ambassador was able to generate Omni-C data. Maybe we can take a look at their data and see if it can improve the genome assembly. |
|
Thanks @auryjm, sounds good |
|
Ping @tbrown91, |
2 similar comments
|
Ping @tbrown91, |
|
Ping @tbrown91, |
|
Ping @tbrown91, |
1 similar comment
|
Ping @tbrown91, |
Assembly review request