-
Notifications
You must be signed in to change notification settings - Fork 34
Open
Description
Hi!
The Tengeler2020 dataset contains some families and genera in the rowData with number suffixes. It looks like they were used to make features unique, but probably they should be removed from the taxonomic ranks. I checked the original biom file and the number suffixes are also there, so it does not seem to be an issue with importing.
Example:
library(mia)
data("Tengeler2020", package = "mia")
tse <- Tengeler2020
apply(rowData(tse), 2L, function(col) col[grepl("_\\d$", col)])
# $Kingdom
# named character(0)
#
# $Phylum
# named character(0)
#
# $Class
# named character(0)
#
# $Order
# named character(0)
#
# $Family
# Clostridium_sensu_stricto_1
# "Clostridiaceae_1"
#
# $Genus
# Ruminococcus_1 Coprococcus_2 Ruminococcus_2
# "Ruminococcus_1" "Coprococcus_2" "Ruminococcus_2"
# Ruminiclostridium_5 Clostridium_sensu_stricto_1 Ruminiclostridium_9
# "Ruminiclostridium_5" "Clostridium_sensu_stricto_1" "Ruminiclostridium_9"
Reactions are currently unavailable
Metadata
Metadata
Assignees
Labels
No labels