Skip to content

Improve input validation to detect typos & synonymous genes #51

@bhavyaac

Description

@bhavyaac

(Based on discussions from issue #1 that has been closed.)

To detect typos, Selin suggested using a fuzzy string matching tool to provide the user with messages such as the following:

  • "Cxb6" was not found in the gene annotation. Did you mean "Cbx6"? (typo)
  • "CBx6" was not found in the gene annotation. Did you mean "Cbx6"? (capitalization issue)
  • " Cbx6 " was not found in the gene annotation. Did you mean "Cbx6"? (whitespace issue)

Example of an app that implements this: https://apps.kaessmannlab.org/evodevoapp/

For synonymous gene names, I suppose that the app should suggest the gene name that is used in the dataset to the user. However, I'm not sure if the app should understand that the synonym is equivalent to the corresponding gene in the dataset and display the plots anyways, or block the plots and force the user to change their input (e.g. with a validate()).
Regarding the synonyms themselves, I'm not sure where to obtain this data. For now, I will check if there is a way for me to quickly obtain the lists stored in this tool that I came across for the genes present in the dataset.

Metadata

Metadata

Assignees

Labels

Type

No type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions