Skip to content

Investigate Whether label_number is Used or Obsolete in ChEBI Dataset Class #47

@aditya0by0

Description

@aditya0by0

Description

We need to understand whether the label_number property in the ChEBIOverX class and its derivatives (ChEBIOver100, ChEBIOver50) is actually used anywhere in the dataset creation process or pipeline.

  1. Is label_number referenced or used in any part of the codebase, including dataset creation, processing, or any downstream tasks?

    • If used, what purpose does it serve?
  2. If label_number is not used, can it be considered obsolete, and should it be removed to clean up the codebase?

  3. Assess whether retaining label_number serves any potential future purpose, or if it should be refactored.

Relevant Code

class ChEBIOverX(_ChEBIDataExtractor):
    LABEL_INDEX: int = 3
    SMILES_INDEX: int = 2
    READER: dr.ChemDataReader = dr.ChemDataReader
    THRESHOLD: int = None

    @property
    def label_number(self) -> int:
        return 854

    def select_classes(self, g: nx.Graph, split_name: str, *args, **kwargs) -> List:
        # Implementation...
        return nodes


class ChEBIOver100(ChEBIOverX):
    THRESHOLD: int = 100

    def label_number(self) -> int:
        return 854


class ChEBIOver50(ChEBIOverX):
    THRESHOLD: int = 50

    def label_number(self) -> int:
        return 1332

Metadata

Metadata

Assignees

Labels

documentationImprovements or additions to documentationunderstanding neededNeed more explanation/knowledge

Type

No type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions