Skip to content

Improve efficiency of __getitem__ #101

@MarcAntoineSchmidtQC

Description

@MarcAntoineSchmidtQC

Currently, our approach for some of the __getitem__ methods is inefficient. For example, column subsetting for CategoricalMatrix converts the full matrix to a csc_matrix.

Here's a list to update with potential improvements:

  • DenseMatrix: nothing to do. Already optimized with np.ndarray
  • SparseMatrix: nothing to do. Already optimized with sps.csc_matrix
  • CategoricalMatrix:
    • row: nothing to do, trivial
    • column: create a SparseMatrix with only the subset of columns/rows selected
  • SplitMatrix:
    • Test thoroughly all the potential ways to index
  • StandardizedMatrix
    • Not sure if columns subset with only one row works
  • Write docstrings for expected behavior
  • Write tests covering all expected behavior

Metadata

Metadata

Assignees

No one assigned

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions