Skip to content

Get scheme by id to use constant time lookup #162

@mpkocher

Description

@mpkocher

There's a core method call of get_schema_by_id which is doing an O(N) call.

class ESSE(object):
    """
    Exabyte Source of Schemas and Examples class.
    """

    def __init__(self):
        self.schemas = SCHEMAS
        self.examples = EXAMPLES

    def get_schema_by_id(self, schemaId):
        return next((s for s in SCHEMAS if s.get("schemaId") == schemaId), None)

While parsing in libs like Exabtye's express are probably limited by file parsing IO and N is small here (~200), get_schema_by_id is called from serialize_and_validate on every property. The call can be converted to a O(1) lookup with a minor change.

class ESSE(object):
    def __init__(self):
        self.schemas = SCHEMAS
        self._schemas = {s['schemaId']: s for s in self.schemas if s.get('schemaId') is not None}
        self.examples = EXAMPLES

    def get_schema_by_id(self, schemaId):
        return self._schemas.get(schemaId)

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions