Skip to content

measurement_date crash in biology #75

@PPPinson

Description

@PPPinson

Description

prepare_measurement_table returns error

MissingConceptError: The DataFrame is missing some columns, namely:
- measurement_date

there is often issues with "date columns" in spark + Pandas. We should only use measurement_datetime column.

Solution : delete measurement_date in variable "_measurement_required_columns" in utils.check_data.check_data_and_select_columns_measurement.

How to reproduce the bug

prepare_measurement_table issue

import eds_scikit
from eds_scikit.biology import prepare_measurement_table, ConceptsSet
from eds_scikit.io import HiveData
data = HiveData(MyDB)

leukocytes_set = ConceptsSet("Leukocytes_Blood_Count")
measurement = prepare_measurement_table(
    data,
    start_date="2022-01-01",
    end_date="2022-05-01",
    concept_sets=[leukocytes_set],
    convert_units=False,
    get_all_terminologies=True,
)

date columns issue

sql("SELECT measurement_date FROM measurement limit 10").toPandas()

returns : "AttributeError: Can only use .dt accessor with datetimelike values"

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions