Skip to content

defining a concept as non-standard if it is not included in the CDM? #250

@lucarri

Description

@lucarri

Describe the bug
I was reading a concept set from an ATLAS-generated JSON using codesFromConceptSet and type = "codelist_with_details". I saw that even if most of these concepts are standard in ATHENA, they appear as non-standard in the resulting dataframe. Is this because these concepts are not included in the CDM? If so, perhaps another word could be used to avoid confusion with the 'ATHENA' standard, e.g, standard_in_CDM?

I have dived into the addDetails functions and saw the following: standard_concept = ifelse(is.na(.data$standard_concept), "non-standard", .data$standard_concept)). Can't this information be based on what is available in the JSON when the concept is not included in the CDM?

To Reproduce
Trying to reproduce it with a mock CDM, I got the following:

This reproduces it:

  1. cdm <- omock::mockCdmFromDataset(datasetName = "GiBleed")
  2. conceptSet <- CodelistGenerator::codesFromConceptSet("temp/smoke.json", cdm, type = "codelist_with_details")

This does not:

  1. cdm <- omock::mockCdmFromDataset(datasetName = "synpuf-1k_5.4")
  2. conceptSet <- CodelistGenerator::codesFromConceptSet("temp/smoke.json", cdm, type = "codelist_with_details")
    I checked that the concept sets in my JSON are included in the "synpuf-1k_5.4" but not in the "GiBleed" CDM.

A shorter verson of the JSON I'm using is:
{
"items": [
{
"concept": {
"CONCEPT_CLASS_ID": "Clinical Finding",
"CONCEPT_CODE": "77176002",
"CONCEPT_ID": 4298794,
"CONCEPT_NAME": "Smoker",
"DOMAIN_ID": "Observation",
"INVALID_REASON": "V",
"INVALID_REASON_CAPTION": "Valid",
"STANDARD_CONCEPT": "N",
"STANDARD_CONCEPT_CAPTION": "Non-Standard",
"VOCABULARY_ID": "SNOMED",
"VALID_START_DATE": "",
"VALID_END_DATE": ""
},
"isExcluded": false,
"includeDescendants": false,
"includeMapped": false
},
{
"concept": {
"CONCEPT_CLASS_ID": "Clinical Observation",
"CONCEPT_CODE": "72166-2",
"CONCEPT_ID": 43054909,
"CONCEPT_NAME": "Tobacco smoking status",
"DOMAIN_ID": "Observation",
"INVALID_REASON": "V",
"INVALID_REASON_CAPTION": "Valid",
"STANDARD_CONCEPT": "S",
"STANDARD_CONCEPT_CAPTION": "Standard",
"VOCABULARY_ID": "LOINC",
"VALID_START_DATE": "",
"VALID_END_DATE": ""
},
"isExcluded": false,
"includeDescendants": false,
"includeMapped": false
},
{
"concept": {
"CONCEPT_CLASS_ID": "Clinical Finding",
"CONCEPT_CODE": "8517006",
"CONCEPT_ID": 4310250,
"CONCEPT_NAME": "Ex-smoker",
"DOMAIN_ID": "Observation",
"INVALID_REASON": "V",
"INVALID_REASON_CAPTION": "Valid",
"STANDARD_CONCEPT": "N",
"STANDARD_CONCEPT_CAPTION": "Non-Standard",
"VOCABULARY_ID": "SNOMED",
"VALID_START_DATE": "",
"VALID_END_DATE": ""
},
"isExcluded": false,
"includeDescendants": false,
"includeMapped": false
},
{
"concept": {
"CONCEPT_CLASS_ID": "Clinical Finding",
"CONCEPT_CODE": "65568007",
"CONCEPT_ID": 4276526,
"CONCEPT_NAME": "Cigarette smoker",
"DOMAIN_ID": "Observation",
"INVALID_REASON": "V",
"INVALID_REASON_CAPTION": "Valid",
"STANDARD_CONCEPT": "N",
"STANDARD_CONCEPT_CAPTION": "Non-Standard",
"VOCABULARY_ID": "SNOMED",
"VALID_START_DATE": "",
"VALID_END_DATE": ""
},
"isExcluded": false,
"includeDescendants": false,
"includeMapped": false
}
]
}

Expected behavior
I would have expected both the domain_id and standard_concept columns from the details to be populated with the same information available at the JSON/ATHENA.

Screenshots
With GiBleed:

Image

With synpuf:

Image

Desktop (please complete the following information):

  • OS: Windows
  • CodelistGenerator version: 3.5.0

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions