Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

fixed thresholds at the Field level:measureValueCompletenessThreshold and at the Table level: measurePersonCompletenessThreshold #464

Merged
merged 7 commits into from
Jul 18, 2023

Conversation

dimshitc
Copy link
Collaborator

in the Field level:measureValueCompletenessThreshold, the following logic was used:
set measurevaluecompletenessthreshold = 0 where isrequired = 'Yes'
;
set measurevaluecompletenessthreshold = 100 where isrequired = 'No'
;
The Table level: measurePersonCompletenessThreshold was set the same as for v5.3, with additionaly set to 95 for the EPISODE table

@codecov
Copy link

codecov bot commented Jun 21, 2023

Codecov Report

Merging #464 (c4863ca) into develop (162e709) will increase coverage by 0.65%.
The diff coverage is 82.63%.

❗ Current head c4863ca differs from pull request most recent head e43fab9. Consider uploading reports for the commit e43fab9 to get more accurate results

@@             Coverage Diff             @@
##           develop     #464      +/-   ##
===========================================
+ Coverage    84.82%   85.47%   +0.65%     
===========================================
  Files           13       15       +2     
  Lines          725      847     +122     
===========================================
+ Hits           615      724     +109     
- Misses         110      123      +13     
Impacted Files Coverage Δ
R/view.R 30.43% <ø> (ø)
R/sqlOnly.R 72.64% <72.64%> (ø)
R/convertResultsCase.R 100.00% <100.00%> (ø)
R/executeDqChecks.R 89.65% <100.00%> (-0.05%) ⬇️
R/listChecks.R 100.00% <100.00%> (+25.53%) ⬆️
R/runCheck.R 96.55% <100.00%> (+14.73%) ⬆️

... and 1 file with indirect coverage changes

Copy link
Collaborator

@katy-sadowski katy-sadowski left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks so much for fixing this!

In checking against the older files I noticed that measureValueCompleteness for the DEATH table does not match isRequired in the v5.2 and v5.3 threshold files (i.e. should be 0 where the field is required, as you've done in the 5.4 file). Could you please also add that change to the other 2 threshold files?

DEVICE_EXPOSURE,CDM,No,DEVICE_,Yes,100,,No,,,,"The Device domain captures information about a person's exposure to a foreign physical object or instrument which is used for diagnostic or therapeutic purposes through a mechanism beyond chemical action. Devices include implantable objects (e.g. pacemakers, stents, artificial joints), medical equipment and supplies (e.g. bandages, crutches, syringes), other instruments used in medical procedures (e.g. sutures, defibrillators) and material used in clinical care (e.g. adhesives, body material, dental material, surgical material).","The distinction between Devices or supplies and Procedures are sometimes blurry, but the former are physical objects while the latter are actions, often to apply a Device or supply.",Source codes and source text fields mapped to Standard Concepts of the Device Domain have to be recorded here.
MEASUREMENT,CDM,No,MEASUREMENT_,Yes,95,,No,,,,"The MEASUREMENT table contains records of Measurements, i.e. structured values (numerical or categorical) obtained through systematic and standardized examination or testing of a Person or Person's sample. The MEASUREMENT table contains both orders and results of such Measurements as laboratory tests, vital signs, quantitative findings from pathology reports, etc. Measurements are stored as attribute value pairs, with the attribute as the Measurement Concept and the value representing the result. The value can be a Concept (stored in VALUE_AS_CONCEPT), or a numerical value (VALUE_AS_NUMBER) with a Unit (UNIT_CONCEPT_ID). The Procedure for obtaining the sample is housed in the PROCEDURE_OCCURRENCE table, though it is unnecessary to create a PROCEDURE_OCCURRENCE record for each measurement if one does not exist in the source data. Measurements differ from Observations in that they require a standardized test or some other activity to generate a quantitative or qualitative result. If there is no result, it is assumed that the lab test was conducted but the result was not captured.","Measurements are predominately lab tests with a few exceptions, like blood pressure or function tests. Results are given in the form of a value and unit combination. When investigating measurements, look for operator_concept_ids (<, >, etc.).","Only records where the source value maps to a Concept in the measurement domain should be included in this table. Even though each Measurement always has a result, the fields VALUE_AS_NUMBER and VALUE_AS_CONCEPT_ID are not mandatory as often the result is not given in the source data. When the result is not known, the Measurement record represents just the fact that the corresponding Measurement was carried out, which in itself is already useful information for some use cases. For some Measurement Concepts, the result is included in the test. For example, ICD10 CONCEPT_ID [45548980](https://athena.ohdsi.org/search-terms/terms/45548980) 'Abnormal level of unspecified serum enzyme' indicates a Measurement and the result (abnormal). In those situations, the CONCEPT_RELATIONSHIP table in addition to the 'Maps to' record contains a second record with the relationship_id set to 'Maps to value'. In this example, the 'Maps to' relationship directs to [4046263](https://athena.ohdsi.org/search-terms/terms/4046263) 'Enzyme measurement' as well as a 'Maps to value' record to [4135493](https://athena.ohdsi.org/search-terms/terms/4135493) 'Abnormal'."
OBSERVATION,CDM,No,OBSERVATION_,Yes,95,,No,,,,"The OBSERVATION table captures clinical facts about a Person obtained in the context of examination, questioning or a procedure. Any data that cannot be represented by any other domains, such as social and lifestyle facts, medical history, family history, etc. are recorded here.","Observations differ from Measurements in that they do not require a standardized test or some other activity to generate clinical fact. Typical observations are medical history, family history, the stated need for certain treatment, social circumstances, lifestyle choices, healthcare utilization patterns, etc. If the generation clinical facts requires a standardized testing such as lab testing or imaging and leads to a standardized result, the data item is recorded in the MEASUREMENT table. If the clinical fact observed determines a sign, symptom, diagnosis of a disease or other medical condition, it is recorded in the CONDITION_OCCURRENCE table. Valid Observation Concepts are not enforced to be from any domain though they still should be Standard Concepts.","Records whose Source Values map to any domain besides Condition, Procedure, Drug, Measurement or Device should be stored in the Observation table. Observations can be stored as attribute value pairs, with the attribute as the Observation Concept and the value representing the clinical fact. This fact can be a Concept (stored in VALUE_AS_CONCEPT), a numerical value (VALUE_AS_NUMBER), a verbatim string (VALUE_AS_STRING), or a datetime (VALUE_AS_DATETIME). Even though Observations do not have an explicit result, the clinical fact can be stated separately from the type of Observation in the VALUE_AS_* fields. It is recommended for Observations that are suggestive statements of positive assertion should have a value of 'Yes' (concept_id=4188539), recorded, even though the null value is the equivalent. "
DEATH,CDM,No,,No,95,,No,,,,"The death domain contains the clinical event for how and when a Person dies. A person can have up to one record if the source system contains evidence about the Death, such as: Condition in an administrative claim, status of enrollment into a health plan, or explicit record in EHR data.",,
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

i think this one should be left blank (since the measurePersonCompleteness column is No and it's also blank in v5.3)

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

My bad, I didn't take measurePersonCompleteness value into account. Not sure why I put 95 there. I should have put 100 there, and measurePersonCompleteness ='Yes', so the status will be always ='Pass', but user can assess the Death table using DQD.
But what is the behaviour of this check, if the Death table is not populated?

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I believe if it's Yes + 100, and the death table is empty, the check will pass (for non-zero thresholds you need to exceed the threshold in order to fail). If the death table is not present in the CDM then the check will be marked as NOT_APPLICABLE. I'm fine with either Yes+100 or No+null here.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes+100 to be it


- It allows aggregation of chronic conditions that require frequent ongoing care, instead of treating each Condition Occurrence as an independent event.
- It allows aggregation of multiple, closely timed doctor visits for the same Condition to avoid double-counting the Condition Occurrences.
For example, consider a Person who visits her Primary Care Physician (PCP) and who is referred to a specialist. At a later time, the Person visits the specialist, who confirms the PCP's original diagnosis and provides the appropriate treatment to resolve the condition. These two independent doctor visits should be aggregated into one Condition Era.",,"Each Condition Era corresponds to one or many Condition Occurrence records that form a continuous interval.
The condition_concept_id field contains Concepts that are identical to those of the CONDITION_OCCURRENCE table records that make up the Condition Era. In contrast to Drug Eras, Condition Eras are not aggregated to contain Conditions of different hierarchical layers. The SQl Script for generating CONDITION_ERA records can be found [here](https://ohdsi.github.io/CommonDataModel/sqlScripts.html#condition_eras)
The Condition Era Start Date is the start date of the first Condition Occurrence.
The Condition Era End Date is the end date of the last Condition Occurrence. Condition Eras are built with a Persistence Window of 30 days, meaning, if no occurrence of the same condition_concept_id happens within 30 days of any one occurrence, it will be considered the condition_era_end_date."
EPISODE,CDM,No,,No,,,No,,,,"The EPISODE table aggregates lower-level clinical events (VISIT_OCCURRENCE, DRUG_EXPOSURE, PROCEDURE_OCCURRENCE, DEVICE_EXPOSURE) into a higher-level abstraction representing clinically and analytically relevant disease phases,outcomes and treatments. The EPISODE_EVENT table connects qualifying clinical events (VISIT_OCCURRENCE, DRUG_EXPOSURE, PROCEDURE_OCCURRENCE, DEVICE_EXPOSURE) to the appropriate EPISODE entry. For example cancers including their development over time, their treatment, and final resolution. ","Valid Episode Concepts belong to the 'Episode' domain. For cancer episodes please see [article], for non-cancer episodes please see [article]. If your source data does not have all episodes that are relevant to the therapeutic area, write only those you can easily derive from the data. It is understood that that table is not currently expected to be comprehensive. ",
EPISODE,CDM,No,,No,95,,No,,,,"The EPISODE table aggregates lower-level clinical events (VISIT_OCCURRENCE, DRUG_EXPOSURE, PROCEDURE_OCCURRENCE, DEVICE_EXPOSURE) into a higher-level abstraction representing clinically and analytically relevant disease phases,outcomes and treatments. The EPISODE_EVENT table connects qualifying clinical events (VISIT_OCCURRENCE, DRUG_EXPOSURE, PROCEDURE_OCCURRENCE, DEVICE_EXPOSURE) to the appropriate EPISODE entry. For example cancers including their development over time, their treatment, and final resolution. ","Valid Episode Concepts belong to the 'Episode' domain. For cancer episodes please see [article], for non-cancer episodes please see [article]. If your source data does not have all episodes that are relevant to the therapeutic area, write only those you can easily derive from the data. It is understood that that table is not currently expected to be comprehensive. ",
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

did you mean to add the 95 here, since the check is toggled to No?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same idea as with Death:
The only difference is that Episode table is used rarely yet.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In this case since it's rarely used maybe it makes the most sense to keep this one as No and remove the threshold.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No + NULL to be it

Dmitry Dymshyts added 2 commits June 26, 2023 06:19
… square meter) was replaced with 720870 (milliliter per minute per 1.73 square meter)" issue
@@ -471,7 +471,7 @@ MEASUREMENT,MEASUREMENT_CONCEPT_ID,3047107,Calcium [Mass/volume] corrected for a
MEASUREMENT,MEASUREMENT_CONCEPT_ID,3050166,Trypsinogen I Free [Mass/volume] in DBS,,,,,,,,,,,,,,,,,,,,,"8636,8713,8725,8748,8751,8817,8820,8837,8840,8842,8845,8859,8861,8950,9028,9503,9514,9530,9532,9560,9564,9625,32964,32965,44777535,44777592,44777638,45956701",5,Approved UNIT_CONCEPT_IDs for the given MEASUREMENT_CONCEPT_ID
MEASUREMENT,MEASUREMENT_CONCEPT_ID,21493466,Plesiomonas shigelloides DNA [Presence] in Stool by NAA with non-probe detection,,,,,,,,,,,,,,,,,,,,,,5,Approved UNIT_CONCEPT_IDs for the given MEASUREMENT_CONCEPT_ID
MEASUREMENT,MEASUREMENT_CONCEPT_ID,21493469,Vibrio cholerae DNA [Presence] in Stool by NAA with non-probe detection,,,,,,,,,,,,,,,,,,,,,,5,Approved UNIT_CONCEPT_IDs for the given MEASUREMENT_CONCEPT_ID
MEASUREMENT,MEASUREMENT_CONCEPT_ID,36306178,"Glomerular filtration rate/1.73 sq M.predicted among blacks [Volume Rate/Area] in Serum, Plasma or Blood by Creatinine-based formula (CKD-EPI)",,,,,,,,,,,,,,,,,,,,,9117,5,Approved UNIT_CONCEPT_IDs for the given MEASUREMENT_CONCEPT_ID
MEASUREMENT,MEASUREMENT_CONCEPT_ID,36306178,"Glomerular filtration rate/1.73 sq M.predicted among blacks [Volume Rate/Area] in Serum, Plasma or Blood by Creatinine-based formula (CKD-EPI)",,,,,,,,,,,,,,,,,,,,,720870,5,Approved UNIT_CONCEPT_IDs for the given MEASUREMENT_CONCEPT_ID
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

per my comment on #420, maybe we should actually allow both unit concept IDs here, to accommodate both vocab versions?

@@ -852,7 +852,7 @@ MEASUREMENT,MEASUREMENT_CONCEPT_ID,3050920,Urate [Mass/volume] in Body fluid,,,,
MEASUREMENT,MEASUREMENT_CONCEPT_ID,3051923,Hemoglobin disorders newborn screen interpretation,,,,,,,,,,,,,,,,,,,,,,5,Approved UNIT_CONCEPT_IDs for the given MEASUREMENT_CONCEPT_ID
MEASUREMENT,MEASUREMENT_CONCEPT_ID,21493338,Parainfluenza virus 2 RNA [Presence] in Nasopharynx by NAA with non-probe detection,,,,,,,,,,,,,,,,,,,,,,5,Approved UNIT_CONCEPT_IDs for the given MEASUREMENT_CONCEPT_ID
MEASUREMENT,MEASUREMENT_CONCEPT_ID,36203231,Streptococcus pneumoniae Danish serotype 12F IgG Ab [Mass/volume] in Serum,,,,,,,,,,,,,,,,,,,,,"8636,8713,8725,8748,8751,8817,8820,8837,8840,8842,8845,8859,8861,8950,9028,9503,9514,9530,9532,9560,9564,9625,32964,32965,44777535,44777592,44777638,45956701",5,Approved UNIT_CONCEPT_IDs for the given MEASUREMENT_CONCEPT_ID
MEASUREMENT,MEASUREMENT_CONCEPT_ID,36303797,"Glomerular filtration rate/1.73 sq M.predicted among non-blacks [Volume Rate/Area] in Serum, Plasma or Blood by Creatinine-based formula (CKD-EPI)",,,,,,,,,,,,,,,,,,,,,9117,5,Approved UNIT_CONCEPT_IDs for the given MEASUREMENT_CONCEPT_ID
MEASUREMENT,MEASUREMENT_CONCEPT_ID,36303797,"Glomerular filtration rate/1.73 sq M.predicted among non-blacks [Volume Rate/Area] in Serum, Plasma or Blood by Creatinine-based formula (CKD-EPI)",,,,,,,,,,,,,,,,,,,,,720870,5,Approved UNIT_CONCEPT_IDs for the given MEASUREMENT_CONCEPT_ID
Copy link
Collaborator

@katy-sadowski katy-sadowski Jun 28, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

same here, and all the ones below and in 5.2 and 5.3 :)

@dimshitc dimshitc changed the base branch from main to develop July 1, 2023 09:31
@@ -1177,7 +1177,7 @@ MEASUREMENT,MEASUREMENT_CONCEPT_ID,3001612,Ticarcillin+Clavulanate [Susceptibili
MEASUREMENT,MEASUREMENT_CONCEPT_ID,3002030,Lymphocytes/100 leukocytes in Blood,,,,,,,,,,,,,,,,,,,,,8554,5,Approved UNIT_CONCEPT_IDs for the given MEASUREMENT_CONCEPT_ID
MEASUREMENT,MEASUREMENT_CONCEPT_ID,3003311,Nucleated erythrocytes/100 leukocytes [Ratio] in Blood by Manual count,,,,,,,,,,,,,,,,,,,,,8554,5,Approved UNIT_CONCEPT_IDs for the given MEASUREMENT_CONCEPT_ID
MEASUREMENT,MEASUREMENT_CONCEPT_ID,3005044,Rubella virus IgG Ab [Interpretation] in Serum,,,,,,,,,,,,,,,,,,,,,,5,Approved UNIT_CONCEPT_IDs for the given MEASUREMENT_CONCEPT_ID
MEASUREMENT,MEASUREMENT_CONCEPT_ID,3005424,Body surface area,,,,,,,,,,,,,,,,,,,,,"8617,9258,9259,9284,9401,9403,9404,9406,9408,9411,9417,9453,9456,9483,9572",5,Approved UNIT_CONCEPT_IDs for the given MEASUREMENT_CONCEPT_ID
MEASUREMENT,MEASUREMENT_CONCEPT_ID,3005424,Body surface area,,,,,,,,,,,,,,,,,,,,,"8617,9284,9401,9403,9404,9406,9408,9411,9417,9453,9456,9483,9572",5,Approved UNIT_CONCEPT_IDs for the given MEASUREMENT_CONCEPT_ID
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@dimshitc were these removed concepts ever the correct units for this measurement concept (i.e. in a previous vocab version)? or were they just incorrectly listed here?

@katy-sadowski katy-sadowski mentioned this pull request Jul 11, 2023
Copy link
Collaborator

@katy-sadowski katy-sadowski left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@dimshitc I ended up fixing the failing tests in a separate PR: #468 (I tried pushing to your fork but got a permission denied so just pushed my changes in a PR on the main repo). As such, I think we can go ahead and merge this PR despite the failing tests, and then right afterwards, merge my PR #468 .

@katy-sadowski katy-sadowski merged commit c62929d into OHDSI:develop Jul 18, 2023
1 of 4 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

measureValueCompleteness errors for optional fields measureValueCompleteness thresholds changed for v5.4
2 participants