Primary and secondary care code sets for Electronic Health Record research. The code sets were developed primarily for use with UK Biobank data.
🧑🎓 Please cite this work if you use it!
Clinical event codes are provided using Read v2 and Clinical Terms Version 3 (CTV3) classifications.
Variable | Value | Level | Description |
---|---|---|---|
angina |
diagnosis |
stable |
Stable angina |
angina |
diagnosis |
unstable |
Unstable angina |
bipolar |
diagnosis |
- |
Bipolar disorder |
diabetes |
diagnosis |
- |
Diabetes (type unknown) |
diabetes |
diagnosis |
type1 |
Type 1 diabetes |
diabetes |
diagnosis |
type2 |
Type 2 diabetes |
diabetes |
diagnosis |
gestational |
Gestational diabetes |
diabetes |
diagnosis |
secondary |
Secondary diabetes |
diabetes |
diagnosis |
remission |
Diabetes remission |
diabetes |
diagnosis |
resolved |
Diabetes resolution |
diabetes |
family_history |
- |
Family history of diabetes |
hypertension |
diagnosis |
- |
Hypertension |
learning_disabilities |
diagnosis |
- |
Learning disabilities |
mi |
diagnosis |
- |
Myocardial infarction/heart attack |
pcos |
diagnosis |
- |
Polycystic ovarian syndrome |
schizophrenia |
diagnosis |
- |
Schizophrenia |
stroke |
diagnosis |
haemorrhagic |
Haemorrhagic stroke |
stroke |
diagnosis |
ischaemic |
Ischaemic stroke |
tia |
diagnosis |
- |
Transient ischaemic attack |
Variable | Value | Level | Description |
---|---|---|---|
blood_glucose |
fpg |
- |
Fasting plasma glucose |
blood_glucose |
hba1c |
- |
Glycated hemoglobin |
blood_glucose |
ogtt |
2hour |
2 hour oral glucose tolerance test |
blood_glucose |
random |
- |
Random blood sugar |
blood_glucose |
unknown |
- |
Glucose test (unknown type) |
anthropometric |
bmi |
- |
Body mass index |
anthropometric |
height |
- |
Height |
anthropometric |
weight |
- |
Weight |
anthropometric |
waist |
- |
Waist circumference |
Variable | Value | Level | Description |
---|---|---|---|
smoking |
current |
trivial |
Current trivial smoker |
smoking |
current |
light |
Current light smoker |
smoking |
current |
moderate |
Current moderate smoker |
smoking |
current |
heavy |
Current heavy smoker |
smoking |
current |
very_heavy |
Current very heavy smoker |
smoking |
current |
- |
Current smoker (level unknown) |
smoking |
former |
trivial |
Former trivial smoker |
smoking |
former |
light |
Former light smoker |
smoking |
former |
moderate |
Former moderate smoker |
smoking |
former |
heavy |
Former heavy smoker |
smoking |
former |
very_heavy |
Former very heavy smoker |
smoking |
former |
- |
Former smoker (level unknown) |
smoking |
never |
- |
Never smoked |
smoking |
non |
- |
Non-smoker (assumed current) |
smoking |
passive |
- |
Passive smoker (assumed current) |
smoking |
consumption |
- |
Cigarette consumption |
Around 76% of UK Biobank prescription records have a BNF code. 99.7% of records have a BNF and/or Read v2 code. Prescription codes are therefore provided using British National Formulary (BNF) and Read v2 classifications.
prescriptions.rds is a named "list of lists" for the following drug categories:
Drug category | Name |
---|---|
Anti-diabetes drugs | diabetes |
Anti-hypertensives | hypertension |
Atypical anti-psychotics | antipsychotic |
Steroids | steroids |
Statins | statins |
Further details are provided here.
Secondary care diagnoses are provided using ICD-9 and ICD-10 coding classifications. Procedures are provided using OPCS-3 and OPCS-4 classifications.
Variable | Value | Level | Description |
---|---|---|---|
diabetes |
diagnosis |
- |
Diabetes (type unknown) |
diabetes |
diagnosis |
type1 |
Type 1 diabetes |
diabetes |
diagnosis |
type2 |
Type 2 diabetes |
diabetes |
diagnosis |
gestational |
Gestational diabetes |
diabetes |
diagnosis |
secondary |
Secondary diabetes |
The majority of diagnosis records in the interim EHR data release use the CTV3 coding classification. The code set repositories below typically only cover Read v2 diagnostic codes and limited prescription coding.
- https://phenotypes.healthdatagateway.org/
- https://www.opencodelists.org/
- https://clinicalcodes.rss.mhs.man.ac.uk/
- https://caliberresearch.org/portal is no longer updated
Kuan et al (2019) includes a map of 308 physical and mental health conditions. Read v2 codes are available at CALIBER and https://github.com/spiros/chronological-map-phenotypes.
- https://openprescribing.net/bnf/ includes a browsable BNF with high-level prescribing trends
- https://www.thedatalab.org/blog/161/prescribing-data-bnf-codes/ summarises the BNF coding structure
If you use this work, please cite it as below:
@article{10.1093/jamia/ocab260,
author = {Darke, Philip and Cassidy, Sophie and Catt, Michael and Taylor, Roy and Missier, Paolo and Bacardit, Jaume},
title = "{Curating a longitudinal research resource using linked primary care EHR data - a UK Biobank case study}",
journal = {Journal of the American Medical Informatics Association},
volume = {29},
number = {3},
pages = {546-552},
year = {2021},
month = {12},
issn = {1527-974X},
doi = {10.1093/jamia/ocab260},
url = {https://doi.org/10.1093/jamia/ocab260},
eprint = {https://academic.oup.com/jamia/article-pdf/29/3/546/42333190/ocab260.pdf},
}
Made available under a Creative Commons Attribution 4.0 International License.