A machine learning model trained to identify patients with familial hypercholesterolemia (FH) using large deidentified healthcare encounter data was found to be successful, according to a study published in Lancet Digital Health.

The machine learning model FIND FH, was trained using deidentified electronic health record from academic health systems (ie, procedure and diagnostic codes, prescriptions, and laboratory findings) from patients with an FH diagnosis (n=939) and individuals without FH (n=83,136).

Researchers applied the model to a national health-care encounter database comprising 170 million people as well as an integrated, tertiary healthcare academic medical system dataset that included up to 174,000 people. People were only included in model training and evaluation if they had ≥1 cardiovascular disease risk factor.

Continue Reading

The model had a positive predictive value of 0.85, a sensitivity of 0.45, an area under the precision–recall curve of 0.55, and an area under the receiver operating characteristic curve of 0.89. There were 1,331,759 of 170,416,201 patients in the national database and 866 of 173,733 people in the integrated healthcare delivery system dataset identified as likely to have FH.

The second external validation dataset included a review of “flagged” individuals and application of diagnostic criteria by FH experts on a sample of 103 patients from the healthcare delivery system and 45 from the national database. A total of 77% of patients in the integrated healthcare delivery system dataset (95% CI, 68–86) and 87% of patients in the national database (95% Cl, 73–100) were deemed to be at high clinical suspicion of having FH, which was  high enough to necessitate guideline-based assessment and therapy.

Study limitations include the fact that physician review was conducted on a small subset of flagged individuals.

“A crucial hurdle will be engaging providers to become familiar with machine learning approaches designed to reconnect them with their patients regarding new diagnoses not presented in previous medical encounters,” noted the study authors. “This new tool carries the promise of finding new individuals with [FH] at scale and leading to more effective preventive therapy for them and newly identified family members.”

Disclosure: Several study authors declared affiliations with the pharmaceutical industry. Please see the original reference for a full list of authors’ disclosures.


Myers KD, Knowles JW, Staszak D, et al. Precision screening for familial hypercholesterolaemia: a machine learning study applied to electronic health encounter data. Lancet Digit Health. 2019;1(8):e393-e402. doi:10.1016/S2589-7500(19)30150-5