Machine-learning algorithms can be successfully used to identify heart failure (HF) in real-time in hospitalized patients, according to research published in JAMA Cardiology.
Researchers from the New York University School (NYU) of Medicine in New York City conducted a retrospective study of hospitalizations at NYU’s Langone Medical Center, with the purpose of developing algorithms that could be used to identify hospitalized patients with HF. They noted that one of the main methods of improving quality of care—“problem lists”—can be useful for case identification but are often inaccurate or incomplete.2-6
Data were collected through electronic health records (EHR), and included hospitalizations of adult patients between January 1, 2013 and February 28, 2015, and included a total of 47,119 hospitalizations (mean age: 60.9 years; 50.8% female; 11.2% African American and 7.8% Hispanic or Latino). Three hundred fifteen records were excluded for physician record review; of the remaining hospitalizations, 75% were randomly selected for model development (n=35,114), while the other 25% were used for model validation (n=11,690).
The researchers developed 5 algorithms of increasing complexity to identify patients with HF at the second midnight of hospitalization. Algorithms were implemented using EHR data and are as follows:
- The presence of HF on the problem list, which also included acute myocardial infarction and atherosclerosis.
- The presence of at least 1 of the following characteristics: HF on the problem list, inpatient oral or intravenous loop diuretic use, or brain natriuretic peptide level of 500 pg/mL or greater.
- Logistic regression of 30 clinically relevant structured data elements.
- Machine-learning approach using unstructured sdata.
- Machine-learning approach using structured and unstructured data.
The primary outcome and measure for this study was a HF diagnosis based on discharge diagnosis and physician review of sampled medical records.
Of all hospitalizations examined, 13.9% had a diagnosis of HF “in any position,” while 2.6% had a principal diagnosis of HF. Algorithm 1 had a sensitivity of 0.52 and a positive predictive value (PPV) of 0.96 for HF identification; HF on the problem list had a sensitivity of 0.40 and a PPV of 0.96 in the validation set. Algorithm 2 was associated with a sensitivity of 0.84 and a PPV of 0.58 for discharge diagnosis, compared to a sensitivity of 0.77 and a PPV of 0.68 for physician review criteria standards.
In algorithm 3, the investigators identified an area under the receiver operating characteristic curve (AUC) of 0.953, a sensitivity of 0.76, and a PPV of 0.8. Validation using the physician criterion review standard was performed, and a sensitivity of 0.68 and PPV of 0.90 were identified.
For both machine learning algorithms—algorithms 4 and 5—AUCs were 0.969 and 0.974, respectively; sensitivities were 0.84 and 0.86, and PPVs were 0.80 for both algorithms, using the discharge diagnosis criterion standard. Algorithm 5 had a sensitivity of 0.83 and a PPV of 0.90 using physician review criterion.
More than 1600 hospitalizations for principal or secondary HF were noted; of these, 12% had not had a prior echocardiogram. Across all algorithms, the PPV for identification of HF among patients who had not had echocardiograms were 0.92, 0.30, 0.71, 0.71, and 0.67, respectively.
“Identification of patients with heart failure in real time can be particularly challenging,” the researchers wrote. “[H]eart failure is a clinical diagnosis with no biometric criterion standard and no medications that are specific to this disease.”
Overall, the researchers found that algorithims 3, 4, and 5 offered the most improved accuracy for identifying HF. In particular, algorithm 3 is considered easy to implement; algorithms 4 and 5 are more difficult to implement, but the investigators felt that the resulting cost “may be worth the improved performance, depending on clinical needs.”
“[E]arly identification of disease during hospitalization has become paramount to initiate transitional care,” they continued. “Our findings suggest that the problem list, which identified only half of hospitalized patients with [HF] is insufficient for real-time identification of this population.”
- The discharge diagnosis codes used are subject to misclassification, including false positive results.
- The physician medical record review criterion is subject to imperfect reliability among physicians, potentially resulting in sampling bias.
- The researchers may have missed potential contraindications to quality metrics.
- The study took place at a single institution, limiting generalizability to other institutions.
- The algorithms were not developed to identify outpatients with HF, and are applicable only to patients who are hospitalized.
Disclosures: Drs Sontag and Blecker report a pending patent for a machine-learning algorithm to predict diabetes. No other disclosures were reported.
- Blecker S, Katz SD, Horwitz LI, et al. Comparison of approaches for heart failure case identification from electronic health record data. JAMA Cardiol. 2016. doi: 10.1001/jamacardio.2016.3236
- Hartung DM, Hunt J, Siemienczuk J, Miller H, Touchette DR. Clinical implications of an accurate problem list on heart failure treatment. J Gen Intern Med. 2005;20(2):143-147.
- Samal L, Linder JA, Bates DW, Wright A. Electronic problem list documentation of chronic kidney disease and quality of care. BMC Nephrol. 2014;15:70.
- Holmes C, Brown M, Hilaire DS, Wright A. Healthcare provider attitudes towards the problem list in an electronic health record: a mixed-methods qualitative study. BMC Med Inform Decis Mak. 2012;12:127.
- Szeto HC, Coleman RK, Gholami P, Hoffman BB, Goldstein MK. Accuracy of computerized outpatient diagnoses in a Veterans Affairs general medicine clinic. Am J Manag Care. 2002;8(1):37-43.