Heart Failure Scoring Systems Reliable Diagnostically; Prognostic Quality Questionable

Stethoscope on an electrocardiogram, very shallow depth of field. Doctor and patient in the background
HFA-PEFF and H2FPEF scores are reliable tools for diagnosis of heart failure with preserved ejection fraction (HFpEF); however, using HFpEF scores for prognosis should be further examined in larger, rigorously phenotyped populations.

Two systems used to score heart failure with preserved ejection fraction (HFpEF) — the HFA-PEFF and the H2FPEF — have demonstrated reliability as diagnostic tools for HFpEF. Prognostic utility scores, though, require further validation, according to research published in ESC Heart Failure.

Using participant data from 2 clinical trials: TOPCAT (ClinicalTrials.gov identifier NCT00094302) and RELAX (ClinicalTrials.gov identifier NCT00763867), researchers sought to evaluate the generalizability of the HFA-PEFF and H2FPEF scoring systems, as well as their comparative prognostic utility, among trial participants.

Investigators also compared the diagnostic utility of these scores in age-, sex-, and race-matched participants from the Atherosclerosis Risk in Communities (ARIC) study who had no history of cardiovascular disease.

The HFA-PEFF is measured on a 0-6 scale and incorporates 3 domains: functional, morphological, and biomarker; scores ≥5 are considered diagnostic for HFpEF. The H2FPEF score, measured on a 0-9 scale, awards points for body mass index, use of ≥2 anti-hypertensive medications, presence of atrial fibrillation, pulmonary artery systolic pressure, age ≥60 years, and Doppler echocardiographic E/e’ ≥9.

Scores 6 and over are considered high. Both HFA-PEFF and H2FPEF scores were available in 264 TOPCAT participants, 188 RELAX participants, and 362 ARIC participants.

HFA-PEFF scores were available in 356 participants from the TOPCAT trial, 216 participants from the RELAX trial, and 362 participants from ARIC.

Median HFA-PEFF score was 5 (interquartile range [IQR], 5-6], 4 [IQR, 2-4], and 3 [IQR, 2-4] in each group, respectively. Among TOPCAT and RELAX participants, these scores were similar across sex, race, and treatment strategy compared with placebo.

80.3% of TOPCAT participants and 20.4% of RELAX participants had a high score (≥5). Within the ARIC population, scores varied by sex but not race. HFA-PEFF scores were primarily intermediate (68.5%) in this group.

H2FPEF scores were available in 214, 188, and 362 participants from each group, respectively.

Median H2FPEF scores were 5.5 (IQR, 4-7), 6 (IQR, 4-7), and 3 (IQR, 2-4) in the TOPCAT, RELAX, and ARIC groups, respectively. In the TOPCAT population, scores differed by sex and race but were similar among participants enrolled via natriuretic peptide vs hospitalization criteria. RELAX scores were similar by sex, gender, and treatment strategy. ARIC participant scores were similar across sex and race; this group primarily had intermediate H2FPEF scores (84.5%).

Among participants who had both scores calculated (TOPCAT: n=264; RELAX: n=188), the use of the HFA-PEFF score categories in participants categorized via H2FPEF score led to the reclassification of 52.7% and 50.5% of participants in each trial, respectively.

According to researchers, both scores performed well, in general, for diagnostic purposes assessed in the study populations. Specificity and PPV of the HFA-PEFF score was 78.2% and 76.2%, respectively, and 92.5% and 85.6%, respectively, for H2FPEF.

For each 1-point HFA-PEFF score increase, hazard for adverse cardiovascular events increased by 26% (hazard ratio [HR] 1.26 [95% CI, 0.98-1.63]); the HR for each 1-point increase in H2FPEF was 1.01 (95% CI, 0.88-1.15). Researchers noted similar associations when scores were assessed as a categorical measure.

After multivariable-adjusted logistic regression, odds of the primary outcome and HF hospitalizations were 2.06 and 2.07, respectively, for HFA-PEFF scores. For H2FPEF scores, odds were 1.12 and 1.08, respectively.

Higher H2FPEF was also associated with decreased maximal oxygen consumption (VO2 max) per both unadjusted and adjusted models (β, -0.51 and -.026, respectively); β estimate for the association of HFAPEFF with VO2 max was -0.1 and -0.02 in both unadjusted and adjusted models, respectively.

Study limitations included the presumption that all participants had HRpEF, although inclusion criteria differed between the two included trials, the potential for selection bias, and an inability to estimate the results of step 3 of the HFA-PEFF algorithm because of the study design.

“Despite recent advances in the understanding of the pathophysiology of HFpEF, the diagnosis of HFpEF remains challenging,” the researchers concluded. “Both the HFA-PEFF and H2FPEF scoring systems are reliable diagnostic tools for HFpEF patients. Further research in large, diverse, and well-phenotyped population-based cohorts is needed to substantiate the validity of the scoring systems and to simplify the diagnosis in those with intermediate-likelihood of HFpEF based on these scores.”


Oarcha V, Malla G, Kalra R, et al. Diagnostic and prognostic implications of heart failure with preserved ejection fraction scoring systems. Published online March 11, 2021. ESC Heart Fail. doi:10.1002/ehf2.13288