A minimal clinically important difference measured by the Cambridge Pulmonary Hypertension Outcome Review for patients with idiopathic pulmonary arterial hypertension

Several patient-reported outcome measures have been developed to assess health status in pulmonary arterial hypertension. The required change in instrument scores needed, to be seen as meaningful to the individual, however remain unknown. We sought to identify minimal clinically important differences in the Cambridge Pulmonary Hypertension Outcome Review (CAMPHOR) and to validate these against objective markers of functional capacity. Minimal clinically important differences were established from a discovery cohort (n = 129) of consecutive incident cases of idiopathic pulmonary arterial hypertension with CAMPHOR scores recorded at treatment-naïve baseline and 4–12 months following pulmonary arterial hypertension therapy. An independent validation cohort (n = 87) was used to verify minimal clinically important differences. Concurrent measures of functional capacity relative to CAMPHOR scores were collected. Minimal clinically important differences were derived using anchor- and distributional-based approaches. In the discovery cohort, mean (SD) was 54.4 (16.4) years and 64% were female. Most patients (63%) were treated with sequential pulmonary arterial hypertension therapy. Baseline CAMPHOR scores were: Symptoms, 12 (7); Activity, 12 (7) and quality of life, 10 (7). Pulmonary arterial hypertension treatment resulted in significant improvements in CAMPHOR scores (p < 0.05). CAMPHOR minimal clinically important differences averaged across methods for health status improvement were: Symptoms, –4 points; Activity, –4 points and quality of life –3 points. CAMPHOR Activity score change ≥minimal clinically important difference was associated with significantly greater improvement in six-minute walk distance, in both discovery and validation populations. In conclusion, CAMPHOR scores are responsive to pulmonary arterial hypertension treatment. Minimal clinically important differences in pulmonary hypertension-specific scales may provide useful insights into treatment response in future clinical trials.


Introduction
Pulmonary arterial hypertension (PAH) is a rare disorder characterised by a progressive rise in mean pulmonary artery pressure and pulmonary vascular resistance, ultimately resulting in right heart failure and death. 1 PAH may present with a range of non-specific, yet debilitating symptoms which can affect health-related quality of life (HRQoL). [1][2][3] As there is presently no cure for PAH, pharmacotherapy remains the mainstay of treatment with the aim of slowing disease progression and alleviating symptoms. 1 Despite recent treatment advancements improving PAH survival, symptomatic burden remains high. [4][5][6][7][8] Conventionally, establishing drug efficacy in PAH clinical trials has relied upon observed changes in functional status and capacity. 4 It has not until relatively recently that composite morbidity and mortality end-points have been employed in event-driven trials, as highlighted by Sitbon et al. 9 Whether selected trial end-points are relevant to the individual, however, remains unknown. As a result, there is an increasing awareness of the need for patientreported outcomes (PROs) to be incorporated as secondary end-points in PAH clinical trials. 10 Despite this, PROs continue to remain under-utilised. This, in part, is because changes in HRQoL measures have been more modest than objective end-points such as six-minute walk distance 11,12 or pulmonary haemodynamics. 4 Generic measures of HRQoL (e.g. SF-36) conventionally used in such trials, however, may lack sensitivity to detect change in specific disease processes such as PAH. 13 To address this, a number of pulmonary hypertension-specific HRQoL instruments have been developed and validated: Cambridge Pulmonary Hypertension Outcome Review (CAMPHOR), 14 EmPHasis-10, 2 Living with Pulmonary Hypertension 15 and Pulmonary Arterial Hypertension-Symptoms and Impact (PAH-SYMPACT). 16 Such disease-specific PROs have been shown to track functional status, clinical deterioration and prognosis in PAH. 12,17 The magnitude of improvement in these measures needed to be seen as meaningful by the individual, or the Minimal Clinically Important Difference (MCID), however, are unknown. This is of importance as even in the absence of statistically significant changes in PRO end-points, interventions may still be of relevance to the patient. Furthermore, knowledge of a measure's MCID provides useful information regarding longitudinal changes in PROs and the monitoring of individual patient's clinical progression.
Although the MCID has become a standard approach in the interpretation of the clinical relevance of changes in PROs, there remains no 'gold-standard' for MCID estimation and methodological approaches remain much debated. 18 Broadly, MCIDs may be estimated using distributional-or anchor-based approaches. 19 Distributional methods rely on the statistical characteristics of scores around the mean (e.g. standard deviation (SD)) whilst anchor-based methods link changes in PRO scores to a second external measure of change, or the anchor, and are therefore presumed to be sample independent. Global assessments of health change are most frequently employed as anchors in MCID estimations and enable the direct association of PRO score change to a patient's preferences and values. 20 They are, however, subject to recall bias. 21 Given limitations in both distributional and anchor-based MCID methods, conventionally both methods are employed with MCIDs typically reported as the mean of combined estimates. 22 We provide the first estimation of a MCID in a pulmonary hypertension-specific PRO measure (CAMPHOR) using both distributional-and anchor-based methods. Furthermore, we demonstrate validation of these estimates using objective markers of PAH severity.

Methods
All incident and prevalent cases of idiopathic pulmonary arterial hypertension (IPAH) between January 2006 and June 2018 were eligible for inclusion. Follow-up was included until 1 June 2019. All patients were age >18 years at the time of diagnostic right heart catheterisation. IPAH diagnosis and treatment was as per international guideline recommendations at the time of diagnosis. 1,23,24 Clinical data were collected prospectively into a dedicated pulmonary vascular diseases database.
The discovery cohort was comprised of all incident cases of IPAH with CAMPHOR scores available at treatment-naı¨ve baseline and within 4-12 months following the initiation of PAH-specific therapy. A minimum follow-up of four months was chosen as this reflects the upper limit of the 12-16 week end-point historically employed for outcome assessment in PAH clinical trials. 9 A maximum follow-up interval of 12 months was chosen as beyond this, it is increasingly difficult to attribute changes directly related to the initiation of drug therapy.
The validation dataset was formed of incident and prevalent cases of IPAH not included in the discovery cohort with at least two serial CAMPHOR scores recorded at any time point until end of follow-up. In the main, individuals in the validation cohort were prevalent cases diagnosed before the routine clinical use of CAMPHOR in 2006. Baseline pre-treatment PRO scores were therefore unavailable for this group. A minimum time period of six weeks between CAMPHOR scores was set to limit the testing effect of repeated measures completed within a short time-frame. No maximum time limit between CAMPHOR completion dates was set for the validation cohort. Prevalent cases underwent clinical review every six months as standard with CAMPHOR questionnaires completed at each visit. New York Heart Association (NYHA) functional class, six-minute walk distance (6MWD) and N-Terminal pro-Brain Natriuretic Peptide (NT-proBNP) levels concurrent to CAMPHOR completion were recorded where available.

CAMPHOR questionnaire
The CAMPHOR questionnaire contains 65 items measuring; Symptoms (25 questions), Activity (15 questions) and quality of life (25 questions). Symptoms and quality of life are both scored out of 25, and activity out of 30. Scores are negatively weighted so that a higher score reflects worse quality of life and greater functional limitation. 14 At the beginning of the CAMPHOR questionnaire, patients are also prompted to provide responses to two global ratings of health status. One question assesses current health status with available responses of poor/fair/ good/excellent, and the other, change in health status relative to last clinical review with available responses ranging from 'significantly worse' to 'significantly better' on a sevenpoint scale.

Statistical analysis
Statistical analysis was performed using R version 3.6.1. 25 Data averages for continuous variables were reported as mean (SD) and categorical variables as n (percentage of total). As CAMPHOR scores are ordinal, values were rounded to the nearest whole number. Paired t-tests were used to compare CAMPHOR scores at baseline and posttreatment. Reported p-values were adjusted for multiple comparisons by false discovery rate at 5%, where necessary. Survival was calculated using a censoring date of last clinic visit or 1 June 2019, whichever was later. The National Health Service summary care record tracking system was used to ascertain survival status (searched 1 June 2019). Cox Proportional Hazards models were used to assess associations between baseline characteristics and five-year survival.

MCID estimation
In the absence of gold-standard methodology, MCID estimates were based upon prevailing methodological approaches reported in systematic review 22 and expert opinion. [18][19][20][21] Distributional-based MCID estimation. MCIDs were estimated using three distributional-based approaches: SD: the SD represents the variation among a group of scores. As 0.5-SD is widely accepted as corresponding to the MCID, 26 this statistic was adopted for the purposes of MCID estimation in this study. The SD of all scores for each of the three CAMPHOR subdomains was divided by 2 to derive 0.5 SD.
Effect size (ES): ES is a standardised measure of change which can be expressed mathematically as 27,28 MCIDs expressed as ES reduce bias which mainly result from dependency on baseline scores. 22 As the MCID of a scale is generally considered to correspond to a small ES (0.2), the above formula was re-arranged so that the ES MCID was attained by multiplying the SD of baseline scores by 0.2. 29,30 Standard error of measurement (SEM): the SEM for each of the three CAMPHOR subscales was derived using the calculation 28 where r x ¼ the standard deviation at baseline r xx ¼ the reliability of the HRQoL measure Test-retest reliability coefficients for each of the three CAMPHOR subscales have been established and validated: symptoms, 0.92; activities, 0.86 and quality of life (QoL), 0.92. 14 A number of thresholds have been suggested (1-, 1.96-and 2-SEM) when employing the SEM in MCID estimation. [31][32][33] As the most widely validated is 1-SEM, this was used for the purposes of this study. 31 Anchor-based MCID estimation. Anchor-based MCID estimations were attained using within-person and sensitivitybased approaches using methods similar to Van Der Roer et al.: 34 Within-person global ratings change: this is the first and most widely used of the anchor-based MICID approaches. 20,22 It defines the MCID as 'the change in PRO scores of a group of patients selected according to their answers to a global assessment scale' which serves as the anchor. 20 A seven-point global rating of health status change was utilised as an anchor for this study with available ratings of 'very much worse' (-3) to 'very much better' (þ3). The MCID was calculated as the mean score change from baseline to post-treatment of those who reported they were 'moderately better' (þ2) compared to initial baseline review.
Sensitivity and specificity analysis: Receiver Operating Curves (ROC) were used to determine the score change from treatment-naı¨ve baseline to post-treatment with equal sensitivity and specificity to discriminate between 'improved' and 'unchanged' patients. Improved patients were those reporting a health status change of 'moderately better' (þ2) or 'very much better' (þ3) compared to treatment-naı¨ve baseline. Unchanged patients were those who reported a health status change of 'a little worse/better' (AE1) or 'no change' (0) from baseline.

Results
There were 184 consecutive incident cases of IPAH during the study period, of whom 129 patients had available preand post-PAH treatment CAMPHOR scores and formed the discovery cohort. The characteristics of incident IPAH patients included and excluded from the discovery cohort did not differ (online Appendix Table 1).
Discovery cohort patient demographics and characteristics at treatment-naive baseline and post PAH treatment are outlined in Table 1. Mean (SD) age was 54.4 (16.4) years and 64% were female. Three individuals were vasoresponders to nitric oxide. The majority of patients (63%) were treated following PAH diagnosis with oral monotherapy (sequential therapy). Twenty-four patients (22%) were treated with upfront parenteral prostanoid therapy. In the five years following IPAH diagnosis, 32 patients in the discovery cohort died and one underwent lung transplantation.
A total of 117 individuals provided global ratings of health change following PAH treatment. The seven potential responses of 'very much worse' to 'very much better' were discriminated by CAMPHOR subscale scores (p < 0.001, all; Table 3). Frequencies of reported change are displayed in online Appendix Figure 1. Thirty-five individuals (30%) reported a change in health status of at least 'moderately better' with PAH treatment and were considered to have 'improved'.

Anchor and distributional MCID estimates
Distributional MCIDs were calculated using 0.5SD, the ES and SEM as described above. For the Symptoms domain, distributional MCIDs were 4-points, 1-point and 2-points, respectively; for the Activity domain: 4-points, 1-point and 3-points, respectively and for QoL: 4-points, 2-points and 2points (Table 4).
Anchor-based MCIDs generated from the mean change in CAMPHOR score for those who reported feeling 'moderately better' post PAH treatment were: Symptoms, Final MCID estimates derived by taking the mean of distributional-and anchor-based results were: Symptoms, 4 points; Activity, 4 points and QoL, 3 points. MCIDs in CAMPHOR subscale scores between treatment-naı¨ve baseline and post-PAH treatment were achieved by 41 patients (32%) for Symptoms, 39 patients (30%) for Activity and 47 patients (36%). Seventeen patients (13%) achieved the MCID across all three scales.

MCID validation
MCIDs were first compared against objective markers of PAH severity in the discovery cohort. The CAMPHOR Activity domain had the strongest correlation with 6MWD (Pearson r ¼ 0.70, p < 0.001). Those attaining the Activity MCID had a greater change in 6MWD (82.3 (80.8) m versus 38.8 (75.3) m; p ¼ 0.030) from a lower baseline 6MWD (250 (108) versus 311(125) m; p ¼ 0.015) than those who did not. Activity MCID achievement was also associated with a greater fall in NTproBNP level (-1094 (1948) versus -448 (1736) pg/ml) and an increased frequency of NYHA functional class improvement (42% vs. 33%), although these did not reach statistical significance (p ¼ 0.106 and p ¼ 0.522, respectively).   MCID estimates were further verified in a validation dataset comprised of 1008 CAMPHOR observations with contemporaneous 6MWD measurements across 87 incident and prevalent cases of IPAH. Mean interval between observations was 188 days. The Activity scale again had the highest correlation with 6MWD: r ¼ -0.58 (p < 0.001) of the three CAMPHOR subdomains. There were 94 instances of Activity MCID (-4 points) attainment between serial CAMPHOR observations during the study period. Change in 6MWD associated with Activity MCID attainment was 31.4 (64) m compared to -4.6 (35) m for episodes of Activity MCID non-attainment (p < 0.001).

Discussion
The symptomatic burden of PAH and its effects on HRQoL are widely known. [4][5][6][7][8]35 Changes in HRQoL in response to PAH treatment in the 'real world' setting, however, remains poorly understood. This is the first systematic study to directly assess the impact of PAH therapy on HRQoL outside of the clinical trial setting using a PH-specific PRO measure; CAMPHOR. We demonstrate significant improvements in each of the three CAMPHOR subdomains with PAH therapy, alongside improvements in objective measures of PAH treatment response: functional class, exercise capacity and NTproBNP level. Using distributional and anchor-based methods, we propose minimum thresholds of CAMPHOR score change deemed clinically relevant to individuals with PAH, or the MCID. Attainment of the Activity scale MCID (four-point change) was associated with a significantly higher increase in exercise capacity in both incident and prevalent population compared to those who failed to achieve the required change.
As with previous studies utilising CAMPHOR, we demonstrate moderate correlations between CAMPHOR subdomains and exercise capacity; 14 the Activity subscale having the strongest relationship with six-minute walk distance (r ¼ 0.70). Correlations compare favourably with those derived using generic (SF-36, r ¼ 0.40-0.60) and PH-specific (PAH-SYMPACT, r ¼ 0.14-0.57) PRO measures and reinforce the CAMPHOR as an excellent surrogate of functional limitation. [36][37][38] Treatment-naive CAMPHOR scores were weakly correlated with diagnostic haemodynamics (r ¼ 0.04-0.33) suggesting that factors beyond PAH haemodynamic severity measured at rest influence symptomatic burden. To our knowledge, this is the first study to relate haemodynamics to a pulmonary hypertension-specific PRO and reinforces similar findings from the use of generic-HRQoL measures. 37 Whilst anecdotally, there is the perception that PAH therapies improve patient's HRQoL, there is limited 'real word' data to support subjective physician experience. Although PROs have been incorporated as secondary endpoints in PAH clinical trials, these have relied upon generic HRQoL measures which may be less sensitive to change in specific disease process such as PAH. This may at least partly explain why only modest changes in PRO endpoints have been observed to-date. 13 In this study, we demonstrate the significant improvement in CAMPHOR scores following initiation of PAH therapy (Symptoms: p ¼ 0.001; Activity, p ¼ 0.041; QoL, p ¼ 0.009). Furthermore, improvements in CAMPHOR score tracked changes in objective measures of treatment response including: six-minute walk distance, NYHA functional class and plasma NTproBNP level, reflecting the ability of CAMPHOR scores to detect change and be responsive over time.
Survival from PAH diagnosis was associated with lower baseline CAMPHOR Activity score, younger age at diagnosis, female sex and greater exercise capacity. Whilst our limited sample size precluded a robust evaluation of the additional contribution of PROs to prognostication using proposed risk stratification tools, the significance of baseline CAMPHOR scores in predicting long-term survival highlights the insights that can be gained simply from patient perceptions alone.
In chronic diseases such as PAH where there is no 'cure', understanding clinically important change to patients becomes more relevant. To our knowledge, this is the first study to quantify a MCID in IPAH for a disease-specific PRO measure. Standardised methodology for MCID estimation has yet to be determined. Both distributional and anchor-based approaches have their limitations which have been extensively discussed elsewhere. 18,21 Methodology in this study was based on prevailing consensus opinion. As global ratings of health change are the most commonly used measure when attempting to identify within patient change, this approach was adopted for anchor-based estimations. 18,21,22 Across five methods (three distributional and two anchor-based), MCIDs for improvement in CAMPHOR were; Symptoms: -4 points, Activity: -4 points and QoL, -3 points. A third of incident patients achieved an MCID in at least one of the three CAMPHOR subdomains with PAH therapy. Change in CAMPHOR Activity scale score that was equivalent, or greater than, the MCID was associated with a significantly greater increase in six-minute walk distance from diagnostic baseline compared to those who did not attain the MCID threshold (82.3 m vs. 38.8 m; p ¼ 0.03). Greater improvements in NTproBNP levels and NYHA functional class were also seen in those who attained the MCID although these did not reach statistical significance. As change in CAMPHOR score was only weakly associated with baseline scores, MCID thresholds should be relevant irrespective of the initial degree of HRQoL impairment.
The longitudinal relevance of CAMPHOR MCID estimates was demonstrated in an extended validation cohort comprised of 1008 instances of CAMPHOR score completion in 87 incident and prevalent cases of IPAH. Once again, a change in CAMPHOR Activity score at least equivalent to the MCID of -4 points between serial CAMPHOR measures was associated with greater gains in six-minute walk distance compared to individuals who did not report a MCID threshold change (31.4 (64) m versus -4.6 (35) m). This distance of 31 m associated with CAMPHOR Activity MCID attainment compares favourably to direct estimates of the MCID in 6MWD for PAH of 33 m, and reinforces the utility of these values in determining relevant change at the level of the individual. 37 The validation of MCID estimates in both incident and prevalent populations enables not only a better understanding of the effects of intervention at the cohort level (e.g. when submitted to clinical trials) but provides useful insights when monitoring individual patient's progress.
Whilst this study has a number of strengths, we acknowledge that although patient characteristics were similar to those of well-published PAH cohorts, data from this study originate from a single pulmonary hypertension centre and may therefore be subject to bias. CAMPHOR has also received criticism for being more time intensive than other available pulmonary hypertension QOL measures, however remains the most validated with adaption for use in, but not limited to, the: United States, 39 Australia/New Zealand, 40 Portugal, 41 Germany, 42 Netherlands, 43 Poland, 44 Brazil,45 Croatia, 46 French/English Canada 47 and Columbia. 48 This provides plentiful opportunity for the external validation of MCID estimates to assess their robustness which is unafforded by other measures at present. Moreover, CAMPHOR is the only clinically utilised measure inclusive of a global assessment of health. As global assessments of health change enable the direct association of PRO score change to a patient's preferences and values, the inclusion of anchor-based methodology in any MCID estimation is generally considered mandatory. 22 One further limitation to our study is the absence of a gold standard methodology for MCID estimation. We have however aligned our methodological approach to prevailing consensus opinion and indeed have derived estimates using approaches in excess of those seen in the majority of other published MCID works, refining the accuracy of our estimates to the best available. 22 Further work is required to determine universally accepted methods for MCID estimation which may be of relevance to our work in the future.
In conclusion, we have established MCIDs for patientrelevant clinical improvement in CAMPHOR subscale scores and demonstrate the correlation of the CAMPHOR Activity subscale to functional capacity. MCIDs in a pulmonary hypertension-specific PRO measure provides useful insights when monitoring individual patient's progress and allows for a better understanding of the effects of intervention at the cohort level (e.g. when submitted to clinical trials).
Contributorship K.B. had full access to all of the data in the study and takes responsibility for the integrity of the data and accuracy of the data analysis. A.M., N.A., S.A., N.D., J.C., K.S., N.S., D.T., M. T. and J.P-Z. contributed substantially to the study design, data analysis and interpretation and review/revisions of the manuscript. J.P-Z. provided overall supervisory support and approved the final version of the manuscript.

Ethical approval
This project was deemed to be Research without Requiring Ethics by Royal Papworth Hospital Research and Development department (Reference No: P02368).