Clinical trial protocol for TRANSFORM-UK: A therapeutic open-label study of tocilizumab in the treatment of pulmonary arterial hypertension

Our aim is to assess the safety and potential efficacy of a novel treatment paradigm in pulmonary arterial hypertension (PAH), immunomodulation by blocking interleukin-6 (IL6) signaling with the IL6 receptor antagonist, tocilizumab. Inflammation and autoimmunity are established as important in PAH pathophysiology. One of the most robust observations across multiple cohorts in PAH has been an increase in IL6, both in the lung and systemically. Tocilizumab is an IL-6 receptor antagonist established as safe and effective, primarily in rheumatoid arthritis, and has shown promise in scleroderma. In case reports where the underlying cause of PAH is an inflammatory process such as systemic lupus erythematosus, mixed connective tissue disease (MCTD), and Castleman’s disease, there have been case reports of regression of PAH with tocilizumab. TRANSFORM-UK is an open-label study of intravenous (IV) tocilizumab in patients with group 1 PAH. The co-primary outcome measures will be safety and the change in resting pulmonary vascular resistance (PVR). Clinically relevant secondary outcome measurements include 6-minute walk distance, WHO functional class, quality of life score, and N-terminal pro-brain natriuretic peptide (NT-proBNP). If the data support a potentially useful therapeutic effect with an acceptable risk profile, the study will be used to power a Phase III study to properly address efficacy.


Introduction
Pulmonary arterial hypertension (PAH) comprises a group of orphan diseases historically associated with a poor prognosis. In the last 20 years, four classes of drug therapy targeting vasoactive pathways have been studied in randomized controlled trials (RCTs) and licensed for the treatment of predominantly group 1 PAH. These therapies have demonstrated moderate success, with meta-analyses of all RCT data suggesting a short-term improvement in mortality at 14 weeks. 1 Despite this, PAH in the UK still carries a fiveyear survival in idiopathic PAH (IPAH) of 61% 2 and as low as 49% for PAH associated with connective tissue diseases. 3,4 Therefore, there remains an urgent need for the development of new treatments, particularly as the results from combination studies of these different classes of vasoactive therapies has been, to date, mixed and disappointing. 5 The strong association of PAH with dysregulated immunity and inflammation has been long established with autoimmune diseases, 6 most prominently scleroderma, but also notably rheumatoid arthritis (RA), systemic lupus erythematosus (SLE), mixed connective tissue disease (MCTD), and Sjogren's syndrome. 7 Auto-immune diseases, in particular, are therefore recognized as causally associated with PAH, but there is also an association of the idiopathic form of PAH (IPAH) with auto-immune thyroid disease, links to HLA subtypes 8 and the presence of auto-antibodies in up to 93% of patients. 9,10 IPAH has previously been speculated historically to be an autoimmune disease. 11 More locally, within the pulmonary vascular lesions, there is accumulation of inflammatory cells including T and B lymphocytes 12 with altered T regulatory cell function 13,14 and changes in B cell gene expression. 15 It is clear, therefore, that inflammation and dysregulated immunity play a significant role in a spectrum of causes of PAH. From the perspective of identifying pathways that are targetable, IL-6 has emerged as a strong candidate. IL-6 has been well-characterized as raised in peripheral blood and within the lung in PAH 12,16 and is an independent marker of prognosis outperforming traditional markers of cardiac function such as NT-proBNP. 17 Over-expression of IL-6 in animal models using transgenic mice leads to pulmonary hypertension 18 and in hypoxia, IL-6 deficient mice are protected. 19 Administration of recombinant IL-6 to rats also recapitulates a PAH phenotype. 20 Tocilizumab is an IL-6 receptor antagonist established as safe, well tolerated, and effective, primarily in RA, 21 and has shown promise in scleroderma. 22 In uncommon cases, where the underlying cause of PAH is an established inflammatory process such as SLE, MCTD, and Castleman's disease, there have been case reports of regression of PAH with tocilizumab. [23][24][25] We therefore propose a phase II open-label proof of concept study of tocilizumab in group I PAH.

Hypothesis
Immunomodulation utilizing interleukin-6 (IL6) receptor antagonism is a novel treatment strategy for patients with group 1 PAH and will improve pulmonary hemodynamic parameters.

Study participants
Patients will be recruited from seven adult specialist PH centers in the UK: Papworth Hospital, Cambridge; Golden Jubilee Hospital, Glasgow; Freeman Hospital, Newcastle; Royal Hallamshire, Sheffield; Hammersmith Hospital, London; Royal Brompton Hospital, London; Royal Free Hospital, London; and Imperial College, London. We aim to recruit 21 patients with a 15% drop-out rate and with provision for replacement. Study entry criteria (Table 1) and exclusion criteria (

Study design rationale
We have taken a conservative safety-led open-label trial design approach (Fig. 1). This has been driven by a safety-first approach but also from previous trial experience in both open-label and RCTs of the stability of PVR as a primary endpoint. Delta change in PVR over a six-month period in phase 2 and 3 trials demonstrate little evidence of a significant placebo response with the majority of trials reporting a deterioration over four to six months (Fig. 2). The timeframe of six months was informed by the duration of clinical response in other autoimmune indications and, in particular, RA and scleroderma.

Recruitment
Participants will be identified using data collected during their routine outpatient appointment at each of the pulmonary hypertension centers.

Investigational product
IV therapy will be administered at a dose of 8 mg/kg once monthly for six months. Other PAH therapies may not be added unless an individual has experienced a clinical failure event. A clinical failure event is defined as the following: worsening of PAH; initiation of treatment with intravenous or subcutaneous prostanoids; lung transplantation; or atrial septostomy or death from any cause up to the end of treatment. Worsening of PAH is defined by the occurrence of all three of the following: . a decrease in the 6-minute walk distance (6MWD) of at least 15% from baseline, confirmed by a second 6-minute walk test (6MWT) performed on a different day within two weeks; . the need for additional treatment for PAH; . worsening of symptoms of PAH includes at least one of the following: a change from baseline to a higher WHO functional class (or no change in patients who were in WHO functional class IV at baseline) and the appearance or worsening of signs of right heart failure that did not respond to oral diuretic therapy.

Study assessments and procedures
Safety assessment, primary and secondary outcome data will be undertaken as per the assessment schedule (data supplement). Primary and secondary endpoints are listed below.

Primary endpoints
. Safety as defined by the incidence and severity of adverse events . Pulmonary vascular resistance (dynes.s/cm 5 ) measured using invasive hemodynamic assessment by right heart catheter Clinical secondary endpoints

Statistical considerations
This is a proof-of-concept study and the sample size has been determined with respect to safety (in terms of exposure 3. Participant has severe hepatic impairment (Child-Pugh class C with or withoutcirrhosis) at the screening visit. 4. Patient with ALT or AST > 5Â upper limit of normal. Hematology and bleeding disorders 1. Individual has clinically significant anemia in the opinion of the investigator, in particular from pyruvate kinase and G6PD deficiencies. 2. Participants with bleeding disorders or significant active peptic ulceration in the opinion of the investigator. 3. Individual has peripheral blood platelets < 100 Â 10 9 /L. 4. Participant has a neutrophil count < 2 Â 10 9 /L. Cardiovascular 1. Individual has had an acute myocardial infarction within the last 90 days before screening. NB: Participants must not have three or more of the following left ventricular disease/dysfunction risk factors: 2. Body mass index (BMI) ! 35 3. Historical evidence of significant coronary disease established by any one of: history of myocardial infarction; history of percutaneous intervention; angiographic evidence of CAD (>50% stenosis in at least one vessel), either by invasive angiography or by CT angiography; positive stress test with imaging (either pharmacologic or with exercise); previous coronary artery surgery; chronic stable angina. General medical conditions 1. Individual with cardiovascular, liver, renal, hematologic, gastrointestinal, immunologic, endocrine, metabolic, or central nervous system disease that, in the opinion of the investigator, may adversely affect the safety of the participant and/or efficacy of the investigational product or severely limit the lifespan of the individual other than the condition being studied. 2. Participant has a history of malignancies within the past five years, except for an individual with localized, non-metastatic basal cell carcinoma of the skin, in situ carcinoma of the cervix, or prostate cancer who is not currently or expected, during the study, to undergo radiation therapy, chemotherapy, and/or surgical intervention, or to initiate hormonal treatment. to the drug and investigations) and feasibility (patient population). We have intentionally only powered the study to pick up large effect sizes. The primary outcome is PVR fold change from baseline after six months of treatment, i.e. PVR 6months /PVR baseline . Fold change is positively skewed although assumed to be normally distributed after a log transformation. It would be clinically significant if PVR 6months decreased by 30% from PVR baseline , i.e. PVR 6months ¼ (1-0.3) PVR baseline . Hence, the expected fold change is PVR 6months /PVR baseline ¼ 0.7, or Àlog(0.7) ¼ 0.15 in the log scale (the sign is to make the number positive but has no effect on sample size). The standard deviation of log fold change after three months was 0.42. Therefore, the sample size (n) required to detect the aforementioned log fold change in PVR with 90% power and 5% statistical significance was 17. Accounting for approximately 20% of drop-outs, the final n was 21. N.B. The pwr package in R was used with the following command: pwr.t.test Classical hypothesis testing pioneered by RA Fisher hinges heavily on P values and rejection of a null hypothesis (H 0 ). Statistically, a P value is the area of a theoretical distribution of a statistic under H 0 beyond an observed value given data. Informally, a P value measures how compatible are the observed data with H 0 . Traditionally, a P value of 0.05 has been used as statistical evidence against H 0 , and as a consequence, of prove of an effect. However, Fisher never intended for it to be fixed at 5% and recommended each trial to gauge an appropriate value given for example the possible consequences of false positive findings.
In rare diseases, RCTs are always of relatively small size. The distribution of P values under the alternative hypothesis (H 1 ) in trials with low power, e.g. 80%, is practically uniform regardless of the actual biological effect. 26 This means that a P value of 0.05 or 0.0005, i.e. 100 times lower, are equally likely for the same effect. P values are therefore equivocal indicators of how strong effects are unless the power exceeds 90%. RCTs in rare diseases suffer from low statistical power given the limitations of finding enough patients. Under these circumstances, the Bayesian paradigm offers an additional advantage over the frequentist one: informative priors. In a Bayesian analysis, additional information not contained in the data can be brought in to enlighten the results and reduce uncertainty, unlike the classical statistical paradigm. This additional information is called the prior and refers to the possible distributional properties of any parameter that we may be interested in before considering the data.

Primary Bayesian analysis
Primary outcome analysis will utilize a Bayesian analysis with a flat prior distribution. It is reasonable to assume that the log of PVR fold change (logPVRfc ¼ log PVR 6months /PVR baseline ) follows a normal distribution with mean m and precision (¼ s À2 , which is the inverse of the variance of the parameter). A 95% credible interval, i.e. highest posterior density interval, for logPVRfc will be obtained using a flat uniform prior for m ranging from À10 (fc ¼ 0.00005, i.e. tocilizumab removes PVR completely) to 10 (fc ¼ 22026, i.e. tocilizumab strongly aggravates PVR) and a vague prior for $Gamma(0.001,0.001).
In the absence of useful prior information, this choice of priors will ensure that the data defines the posterior distribution of logPVRfc.
We simulated data to show the benefits of Bayesian analysis in this small size RCT (programs provided in Appendix). PVR before treatment was assumed to be distributed as a log-normal with mean log(300) and standard deviation log(3), whereas PVR after treatment was assumed to be distributed as another log-normal with mean log(200) and sd log(2) (Fig. 3). Thirty-three percent of patients have PVR > 500 before intervention compared to just 9% after intervention. The average fold change was 0.94, or equivalently in the Àlog scale, 0.51. Table 3 shows Bayesian results for log-FC given the aforementioned flat prior (equivalent to a frequentist analysis) and an informative prior.
Here, the informative prior is obtained from the results observed with the flat prior. This serves to illustrate an  advantage of incorporating good additional information in the form of informative priors: the 95% credibility interval (similar to the frequentist confidence interval) is narrower when using good informative priors indicating much more certainty around the true effect of tocilizumab. In practice, priors cannot be obtained from a prior Bayesian analysis under informative priors but from independent sources. Likewise, if prior and data disagree, the credibility interval can be larger than the confidence interval in a frequentist analysis. This is nevertheless not a disadvantage but a realistic picture of the effect of combining conflicting sources of information.
In comparison, the Wilcoxon test outputs a statistic V ¼ 183, which is not intuitive, and a P value of 0.018. It has detected a difference at 5% significance but it does not provide any further information into the effect of tocilizumab in PVR. In contrast, the most conservative Bayesian analysis with flat priors informs us that tocilizumab renders on average a fold change of 0.60 (simulated value 0.67) but that given 18 individuals and the distributions in Fig. 3 the true effect is between 0.93 and 0.38 with 95% probability (the range is 0.45-0.81 when using good informative priors or a 65% reduction in the credible interval). Note that this probability statement on intervals does not apply to the frequentist confidence interval. There, an interval either contains the true parameter value or does not, and is understood in terms of what proportion of those intervals would contain the true parameter in the long run after many hypothetical experiments. Fig. 4 shows all the possible results expected after completing the TRANSFORM project (www.transform-uk.com). If we assume 21 patients completed treatment successfully and had PVR measures at baseline and after 24 weeks, a patient will be scored 1 if PVR reduced by at least 30% from baseline (success), otherwise he/she will be scored as 0. The question is: what is the probability of a successful treatment for a random individual with group 1 PAH? The Bayesian solution to this problem is to combine prior information summarizing all available knowledge of the effect of tocilizumab in PVR among PAH patients and new data from a trial to update our knowledge of the probability of success (P). Unlike frequentist analysis, the Bayesian solution is a distribution of possibilities for the parameter P, i.e. the posterior distribution. For example, the top left plot in Fig. 4 shows all the potential posterior distributions expected at the end of the trial (when data and prior are combined) given a flat prior, i.e. equivalent to complete uncertainty about the effect of tocilizumab on PVR. The leftmost curve corresponds to the distribution given 0 success and 21 failures in the study, the next curve to the right corresponds to observing 1 success and 20 failures in the study, and so on until the last curve to the right which corresponds to having observed 21 successes and 0 failures. Those results would also be obtained in a frequentist analysis where the prior is always uninformative, i.e. only the data carry information. The added advantage of Bayesian analysis is that a bona-fide prior enlightened the results by reducing bias after repositioning the most likely effect estimate (the mode) and reducing the error or uncertainty around the most likely estimate (reducing the width of the distribution). With a flat prior (no knowledge about the effect of tocilizumab) and 0 successes, the most likely probability of success is 0 (technically, the mode does not exist but it is 0 asymptotically) and the standard error is 0.04. However, when using the expert elicited prior and given the same experimental outcome (0 successes), the most likely value for the probability of success of tocilizumab is no longer 0 but 0.02 and the standard error is 0.05. Even with no observed successes, there was still a non-zero probability that tocilizumab may work. This statement takes into account both a relatively vague expert prior with mode at 0.2 and a small dataset (n ¼ 21).

Expert prior elicitation
For comparison, paired changes in the primary endpoint PVR will be assessed using the Wilcoxon signed-rank test.
(significance set at P value < 0.05). A conservative approach to secondary endpoints will be presented with median and confidence intervals. The actual prior elicitation produced results shown in Fig. 5. It reveals an overall belief among experts that tocilizumab ought to work in 20-40% of the cases but with a non-negligible probability of not working at all (about 10% of the overall weight falls onto no effect at  all), and a low probability (20%) that it will work more than 50% of the time.

Discussion
A trial of immunosuppression in PAH is now warranted. Over the last 20-30 years, evidence has accumulated to suggest that inflammation and autoimmunity play a role not just in CTD but also in IPAH, though the exact role is unclear and a significant question remains about how much is pathogenic and how much is inflammation related to chronic disease. The concept of immunosuppression in other vascular diseases is gaining momentum, even in the absence of autoimmunity. Examples of this include blocking IL-1 in acute coronary syndromes 27 and in atherosclerosis (CANTOS study: NCT01327846). This is not to downplay the implications. We acknowledge that to commit PAH patients to immunosuppression is a significant undertaking and we would not underestimate the potential effects of this. The side effect profile of most immunosuppressive therapy is a new area for PAH and cannot be ignored. For this reason, we feel that immunosupression should not be considered unless it is going to be transformative to patient outcomes. Our study is powered for large effects only in recognition of this. We have paradoxically excluded patients with SLE, MCTD, and Castleman's disease to minimize heterogeneity  and the chance of a positive study being driven by response in rarer diseases. Additionally, the majority of these patients are already immunosuppressed. We believe a trial specifically looking at immunosuppression in CTD-PAH is overdue but the trial design would have to be different reflecting the availability of licensed and established immunotherapies. A trial only looking at these rare causes would require many more centers. The open-label nature of our study may be viewed as contentious. We have examined available PVR data from recent meta-analyses 1,28 and PVR is not placebo responsive with the majority of trials reporting increases at four to six months. Given the lack of placebo effect on PVR, we feel the only likely impact of this therefore will be missing signal related to participants deteriorating, and there is a risk our study may underestimate effects. As the first trial of a new approach, we are also aware that a valid criticism is that we may not be enriching our trial for patients likely to respond. Given the novelty of this approach, we have chosen to start with patients who are relatively stable, predominantly on dual therapy and therefore a ''prevalent'' not ''incident'' population. In most autoimmune diseases, therapy works best in patients ''flaring'' or ''active'' and at present this is not a traditional way of viewing the progress of pulmonary hypertension and we have no good biomarkers to help guide us. This may affect the power of the study. In part to address the low power issue we have adopted a Bayesian statistical approach. Classical hypothesis testing hinges heavily on P values and rejection of a null hypothesis (H 0 ). Informally, a P value measures how compatible are the observed data with H 0 . In rare diseases, RCTs are always of small size. Halsey et al. 26 showed that the distribution of P values under the alternative hypothesis (H 1 ) in trials with low power, e.g. 80% or less, is practically uniform regardless of the actual biological effect. RCTs in rare diseases suffer from low statistical power, and as we stratify our patients more carefully this will worsen. Given these circumstances, it may be wiser to move away from frequentist statistics and one option is to change over to the Bayesian paradigm. What is appealing in a Bayesian analysis is that additional information not contained in the data (the classical statistical stand) can be brought in to enlighten the results. This approach may be of particular importance when considering mixed populations as we have in this trial of CTD, predominantly scleroderma, and IPAH where we may have significant variation in treatment responses. The authors of this manuscript are not frequentist or Bayesian but practical scientists that can work within either paradigm. Nevertheless, we recognize the potential benefits of using a Bayesian approach here.
In summary, we believe that the time is right for a trial of immunosuppression in PAH, that IL6 is an excellent starting candidate, with a well-established and well-tolerated therapy in tocilizumab. Immunosuppression is uncontroversial in our opinion in CTD but more controversial in IPAH; however, the preclinical evidence is compelling and it is now time to test hypotheses other than vasodilation. We have designed a small open-label investigator-led study utilizing Bayesian statistical methods to analyze and we think this trial design is potentially useful to the field to consider.