Estimating and evaluating personalized treatment recommendations from randomized trials with ptr

One of the targets of personalized medicine is to provide treatment recommendations using patient characteristics. We present the command ptr, which both predicts a personalized treatment recommendation algorithm and evaluates its effectiveness versus an alternative regime, using randomized trial data. The command allows for multiple (continuous or categorical) biomarkers and a binary or continuous outcome. Confidence intervals for the evaluation parameter are provided using bootstrap resampling.


Introduction
One of the goals of the modern paradigm of "personalized medicine" is to use data collected on a patient to recommend treatment, rather than treating all patients with the same therapy. A personalized treatment recommendation (PTR) formalizes the process of recommending treatment based on clinically measurable patient traits, termed biomarkers. A PTR is an algorithm that maps single or multiple biomarker inputs onto a decision to treat. For example, breast cancer patients with a low Oncotype DX recurrence score, constructed from a combination of 21 genes, were found to be unlikely to benefit from chemotherapy following surgery (Paik et al. 2006).
As with other kinds of prediction models, developing a PTR requires an estimation and an evaluation step. However, estimating a PTR differs from diagnostic or prognostic model development because the object of the inference (whether a patient benefited from treatment) remains strictly unobserved. Several methods have been developed to tackle this prediction problem, and perhaps the most commonly used is to fit a regression model with a treatment-by-covariate interaction. Once a PTR has been estimated, it is of interest to know whether it improves on a conservative policy, for example, one where everybody is treated. This potential improvement can be estimated prospectively using a new trial, or it can be done retrospectively using a separate dataset from the one used to develop the model.
In this article, we demonstrate the community-contributed ptr command to first estimate a PTR using regression modeling and then to separately estimate the benefit of a PTR compared with an alternative policy, using retrospectively collected data. The command is implemented in the context of a two-armed, randomized controlled trial with a single treatment decision, a binary treatment, and continuous or binary outcomes.

Notation and preliminaries
Let A{0, 1} represent a binary treatment variable; X a vector of biomarkers, measured at a single baseline visit; and Y the outcome. A PTR is an algorithm that takes the baseline biomarkers as input and outputs a recommendation to treatment or control: PTR(X) : X → A{0, 1}. Using counterfactual notation (Rubin 1974), we denote the outcome if a subject was treated as Y * (1) and if he or she was given control as Y * (0). Following a randomized trial, the observed outcome for the ith subject is Analogously, the outcome that would be observed under a given PTR is Note that Y * i (PTR) is a potential outcome because it is only partially observed. By contrasting (2) with (1), we clearly see that Y * i (PTR) is equal to the observed Y when A i = PTR(X i ) (that is, the subject is randomized to the treatment he or she is recommended). We seek a PTR that recommends treatment for values of X where the expected benefit of treatment is above a given threshold. When higher values of the outcome are considered beneficial, this can be written as where I[·] is the indicator function and δ is a number for the minimum improvement necessary for treatment to be recommended over the control. In many applications, this value is zero, although it could take another value if there are economic costs or treatment side effects to account for (see VanderWeele et al. [2019] for a summary).

Estimating a PTR using regression
It is natural to separate biomarkers into those that are prognostic (X prog ) and predict the treatment-free outcome and those that are effect modifiers (X mod ) and predict deviations from the average effect of treatment. For a continuous outcome, a model for the expected outcome might therefore be parameterized by This can be estimated by fitting a linear regression model that includes main effects for the estimation of the α parameters and biomarker by treatment interactions for estimation of the β parameters. In a randomized trial, E{Y * (a)|X} = E(Y |A, X), and the parameters of (4) can be substituted into (3) to get an estimate of the PTR: Note that this depends only on the parameters relating to the average treatment effect (β 0 ) and the effect modifiers (β). However, for unbiased estimation of the β parameters, the main effect of each modifier must be included (that is, all variables in X mod should also be in X prog ). Also, inclusion of other variables in X prog will generally improve the efficiency in estimating the β parameters.

Improvement under a PTR
After one constructs a PTR, it is of interest to evaluate whether there has been an improvement compared with an alternative policy. Natural alternatives might be either a policy where everybody is treated or a policy where everybody is given the control condition. Writing the expected outcome under a policy as µ(·), we see the improvements compared with treatment and control are, respectively, θ T is the obvious comparator if the active treatment would be recommended in absence of a PTR, and θ C is the comparator if the control treatment would be preferred. In the following two sections, we describe two estimates of θ: a nonparametric inverse probability weighted (IPW) estimate and the more efficient augmented inverse probability weighted (AIPW) estimate.

IPW estimate
As noted above, the outcome under a PTR is observed only in subjects who receive the treatment they were recommended: Y obs The outcome remains missing for the remainder. Using standard missingdata theory (Seaman and White 2013), we can make inferences from µ obs (PTR) to µ(PTR) by reweighting the observed sample by the probability that a subject received the treatment he or she was given. This is given by p(A) = π if A = 1 and p(A) = 1 − π if A = 0, where π is the propensity score, fixed by design in a randomized trial. This results in the IPW estimate of the outcome under a PTR: Similarly, the outcome under treatment can be estimated as and the outcome under control can be estimated as

AIPW estimate
We can improve the efficiency of the IPW estimate of the outcome under a treatment policy by borrowing information from the regression of baseline parameters on the outcome (Zhang et al. 2012). This results in the AIPW estimate of the outcome under a PTR: This is equal to the IPW estimate minus an augmentation term that improves the asymptotic efficiency. m(A = a, X i ) is a regression estimate of the expected outcome under treatment or control, conditional on baseline covariates. These models include variables that are predictive of outcome, regardless of treatment assignment. The outcome under treatment is estimated as and the outcome under control is estimated as AIPW estimates are doubly robust, which means that they are unbiased even when the model for E(Y |A = a, X) is misspecified (Bang and Robins 2005).

Inference for improvement parameter
Inference for θ comes from resampling the data using the bootstrap procedure and reestimating the statistic on each sample (θ bs ). The distribution of θ bs can be translated into confidence intervals using one of three methods. The first estimates the standard error using the standard deviation of θ bs and then calculates the confidence intervals by referring this to a normal distribution. The second, the percentile method, calculates the confidence intervals as the 2.5th and 97.5th percentiles of the distribution of θ bs . The final method uses a bias-corrected and accelerated method to account for the fact that the distribution of θ bs may be skewed away from the normal distribution.

Estimating and evaluating a PTR for a binary outcome
The process for estimating and evaluating a PTR when the outcome is binary is similar to that outlined above, except that in two instances a logistic regression model is used instead of a linear regression model. The first is in the estimation of the parameters used to construct a PTR. The second is when estimating the expected outcome under treatment or control used in the AIPW estimation of θ. In the binary outcome example, the parameter θ is interpreted as the absolute change in event rate under PTR compared with a policy where everybody or nobody is treated. binary higher lower outcome can be continuous (the default) or binary (specify the option binary), and treatment is a binary variable coded 0 for the control group and 1 for the treatment group. The command can be used for both estimation and evaluation of a PTR or for the evaluation of a PTR only. If estimating a PTR, then one must specify the modifiers(varlist) option. If evaluating only, then one must specify the ptr(varname) option. Note that none of the varlists used in this command support the i. prefix or the use of # for interaction variables. Therefore, interaction and categorical indicator variables must be created beforehand.

Options
modifiers(varlist) specifies the modifier variables used as treatment interaction terms in a regression model, used to estimate the PTR. Their main effect will automatically be included.
covariates(varlist) specifies the variables to be used as main effects in the regression model that estimates the PTR.
delta(#) sets the minimum threshold for improvement to recommend treatment over control. It can take any real value, and the default is delta(0). ptr(varname) identifies the variable name that indicates whether a subject is recommended treatment under a PTR. This is to be coded 1 if the subject is recommended treatment and 0 if the subject is recommended control. If this option is specified, then the command will evaluate the PTR only and ignore any estimation options.
contrast(string) specifies the condition to contrast the PTR with. string can be either treat for the contrast to be with a policy where everybody is treated or control for the contrast to be with a policy where everybody is given control. The default is contrast(treat).
bsamples(#) specifies the number of bootstrap samples to be used for inference for the theta parameter. The default is bsamples (1000).
bootstrap all specifies that all bootstrap confidence intervals be reported. This includes normal, percentile, and bias corrected and accelerated. The default is to display the normal confidence intervals only. See [R] bootstrap.
seed(#) specifies the seed number for consistent bootstrap inference. The default is seed(0).
augmented specifies that theta be calculated using the efficient and double-robust AIPW estimate.
outcome vars(varlist) specifies the variables to be used in the outcome model part of the AIPW estimate of theta. Not specifying this option will result in the empirical mean being used as input.
tx vars(varlist) specifies the variables to be used in the treatment prediction part of the inverse probability weighted estimate or AIPW estimate of theta. Not specifying this option will result in the empirical mean being used as input.
test(varname) specifies the variable that indicates the part of the dataset that is to be used in evaluation of the PTR, indicating that the remainder is to be used in estimating the PTR. The default is to use all the data for both estimation and evaluation of the PTR.
binary specifies that the outcome variable be binary. The default is to assume the outcome is continuous.
higher specifies that higher values of the outcome are better. This is the default option for a continuous outcome.
lower specifies that lower values of the outcome are better. This is the default option for a binary outcome.

Stored results
ptr stores the following in r(): Scalars r(theta) estimate of θ r(theta se) standard error of θ r(mu ptr) expected outcome under PTR r(mu contrast) expected outcome under contrasted group ptr also creates the variable ptr, which indicates treatment under the PTR, coded 1 if the subject is recommended treatment and 0 if not.
6 Demonstration of command using simulated data 6.1 Continuous outcome The following demonstrates ptr on a simulated dataset with a continuous outcome. First, 300 observations were created, and a binary treatment variable (A) was randomly generated with treatment assignment probability of 0.5. The following variables were generated: one continuous prognostic variable (P 1, standard normal); two continuous modifier variables (M 1 and M 2, standard normal); one three-level categorical variable, signified by two indicator variables (M 3 l2 and M 3 l3 , prevalence 20 and 40%, respectively); and an error term (e, standard normal). The dataset was randomly split 50:50 between training and test datasets. A continuous outcome was generated using the following specification: The following syntax estimates and evaluates a PTR using three modifiers and one covariate: The output shows both estimation and evaluation steps. The first table provides the weights for each variable used in the estimation of the PTR. These are the estimated average treatment effect and the modifier by treatment interaction terms, as per the linear model (4). These are used to generate the PTR algorithm [provided in text below the first table, according to (1) above]. The standard errors, p-values, and confidence intervals of the weights are derived from Wald statistics and are the same as those displayed after the regress (see [R] regress) command.
The second table shows the estimates of the outcome under the PTR (mu ptr), the outcome under the contrast scenario (mu contrast; everybody receives treatment as the default), and the expected improvement under the PTR (theta). These are all estimated on the test dataset, using the default IPW approach. The standard errors, p-values, and confidence intervals are from the bootstrap procedure.
To demonstrate how to evaluate a PTR using the AIPW method, we simulated an arbitrary PTR on the same dataset: The following syntax evaluates this using the AIPW method, using the covariate P 1 in the linear model for m(A, X i ):

Binary outcome
The following demonstration uses the same covariates defined above, except with a binary outcome variable that has probability where logit −1 is the inverse logit function. This probability is referred to as a random Bernoulli distribution to simulate the binary outcome.
The estimation and evaluation of the PTR under this distribution is given by the following output. This time, we display all bootstrap standard errors using the option bootstrap all and contrast the outcome under the PTR with the outcome under the control.

Discussion
In this article, we explained how a PTR can be estimated and evaluated using data retrospectively collected from two-armed randomized controlled trials. We implemented this theory in a package: ptr. This command allows for potentially multiple biomarkers and binary or continuous outcomes. The command also allows users to specify a minimum threshold for improvement. For the evaluation step, the user can specify a doubly robust method that has efficiency gains over the standard inverse probability method. The user can also contrast the outcome under a PTR with several different policies (for example, treat everybody). This is the first available command in Stata; for similar programs in R, see the packages TreatmentSelection (Janes et al. 2014) and PTE (Kapelner et al. 2014).
Ideally, to avoid overoptimism in the evaluation step, one should use data in the estimation and evaluation steps from two separate randomized controlled trials. There are two further problems with estimating a PTR using a regression model. First, the model may be misspecified in that it omits higher-order, transformed, or interaction terms between the biomarkers that might better explain the data. This can be corrected by specifying a rich model with many parameters. However, this may result in the second problem: overfitting of the model, where the model predicts well in a training dataset but not well in any other datasets. This may be solved by extending the ptr command to incorporate penalized regression, for example, the lasso estimator recently added to Stata 16 (see [LASSO] lasso).
We recommend that the ptr command be used for exploratory analysis, to understand what form a PTR might take and how well it might perform compared with other regimes. A PTR estimated from retrospectively collected trial data will likely not be fit for direct use in clinical practice. We recommend that a prospective trial should be used to see how well ptr performs versus alternative regimes, potentially allowing for the PTR to be adapted as data from the trial are collected (see http://www.bigted.org/ for details of trial designs to be used for this purpose).

Acknowledgments
We acknowledge the guidance and support of our departed mentor Professor Graham Dunn, and we dedicate this to his memory. This work was funded by the MRC North West Hub for Trials Methodology (grant reference: MR/K025635/1).

Programs and supplemental materials
To install a snapshot of the corresponding software files as they existed at the time of publication of this article, type . net sj 21-2 . net install st0639 (to install program files, if available) . net get st0639 (to install ancillary files, if available)