Local continual reassessment methods for dose finding and optimization in drug-combination trials

Due to the limited sample size and large dose exploration space, obtaining a desirable dose combination is a challenging task in the early development of combination treatments for cancer patients. Most existing designs for optimizing the dose combination are model-based, requiring significant efforts to elicit parameters or prior distributions. Model-based designs also rely on intensive model calibration and may yield unstable performance in the case of model misspecification or sparse data. We propose to employ local, underparameterized models for dose exploration to reduce the hurdle of model calibration and enhance the design robustness. Building upon the framework of the partial ordering continual reassessment method, we develop local data-based continual reassessment method designs for identifying the maximum tolerated dose combination, using toxicity only, and the optimal biological dose combination, using both toxicity and efficacy, respectively. The local data-based continual reassessment method designs only model the local data from neighboring dose combinations. Therefore, they are flexible in estimating the local space and circumventing unstable characterization of the entire dose-exploration surface. Our simulation studies show that our approach has competitive performance compared to widely used methods for finding maximum tolerated dose combination, and it has advantages over existing model-based methods for optimizing optimal biological dose combination.


Introduction
For treating cancer, the strategy of combination therapy provides an efficient way to increase patients' responses by inducing drug-drug synergistic treatment effects, targeting multiple sensitive sites and disease-related pathways, and increasing dose intensity without overlapping toxicities. 1 Finding a desirable dose combination is critically important for the late-stage development of combination treatments. However, early-phase dose finding/optimization for multiple drugs faces several challenges, such as a large dose exploration space, unclear drug-drug interactions, partially unknown toxicity order among some dose combination pairs, and the small-scale nature of early-phase trials.
In the last decade, numerous designs have been proposed to explore the maximum tolerated dose combination (MTDC) for drug combinations. Most of these designs use parametric models to depict the dose-toxicity relationship and use a similar dose exploration strategy to the well-known, single agent-based continual reassessment method (CRM). 2 Thall et al. 3 proposed a two-stage design for identifying three pairs of MTDCs by fitting a six-parameter model. Yin and Yuan 4 employed a copula-type model to account for the synergistic effect between drugs. Braun and Wang 5 estimated the doselimiting toxicity (DLT) rate for drug combinations using a Bayesian hierarchical model and explored the dose matrix using an adaptive Bayesian algorithm. Wages et al. 6 proposed the partial ordering continual reassessment method (POCRM), transforming the dose-combination matrix into one-dimension and determining the treatment of the next cohort based on the CRM and Bayesian model selection. Riviere et al. 7 proposed a Bayesian dose finding design for drug-combination trials based on the logistic regression. To circumvent a lack of robustness when using parametric models, several non-parametric or model-free designs have also been proposed. Lin and Yin 8 developed a two-dimensional Bayesian optimal interval (BOIN) design to identify MTDC. Mozgunov et al. 9 proposed a surface-free design for phase I dual-agent combination trials. Mander and Sweeting 10 employed a product of independent Beta probabilities escalation (PIPE) strategy to identify the MTDC contour. Clertant et al. 11 proposed to identify MTDC or the MTDC contour using a semiparametric method.
Due to the small-scale nature of early phase clinical trials, model-based designs attempt to strike a balance between the comprehensiveness of the dose-toxicity model and the robustness of inference. As a result, parsimonious models are generally proposed to capture the dose-toxicity relationship of the entire dose exploration space in an approximate sense. Despite using parsimonious models to enhance the robustness of inference, model-based designs tend to have unstable performance when the model assumption is misspecified or the observed data is sparse. 12 This issue becomes more profound for designs of drug-combination trials, where the dimension of the dose-exploration space is large. This difficulty is because more parameters are introduced to quantify the effects of drug-drug interactions. Due to the increased model complexity, prior elicitation or model calibration becomes another hurdle that may affect the operating characteristics.
The goal of dose-finding designs rarely is to estimate the entire dose-toxicity relationship. 13 Instead, the primary objective is to accurately estimate the local region of the target dose by concentrating as many patients on doses close to the target. To this end, CRM uses a simple, underparameterized model (such as the empiric model) to conduct dose finding. Although it may produce a biased estimate for doses far from the target dose, CRM design converges almost surely to MTD. 14 Following the main idea of CRM, we propose to use local models to stabilize the dose finding procedure. Specifically, we implement POCRM locally in the adjacent region of the current dose, leading to the local data-based CRM (LOCRM). POCRM aims to characterize the entire dose-toxicity surface by specifying a limited number (typically six to eight) of orderings for the dose exploration space. In contrast, LOCRM evaluates up to five neighboring dose combinations, allowing for the use of all possible toxicity orderings within a local region, bypassing the need for preselection in POCRM. Simulation studies demonstrate that LOCRM design is effective in determining the MTDC and compares favorably with other model-based or model-assisted designs.
As a step further, we extended the LOCRM approach to optimize the dose combination based on the toxicity and efficacy simultaneously. In the modern era of precision oncology, the conventional "more-is-better" paradigm that works for chemotherapies is no longer suitable for targeted therapy and immunotherapy. 15,16 Project Optimus, a recent U.S. FDA initiative, also highlights the need for innovative designs that can handle scenarios where "less-is-more." By considering both toxicity and efficacy at the same time for decision making, finding an optimal biological dose combination (OBDC) that maximizes the risk-benefit tradeoff in a multi-dimensional dose exploration space becomes even more challenging. In addition to the challenges in finding MTDC, other major obstacles for OBDC finding include the lack of flexible and robust models to account for possible plateau dose-efficacy relationships and the difficulty in effectively assigning patients in the dose-exploration space.
Trial designs for single-agent dose optimization are abundant; see references. [17][18][19][20][21][22][23] However, because of the aforementioned challenges, research on dose optimization methods in dose-combination trials is rather limited. Mandrekar et al. 24 developed a continual ratio model for dose optimization in drug-combination trials. Yuan and Yin 25 constructed a Bayesian copula-type model for toxicity and a Bayesian hierarchical model for efficacy. Cai et al. 26 developed a change point model to identify the possible toxicity plateau on higher dose combinations and employed a five-parameter logistic regression for efficacy estimation. Wages and Conaway 27 proposed a Bayesian adaptive design with the assumption of monotone dose-toxicity and dose-efficacy relationships within single agents. Guo and Li 28 proposed dose finding designs based on the partial stochastic ordering assumptions. The two-stage design proposed by Shimamura et al. 29 includes a zone-finding stage to evaluate toxicity on prespecified partitions and a dose-finding stage to explore the efficacy of the dose space. Yada and Hamada 30 extended the method of Yuan and Yin 25 using a Bayesian hierarchical model to share information between doses. As shown in the simulation results, many existing drug-combination dose-optimization designs may suffer from robustness problems due to the use of parametric models to quantify the entire dose-exploration space. To address this, this article introduces a LOCRM12 design based on the proposed LOCRM approach for OBDC identification. We utilize LOCRM as the toxicity model and employ the robit regression model 31 locally for efficacy. The combination of constructing local models for interim decision-making and using the robit regression makes the trial design more robust and easier to execute in practice.
The remaining of this article is organized as follows: In Section 2, we introduce the local method for modeling toxicity and propose the LOCRM design for MTDC identification. In Section 3, we describe the local efficacy model and introduce the trial design for OBDC identification. In Sections 4 and 5, we conduct extensive simulation studies to evaluate the operating characteristics of the LOCRM and LOCRM12. Section 6 provides a brief discussion.
2 Dose finding based on toxicity only

Partial toxicity orderings
Assume that an early-phase drug-combination trial is being conducted to determine MTDC for a combination therapy with J dose levels of drug A and K levels of drug B. Let p j,k be the toxicity probability of dose combination ( j, k), j = 1, … , J , k = 1, … , K. A key assumption that is usually made for the toxicity of the dose-exploration space is the partial toxicity ordering 6 ; that is, the toxicity rate increases with the dose of one drug when the other drug's dose is fixed at a certain level. We restrict the model and interim decision making within the local space, rather than considering the entire space. Specifically, suppose the current dose is ( j, k), its local dose set is In other words, dose set  contains all the adjacent dose combinations that differ from the current combination ( j, k) by one dose level of one drug. Because we only focus on the local space, we can enumerate all possible toxicity orderings of dose combinations in . Under the partial monotonicity assumption, we have up to I = 4 possible toxicity orderings for local dose combinations contained in : Under a specific ordering  i , i = 1, … , I, let r i ( j, k) = 1, … , || denote the rank of the toxicity rate of dose combination ( j, k) (from the smallest to the largest) in the local space , where || is the cardinality (i.e. the total number of dose combinations) of . For example, when the current dose is ( j, k), under  1 , we have r 1 ( j − 1, k) = 1, r 1 ( j, k − 1) = 2, r 1 ( j, k) = 3, r 1 ( j + 1, k) = 4, and r 1 ( j, k + 1) = 5.

Toxicity model
When I ≥ 2 possible toxicity orderings exist in the local dose combination space , following the ideas of POCRM 6 and Bayesian model averaging CRM, 12 we treat each of the I possible orderings as a probability model and estimate the toxicity rates based on the Bayesian model averaging method. 12 Because the toxicity orders of the local doses are fully specified under model  i , i = 1, … , I, the two-dimensional local space can be converted to a one-dimensional searching line. In other words, a variety of single-agent toxicity models could be employed for an ordering  i , i = 1, … , I. Here we employ a common toxicity model of CRM. Based on the commonly used single-parameter empiric CRM model, the toxicity probability of dose combination ( j, k) ∈  is expressed as where a i ∈ (−∞, ∞) is the unknown parameter associated with ordering  i , and (l) is the prior guess of the toxicity probability of the lth dose level, that is, (1) ≤ ⋯ ≤ (||) is the skeleton of CRM. While other toxicity models like the two-parameter logistic model can be also used, research has shown that these more complex models often perform worse than the simpler one-parameter empirical model in accurately determining the correct dose. 32 Let y T j,k and n j,k be the number of observed toxicities and the number of patients at combination ( j, k), respectively. Based on the local toxicity data D T  = {(y T j,k , n j,k ) : ( j, k) ∈ }, the likelihood function corresponding to ordering  i is Let f (a i | i ) denote the prior distribution of a i specified under  i . As a i takes values from −∞ to ∞, a normal distribution N(0, 2 a ) with mean 0 and variance 2 a is often used for f (a i | i ) in the literature, with 2 a typically ranging from 1.34 to 4. 33,12 Alternatively, another option is using an exponential distribution like Exp(1) for the prior of exp(a i ). 6 Our simulation study has demonstrated that the proposed design is not sensitive to these prior specifications of a i . The posterior distribution of a i under  i can be expressed as and the posterior mean of the toxicity rate at dose combination ( j, k) is given byp The Bayesian model averaging procedure is conducted to average the posterior estimates of multiple models. Denote the prior probability of each model (ordering) being true as Pr( i ). We choose equal prior model probabilities with Pr( 1 ) = ⋯ = Pr( I ) = 1∕I. According to Yin and Yuan, 12 the posterior model probability The posterior mean of the toxicity probability of dose ( j, k) in  is then calculated as a weighted average of the estimates under each model, that is,

Trial design
Suppose the current dose combination is ( j, k). We treat the next cohort of patients at a dose from the local space, ( j * , k * ), defined as the dose combination in  with the estimated toxicity probabilityp j * ,k * being closest to the target toxicity probability T , that is, If multiple dose combinations meet the criteria, we select one randomly. When there is no toxicity outcome observed at the beginning of the trial, dose escalation by LOCRM is equivalent to randomly escalating one dose level of one drug with the level of the other drug fixed. This is because we specify all possible toxicity orderings with equal prior probabilities. During the trial, we add a safety monitoring rule for overdose control. Given a prespecified probability cutoff c T , if p j,k satisfies then dose combination ( j, k) and its higher dose combinations are overly toxic and should be eliminated from the trial. In many cases, using c T = 0.95 provides satisfactory results for overdose control. But c T can be further calibrated by evaluating various values, such as 0.85, 0.90, and 0.95 through simulations. If the lowest dose satisfies the above condition, then all dose combinations are unacceptably toxic, and we should therefore terminate the trial early. If the trial is terminated because of safety, we do not select any dose as MTDC.
If this overdose control rule is never activated, the trial will continue to recruit and treat patients until the maximum sample size is exhausted. The use of the local data as well as local models can stabilize the dose exploration in the drugcombination space, especially when the amount of observed information is limited and the trial has a large dose exploration space. At the end of the trial, we conduct the bivariate isotonic regression 34 for the matrix of the observed toxicity rates (function biviso() in R package Iso). By doing so, the observed information can be shared across all explored dose combinations, leading to a non-decreasing matrix of the estimated toxicity rates and thus an efficient estimate of the final MTDC. After exclusion of untried dose combinations, the dose combination whose isotonic estimated toxicity rate is closest to T is selected as MTDC. If multiple dose combinations satisfy the criteria, the one with lower doses is chosen. Instead of using the bivariate isotonic regression, other model-based approaches (such as the logistic regression model) also can be used to estimate the toxicity rates of the entire space.

Dose optimization based on the toxicity and efficacy
Based on the LOCRM design proposed in Section 2, we further develop a dose optimization design (i.e. a phase I/II design) for dose-combination trials where the dose escalation/de-escalation decisions are made based on the toxicity and efficacy jointly. This trial design employs LOCRM as the toxicity model and an efficacy model utilizing the same local modeling idea (see the following Section 3.1). We name this dose-optimization design LOCRM12.

Efficacy model
If we construct the efficacy model similar to the toxicity model, we need to enumerate all possible efficacy orderings of candidate doses. However, the partial ordering assumption does not necessarily hold for efficacy. The efficacy surface is much more complicated when taking various dose-efficacy relationships into account. Consider the case when the current combination, ( j, k), is the most efficacious dose combination among the local space . In this case, the other four candidate doses cannot be ordered, leading to a total of 4! = 24 possible orderings. If  has five doses, then we have a total of 5 × 24 = 120 possible orderings, which is a huge number to conduct the model averaging procedure as described in Section 2.2. Instead, we propose to use a local regression model for efficacy. We model efficacy based on the local nine dose combinations by the robit model 31 as a simple robust alternative to the logistic and probit models.
Formally, let q j,k be the efficacy rate at combination ( j, k), and denote which includes all possible dose combinations that are adjacent to the current combination ( j, k), then the robit model for efficacy is given by is the cumulative distribution function of Student's t distribution with v degrees of freedom. The incorporation of the quadratic terms, (d A j ) 2 and (d B k ) 2 , increases the model flexibility. For example, it enable us to capture the non-monotone dose-response surface. Simulation studies have demonstrated (results omitted) that excluding the two quadratic terms can result in poor operating characteristics for the proposed design. As discussed by Liu, 31 the robit regression model covers a rich class of flexible models for analysis of binary data, including the logistic (when v ≈ 7) and probit (when v → ∞) models as special cases. Note that the efficacy model incorporates more local data, i.e. a larger local area, than the toxicity model because the dose-efficacy surface is more varied and unpredictable. Section 5.3 demonstrates that utilizing slightly more data improves parameter estimation. Specifically, the design utilizing  for the efficacy model outperforms the design based on . Furthermore, we omit the interaction term (d A j d B k ) from the efficacy model since its inclusion does not improve the performance of LOCRM12 in identifying OBDC in small-sized studies (simulation results omitted). This finding is consistent with the results by Cai et al., 26 Iasonos et al., 32 and Mozgunov et al. 35 Let = ( , 1 , 2 , 1 , 2 ) be the vector of parameters that characterizes the efficacy robit model (3). Denote the number of responders at combination ( j, k) as y E j,k . Based on the locally observed efficacy data D E  = {(y E j,k , n j,k ) : ( j, k) ∈ }, the likelihood function for efficacy can be written as Algorithm 1 Dose-optimization algorithm of the LOCRM12 design Startup stage 1. Treat the first cohort at dose combination (1, 1) or the physician-specified combination.
2. Suppose that current dose combination is ( j, k).
(a) If no toxicity event is observed at ( j, k), randomly escalate to ( j + 1, k), or ( j, k + 1). If j = J and k < K, treat the next cohort at (J , k + 1). If j < J and k = K, treat the next cohort at ( j + 1, K). (b) If the toxicity event is observed at ( j, k), the trial enters the main stage.

Main stage
1. Fit the toxicity and efficacy models as described in Sections 2.2 and 3.1. 2. Determine the local MTDC, ( j * , k * ), among  using equation (1) and identify the set of acceptable dose combinations  using equation (4).
is untried or all dose combinations in have been tried, allocate the next cohort of patients to ( j † , k † ). 4. If neither of the two conditions in step 3 is satisfied, consider the following two cases.
( where the expression of q j,k is given by the robit model (3). Let f ( ) denote the prior distribution of . Then the posterior distribution of is given by The posterior mean of the efficacy probability q j,k , denote asq j,k , is Based on our investigation, we recommend the following prior distribution for in the efficacy model where , 2 , , 2 , , 2 , a df , and b df are hyperparameters. In general, the elicitation of the hyperparameters depends on the availability of the prior information and also the operating characteristics through calibration. If there is historical data on monotherapy or combination efficacy, the hyperparameters can be determined using the posterior distribution with a carefully inflated variance. Prior information is often limited in phase I studies, so it is recommended to use large values for 2 , 2 , and 2 to ensure a non-informative prior on q j,k . Our investigation finds that using 2 = 2 = 2 = 1.3 or 3 result in similar performance for the proposed design. Our design is not affected by the value of and thus we recommend = 0 for general use. A positive value for is recommended to ensure a monotonically increasing dose-response relationship at lower dose combinations. This facilitates exploration of the dose combination space at the beginning of the trial. In the absence of prior information on the dose-efficacy shape, setting = 0 can accommodate a wide range of different shapes. Regarding the prior distribution on the degrees of freedom, we generally recommend df = 2 and df = 10 such that the robit model can balance robustness and good approximation to the logistic or probit model. 31 In practice, specifying hyperparameters in Bayesian dose-finding trials is challenging and requires close collaboration between statisticians and clinical investigators, who should undertake comprehensive simulation investigations and exploit existing resources. A more systematic way for hyperparameter elicitation can be found by Lin et al., 36 where the notion of prior effective sample size (PESS) is adopted to elicit an interpretable and relatively non-informative prior. The choices of 2 = 2 = 2 = 1.3 and 3 both result in PESS being less than 2, respectively.
The proposed designs can be extended straightforwardly to explore combinations of more than two drugs. Suppose we aim to treat patients with p drugs. For the LOCRM design, the local space is expanded to include 2p + 1 dose combinations. This includes p higher dose combinations obtained by increasing the dose level of one drug while keeping the dose levels of the other p − 1 drugs fixed, and p lower dose combinations in parallel. The dose combinations higher or lower than the current one have p! possible orders, respectively. We can enumerate the p! × p! complete orders in the local space and conduct a trial using the proposed design. For instance, when p = 3, this leads to a space of 36 complete orders. If the number of orders becomes too large, we can consider using the idea of POCRM to specify a subset of the complete orders as candidate models. The efficacy model can be easily extended to include more than two drugs by adding linear and quadratic terms for the additional drugs. The extension of the dose space for the efficacy model follows a similar process as the toxicity model. While it is straightforward to apply the proposed designs to explore combinations of more than two drugs, there may be other practical considerations when investigating the combination of multiple drugs in real-world trials. In practice, due to budget limitations and the consideration that many standard treatments have approved or labeled dose levels, it is more practical to vary the dose levels of a maximum of two or three drugs.

Trial design
We employ the same safety control rule as the LOCRM design (Section 2.3). However, to optimize the toxicity-efficacy trade-off and accommodate the situation where a relatively toxic dose may yield much higher efficacy, we recommend a relatively loose cutoff T that may be slightly larger than the value specified for finding MTDC. The LOCRM12 design consists of two stages; see the dose-finding steps in Algorithm 1. The first stage, that is, the startup stage, is a fast escalation stage to collect preliminary data in the dose matrix. As data is sparse at the start of the trial, we make escalation decisions independent of the toxicity and efficacy models, as long as no toxicity event is observed among the initial cohorts of the patients. Once a toxicity event is observed in the most recent cohort during the safety assessment period, the trial seamlessly enters the main stage.
In the second stage (main stage) of LOCRM12, we use the proposed toxicity and efficacy models to inform escalation/deescalation decisions. Generally, we treat the next cohort of patients at the locally most efficacious and safe dose combination. Proper safety control and balanced patient assignment between doses are also to be considered. Specifically, we follow a two-step rule to decide the locally optimal dose combination for the next cohort of patients. In the first step, we identify a set of safe dose combinations based on the estimated local MTDC ( j * , k * ). Specifically, let denote the admissible set that contains the doses with the posterior estimate of the toxicity probability no higher than the estimated probability of dose combination ( j * , k * ), that is, In the next step, we identify the locally most efficacious dose combination in, denoted as ( j † , k † ), where ( j † , k † ) achieves the highest posterior efficacy rate by model (3), that is, ( j † , k † ) = arg max ( j,k)∈ {q j,k }. Rather than using a simple pick-the-winner rule, we want to enhance the exploration-exploitation trade-off by balancing the patient allocation among the admissible set. In particular, if the current most efficacious dose ( j † , k † ) has never been tried or all dose combinations in have been tested already, we assign the next cohort of patients to the combination ( j † , k † ). Otherwise, if ( j † , k † ) has been tried while there are some untried dose combinations in, we further compare the estimated efficacy probabilityq j † ,k † with a sample size-dependent cutoff, ( N−n N ) z , where N is the maximum sample size of the main stage, n is the number of patients enrolled in main stage, and z is the tuning parameter to control the degree of balance of patient allocation. Ifq j † ,k † is larger than ( N−n N ) z , we are confident that treating the next cohort of patients at the optimal dose elicited by the efficacy model is more beneficial. Hence, we allocate the next cohort of patients to ( j † , k † ). Otherwise, we should explore untried doses. A larger value of z means the investigators have more interest in exploring untried doses and a smaller z indicates more intent to treat patients at ( j † , k † ). To strike a balance between exploration and exploitation, we recommend z = 2. The sensitivity of the design to difference values of z is investigated in Section 5.3.
In addition to the safety stopping rule used in the LOCRM, we also continuously monitor the efficacy of the considered dose combinations. If the posterior efficacy probability satisfies then the combination ( j, k) is deemed futile and should be eliminated from the trial. Here, E is a prespecified lower limit on the efficacy rate, and c E is a probability cutoff. It is recommended to use the response rate of the standard of care or At the end of the trial, we determine OBDC by finding the most efficacious dose combination among a safe dose set, which is the collection of dose combinations that have the isotonically estimated toxicity probability no greater than that of the selected MTDC. For dose optimization trials that focus on balancing the risk-benefit tradeoff, we recommend using a larger value for T compared to trials finding MTDC only in order to accommodate the scenarios where a slightly over-toxic dose combination may yield much higher efficacy than the lower dose combinations. Therefore, the selected MTDC in a dose optimization trial usually is higher than the MTDC picked in a conventional dose-finding trial. The selected OBDC is the dose combination that has been tested in the trial and has the highest estimated efficacy probability among the safe dose set. The excluded dose combinations that are either overly toxic or futile should not be considered as the candidates for OBDC.

Simulation studies for finding MTDC
To evaluate the operating characteristics of the proposed LOCRM design, we choose a model-based design (POCRM 6 ) and a model-assisted design (BOIN Comb 8 ) as competing methods. Six representative scenarios are presented in Table 1. Two MTDCs exist in Scenarios 1 and 6, whereas, three MTDCs exist in Scenarios 2 to 5. The MTDCs are located at the relative lower part of the dose space in Scenario 1, and the MTDCs progress towards higher regions in the dose exploration space as we go from Scenarios 1 to 6. The target toxicity rate is set as T = 0.3, and the cutoff value is c T = 0.95. The maximum sample size is assumed to be 51 and patients are treated in cohorts of size three. Under each design, we simulate 5000 trials for each scenario.
For the LOCRM design, we generate skeletons by using the algorithm proposed by Lee and Cheung 33 (function getprior() in R package dfcrm). To implement the getprior() function, one needs to prespecify the target rate ( T = 0.3), the number of candidate doses (= 3, 4, or 5), the prior guess of MTD (= 2, 3, or 4), and the halfwidth of the indifference interval (= 0.05). The indifference interval refers to a range of toxicity rates such that a dose level whose toxicity rate falls within this interval can be considered equivalent to the MTD. The prior distribution of parameter a i is set as N(0, 2), i = 1, … , I. Assuming six toxicity orderings, 37 we simulate the POCRM design by modifying R package pocrm. Specifically, in the startup stage, we randomly increase the amount of drug A or B and fix the amount of the other drug, and the trial transforms to the next stage when toxicity events are observed. This new startup rule makes the POCRM design more comparable with the other two designs. The skeleton used in POCRM is chosen using the getprior() function with the prior MTD guess as the seventh dose and a halfwidth of the indifference interval being 0.05. Results of BOIN Comb are generated by R package BOIN using the default configuration. For a fair comparison between the competing methods, we use the same overdose control rule (2) based on the beta-binomial model for all designs. The performance of LOCRM and other methods is assessed based on four metrics: (a) percentage of trials that ultimately choose one of the true MTDCs; (b) number of patients assigned to any of the true MTDCs; (c) percentage of trials that end up selecting a dose combination with a toxicity rate greater than T ; and (d) number of patients treated at overly toxic doses. The first two metrics evaluate the design's ability to identify the target dose combinations and high values are desirable. The latter two metrics are used to assess safety, thus smaller values indicate better safety control. The operating characteristics of LOCRM and the competing methods are summarized in Table 2. On average, the proposed LOCRM design is comparable to POCRM and BOIN Comb design in terms of the four performance indicators. The difference of the percentage of selecting true MTDCs is no more than 11 points between different methods under various scenarios. The proposed design exhibits better safety control on average, especially in Scenarios 1 to 4, where the MTDCs are at the middle part of the dose matrix.

Simulation configuration
To demonstrate the desirable performance of the proposed LOCRM12 design, we compare it with two competing methods, Cai et al. 26 (referred to as Cai) and Yuan and Yin 25 (referred to as YY). Ten representative scenarios with five doses for drug A and three doses for drug B are present in Table 3. The 10 scenarios incorporate various dose-toxicity and dose-efficacy  (2) for overdose control, the cutoff value is set as c T = 0.85. The parameter specification of the toxicity model used for the LOCRM12 design is the same as that described in Section 4 for the LOCRM design. The lower limit for the efficacy rate is set as E = 0.2. Using equation (5) to monitor efficacy, the cutoff value is set as c E = 0.9. The hyperparameters in the efficacy model in equation (3) are set as follows: The impact of various prior distributions is investigated in Section 5.3. For the LOCRM12 and Cai designs, we get the standardized doses d A = (−1.26, −0.63, 0, 0.63, 1.26) and d B = (−1, 0, 1) for drug A and drug B, respectively, such that the standardized doses are centered around mean 0 with a unit standard deviation. We also take z = 2 for the LOCRM12 and Cai designs. For the YY design, we set the lowest acceptable response rate as 0.2, consistent with the value of E used in LOCRM12. The number of patients for the first and second stages are 21 and 30, respectively. The operating characteristics of the proposed LOCRM12 design and competing methods are summarized in Table 4. In Scenarios 7 and 8, where OBDCs are located at the off-diagonal of the dose matrix, LOCRM12 performs the best in OBDC identification. In Scenario 7 particularly, LOCRM12 selects the correct OBDC with a probability 7.5% larger than the Cai design (59.0% vs. 51.5%) and 21.4% larger than the YY design (59.0% vs. 37.6%) design. LOCRM12 also allocates 7.0 and 12.9 more patients to OBDC than Cai (24.0 vs. 17.0) and YY (24.0 vs. 11.1), respectively. In terms of safety, the LOCRM12 design exhibits a better safety profile than the Cai design. As observed in Scenarios 7 and 8, the Cai design allocates 9.3 and 15.0 more patients to overly toxic dose combinations, respectively. In Scenarios 9 and 14, LOCRM12 yields comparable performance with the Cai design in terms of OBDC identification, while it shows an advantage in terms of safety control. In Scenarios 10 to 13, the LOCRM12 design is the most efficient for OBDC identification. In Scenario 11, the YY design shows the best safety control, but a limited ability to accurately select the OBDC(s) and TDC(s). In Scenario 15, the YY design performs the best in selecting OBDC and TDCs. However, the LOCRM12 design still possesses an advantage in terms of patient allocation. For example, the LOCRM12 design allocates 5.7 and 11.3 more patients than YY to OBDC and TDCs, respectively. Moreover, the YY design is very sensitive to scenario specifications. For example, in Scenario 13, this design selects OBDC(s) with a probability of 0.7% and selects the overly toxic dose combinations with a probability of 50.6%. In Scenario 16, where all dose combinations are overly toxic, LOCRM12 has similar performance as the Cai and YY designs. In Scenarios 9-11 and 15, where TDCs include more dose combinations than OBDCs, the LOCRM12 design exhibits promising operating characteristics in terms of both the selection percentage of TDCs and the number of patients treated at TDCs. For example, in Scenario 10, the selection percentage of TDCs of the LOCRM12 design is as high as 86.7%, while those of Cai and YY are 81.6% and 3.9%, respectively. The number of patients treated at TDCs based on the LOCRM12, Cai, and YY are 32.2, 29.1, and 4.6, respectively. In Scenarios 9 and 15, the selection percentages of TDCs are comparable between LOCRM12 and Cai, but the LOCRM12 design allocates more patients to TDCs.

Sensitivity analysis
In this section, we investigate the robustness of the proposed design under various configurations of the prior distributions or design parameters. Specifically, we study the impact of settings for toxicity modeling (i.e. the halfwidth of the indifference interval), the impact of parameters for efficacy modeling (including the hyperparameters , 1 , 2 , 1 , and 2 ) and the value of z (which determines the aggressiveness in dose exploration as shown in Algorithm 1), and the impact of the sample size and cohort size as follows: Setting   In Setting (a), we investigate the performance of LOCRM12 using a smaller halfwidth of the toxicity indifference interval. Technically, a smaller halfwidth indicates the distance between prior toxicity probabilities is smaller, which to some extent promotes escalation. In Setting (b), we use a smaller (more informative) prior variance for the unknown parameter used in the CRM model. In Setting (c), we similarly examine larger prior variances in the efficacy model. In Settings (d) and (e), we study the performance of LOCRM12 under different values of z, which controls the aggressiveness in dose exploration. In Setting (f), we investigate whether incorporating data from the smaller set of five doses in set  only is sufficient for the efficacy model. Settings (g) and (f) investigate the performance of the design under different cohort sizes and sample sizes, respecitvely. Simulation results of Scenarios 7 to 11 are shown in Figure 1. The proposed method is relatively robust under Settings (a) to (e), with a fluctuation in the correct selection percentage no > 6% and a fluctuation in the percentage of patients allocated to the OBDCs no > 7% under each scenario. This indicates that the proposed LOCRM12 is fairly robust to the considered configurations under Settings (a) to (e). By comparing (f) and (i) in Figure 1, it is observed that the design using  for the efficacy model is inferior to the design using  on average. As a result, incorporating more local data for efficacy monitoring may be more efficient and accurate in terms of finding the OBDCs. By comparing (g) and (i), we find that LOCRM12 is fairly robust to different cohort sizes, but a cohort size of three may slightly outperform a cohort size of one in most scenarios. A cohort size of three is generally preferred as using a cohort size of one may add additional complexity in implementation and logistics. Furthermore, according (h) and (i), LOCRM12 performs better as the sample size increases. As a result, if time and budget are not an issue, a larger sample size can improve the design's accuracy of identifying the optimal dose combination.

Discussion
This article presents two designs, LOCRM and LOCRM12, for the early-phase exploratory dose-combination trials to determine the MTDC and OBDC, respectively. The LOCRM design uses CRM locally and makes escalation and de-escalation decisions among the adjacent doses of the current dose (i.e. ). Local modeling and decision-making make LOCRM efficient and robust in various scenarios. Simulations show that LOCRM is comparable to the POCRM and BOIN Comb designs in identifying MTDC, and has a better safety profile than other methods due to the local model. The LOCRM12 design uses LOCRM as a toxicity model and a robit regression model as the efficacy model. The extra parameter (degrees of freedom) in the robit regression enhances the efficacy estimate's robustness. Extensive simulations have shown that LOCRM12 performs better than other model-based methods in most considered scenarios. The simulation results also suggest that the proposed designs can perform well if the prior distributions are reasonable. If clinicians have prior knowledge of the investigational drug's toxicity or efficacy, they can incorporate this information by setting appropriate informative priors. Setting appropriate prior distributions and design parameters is a nontrivial task for model-based designs. Although our designs use simpler models than many others, proper calibration of priors and parameters is still required. Effective collaboration between biostatisticians and clinicians is crucial to ensure the design's efficiency and robustness. R codes for implementing the LOCRM and LOCRM12 designs are available at the GitHub repository https://github.com/ruitaolin/LOCRM.