The staircase cluster randomised trial design: A pragmatic alternative to the stepped wedge

This article introduces the ‘staircase’ design, derived from the zigzag pattern of steps along the diagonal of a stepped wedge design schematic where clusters switch from control to intervention conditions. Unlike a complete stepped wedge design where all participating clusters must collect and provide data for the entire trial duration, clusters in a staircase design are only required to be involved and collect data for a limited number of pre- and post-switch periods. This could alleviate some of the burden on participating clusters, encouraging involvement in the trial and reducing the likelihood of attrition. Staircase designs are already being implemented, although in the absence of a dedicated methodology, approaches to sample size and power calculations have been inconsistent. We provide expressions for the variance of the treatment effect estimator when a linear mixed model for an outcome is assumed for the analysis of staircase designs in order to enable appropriate sample size and power calculations. These include explicit variance expressions for basic staircase designs with one pre- and one post-switch measurement period. We show how the variance of the treatment effect estimator is related to key design parameters and demonstrate power calculations for examples based on a real trial.

We have Then using the definition of the inverse of a partitioned matrix, the bottom-right element of (Z .

B.1.2 Approach 2: Matrix inversion
Another way to solve the system of Lagrange multiplier equations is by expressing the problem in vector and matrix form.Then the vector of weights can be written as where 1 is a column vector of ones.Since the weights must also satisfy the condition w 1 + w 2 + w 3 = 1, then Then the weights can be represented as where s i is the sum of the elements in row i of Σ −1 .Then since the optimal weights can be obtained using the row sums and total sum of the elements of Σ −1 :
Note that unlike when categorical period effects are assumed, all cluster-period means contribute to the treatment effect estimate when linear period effects are assumed.
Then the variance of the treatment effect estimator can be represented as We can find the optimal weights by using the method of Lagrange multipliers to minimise the expression a w 2 1 + w 2 2 + w 2 3 + w 2 4 − 2ψ (w 1 w 4 + w 2 w 3 ) subject to the constraints , and g 2 (w 1 , w 2 , w 3 , w 4 ) = w 1 +w 2 +w 3 +w 4 −1.Then we solve ∇ w1,w2,w3,w4,λ1,λ2 L(w 1 , w 2 , w 3 , w 4 , λ 1 , λ 2 ) = 0, which yields the following system of equations: Plugging the previous results into (5) gives: w 4 = − 1 a(1+ψ) λ 1 and into (6) gives: Equating these results gives: 1+ψ)  10 .Then plugging this result into each of the forms for each w i , we obtain the following weights: Unlike when categorical period effects are assumed, these weights do not depend on the correlation between cluster-period means, ψ.Then the treatment effect estimator can be represented as Each of the cluster-period means under a particular treatment condition have different weights: the cluster-period means under the intervention condition have decreasing magnitude, and the weights on the cluster-period means under the control condition have increasing magnitude (and opposite sign to those under the intervention condition), with progression in time across the periods of the trial.
If K clusters were randomised to each sequence, then the variance would simply be reduced by a factor of 1/K: C Derivation of the variance of the treatment effect estimator, S-sequence basic staircase design

C.1 Categorical period effects
Following the approach outlined in Section 3.2 but for an S-sequence basic staircase design, we can write the treatment effect estimator as And since we want θ to be an unbiased estimator of θ, the following conditions must hold: (i) w 11 = 0 and w S,S+1 = 0 (ii) S−1 s=1 w s,s+1 = 1.As seen in the four-sequence case, (i) implies that the outcomes from the cluster-periods in the first and last periods have weights of zero and therefore do not contribute to the treatment effect estimate, and (ii) implies that the cluster-periods measured in the same period receive weights that are equal in magnitude but opposite signs.Then letting w s = w s,s+1 = −w s+1,s+1 for all s = 1, . . ., S − 1, the treatment effect estimator θ can be written in simplified form as with the constraint that S−1 s=1 w s = 1, and the variance of the treatment effect estimator can be represented as w s w s+1 .
The optimal weights w s that give lowest variance can be found by solving the system of S − 1 Lagrange multiplier equations.A general analytical form for the weights can be obtained by representing the system of equations as a matrix equation.Then the vector of weights can be expressed in terms of the elements of the inverse of an (S − 1) × (S − 1) symmetric tridiagonal Toeplitz matrix Σ, where .
The matrix Σ represents the covariance matrix of "one-dependent random variables", i.e. the differences in cluster-period means: Ȳs,s+1 − Ȳs+1,s+1 , which each have variance 2a and covariance −aψ(= −b) with the differences in adjacent periods (da Fonseca and Petronilho, 2001).Then the optimal weights can be represented as is the sum of row i of Σ −1 and 1 is the sum of all elements of Σ −1 , with σ ij = Σ −1 ij .The weights can then be expressed solely in terms of the elements of the first row of Σ −1 : where , and While the variance of the treatment effect estimator can be obtained by plugging these weights back into the earlier variance expression and simplifying, since we are dealing with "one-dependent random variables", we can exploit the more direct result that the precision of the treatment effect estimator is simply the sum of all the elements of Σ −1 (da Fonseca and Petronilho, 2001): If K clusters were randomised to each of the S sequences of the design, then the variance of the treatment effect estimator would simply be reduced by a factor of 1/K, so that C.2 Categorical period effects, using expression (3)

C.2.1 Setup
We seek to derive an analytical variance expression for the basic staircase design, with R 0 = R 1 = 1.Then X ⊤ = (0, 1) and the covariance matrix for a single cluster, assumed common across clusters, is of the form and so Note that with just two periods of measurement per cluster, the block-exchangeable and discrete-time correlation decay intracluster correlation structures are equivalent.

C.2.2 Categorical period effects
Consider a basic staircase design with S unique treatment sequences.Assuming categorical period effects, the matrices encoding the time effects, Z s , s = 1, . . ., S, are 2 × (S + 1)-dimensional matrices comprised entirely of zeroes except for a 2 × 2 identity matrix starting in column s: Tan ( 2019) provided some useful results pertaining to the inverse of real symmetric n × n tridiagonal matrices of the form , where the first and last diagonal elements differ from the remaining diagonal elements.The results we will use below hold when the inner diagonal element c ̸ = 2.In our case, d = a b , b = 1 and c = 2d = 2a b , and so c ̸ = 2 since a ̸ = b.For simplicity of notation, we will let d = a b and substitute a and b back in at the end of the derivation.
The elements of the (S + 1) × (S + 1) tridiagonal matrix Q −1 can be represented by where κ i , κ and κ 0 can be represented by The sum of the elements of the ith row of Q −1 can be represented by We can then represent Ω, the sum of the elements of the inner rows of Q −1 , by so then the variance of the treatment effect estimator is given by var and so Then the variance of the treatment effect estimator is given by var( θ) SC(S,K,1,1),lin = 1 C.4 Specific cases, S ∈ {2, 3, 4} We present the analytical variance expressions for some specific cases: for S = 2, S = 3 and S = 4 unique treatment sequences.