Causal mediation and sensitivity analysis for mixed-scale data

The goal of causal mediation analysis, often described within the potential outcomes framework, is to decompose the effect of an exposure on an outcome of interest along different causal pathways. Using the assumption of sequential ignorability to attain non-parametric identification, Imai et al. (2010) proposed a flexible approach to measuring mediation effects, focusing on parametric and semiparametric normal/Bernoulli models for the outcome and mediator. Less attention has been paid to the case where the outcome and/or mediator model are mixed-scale, ordinal, or otherwise fall outside the normal/Bernoulli setting. We develop a simple, but flexible, parametric modeling framework to accommodate the common situation where the responses are mixed continuous and binary, and, apply it to a zero-one inflated beta model for the outcome and mediator. Applying our proposed methods to the publicly-available JOBS II dataset, we (i) argue for the need for non-normal models, (ii) show how to estimate both average and quantile mediation effects for boundary-censored data, and (iii) show how to conduct a meaningful sensitivity analysis by introducing unidentified, scientifically meaningful, sensitivity parameters.


S.2
Not for Publication Supplementary Material SI2C Conditional on X i , M i (0), and M i (1), the mean of Y i (a, m) is given by The following result establishes that SI2C identifies the average causal mediation effects.
Proposition 2. Suppose that SI1, SI2C, and SI3 hold.Then we have Proof.As in the proof of Proposition 1, iterated expectation gives Plugging this expressions for E θ [r y {M i (a ′ ), a, X i }] into (S.9)finishes the proof.
SI2C has several advantages over SI2A -SI2B.First, we feel that shifts directly on the scale of the mean are more easily interpreted than shifts on the logit scale.Like the shift on the logit scale developed in Section 5.1, λ can be interpreted as shifting the causal effect of m into an association with M i (a).For example, if we had used the linear model To a elicit a default range of λ's we can use the approach outlined in Section 5.1 by fitting a linear regression of Y i on (M i , A i , X i ).
The second benefit of SI2C is that the linearity removes the need to specify the correlation ρ between M i (a) and M i (a ′ ) so that the sensitivity analysis only requires eliciting a range of plausible λ's.This is very helpful, as ρ is more difficult to interpret than λ.
Using Proposition 2 we can again develop a Monte Carlo implementation of the g- .1 gives a sensitivity analysis under SI2C for the JOBS II data, with a reasonable range for λ now obtained from a linear regression of Y i on (A i , M i , X i ).The results are substantively in agreement with the logit-scaled sensitivity analysis: there are no values of λ that lead to evidence of either direct or indirect effects of the treatment.
For the linear model, it is easy to see why λ does not greatly influence our conclusions: as shown in Proposition 2, the influence of λ is through the average treatment effect on the . Because the sign of ϖ is uncertain, the direction by which λ shifts E θ [Y i {a, M i (a ′ )}] is itself uncertain, so that increasing |λ| has more effect on the uncertainty of the mediation effects than it does on the point estimates.Sample U ∼ Uniform(0, 1)

Figure S. 2
Figure S.2 and Figure S.3 give the sensitivity analyses for the values of ρ = 0 and ρ = 0.5, respectively, under SI2B.

Figure S. 4
Figure S.4 shows traceplots to assess mixing of the Markov chains used to fit the JOBS data on the causal mediation effects.Figure S.5 shows the same for the log posterior density.In both cases, we see that the Markov chain is mixing rapidly, and there is no evidence of any problems.
for 15: Approximate δ(a), ζ(a), τ with Figure S.3: Sensitivity of inferences about δ(a) and ζ(a) to changes in the sensitivity parameter λ under assumptions SI2A and SI2B when ρ = 0.5.The dashed line is the posterior mean, and the bands delimit a pointwise 95% credible interval.
Figure S.5: Mixing of the log-posterior density across the four chains.