Multiple imputation (MI) is one of the principled methods for dealing with missing data. In addition, multilevel models have become a standard tool for analyzing the nested data structures that result when lower level units (e.g., employees) are nested within higher level collectives (e.g., work groups). When applying MI to multilevel data, it is important that the imputation model takes the multilevel structure into account. In the present paper, based on theoretical arguments and computer simulations, we provide guidance using MI in the context of several classes of multilevel models, including models with random intercepts, random slopes, cross-level interactions (CLIs), and missing data in categorical and group-level variables. Our findings suggest that, oftentimes, several approaches to MI provide an effective treatment of missing data in multilevel research. Yet we also note that the current implementations of MI still have room for improvement when handling missing data in explanatory variables in models with random slopes and CLIs. We identify areas for future research and provide recommendations for research practice along with a number of step-by-step examples for the statistical software R.

Aguinis, H., Culpepper, S. A. (2015). An expanded decision-making procedure for examining cross-level interaction effects with multilevel modeling. Organizational Research Methods, 18(2), 155176. doi:10.1177/1094428114563618
Google Scholar | SAGE Journals | ISI
Allison, P. D. (2001). Missing data. Thousand Oaks, CA: Sage.
Google Scholar
Allison, P. D. (2012). Handling missing data by maximum likelihood. In Proceedings of the SAS Global Forum. Retrieved from http://support.sas.com/
Google Scholar
Andridge, R. R. (2011). Quantifying the impact of fixed effects modeling of clusters in multiple imputation for cluster randomized trials. Biometrical Journal, 53, 5774. doi:10.1002/ bimj.201000140
Google Scholar | Crossref | Medline | ISI
Asparouhov, T., Muthén, B. O. (2010). Multiple imputation with Mplus (Technical Appendix). Retrieved from http://statmodel.com/
Google Scholar
Bartlett, J. W., Seaman, S. R., White, I. R., Carpenter, J. R. (2015). Multiple imputation of covariates by fully conditional specification: Accommodating the substantive model. Statistical Methods in Medical Research, 24, 462487. doi:10.1177/0962280214521348.
Google Scholar | SAGE Journals | ISI
Bodner, T. E. (2008). What improves with increased missing data imputations? Structural Equation Modeling: A Multidisciplinary Journal, 15, 651675. doi:10.1080/10705510802339072
Google Scholar | Crossref | ISI
Carpenter, J. R., Goldstein, H., Kenward, M. G. (2011). REALCOM-IMPUTE software for multilevel multiple imputation with mixed response types. Journal of Statistical Software, 45(5), 114. doi:10.18637/jss.v045.i05
Google Scholar | Crossref | ISI
Carpenter, J. R., Kenward, M. G. (2013). Multiple imputation and its application. Hoboken, NJ: Wiley.
Google Scholar | Crossref
Cheung, M. W.-L. (2007). Comparison of methods of handling missing time-invariant covariates in latent growth models under the assumption of missing completely at random. Organizational Research Methods, 10, 609634. doi:10.1177/1094428106295499
Google Scholar | SAGE Journals | ISI
Collins, L. M., Schafer, J. L., Kam, C.-M. (2001). A comparison of inclusive and restrictive strategies in modern missing data procedures. Psychological Methods, 6, 330351. doi:10.1037/1082-989X.6.4.330
Google Scholar | Crossref | Medline | ISI
Drechsler, J. (2015). Multiple imputation of multilevel missing data—Rigor versus simplicity. Journal of Educational and Behavioral Statistics, 40, 6995. doi:10.3102/1076998614563393
Google Scholar | SAGE Journals | ISI
Enders, C. K. (2008). A note on the use of missing auxiliary variables in full information maximum likelihood-based structural equation models. Structural Equation Modeling: A Multidisciplinary Journal, 15, 434448. doi:10.1080/10705510802154307
Google Scholar | Crossref | ISI
Enders, C. K. (2010). Applied missing data analysis. New York, NY: Guilford.
Google Scholar
Enders, C. K., Mistler, S. A., Keller, B. T. (2016). Multilevel multiple imputation: A review and evaluation of joint modeling and chained equations imputation. Psychological Methods, 21, 222240. doi:10.1037/met0000063
Google Scholar | Crossref | Medline | ISI
Erler, N. S., Rizopoulos, D., van Rosmalen, J., Jaddoe, V. W. V., Franco, O. H., Lesaffre, E. M. E. H. (2016). Dealing with missing covariates in epidemiologic studies: A comparison between multiple imputation and a full Bayesian approach. Statistics in Medicine, 35, 29552974. doi:10.1002/sim.6944
Google Scholar | Crossref | Medline | ISI
Gelman, A., Hill, J. (2006). Data analysis using regression and multilevel/hierarchical models. New York, NY: Cambridge University Press.
Google Scholar | Crossref
Gibson, N. M., Olejnik, S. (2003). Treatment of missing data at the second level of hierarchical linear models. Educational and Psychological Measurement, 63, 204238. doi:10.1177/0013164402250987
Google Scholar | SAGE Journals | ISI
Goldstein, H., Carpenter, J. R., Browne, W. J. (2014). Fitting multilevel multivariate models with missing data in responses and covariates that may include interactions and non-linear terms. Journal of the Royal Statistical Society: Series A (Statistics in Society), 177, 553564. doi:10.1111/rssa.12022
Google Scholar | Crossref | ISI
Goldstein, H., Carpenter, J. R., Kenward, M. G., Levin, K. A. (2009). Multilevel models with multivariate mixed response types. Statistical Modelling, 9, 173197. doi:10.1177/1471082X0800900301
Google Scholar | SAGE Journals | ISI
Gottfredson, N. C., Sterba, S. K., Jackson, K. M. (2016). Explicating the conditions under which multilevel multiple imputation mitigates bias resulting from random coefficient-dependent missing longitudinal data. Prevention Science. Advance online publication. doi:10.1007/s11121-016-0735-3
Google Scholar | ISI
Graham, J. W. (2003). Adding missing-data-relevant variables to FIML-based structural equation models. Structural Equation Modeling: A Multidisciplinary Journal, 10, 80100. doi:10.1207/S15328007SEM1001_4
Google Scholar | Crossref | ISI
Graham, J. W. (2009). Missing data analysis: Making it work in the real world. Annual Review of Psychology, 60, 549576. doi:10.1146/annurev.psych.58.110405.085530
Google Scholar | Crossref | Medline | ISI
Graham, J. W., Olchowski, A. E., Gilreath, T. D. (2007). How many imputations are really needed? Some practical clarifications of multiple imputation theory. Prevention Science, 8, 206213. doi:10.1007/s11121-007-0070-9
Google Scholar | Crossref | Medline | ISI
Graham, J. W., Taylor, B. J., Olchowski, A. E., Cumsille, P. E. (2006). Planned missing data designs in psychological research. Psychological Methods, 11, 323343. doi:10.1037/1082-989X.11.4.323
Google Scholar | Crossref | Medline | ISI
Grund, S., Lüdtke, O., Robitzsch, A. (2016a). Multiple imputation of missing covariate values in multilevel models with random slopes: A cautionary note. Behavior Research Methods, 48, 640649. doi:10.3758/s13428-015-0590-3
Google Scholar | Crossref | Medline | ISI
Grund, S., Lüdtke, O., Robitzsch, A. (2016b). Multiple imputation of multilevel missing data: An introduction to the R package pan. SAGE Open, 6(4), 117. doi:10.1177/2158244016668220
Google Scholar | SAGE Journals | ISI
Grund, S., Lüdtke, O., Robitzsch, A. (in press). Missing data in multilevel research. In Humphrey, S. E., LeBreton, J. M. (Eds.), Handbook for multilevel theory, measurement, and analysis. Washington, DC: American Psychological Association.
Google Scholar
Hofmann, D. A., Gavin, M. B. (1998). Centering decisions in hierarchical linear models: Implications for research in organizations. Journal of Management, 24, 623641. doi:10.1177/014920639802400504
Google Scholar | SAGE Journals | ISI
Hox, J. J., van Buuren, S., Jolani, S. (2016). Incomplete multilevel data. In Harring, J., Stapleton, L. M., Beretvas, S. N. (Eds.), Advances in multilevel modeling for educational research: Addressing practical issues found in real-world applications (pp, 3962). Charlotte, NC: Information Age.
Google Scholar
Keller, B. T., Enders, C. K. (2016). Blimp Software Manual (Version Beta 6.6) [Computer software]. Retrieved from http://www.appliedmissingdata.com
Google Scholar
Kim, S., Sugar, C. A., Belin, T. R. (2015). Evaluating model-based imputation methods for missing covariates in regression models with interactions. Statistics in Medicine, 34, 18761888. doi:10.1002/sim.6435
Google Scholar | Crossref | Medline | ISI
Kreft, I. G. G., de Leeuw, J., Aiken, L. S. (1995). The effect of different forms of centering in hierarchical linear models. Multivariate Behavioral Research, 30, 121. doi:10.1207/s15327906mbr3001_1
Google Scholar | Crossref | Medline | ISI
Little, R. J. A., Rubin, D. B. (2002). Statistical analysis with missing data (2nd ed.). Hoboken, NJ: Wiley.
Google Scholar | Crossref
Lüdtke, O., Marsh, H. W., Robitzsch, A., Trautwein, U., Asparouhov, T., Muthén, B. O. (2008). The multilevel latent covariate model: A new, more reliable approach to group-level effects in contextual studies. Psychological Methods, 13, 203229. doi:10.1037/a0012869
Google Scholar | Crossref | Medline | ISI
Lüdtke, O., Robitzsch, A., Grund, S. (2017). Multiple imputation of missing data in multilevel designs: A comparison of different strategies. Psychological Methods, 22, 141165. doi:10.1037/met0000096
Google Scholar | Crossref | Medline | ISI
Lunn, D. J., Thomas, A., Best, N., Spiegelhalter, D. (2000). WinBUGS—A Bayesian modelling framework: Concepts, structure, and extensibility. Statistics and Computing, 10, 325337. doi:10.1023/A:1008929526011
Google Scholar | Crossref | ISI
McNeish, D. M. (2016). Using data-dependent priors to mitigate small sample bias in latent growth models: A discussion and illustration using Mplus. Journal of Educational and Behavioral Statistics, 41, 2756. doi:10.3102/1076998615621299
Google Scholar | SAGE Journals | ISI
Mehta, P. D. (2013). xxM (Version 0.6.0) [Computer software]. Retrieved from xxm.times.uh.edu
Google Scholar
Meng, X.-L. (1994). Multiple-imputation inferences with uncongenial sources of input. Statistical Science, 9, 538- 558. doi:10.1214/ss/1177010269
Google Scholar | ISI
Mistler, S. A. (2013). A SAS macro for applying multiple imputation to multilevel data. In Proceedings of the SAS Global Forum. Retrieved from http://support.sas.com/
Google Scholar
Mistler, S. A. (2015). Multilevel multiple imputation: An examination of competing methods (Doctoral dissertation). Retrieved from http://repository.asu.edu/
Google Scholar
Muthén, L. K., Muthén, B. O. (2012). Mplus user’s guide (7th ed.). Los Angeles, CA: Muthén & Muthén.
Google Scholar
Newman, D. A. (2009). Missing data techniques and low response rates. In Lance, C. E., Vandenberg, R. J. (Eds.), Statistical and methodological myths and urban legends: Doctrine, verity and fable in the organizational and social sciences (pp. 736). New York, NY: Routledge.
Google Scholar
Newman, D. A. (2014). Missing data: Five practical guidelines. Organizational Research Methods, 17, 372411. doi:10.1177/1094428114548590
Google Scholar | SAGE Journals | ISI
Plummer, M. (2016). JAGS: A program for analysis of Bayesian graphical models using Gibbs sampling (Version 4.2.0) [Computer software]. Retrieved from http://sourceforge.net/projects/mcmc-jags/
Google Scholar
Preacher, K. J., Zyphur, M. J., Zhang, Z. (2010). A general multilevel SEM framework for assessing multilevel mediation. Psychological Methods, 15, 209233. doi:10.1037/a0020141
Google Scholar | Crossref | Medline | ISI
Quartagno, M., Carpenter, J. R. (2016). Jomo: A package for multilevel joint modelling multiple imputation (Version 2.3-1) [Computer software]. Retrieved from http://CRAN.R-project.org/package=jomo
Google Scholar
R Core Team . (2016). R: A language and environment for statistical computing (Version 3.3.0) [Computer software]. Retrieved from http://www.R-project.org/
Google Scholar
Rabe-Hesketh, S., Skrondal, A., Zheng, X. (2012). Multilevel structural equation modeling. In Hoyle, R. H. (Ed.), Handbook of structural equation modeling (pp. 512531). New York, NY: Guilford.
Google Scholar
Rasbash, J., Charlton, C., Browne, W. J., Healy, M., Cameron, B. (2015). MLwiN (Version 2.34) [Computer software]. Bristol, UK: University of Bristol, Centre for Multilevel Modelling.
Google Scholar
Resche-Rigon, M., White, I. R. (2016). Multiple imputation by chained equations for systematically and sporadically missing multilevel data. Statistical Methods in Medical Research. doi:10.1177/0962280216666564
Google Scholar | SAGE Journals
Robitzsch, A., Grund, S., Henke, T. (2016). Miceadds: Some additional multiple imputation functions, especially for mice (Version 1.7-8) [Computer software]. Retrieved from http://CRAN.R-project.org/package=miceadds
Google Scholar
Royston, P. (2004). Multiple imputation of missing values. Stata Journal, 4, 227241.
Google Scholar | SAGE Journals
Rubin, D. B. (1976). Inference and missing data. Biometrika, 63, 581592. doi:10.1093/biomet/63.3.581
Google Scholar | Crossref | ISI
Rubin, D. B. (1987). Multiple imputation for nonresponse in surveys. Hoboken, NJ: Wiley.
Google Scholar | Crossref
Schafer, J. L. (2003). Multiple imputation in multivariate problems when the imputation and analysis models differ. Statistica Neerlandica, 57, 1935. doi:10.1111/1467-9574.00218
Google Scholar | Crossref | ISI
Schafer, J. L., Graham, J. W. (2002). Missing data: Our view of the state of the art. Psychological Methods, 7, 147177. doi:10.1037//1082-989X.7.2.147
Google Scholar | Crossref | Medline | ISI
Schafer, J. L., Yucel, R. M. (2002). Computational strategies for multivariate linear mixed-effects models with missing values. Journal of Computational and Graphical Statistics, 11, 437457. doi:10.1198/106186002760180608
Google Scholar | Crossref | ISI
Seaman, S. R., Bartlett, J. W., White, I. R. (2012). Multiple imputation of missing covariates with non-linear effects and interactions: An evaluation of statistical methods. BMC Medical Research Methodology, 12(1), 46. Retrieved from http://www.biomedcentral.com/1471-2288/12/46
Google Scholar
Shin, Y., Raudenbush, S. W. (2010). A latent cluster-mean approach to the contextual effects model with missing data. Journal of Educational and Behavioral Statistics, 35, 2653. doi:10.3102/1076998609345252
Google Scholar | SAGE Journals | ISI
Snijders, T. A. B., Bosker, R. J. (2012). Multilevel analysis: An introduction to basic and advanced multilevel modeling. Thousand Oaks, CA: Sage.
Google Scholar
Stubbendick, A. L., Ibrahim, J. G. (2003). Maximum likelihood methods for nonignorable missing responses and covariates in random effects models. Biometrics, 59, 11401150. doi:10.1111/j.0006-341X.2003.00131.x
Google Scholar | Crossref | Medline | ISI
Taljaard, M., Donner, A., Klar, N. (2008). Imputation strategies for missing continuous outcomes in cluster randomized trials. Biometrical Journal, 50, 329345. doi:10.1002/bimj.200710423
Google Scholar | Crossref | Medline | ISI
van Buuren, S. (2011). Multiple imputation of multilevel data. In Hox, J. J. (Ed.), Handbook of advanced multilevel analysis (pp. 173196). New York, NY: Routledge.
Google Scholar
van Buuren, S. (2012). Flexible imputation of missing data. Boca Raton, FL: CRC Press.
Google Scholar | Crossref
van Buuren, S., Groothuis-Oudshoorn, K. (2011). MICE: Multivariate imputation by chained equations in R. Journal of Statistical Software, 45(3), 167. doi:10.18637/jss.v045.i03
Google Scholar | ISI
Vermunt, J. K. (2003). Multilevel latent class models. Sociological Methodology, 33, 213239. doi:10.1111/j.0081-1750.2003.t01-1-00131.x
Google Scholar | SAGE Journals | ISI
Vermunt, J. K., Magidson, J. (2013). Latent GOLD (Version 5.0) [Computer software]. Belmont, MA: Statistical Innovations.
Google Scholar
Vermunt, J. K., van Ginkel, J. R., van der Ark, L. A., Sijtsma, K. (2008). Multiple imputation of incomplete categorical data using latent class analysis. Sociological Methodology, 38, 369397. doi:10.1111/j.1467-9531.2008.00202.x
Google Scholar | SAGE Journals | ISI
Vink, G., van Buuren, S. (2013). Multiple imputation of squared terms. Sociological Methods & Research, 42, 598607. doi:10.1177/0049124113502943
Google Scholar | SAGE Journals | ISI
von Hippel, P. T. (2009). How to impute interactions, squares, and other transformed variables. Sociological Methodology, 39, 265291. doi:10.1111/j.1467-9531.2009.01215.x
Google Scholar | SAGE Journals | ISI
Wu, L. (2010). Mixed effects models for complex data. Boca Raton, FL: CRC Press.
Google Scholar
Yucel, R. M. (2008). Multiple imputation inference for multivariate multilevel continuous data with ignorable non-response. Philosophical Transactions of the Royal Society A: Mathematical, Physical and Engineering Sciences, 366, 23892403. doi:10.1098/rsta.2008.0038
Google Scholar | Crossref | Medline | ISI
Zhang, Q., Wang, L. (2016). Moderation analysis with missing data in the predictors. Psychological Methods. Advance online publication. doi:10.1037/met0000104
Google Scholar | Crossref | ISI
Zinn, S. (2013). An imputation model for multilevel binary data (NEPS Working Paper No. 31). Retrieved from http://www.neps-data.de/
Google Scholar
Access Options

My Account

Welcome
You do not have access to this content.



Chinese Institutions / 中国用户

Click the button below for the full-text content

请点击以下获取该全文

Institutional Access

does not have access to this content.

Purchase Content

24 hours online access to download content

Research off-campus without worrying about access issues. Find out about Lean Library here

Your Access Options


Purchase

ORM-article-ppv for $37.50
Single Issue 24 hour E-access for $434.33

Cookies Notification

This site uses cookies. By continuing to browse the site you are agreeing to our use of cookies. Find out more.
Top