Abstract
We build upon the existing literature to formulate a class of models for multivariate mixtures of Gaussian, ordered or unordered categorical responses and continuous distributions that are not Gaussian, each of which can be defined at any level of a multilevel data hierarchy. We describe a Markov chain Monte Carlo algorithm for fitting such models. We show how this unifies a number of disparate problems, including partially observed data and missing data in generalized linear modelling. The two-level model is considered in detail with worked examples of applications to a prediction problem and to multiple imputation for missing data. We conclude with a discussion outlining possible extensions and connections in the literature. Software for estimating the models is freely available.
|
Aitchison J
and
Bennett JA
(1970)
Polychotomous quantal response by maximum indicant
. Biometrika, 57,
253–62
. Google Scholar | Crossref | ISI | |
|
Asparouhov T
and
Muthen B
(2007) Multilevel mixture models. In
Hancock GR
and
Samuelson KM
(eds). Advances in latent mixture models Charlotte, NC: Information Age Publishing, Inc
., 27–51. Google Scholar | |
|
Box GEP
and
Cox DR
(1964)
An analysis of transformations (with discussion)
. Journal of the Royal Statistical Society, B, 26,
211–52
. Google Scholar | |
|
Browne WJ
(2009) MCMC estimation in MLwiN.
Bristol: University of Bristol
. Google Scholar | |
|
Browne WJ
and
Draper D
(2006)
A comparison of Bayesian and likelihood based methods for fitting multilevel models
. Bayesian Analysis, 1,
473–514
. Google Scholar | Crossref | |
|
Carpenter J
and
Goldstein H
(2004)
Multiple imputation using MLwiN
. Multilevel Modelling Newsletter, 16,
9–18
. Google Scholar | |
|
Carstairs V
and
Morris R
(1991) Deprivation and health in Scotland.
Aberdeen, Scotland: Aberdeen University Press
. Google Scholar | |
|
Currie C
,
Levin K
, and
Todd J
(2008) Health behaviour in school-aged children: findings from the 2006 HBSC survey in Scotland.
Child and Adolescent Health Research Unit, University of Edinburgh
. Google Scholar | |
|
Dunson DB
(2000)
Bayesian latent variable models for clustered mixed outcomes
. Journal of the Royal Statistical Society, Series B, 62,
355–66
. Google Scholar | Crossref | |
|
Geweke J
(1991)
Efficient simulation from the multivariate normal and student-t distributions subject to linear constraints
. Computing science and statistics: Proceedings of the 23rd symposium on the interface. Fairfa Station, VA: Interface foundation of North America. Google Scholar | |
|
Goldstein H
(1989) Models for multilevel response variables with an application to Growth Curves. In
Bock RD
(ed). Multilevel analysis of educational data.
New York: Academic Press
, 107–25. Google Scholar | Crossref | |
|
Goldstein H
(2003) Multilevel statistical models. Third edition.
London: Edward Arnold
. Google Scholar | |
| Goldstein H (2009) The analysis of survival and event history data using a latent normal model. (in press). Google Scholar | |
|
Goldstein H
,
Bonnet G
, and
Rocher T
(2007)
Multilevel structural equation models for the analysis of comparative data on educational performance
. Journal of Educational and behavioural Statistics, 32,
252–86
. Google Scholar | SAGE Journals | |
|
Goldstein H
and
Browne W
(2005) Multilevel factor analysis models for continuous and discrete data. In
Olivares A
and
McArdle JJ
(eds). Contemporary psychometrics. A Festschrift to Roderick P. McDonald.
Mahwah, NJ: Lawrence Erlbaum
. Google Scholar | |
|
Goldstein H
and
Kounali D
(2009)
Multivariate multilevel modelling of childhood growth, members of growth measurements and adult characteristics
. Journal of the Royal Statistical Society, A, 172,
599–613
. Google Scholar | Crossref | |
|
Heitjan DF
and
Rubin DB
(1991)
Ignorability and coarse data
. Annals of Statistics, 19,
2244–53
. Google Scholar | Crossref | ISI | |
|
Imai K
and
van Dyk DA
(2005)
A Bayesian analysis of the multinomial probit model using marginal data augmentation
. Journal of Econometrics, 124,
311–34
. Google Scholar | Crossref | |
|
Kenward M
and
Carpenter J
(2007)
Multiple imputation: current perspectives
. Statistical Methods in Medical Research, 16,
199–218
. Google Scholar | SAGE Journals | |
| Mathworks (2004) Matlab. Available at http://www.mathworks.co.uk. Google Scholar | |
|
Muthen LK
and
Muthen BO
(2004) MPLUS users guide version 5.
Los Angeles: University of California, Graduate School of Education
. Google Scholar | |
|
Pitt M
,
Chan D
, and
Kohn R
(2006)
Efficient Bayesian inference for Guassian copula regression models
. Biometrika, 93,
537–54
. Google Scholar | Crossref | |
|
Qin C
,
Dietz PM
,
England LJ
,
Martin JA
,
et al.
(2007)
Effects of different data-editing methods on trends in race-specific delivery rates, United States, 1990–2002
. Pediatric and Perinatal Epidemiology, 21,
41–49
. Google Scholar | Crossref | Medline | |
|
Rabe-Hesketh S
,
Pickles A
, and
Skrondal A
(2001)
GLLAMM: a general class of multilevel models and a STATA program
. Multilevel Modelling Newsletter, 13,
17–23
. Google Scholar | |
|
Rabe-Hesketh S
,
Skrondal A
, and
Pickles A
(2005)
Maximum likelihood estimation of limited and discrete dependent variable models with nested random effects
. Journal of Econometrics, 128,
301–23
. Google Scholar | Crossref | ISI | |
|
Rasbash J
,
Steele F
,
Browne W
and
Goldstein H
(2009). A user’s guide to MLwiN version 2.10.
Bristol, Centre for Multilevel Modelling, University of Bristol
. Google Scholar | |
|
Rubin DB
(1987) Multiple imputation for non response in surveys.
Chichester: Wiley
. Google Scholar | Crossref | |
|
Schafer JL
(1997) Analysis of incomplete multivariate data.
London: Chapman & Hall
. Google Scholar | Crossref | |
|
Scheuren F
and
Winkler WE
(1993)
Regression analysis of data files that are computer matched
. Survey Methodology, 19,
35–38
. Google Scholar | |
| Scottish Executive (2003) Hungry for success: a whole school approach to school meals in Scotland.
Edinburgh: The Stationary Office
. Google Scholar | |
| Scottish Health Promoting Unit (2004) Being well–doing well: a framework for health promoting schools in Scotland 2004.
Dundee: SHPSU
. Google Scholar | |
|
Spiegelhalter D
,
Best N
,
Carlin BP
, and
Van der Linde A
(2002)
Bayesian measures of model complexity and fit (with discussion)
. Journal of the Royal Statistical Society, B, 64,
583–640
. Google Scholar | Crossref | |
|
Spiegelhalter DJ
,
Thomas A
, and
Best NG
(1999) WINBUGS version 1.2, user manual.
Cambridge: MRC Biostatistics Unit
. Google Scholar | |
|
Van Buuren S
(2007)
Multiple imputation of discrete and continuous data by fully conditional specification
. Statistical Methods in Medical Research, 16,
219–42
. Google Scholar | SAGE Journals | ISI | |
|
Van Dyk D
and
Meng X
(2001)
The art of data augmentation
. Journal of Computational and Graphical Statistics, 10,
1–30
. Google Scholar | Crossref | ISI | |
|
Yucel R
(2008)
Multiple imputation for multilevel continuous data. Philosophical transactions of the Royal Society, A
, 2,
2389–403
. Google Scholar |
