Estimation of Latent Regression Item Response Theory Models Using a Second-Order Laplace Approximation

The estimation of high-dimensional latent regression item response theory (IRT) models is difficult because of the need to approximate integrals in the likelihood function. Proposed solutions in the literature include using stochastic approximations, adaptive quadrature, and Laplace approximations. We propose using a second-order Laplace approximation of the likelihood to estimate IRT latent regression models with categorical observed variables and fixed covariates where all parameters are estimated simultaneously. The method applies when the IRT model has a simple structure, meaning that each observed variable loads on only one latent variable. Through simulations using a latent regression model with binary and ordinal observed variables, we show that the proposed method is a substantial improvement over the first-order Laplace approximation with respect to the bias. In addition, the approach is equally or more precise to alternative methods for estimation of multidimensional IRT models when the number of items per dimension is moderately high. Simultaneously, the method is highly computationally efficient in the high-dimensional settings investigated. The results imply that estimation of simple-structure IRT models with very high dimensions is feasible in practice and that the direct estimation of high-dimensional latent regression IRT models is tractable even with large sample sizes and large numbers of items.


Introduction
Item response theory (IRT) is a class of models for categorical observed variables where an underlying latent variable is assumed to generate the observed variables. Through different formulations of the probability, conditional on the latent variable, for the response in each category of each observed variable, different IRT models are obtained. The primary areas of application of IRT models are in education and psychology, where the models are used to estimate the individual latent variable, to assess the properties of scales and tests, and to infer population characteristics.
Maximum likelihood estimation of IRT models involves the calculation of integrals which, for most IRT models, have no tractable analytical solution and have to be approximated. With three latent dimensions or fewer, computationally efficient implementations using fixed numerical quadrature rules are the most commonly used approaches (Bock & Aitkin, 1981). However, as the dimension increases, the computational expense grows exponentially which makes the fixed quadrature methods excessively computationally and memory intensive when the dimension is higher than three. A second approach is to replace the fixed quadrature rules with adaptive quadrature, which means that each unique integral is approximated with a unique set of quadrature points (Cagnone & Monari, 2013;Rabe-Hesketh et al., 2002;Schilling & Bock, 2005). The adaptive quadrature methods reduce the required number of quadrature points per dimension, but the computational expense still increases exponentially with higher dimensions, making the approach prohibitively computationally demanding in very high dimensions. A third approach uses stochastic approximations such as the Metropolis-Hastings Robbins-Monro (MH-RM) method (Cai, 2010a(Cai, , 2010bChalmers, 2015). The stochastic methods are slower than the alternatives with few dimensions, but the computational expense only grows linearly with an increased number of dimensions. A fourth option is the Laplace approximation (Huber et al., 2004;Joe, 2008;Pinheiro & Bates, 1995;Shun, 1997), which uses asymptotic expansions to approximate the required integral. The first-order Laplace approximation is formally equivalent to adaptive quadrature with only one quadrature point per dimension. Hence, the computational demand of the first-order Laplace approximation grows only linearly with increasing dimensionality and as a result, in high-dimensional models, the first-order Laplace approximation is by far the most computationally efficient method out of the four referenced. The downside is however the inaccuracy of the approximation, especially with few observed variables per dimension and with dichotomous observed variables (Joe, 2008). To improve the computational accuracy, higher order Laplace approximations can be pursued, which has been done with generalized linear models (Raudenbush et al., 2000), generalized linear latent variable models with ordinal data (Bianconcini & Cagnone, 2012), and confirmatory factor analysis with ordinal data and a probit link (Jin et al., 2017). However, a higher order Laplace approximation requires a substantial amount of higher order derivatives and greatly increases the computational expense, especially for high-dimensional models (Bianconcini, 2014).

Andersson and Xin
An extension to the regular IRT model that incorporates a latent regression component is used in many areas, for example, in large-scale educational assessment programs such as the Programme for International Student Asessment, the National Assessment of Education Progress, and the National Assessment of basic Education Quality (NAEQ; Jiang et al., 2019). With a latent regression IRT model, the usual assumption of a fixed mean vector is replaced with an assumption that each individual has a mean vector defined by observed covariates and regression parameters. These models have also been called explanatory IRT models (De Boeck & Wilson, 2013) in the literature and can, for many IRT models, be formulated as nonlinear mixed models (Rijmen et al., 2003). The estimation of latent regression models again requires the approximation of integrals. In current operational procedures in large-scale educational assessment programs, estimation is done in two steps (von Davier & Sinharay, 2013). First, the item parameters are estimated using a unidimensional model without considering the covariates. Second, assuming fixed item parameters, the latent regression and covariance parameters are estimated. The estimation of the latent regression and covariance parameters can be done with an expectationmaximization (EM) algorithm using a second-order Laplace approximation (Thomas, 1993) or using stochastic approximation (von Davier & Sinharay, 2010). It is also possible to utilize adaptive quadrature, stochastic approximations, or a Laplace approximation and estimate the item and regression parameters simultaneously (Chalmers, 2015;Harrell, 2015;Raudenbush et al., 2000), but such methods have typically not been applied to large-scale assessment programs partly because of the large computational requirements and partly because a second-order Laplace approximation has not been available for many IRT models.
The purpose of this article is to introduce an estimation method for latent regression IRT models using second-order Laplace approximations of the integrals in the likelihood function. As a special case of the approach, a second-order Laplace estimation method for regular IRT models without covariates is obtained. The method is introduced for multidimensional simple structure IRT models (also called between-item IRT models) where each item can be individually modeled with either of the nominal response, graded response, generalized partial credit, and three-parameter logistic (3-PL) models. Previous implementations of the higher order Laplace approximation with item response models have focused on estimation of only the latent regression and covariance parameters (Thomas, 1993), Rasch type models (Raudenbush et al., 2000), or particular models for ordinal data without a latent regression (Jin et al., 2017). The approach in this article extends previous research using the second-order Laplace approximation by providing a general method that applies to any item response model and by applying the method to several common models in IRT, for which the second-order Laplace method was not previously available.

Latent Regression IRT Models
Let Y denote the J Â N matrix of item responses and let X denote the K Â N matrix of fixed covariates including a constant equal to 1 for each individual. Also, let y i and x i denote the J Â 1 item response vector and K Â 1 covariate vector for individual i, respectively. Define θ as the p Â 1 latent variable vector and let Pðy i jθ; aÞ be the likelihood of the response vector y i conditional on θ, where ␣ denotes the p ␣ Â 1 vector of all item parameters. In a latent regression model, the latent variable vector is related to the covariates by the regression model θjx i ¼ ␤x i þ ε, where ε*N ð0; ΣÞ and ␤ is a p Â K matrix of regression coefficients. The marginal likelihood of a response vector y i is then and the marginal log-likelihood for all n individuals is

IRT Models
The IRT model is defined by the likelihood Pðy i jθ; ␣Þ which is equal to Pðy ij jθ;␣ j Þ due to the local independence assumption, where y ij denotes the item response to the jth item, with y ij taking values k 2 f1; : : : ; m j g, and ␣ j is a p ␣ j Â 1 vector of item parameters for item j. There are numerous models for the item responses y i . We will consider one model for nominal data, namely, the nominal response model with probabilities defined by (Bock, 1972) where b j1 ¼ 0 and a jc and a jk are p Â 1 vectors such that a j1 ¼ 0. Let a j be a p Â 1 vector of discrimination parameters. For ordinal data, we consider the graded response model with probabilities defined by (Samejima, 1969) where Andersson and Xin with P Ã ð1jθ;␣ j Þ ¼ 1 and P Ã ðm j þ 1jθ;␣ j Þ ¼ 0. We also consider the generalized partial credit model, a special case of the nominal response model, with probabilities defined by (Muraki, 1992) For dichotomous items, with y ij ¼ 2 indicating a correct response, these three models coincide, resulting in the two-parameter logistic (2-PL) model With dichotomous data, we will also consider the 3-PL model, with probabilities defined by (Birnbaum, 1968) Note that, for numerical stability in estimation, the parametrization will be used when estimating the unknown parameters with the 3-PL model. Depending on the exact model formulation, IRT models may be part of other modeling frameworks. For example, the Rasch model can be formulated as a generalized linear mixed model (Clayton, 1996), the graded response model with a probit link as a confirmatory factor analysis model with categorical observed variables (Bartholomew, 1980), and the generalized partial credit model and nominal response model as generalized linear latent variable models (Bartholomew et al., 2011) or as nonlinear mixed models (Rijmen et al., 2003). Meanwhile, the 3-PL model can be viewed as a type of latent class model (Vermunt, 2001).

Parameter Estimation Using a Second-Order Laplace Approximation
The Laplace approximation has been proposed to approximate integrals of the form Latent Regression Item Response Models with 0 < J < 1 and hðxÞ being a smooth function with a unique minimum at . We can then write the integral as (Shun, 1997) IðJ Þ ¼ 2p J p=2 jH 0 j À1=2 e ÀJhðx0Þ 1 þ R J þ ::: with where and fb jk g correspond to the entries in H À1 0 . Consider now the latent regression model defined by θ i jx i *N ð␤x i ; ΣÞ and a simple-structure item response model. Let h i ðθÞ ¼ Àlog½Pðy i jθ; ␣Þjðθ; ␤x i ; ΣÞ and letθ i denote the minimizer of h i ðθÞ. For parameter identification, we impose the restrictions diagðΣÞ ¼ ð1; : : : ; 1Þ and P n i¼1 ␤x i ¼ 0. Let the p ␤ Â 1 vector ␤ Ã and the p Σ Â 1 vector Σ Ã denote the vectors of free parameters in ␤ and Σ, respectively. The proposed estimation method will directly maximize an approximation to the marginal likelihood function with a quasi-Newton method. The second-order Laplace approximation of the likelihood for an individual i is and hence the Laplace log-likelihood is column and jth row entry of the inverse of H i . The term e 0 is for the simple structure assumed here defined by since all third-order and higher cross-derivatives are 0. We then maximize the expression Andersson and Xin with respect to the unknown parameters to obtain the second-order Laplace approximation maximum likelihood estimator of ␣; ␤ Ã and Σ Ã . To implement the estimation method, the gradient of the Laplace-approximated log-likelihood is needed. Let v be a function such that, for column vectors x, y and z, vðx; y; zÞ where, for r 2 f␣; and Maximization of the approximated marginal likelihood is done by employing a modified Newton-Raphson method. In iteration m, we update the parameter estimates by is the approximation to the Hessian matrix with the parameter estimates from iteration m À 1 and l m is the step length for iteration m. The matrix B mÀ1 can be set to P n i¼1 ∇ ðmÀ1Þ i ∇ ðmÀ1Þ 0 i , the approximation used in the Berndt, Hall, Hall and Hausman (Berndt et al., 1974;BHHH) or to the approximation from the Broyden-Fletcher-Goldfarb-Shanno (BFGS) algorithm. In our experience, using the BFGS approximation results in better convergence rates for small sample sizes but requires Latent Regression Item Response Models considerably more iterations to converge compared to using the BHHH approximation. Hence, BFGS is recommended for small sample sizes, while BHHH is recommended for large sample sizes. The step length l m is by default set to 0.5 for the first 25 iterations and to 1 for the remaining iterations in order to stabilize the algorithm during the initial steps. A line search (Nocedal & Wright, 2006) may also be employed to find an optimal value of l m but since the evaluation of the gradient and the likelihood requires the minimization of h i ðθÞ for each individual, it is typically more efficient to set the step length to a fixed value, even though this will require slightly more iterations in the algorithm. The required derivatives for the nominal response model, graded response model, generalized partial credit model, and 3-PL model are provided as Supplementary Material in the online version of the journal.
To obtain an estimator of the asymptotic covariance matrix of the parameter estimator, it is proposed to use one of three different estimators. The first is equal and the third is the In practice, the calculation of the observed information matrix is done by numerically differentiating the observed gradient from Equation 18 with respect to the unknown parameters. This calculation can be time-consuming since the gradient evaluation involves the minimization of h i ðθÞ for each individual. For correctly specified models and with large sample sizes, the matrixΣ XPD is appropriate, whileΣ Sand is needed when the model is misspecified. It can be noted that the matrixΣ XPD is a by-product of the estimation method and is thus easily obtained within the proposed approach. In this article, the finite-sample properties of the estimator based on the observed information matrix or the empirical cross-product matrix are evaluated through simulations.

Numerical Results
To explore the properties of the proposed estimation method, three simulation studies were performed. In the simulations, the R package mirt (v. 1.30; Chalmers, 2012) was used for the fixed quadrature EM and MH-RM estimation methods, while an alpha version of the R package LaplaceIRT (Andersson & Jin, 2020) was used for the Laplace estimation. Both of the R packages utilize Cþþ code for the heavy computations needed. In addition, the MH-RM implementation in the commercial software flexMIRT (Cai & Wirth, 2013) was used. In the remainder of the article, we refer to the MH-RM implementation in mirt as MH-RM1 and the MH-RM implementation in flexMIRT as MH-RM2. The first simulation study considered simple-structure dichotomous item response models up to a dimension of six and the second simulation study considered simplestructure polytomous item response models up to a dimension of 12. The covariance matrices used for these two studies were based on the 12 Â 12 covariance matrix given in Table 1, which consists of combinations of the covariance terms 0.4, 0.5, and 0.6 and is similar to the setting used in Cai (2010b). The third simulation study used a five-dimensional latent regression model with 60 binary and ordinal items and 100 continuous covariates, designed to mimic a large-scale educational assessment.
The estimation methods EM, first-order Laplace (Lap1), second-order Laplace (Lap2), MH-RM1, and MH-RM2 were considered in the study, but the EM method was only used for models with a dimension of three since estimation in reasonable time was not possible for the higher dimensional models. All estimation methods used the absolute difference in the parameter estimates between successive iterations as the stopping criterion with a tolerance of 0.0001. The EM algorithm with fixed quadrature used 20 quadrature points, the

Latent Regression Item Response Models
MH-RM1 method used five draws per iteration, the gain function ð0:1=tÞ 0:75 , with t denoting the iteration number, and maximum iterations in the third stage equal to 4,000; the MH-RM2 method used five draws per iteration, the default gain function, and maximum iterations in the third stage equal to 4,000. The settings with the MH-RM1 method were similar to those used in Yang and Cai (2014) and Chalmers and Flora (2014) and worked better than the settings used in Cai (2010b) based on initial testing. The standard errors were estimated based on the Oakes method (Oakes, 1999) for the EM algorithm, with the postconvergence approximation (Yang & Cai, 2014, denoted FMHRM in mirt) using 10,000 draws for the MH-RM1 method, and with the recursive approximation using the default tolerance level in flexMIRT for the MH-RM2 method. Numerical differentiation of the observed gradient with finite differences was used to calculate the observed information matrix with the Laplace estimation methods. The starting values for the EM, MH-RM1, and MH-RM2 estimation methods were set to the default values used in the respective software implementations. The starting values for the Laplace estimation methods were for the covariance parameters set to 0.5, for the discrimination parameters set to a random number uniformly selected between 0.8 and 1.2, and for the intercept parameters set to 0, (1, À1), (2, 0, À2), (3, 1, À1, À3), and (4, 2, 0, À2, À4), depending on the number of categories (2, 3, 4, 5, and 6). In this study, nonconvergence was identified by the failure to converge within the maximum number of iterations or obtaining a nonpositive definite observed information matrix after convergence. The number of replications was 200 for each of the simulation settings used. The evaluation criteria were, for a parameter x with estimatex i in replication i, the absolute bias j P n i¼1 ðx i À xÞj=n and the root mean square error (RMSE) ffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi ffi P n i¼1 ðx i À xÞ 2 =n q . The evaluation criteria were averaged for each type of parameter in each setting. For each simulation setting, we only report the bias and RMSE from the estimation methods that attained a convergence rate of 50% or higher.

Simulation Study 1: Multidimensional 2-PL Models
To evaluate the properties of the first-and second-order Laplace approximation estimation methods with dichotomous items, three-and six-dimensional 2-PL models were used where 3 and 6 dichotomous items per dimension were considered. The covariance matrices for the models were set to the corresponding upper submatrices of the matrix in Table 1. Sample sizes 500, 1,000, and 2,000 were used for each condition. The nonconvergence rates for each simulation setting are given in Tables 2 and 3. Due to high nonconvergence rates, the results for Lap1 were excluded for all models except the six-dimensional model with 6 items per dimension. For the six-dimensional model with only 3 items per dimension, MH-RM1 also had excessively high nonconvergence rates, whereas Lap2 had moderate nonconvergence rates (Table 3). MH-RM2 had Andersson and Xin no issues with nonconvergence. Note that only the replications for which all compared estimators converged are included in the presented results. The full results can be obtained from the first author upon request.
The results for the multidimensional 2-PL models can be found in Tables 4 through 7. Concerning the parameter estimates, the three-dimensional model with 3 items per dimension (Table 4) has higher bias overall with Lap2 compared to EM, MH-RM1, and MH-RM2, while the Lap2 method has lower RMSE, excepting the covariance parameters. The bias is not much reduced for Lap2 with an increased sample size. With 6 items per dimension (Table 5), the differences between EM, MH-RM1, MH-RM2, and Lap2 decrease, and there are no substantial differences between the estimation methods. With six dimensions and 3 items per dimension (Table 6), Lap2 is uniformly better than MH-RM2 with regard to bias and RMSE. Furthermore, the bias is reduced with a larger sample size. For the case of six dimensions and 6 items per dimension (Table 7), the performance of Lap2 is overall somewhat better compared to MH-RM1 and MH-RM2. Meanwhile, both the bias and RMSE are substantially improved with Lap2 compared to Lap1 for all parameters with sample size

Simulation Study 2: High-Dimensional Graded Response Model
To evaluate the proposed estimation method with a combination of dichotomous and polytomous observed variables, the three-dimensional simplestructure graded response model in Cai (2010b) was used as the basis for a simulation study where, in addition to the three-dimensional case, six-and 12-dimensional models were also considered. As in Simulation Study 1, the covariance matrices for the three-and six-dimensional models were set to the corresponding upper submatrices of the matrix in Table 1. The 12-dimensional model used the full covariance matrix. For the three-dimensional model with a sample size of 500, the second-order Laplace estimation took an average of approximately 3 seconds using a 3.3GHz CPU (Core i5-4590) with 16GB RAM, which can be compared to the time with MH-RM reported in Cai (2010b), which was 20 seconds with a 2.0GHz CPU and 2GB RAM. The estimation methods EM, Lap1, Lap2, MH-RM1, and MH-RM2 were considered in the study, but EM was only used with the three-dimensional models since estimation in reasonable time was not possible for the higher dimensional models. In addition, it was not

Latent Regression Item Response Models
possible to use MH-RM1 in the simulation with the 12-dimensional model due to the excessively long time required in estimation and for calculation of the observed information matrix (with sample size 500 one replication took more than 150 minutes to finish). For all conditions considered, the nonconvergence rates were below 1%.
The results for the three-dimensional model can be seen in Table 8. Overall, Lap1 has higher bias than the other estimation methods. However, Lap1 has lower RMSE for the item parameters compared to the other estimation methods with the smaller sample sizes. For the parameter estimates, EM, Lap2, MH-RM1, and MH-RM2 are virtually identical across all settings and evaluation criteria, with the exception of higher bias and RMSE for the covariance parameters with MH-RM2. The six-dimensional results can be found in Table 9. The results mirror those for the three-dimensional case, with Lap1 having higher bias than the alternatives but maintaining a lower RMSE for the small sample case. Lap2 has the overall lowest bias across all estimation methods for each setting although the differences are small with the largest sample size. The standard errors with the three-and six-dimensional models are very close regarding the accuracy and precision between the different estimation methods, with the exception that MH-RM1 has slightly higher RMSE overall and that MH-RM2 has higher bias and RMSE overall. With the 12-dimensional model, the Laplace and MH-RM2 estimation methods were the only ones possible to use and the results can be seen in Table 10. The results are similar to the lower dimensional Andersson and Xin cases in that Lap1 has higher bias but lower variance, resulting in a lower RMSE for the small sample sizes, and that MH-RM2 has higher bias and RMSE with respect to the covariance parameters. Lap2 has bias which is slightly lower on average than with lower dimensional models and has the lowest overall bias among the considered estimation methods. The standard errors are highly similar with respect to the bias and RMSE between Lap1 and Lap2, while the bias and RMSE are slightly higher with MH-RM2.

Simulation Study 3: Latent Regression in Large-Scale Assessment
A five-dimensional model based on the NAEQ 2015 mathematics assessment was also used, where the correlations between the latent variables were all set to .8. The model had 60 generalized partial credit model items (12 in each dimension, with 6 dichotomous and 6 polytomous three-category items in each dimension) and 100 uncorrelated covariates, yielding a total of 690 parameters. A sample size of 60,000 was used, similar to the real Grade 8 NAEQ assessment, where each individual answered on average 20 items in total, meaning that on Latent Regression Item Response Models average 40 item responses per individual were missing due to the matrix sampling design. To obtain reasonable starting values for the item and regression parameters, an initial analysis of the data using a one-dimensional latent regression model was used with a random sample of 10,000 from the simulated data in each replication. The five-dimensional model estimation then used starting values that were equal to the estimated item and regression parameters from the onedimensional model. Only the first-and second-order Laplace approximation estimation methods were possible to use due to the large computational and memory requirements. With the first-and second-order Laplace approximation methods, each replication took approximately 30 minutes to finish, and for this reason, only one simulation condition was considered and the observed information matrix was not calculated.
The simulation results are given in Table 11. The bias is substantially lower with Lap2 for the item and covariance parameters, while the improvement is smaller for the regression parameters. The same applies with respect to the RMSE, except that the differences between Lap1 and Lap2 are somewhat smaller. The standard errors were estimated with the cross-product approximation due to the large computational resources required to calculate the observed information matrix. The standard errors are approximately equally accurate and precise for either of the two estimation methods.

Conclusions
In this article, a second-order Laplace approximation was introduced for the estimation of multidimensional simple structure item response models with a latent regression component. Through numerical illustrations using realistic settings in education and psychology, it was shown that the estimation method gave a substantial improvement over the first-order Laplace approximation for the estimation of latent regression models and multidimensional IRT models with both binary and ordinal data. Furthermore, the estimation method was typically equally or more precise in estimating multidimensional IRT models when compared to alternatives such as the EM algorithm with fixed quadrature and two Latent Regression Item Response Models implementations of the MH-RM, while considerably reducing the time required for convergence with high-dimensional models. The second-order Laplace approximation provides a fast yet accurate method for estimation of highdimensional simple structure IRT models and enables the direct estimation of high-dimensional simple structure latent regression IRT models to situations with a large amount of items, covariates, and individuals such as in large-scale educational assessment programs.
In previous research, the first-order Laplace approximation has been shown to function poorly with a low number of observed dichotomous variables (Joe, 2008). The results of this study echo this but suggest that the second-order Laplace approximation can substantially improve over the first-order approximation with dichotomous observed variables. Note that with as few as three observed variables per dimension, the second-order Laplace approximation estimation method can still exhibit bias that is larger than the alternatives. However, when increasing the number of observed dichotomous variables per dimension to six, the second-order Laplace approximation was equally good or better than any of the alternatives considered with respect to the RMSE. The practical performance of the Laplace approximation-based estimation methods is related to the theoretical error rates of the approximations used (Shun, 1997). With an increased number of items, the error of the approximations decrease, and with a higher order approximation, the error decreases at a faster rate compared to a lower order approximation.
Besides the reduced time needed in estimation, the Laplace approximation has several other advantages compared to the stochastic approximation methods. The convergence diagnostics are more straight-forward since the observed-likelihood gradient provides an inexpensive convergence check to supplement the stopping criterion. Although the second-order test via the observed information matrix can be expensive to obtain, the second-order test is also required with the stochastic approximation methods, and for the stochastic approximation methods, this matrix has to be approximated via simulations which often requires a substantial amount of replicates in order for the accuracy to be sufficient. Another advantage is the fast calculation of the observed log-likelihood, used for model comparisons and hypothesis tests. With stochastic approximation methods, obtaining an accurate approximation of the log-likelihood requires heavy computations, which for large sample sizes can take even longer to finish than the actual estimation process. Another advantage to the Laplace estimation method is that the method is not heavily dependent on the estimation settings used. With the MH-RM method, choices have to be made regarding how many samples to draw at each iteration, how to select the gain constants, and how accurate the approximations to the observed information matrix and observed log-likelihood should be. The optimal choices for these settings can vary greatly for different problems, which makes the application of the MH-RM somewhat difficult from the user perspective. It can be noted that the mirt implementation of the MH-RM (denoted MH-

Andersson and Xin
RM1 in the article) exhibited convergence problems with small sample sizes and with few dichotomous items. We have used MH-RM estimation settings which are highly similar to previous studies (Chalmers & Flora, 2014;Yang & Cai, 2014) and found that other settings like those in, for example, Cai (2010b) did not improve the performance. However, it is possible that tuning the settings further could have improved the convergence rates, and we also note that the flexMIRT implementation (denoted MH-RM2 in the article) did not exhibit the same nonconvergence problems. The performance of MH-RM1 and MH-RM2 are dependent on the specific settings used in this study. By changing settings such as the maximum number of iterations, the stopping criterion, or the type of standard error estimation, different performance could be obtained. The settings in the Laplace approximation require less finetuning, but it may be necessary to try multiple optimizers or step lengths during estimation, particularly for small sample sizes. To exemplify the impact of the specific settings used, we can note that the standard errors from the recursive approximation (used with the MH-RM2 method) have a tendency to underestimate the standard errors, while the cross-product matrix used in Simulation Study 3 has a tendency to overestimate the standard errors with small sample sizes.
In the current study, we have only investigated the computational efficiency of the different methods using a single CPU core. With the Laplace approximation methods, multiple cores are easily utilized by parallel computations of each individual contribution to the log-likelihood and gradient. We also note that estimation methods using MH-RM easily utilize multiple cores which will improve their computational efficiency with many cores. In addition, using one draw instead of five draws as done here will increase the computational efficiency with MH-RM. To compare the computational efficiency of estimation methods incorporating parallel computing is a useful future area of research.
A deficiency of estimation with a higher order Laplace approximation is that the computational requirement for models with a more complicated structure increases greatly in higher dimensions. For example, with 12 dimensions and items that load on all latent variables, the summations in Equation 12 contain up to 12 6 nonzero elements, instead of only up to 12 2 as in the simple structure case given in Equation 16. Hence, it is unlikely that the second-order Laplace approximation will be more efficient than the stochastic approximation methods for the most general models although it may provide a benefit for models that are slightly more complex than the simple structure models (e.g., when each item loads on only two out of many latent variables) provided that the computer implementation takes the specific structure into account. It is straight-forward to implement a general estimation method, and general model structures are supported in the software package LaplaceIRT. Additional research is needed to evaluate the properties of the higher order Laplace estimation method for models with a more complex structure.

Latent Regression Item Response Models
The proposed method is general but for each new item response model up to the fifth-order derivatives with respect to the latent variables need to be derived and additionally the derivatives with respect to the item parameters have to be derived for the derivatives up to the fourth order with respect to the latent variables. In this article, support for the nominal response, graded response, generalized partial credit, and 3-PL models has been provided, and any combination of these models can be used.

Declaration of Conflicting Interests
The author(s) declared no potential conflicts of interest with respect to the research, authorship, and/or publication of this article.

Funding
The author(s) received no financial support for the research, authorship, and/or publication of this article.