This paper focuses on hypothesis testing in lasso regression, when one is interested in judging statistical significance for the regression coefficients in the regression equation involving a lot of covariates. To get reliable p-values, we propose a new lasso-type estimator relying on the idea of induced smoothing which allows to obtain appropriate covariance matrix and Wald statistic relatively easily. Some simulation experiments reveal that our approach exhibits good performance when contrasted with the recent inferential tools in the lasso framework. Two real data analyses are presented to illustrate the proposed framework in practice.

1. Fahrmeir, L, Kneib, T, Lang, S, et al. Regression: models, methods and applications, Berlin: Springer, 2013.
Google Scholar | Crossref
2. Tibshirani, R . Regression shrinkage and selection via the lasso. J R Stat Soc: Series B 1996; 58: 267288.
Google Scholar
3. Meinshausen, N, Bühlmann, P. High-dimensional graphs and variable selection with the lasso. Ann Stat 2006; 34: 14361462.
Google Scholar | Crossref | ISI
4. Zhang, C, Huang, J. The sparsity and bias of the lasso selection in high-dimensional linear regression. Ann Stat 2008; 36: 15671594.
Google Scholar | Crossref | ISI
5. Beck, A, Teboulle, M. A fast iterative shrinkage-thresholding algorithm for linear inverse problems. SIAM J Imaging Sci 2009; 2: 183202.
Google Scholar | Crossref | ISI
6. Candes, E, Plan, Y. Near ideal model selection by l1 minimization. Ann Stat 2009; 37: 21452177.
Google Scholar | Crossref | ISI
7. Friedman, J, Hastie, T, Tibshirani, R. Regularization paths for generalized linear models via coordinate descent. J Stat Softw 2010; 33: 122.
Google Scholar | Crossref | Medline | ISI
8. Tibshirani, R . Regression shrinkage and selection via the lasso: a retrospective. J R Stat Soc: Ser B 2011; 73: 273282.
Google Scholar | Crossref
9. Augugliaro, L, Mineo, A, Wit, E. Differential geometric least angle regression: a differential geometric approach to sparse generalized linear models. J R Stat Soc: Ser B 2013; 75: 471498.
Google Scholar | Crossref
10. Tutz, G, Gertheiss, J. Regularized regression for categorical data (with discussion). Stat Model 2016; 16: 161200.
Google Scholar | SAGE Journals | ISI
11. Mi, K, Yongdai, K, Kuhwan, J, et al. Logistic lasso regression for the diagnosis of breast cancer using clinical demographic data and the bi-rads lexicon for ultrasonography. Ultrasonography 2018; 37: 3642.
Google Scholar | Crossref | Medline
12. Khanji, C, Lalonde, L, Bareil, C, et al. Lasso regression for the prediction of intermediate outcomes related to cardiovascular disease prevention using the transit quality indicators. Med Care 2018; 57: 6272.
Google Scholar
13. Frost, H, Amos, C. Gene set selection via lasso penalized regression (SLPR). Nucleic Acids Res 2017; 45: e114e114.
Google Scholar | Crossref | Medline
14. Lu, Y, Zhou, Y, Qu, W, et al. A lasso regression model for the construction of microrna-target regulatory networks. Bioinformatics 2011; 27: 24062413.
Google Scholar | Crossref | Medline | ISI
15. Pripp, A, Stanii, M. Association between biomarkers and clinical characteristics in chronic subdural hematoma patients assessed with lasso regression. PLoS One 2017; 12: 115.
Google Scholar | Crossref
16. Guo, P, Zeng, F, Hu, X, et al. Improved variable selection algorithm using a lasso-type penalty, with an application to assessing hepatitis b infection relevant factors in community residents. PLoS One 2015; 10: 123.
Google Scholar | Crossref | ISI
17. Musoro, J, Zwinderman, A, Puhan, M, et al. Validation of prediction models based on lasso regression with multiply imputed data. BMC Med Res Method 2014; 14: 116116.
Google Scholar | Crossref | Medline | ISI
18. Nesterov, Y . Smooth minimization of non-smooth functions. Math Program 2005; 103: 127152.
Google Scholar | Crossref
19. Beck, A, Teboulle, M. Smoothing and first order methods: a unified framework. SIAM J Optim 2012; 22: 557580.
Google Scholar | Crossref
20. Fan, J, Li, R. Variable selection via nonconcave penalized likelihood and its oracle properties. J Am Stat Assoc 2001; 96: 13481360.
Google Scholar | Crossref | ISI
21. Osborne, M, Presnell, B, Turlach, B. On the lasso and its dual. J Comput Graph Stat 2000; 9: 319337.
Google Scholar | ISI
22. Kyung, M, Gilly, J, Ghoshz, M, et al. Penalized regression, standard errors, and Bayesian lassos. Bayesian Anal 2010; 5: 144.
Google Scholar | Crossref | Medline
23. Beran, R . Estimated sampling distributions: the bootstrap and competitors. Ann Stat 1982; 10: 212225.
Google Scholar | Crossref
24. Chatterjee, A, Lahiri, SN. Bootstrapping lasso estimators. J Am Stat Assoc 2011; 106: 608625.
Google Scholar | Crossref | ISI
25. Bühlmann P. Proposing the vote of thanks: regression shrinkage and selection via the lasso: a retrospective by Robert Tibshirani, ftp://ftp.stat.math.ethz.ch/pub/Manuscripts/buhlmann/discRSS2010.pdf (2010).
Google Scholar
26. Zhang, X, Cheng, G. Simultaneous inference for high-dimensional linear models. J Am Stat Assoc 2017; 112: 757768.
Google Scholar | Crossref
27. Dezeure, R, Bühlmann, P, Zhang, C. High-dimensional simultaneous inference with the bootstrap. Test 2017; 26: 685719.
Google Scholar | Crossref
28. Lan, W, Zhong, P, Li, R, et al. Testing a single regression coefficient in high dimensional linear models. J Econom 2016; 195: 134168.
Google Scholar | Crossref
29. Meinshausen, N, Bühlmann, P. P-values for high-dimensional regression. J Am Stat Assoc 2009; 104: 16711681.
Google Scholar | Crossref | ISI
30. Mandozzi, J, Bühlmann, P. Hierarchical testing in the high-dimensional setting with correlated variables. J Am Stat Assoc 2016; 111: 331343.
Google Scholar | Crossref
31. Minnier, J, Lu, T, Tianxi, C. A perturbation method for inference on regularized regression estimates. J Am Stat Assoc 2012; 106: 13711382.
Google Scholar | Crossref
32. Van de Geer, S, Bühlmann, P, Ritov, Y, et al. On asymptotically optimal confidence regions and tests for high-dimensional models. Ann Stat 2014; 42: 11661202.
Google Scholar | Crossref
33. Zhang, C, Zhang, S. Confidence intervals for low dimensional parameters in high dimensional linear models. J R Stat Soc: Series B 2014; 76: 217242.
Google Scholar | Crossref
34. Javanmard, A, Montanari, A. Confidence intervals and hypothesis testing for high-dimensional regression. J Mach Learn Res 2014; 15: 28692909.
Google Scholar | ISI
35. Javanmard A and Lee J. A flexible framework for hypothesis testing in high-dimensions. arXiv:1704.07971[math.ST], Apr. 2017.
Google Scholar
36. Belloni, A, Chernozhukov, V, Hansen, C. Inference on treatment effects after selection among high-dimensional controls. Rev Econ Stud 2014; 81: 608650.
Google Scholar | Crossref
37. Yang Y. Statistical inference for high dimensional regression via constrained lasso. arXiv:1704.05098 [math.ST], Apr. 2017.
Google Scholar
38. Lockhart, R, Taylor, J, Tibshirani, R, et al. A significance test for the lasso. Ann Stat 2014; 42: 413468.
Google Scholar | Crossref | Medline | ISI
39. Lee, J, Taylor, J. Exact post model selection inference for marginal screening. Adv Neural Inf Process Syst 2014, pp. 136144.
Google Scholar
40. Lee, J, Sun, D, Sun, Y, et al. Exact post-selection inference, with application to the lasso. Ann Stat 2016; 44: 907927.
Google Scholar | Crossref
41. Tibshirani, R, Taylor, J, Lockhart, R, et al. Exact post-selection inference for sequential regression procedures. J Am Stat Assoc 2016; 111: 600620.
Google Scholar | Crossref
42. Brown, B, Wang, Y. Standard errors and covariance matrices for smoothed rank estimators. Biometrika 2005; 92: 149158.
Google Scholar | Crossref
43. Royall, RM . Model robust confidence intervals using maximum likelihood estimators. Int Stat Rev 1986; 54: 221226.
Google Scholar | Crossref | ISI
44. Knight, K, Fu, W. Asymptotics for lasso-type estimators. Ann Stat 2000; 28: 13561378.
Google Scholar | Crossref | ISI
45. Pötscher, B, Leeb, H. On the distribution of penalized maximum likelihood estimators: The lasso, scad, and thresholding. J Multivar Anal 2009; 10: 20652082.
Google Scholar | Crossref
46. Zhou, Q . Monte carlo simulation for lasso-type problems by estimator augmentation. J Am Stat Assoc 2014; 109: 14951516.
Google Scholar | Crossref
47. Owen, AB . A robust hybrid of lasso and ridge regression. Contemporary Math 2007; 443: 5972.
Google Scholar | Crossref
48. Lockhart R, Taylor J and Tibshirani Rea. covTest: Computes covariance test for adaptive linear modelling, https://CRAN.R-project.org/package=covTest. R package version 1.02 (2013).
Google Scholar
49. Tibshirani R, Tibshirani R, Taylor J, et al. SelectiveInference: Tools for Post-Selection Inference, https://CRAN.R-project.org/package=selectiveInference. R package version 1.2.4 (2017).
Google Scholar
50. Stamey, T, Kabalin, J, McNeal, J, et al. Prostate specific antigen in the diagnosis and treatment of adenocarcinoma of the prostate. II. Radical prostatectomy treated patients. J Urol 1989; 141: 10761083.
Google Scholar | Crossref | Medline | ISI
51. Simoni, M, Lombardi, E, Berti, G, et al. Mould/dampness exposure at home is associated with respiratory disorders in italian children and adolescents: the sidria-2 study. Occup Environ Med 2005; 62: 616622.
Google Scholar | Crossref | Medline
52. Migliore, E, Berti, G, Galassi, C, et al. Respiratory symptoms in children living near busy roads and their relationship to vehicular traffic: results of an Italian multicenter study (SIDRIA 2). Environ Health 2009; 8: 2727.
Google Scholar | Crossref | Medline | ISI
53. Zou, H, Hastie, T, Tibshirani, R. On the ‘degrees of freedom’ of the lasso. Ann Stat 2007; 35: 21732192.
Google Scholar | Crossref | ISI
54. Zou, H . The adaptive lasso and its oracle properties. J Am Stat Assoc 2006; 101: 14181429.
Google Scholar | Crossref | ISI
55. Fu, L, Wang, Y, Bai, Z. Rank regression for analysis of clustered data: a natural induced smoothing approach. Comput Stat Data Anal 2010; 54: 10361050.
Google Scholar | Crossref
56. Muggeo VMR. Interval estimation for the breakpoint in segmented regression: a smoothed score-based approach. Aust N Z J Stat 2017; 59: 311–322.
Google Scholar
Access Options

My Account

Welcome
You do not have access to this content.



Chinese Institutions / 中国用户

Click the button below for the full-text content

请点击以下获取该全文

Institutional Access

does not have access to this content.

Purchase Content

24 hours online access to download content

Research off-campus without worrying about access issues. Find out about Lean Library here

Your Access Options


Purchase

SMM-article-ppv for $41.50
Single Issue 24 hour E-access for $543.66

Cookies Notification

This site uses cookies. By continuing to browse the site you are agreeing to our use of cookies. Find out more.
Top