The inverse probability weighted Cox proportional hazards model can be used to estimate the marginal hazard ratio. In multi-site studies, it may be infeasible to pool individual-level datasets due to privacy and other considerations. We propose three methods for making inference on hazard ratios without the need for pooling individual-level datasets across sites. The first method requires a summary-level eight-column risk-set table to produce the same hazard ratio estimate and robust sandwich variance estimate as those from the corresponding pooled individual-level data analysis (reference analysis). The second and third methods, which are based on two bootstrap re-sampling strategies, require a summary-level four-column risk-set table and bootstrap-based risk-set tables from each site to produce the same hazard ratio and bootstrap variance estimates as those from their reference analyses. All three methods require only one file transfer between the data-contributing sites and the analysis center. We justify these methods theoretically, illustrate their use, and demonstrate their statistical performance using both simulated and real-world data.

1. Cox, DR . Regression models and life tables. J Royal Stat Soc (Series B) 1972; 34: 187220.
Google Scholar
2. Rosenbaum, PR, Rubin, DB. The central role of the propensity score in observational studies for causal effects. Biometrika 1983; 70: 4155.
Google Scholar | Crossref | ISI
3. D'Agostino, RB . Propensity score methods for bias reduction in the comparison of a treatment to a non-randomized control group. Stat Med 1998; 17: 22652281.
Google Scholar | Crossref | Medline | ISI
4. Lunceford, JK, Davidian, M. Stratification and weighting via the propensity score in estimation of causal treatment effects: a comparative study. Stat Med 2004; 23: 29372960.
Google Scholar | Crossref | Medline | ISI
5. Austin, PC . An introduction to propensity score methods for reducing the effects of confounding in observational studies. Multivar Behav Res 2011; 46: 399424.
Google Scholar | Crossref | Medline | ISI
6. Seaman, SR, White, IR. Review of inverse probability weighting for dealing with missing data. Stat Methods Med Res 2013; 22: 278295.
Google Scholar | SAGE Journals | ISI
7. Lu, CL, Wang, S, Ji, Z, et al. WebDISCO: a web service for distributed Cox model learning without patient-level data sharing. J Am Med Inform Assoc 2015; 22: 12121219.
Google Scholar | Medline
8. Vilk Y, Zhang Z, Young J, et al. A distributed regression analysis application based on SAS software. Part II: Cox proportional hazards regression 2018: arXiv:1808.02392 [stat.CO].
Google Scholar
9. Yoshida, K, Gruber, S, Fireman, BH, et al. Comparison of privacy-protecting analytic and data-sharing methods: a simulation study. Pharmacoepidemiol Drug Saf 2018; 27: 10341041.
Google Scholar | Crossref | Medline
10. Mazor, KM, Richards, A, Gallagher, M, et al. Stakeholders' views on data sharing in multicenter studies. J Comp Eff Res 2017; 6: 537547.
Google Scholar | Crossref | Medline
11. Damschroder, LJ, Pritts, JL, Neblo, MA, et al. Patients, privacy and trust: patients' willingness to allow researchers to access their medical records. Soc Sci Med 2007; 64: 223235.
Google Scholar | Crossref | Medline | ISI
12. Hill, EM, Turner, EL, Martin, RM, et al. “Let's get the best quality research we can”: public awareness and acceptance of consent to use existing data in health research: a systematic review and qualitative study. BMC Med Res Methodol 2013; 13: 72.
Google Scholar | Crossref | Medline | ISI
13. Brown, JS, Holmes, JH, Shah, K, et al. Distributed health data networks: a practical and preferred approach to multi-institutional evaluations of comparative effectiveness, safety, and quality of care. Med Care 2010; 48: S45S51.
Google Scholar | Crossref | Medline | ISI
14. Platt, R, Lieu, T. Data enclaves for sharing information derived from clinical and administrative data. JAMA 2018; 320: 753754.
Google Scholar | Crossref | Medline
15. Behrman, RE, Benner, JS, Brown, JS, et al. Developing the Sentinel System – a national resource for evidence development. N Engl J Med 2011; 364: 498499.
Google Scholar | Crossref | Medline | ISI
16. Baggs, J, Gee, J, Lewis, E, et al. The Vaccine Safety Datalink: a model for monitoring immunization safety. Pediatrics 2011; 127: S45S53.
Google Scholar | Crossref | Medline | ISI
17. Collins, FS, Hudson, KL, Briggs, JP, et al. PCORnet: turning a dream into reality. J Am Med Inform Assoc 2014; 21: 576577.
Google Scholar | Crossref | Medline | ISI
18. Cole, SR, Hernán, MA. Adjusted survival curves with inverse probability weights. Comput Methods Programs Biomed 2004; 75: 4549.
Google Scholar | Crossref | Medline | ISI
19. Lin, DY, Wei, LJ. The robust inference for the Cox proportional hazards model. J Am Stat Assoc 1989; 84: 10741078.
Google Scholar | Crossref | ISI
20. Binder, DA . Fitting Cox's proportional hazards models from survey data. Biometrika 1992; 79: 139147.
Google Scholar | Crossref
21. Robins JM. Marginal structural models versus structural nested models as tools for causal inference. In: Halloran E, Berry D (eds) Statistical models in epidemiology: The environment and clinical trials. New York: Springer, 1999, pp.95–134.
Google Scholar
22. Hernán, MA, Brumback, B, Robins, JM. Marginal structural models to estimate the causal effect of zidovudine on the survival of HIV-positive men. Epidemiology 2000; 11: 561570.
Google Scholar | Crossref | Medline | ISI
23. Robins, JM, Hernán, MA, Brumback, B. Marginal structural models and causal inference in epidemiology. Epidemiology 2000; 11: 550560.
Google Scholar | Crossref | Medline | ISI
24. Joffe, MM, Ten Have, TR, Feldman, HI, et al. Model selection, confounder control, and marginal structural models: review and new applications. Am Stat 2004; 58: 272279.
Google Scholar | Crossref | ISI
25. Austin, PC . Variance estimation when using inverse probability of treatment weighting (IPTW) with survival analysis. Stat Med 2016; 35: 56425655.
Google Scholar | Crossref | Medline
26. Efron, B, Tibshirani, RJ. An Introduction to the bootstrap, New York: Chapman & Hall, 1993.
Google Scholar | Crossref
27. Bender, R, Augustin, T, Blettner, M. Generating survival times to simulate Cox proportional hazards models. Stat Med 2005; 24: 17131723.
Google Scholar | Crossref | Medline | ISI
28. R Core Team . R: A language and environment for statistical computing, Vienna, Austria: R Foundation for Statistical Computing, 2018.
Google Scholar
29. Toh, S, Shetterly, S, Powers, JD, et al. Privacy-preserving analytic methods for multi-site comparative effectiveness and patient-centered outcomes research. Med Care 2014; 52: 664668.
Google Scholar | Crossref | Medline
30. Toh, S, Wellman, R, Coley, RY, et al. Combining distributed regression and propensity scores: a doubly privacy-protecting analytic method for multicenter research. Clin Epidemiol 2018; 10: 17731786.
Google Scholar | Crossref | Medline
31. Toh S, Gagne JJ, Rassen JA, et al. Confounding adjustment in comparative effectiveness research conducted within distributed research networks. Med Care 2013; 51(8 Suppl 3): S4-S10.
Google Scholar
32. Fienberg SE, Fulp WJ, Slavković AB, et al. “Secure” log-linear and logistic regression analysis of distributed databases. In: Domingo-Ferrer J and Franconi L (eds) Privacy in statistical databases. PSD 2006. Lecture Notes in Computer Science, vol. 4302. Berlin, Heidelberg: Springer, 2006.
Google Scholar
33. Karr, AF, Fulp, WJ, Vera, F, et al. Secure, privacy-preserving analysis of distributed databases. Technometrics 2007; 49: 335345.
Google Scholar | Crossref
34. Jiang, W, Li, P, Wang, S, et al. WebGLORE: a web service for Grid LOgistic REgression. Bioinformatics 2013; 29: 32383240.
Google Scholar | Crossref | Medline
35. El Emam, K, Samet, S, Arbuckle, L, et al. A secure distributed logistic regression protocol for the detection of rare adverse drug events. J Am Med Inform Assoc 2013; 20: 453461.
Google Scholar | Crossref | Medline
36. Her, QL, Malenfant, JM, Malek, S, et al. A query workflow design to perform automatable distributed regression analysis in large distributed data networks. eGEMS 2018; 6: 11.
Google Scholar | Crossref | Medline
37. Arpino, B, Mealli, F. The specification of the propensity score in multilevel observational studies. Comput Stat Data Anal 2011; 55: 17701780.
Google Scholar | Crossref
38. Arpino, B, Cannas, M. Propensity score matching with clustered data. An application to the estimation of the impact of caesarean section on the Apgar score. Stat Med 2016; 35: 20742091.
Google Scholar | Crossref | Medline | ISI
39. Glidden, DV, Vittinghoff, E. Modelling clustered survival data from multicentre clinical trials. Stat Med 2004; 23: 369388.
Google Scholar | Crossref | Medline | ISI
40. Cai, J, Zhou, H, Davis, CE. Estimating the mean hazard ratio parameters for clustered survival data with random clusters. Stat Med 1997; 16: 20092020.
Google Scholar | Crossref | Medline
41. Cai, J, Sen, PK, Zhou, H. A random effects model for multivariate failure time data from multicenter clinical trials. Biometrics 1999; 55: 182189.
Google Scholar | Crossref | Medline
42. Rubin DB. Multiple imputation for nonresponse in surveys. New York: John Wiley & Sons, 1987.
Google Scholar
Access Options

My Account

Welcome
You do not have access to this content.



Chinese Institutions / 中国用户

Click the button below for the full-text content

请点击以下获取该全文

Institutional Access

does not have access to this content.

Purchase Content

24 hours online access to download content

Research off-campus without worrying about access issues. Find out about Lean Library here

Your Access Options


Purchase

SMM-article-ppv for $41.50
Single Issue 24 hour E-access for $543.66

Cookies Notification

This site uses cookies. By continuing to browse the site you are agreeing to our use of cookies. Find out more.
Top