Abstract
Bland and Altman’s limits of agreement have traditionally been used in clinical research to assess the agreement between different methods of measurement for quantitative variables. However, when the variances of the measurement errors of the two methods are different, Bland and Altman’s plot may be misleading; there are settings where the regression line shows an upward or a downward trend but there is no bias or a zero slope and there is a bias. Therefore, the goal of this paper is to clearly illustrate why and when does a bias arise, particularly when heteroscedastic measurement errors are expected, and propose two new plots, the “bias plot” and the “precision plot,” to help the investigator visually and clinically appraise the performance of the new method. These plots do not have the above-mentioned defect and still are easy to interpret, in the spirit of Bland and Altman’s limits of agreement. To achieve this goal, we rely on the modeling framework recently developed by Nawarathna and Choudhary, which allows the measurement errors to be heteroscedastic and depend on the underlying latent trait. Their estimation procedure, however, is complex and rather daunting to implement. We have, therefore, developed a new estimation procedure, which is much simpler to implement and, yet, performs very well, as illustrated by our simulations. The methodology requires several measurements with the reference standard and possibly only one with the new method for each individual.
References
| 1. | Bland, JM, Altman, DG. Statistical methods for assessing agreement between two methods of clinical measurement. Lancet 1986; 1: 307–310. Google Scholar | Medline | ISI |
| 2. | Bland, JM, Altman, DG. Measuring agreement in method comparison studies. Stat Meth Med Res 1999; 8: 135–160. Google Scholar | SAGE Journals | ISI |
| 3. | Ludbrook, J . Confidence in Altman-Bland plots: a critical review of the method of differences. Clin Exp Pharmacol P 2010; 73: 143–149. Google Scholar |
| 4. | Hopkins, WG . Bias in Bland-Altman but not regression validity analyses. Sportscience 2004; 8: 42–46. Google Scholar |
| 5. | Krouwer, JS . Why Bland-Altman plots should use X, not (Y+X)/2 when X is a reference method. Stat Med 2008; 27: 778–780. Google Scholar | Medline | ISI |
| 6. | Ludbrook, J . Linear regression analysis for comparing two measurers or methods of measurement: but which regression? Clin Exp Pharmacol P 2010; 37: 692–699. Google Scholar | Medline | ISI |
| 7. | Carstensen, B, Simpson, J, Gurrin, LC. Statistical models for assessing agreement in method comparison studies with replicate measurements. Int J Biostat 2008; 4: Article 16. Google Scholar | Medline |
| 8. | Carstensen, B . Comparing methods of measurement: extending the LoA by regression. Stat Med 2010; 29: 401–410. Google Scholar | Medline |
| 9. | Nawarathna, LS, Choudhary, PK. A heteroscedastic measurement error model for method comparison data with replicate measurements. Stat Med 2015; 34: 1242–1258. Google Scholar | Medline |
| 10. | Dunn, G . Statistical evaluation of measurement errors: design and analysis of reliability studies, 2nd ed. London: Arnold, 2004. Google Scholar |
| 11. | Verbeke, G, Molenberghs, G. Linear mixed models in practice. (Lecture Notes in Statistics 126), New York: Springer, 1997. Google Scholar |
| 12. | Robinson, JK . That BLUP is a good thing: the estimation of random effects. Stat Sci 1991; 6: 15–32. Google Scholar |
| 13. | Royston, P, Altman, DG. Regression using fractional polynomials of continuous covariates: parsimonious parametric modelling. J Roy Stat Soc C-App 1994; 43: 429–467. Google Scholar | ISI |
| 14. | Jiang, J . Asymptotic properties of the empirical BLUP and BLUE in mixed linear models. Stat Sinica 1998; 8: 861–885. Google Scholar |
