Compositional count data are discrete vectors representing the numbers of outcomes falling into any of several mutually exclusive categories. Compositional techniques based on the log-ratio methodology are appropriate in those cases where the total sum of the vector elements is not of interest. Such compositional count data sets can contain zero values which are often the result of insufficiently large samples. That is, they refer to unobserved positive values that may have been observed with a larger number of trials or with a different sampling design. Because the log-ratio transformations require data with positive values, any statistical analysis of count compositions must be preceded by a proper replacement of the zeros. A Bayesian-multiplicative treatment has been proposed for addressing this count zero problem in several case studies. This treatment involves the Dirichlet prior distribution as the conjugate distribution of the multinomial distribution and a multiplicative modification of the non-zero values. Different parameterizations of the prior distribution provide different zero replacement results, whose coherence with the vector space structure of the simplex is stated. Their performance is evaluated from both the theoretical and the computational point of view.

Aebischer, NJ, Robertson, PA, Kenward, RE (1993) Compositional analysis of habitat use from animal radio-tracking data. Ecology, 74(5), 131325.
Google Scholar | Crossref | ISI
Agresti, A (2003) Categorical data analysis. Wiley Series in Probability and Statistics, p. 710. 2nd edn, Hoboken: John Wiley & Sons.
Google Scholar
Aitchison, J (1986) The statistical analysis of compositional data. Monographs on Statistics and Applied Probability (Reprinted 2003 with additional material by The Blackburn Press). London: Chapman and Hall Ltd., p. 416.
Google Scholar | Crossref
Bernard, JM (2005) An introduction to the imprecise Dirichlet model for multinomial data. International Journal of Approximate Reasoning, 39(2–3), 12350.
Google Scholar | Crossref | ISI
Butler, A, Glasbey, C (2008) A latent Gaussian model for compositional data with zeros. Journal of the Royal Statistical Society Series C-Applied Statistics, 57, 50520.
Google Scholar | Crossref | ISI
Davis, CS (1993) The computer generation of the multinomial random variates. Computational Statistics & Data Analysis, 16, 20517.
Google Scholar | Crossref | ISI
Eaton, ML (1983) Multivariate statistics. A vector space approach. New York: John Wiley & Sons, p. 512.
Google Scholar
Egozcue, JJ (2009) Reply to ‘On the Harker variation diagrams; ...’ by J.A. Cortés. Mathematical Geosciences, 41, 82934.
Google Scholar | Crossref | ISI
Egozcue, JJ, Pawlowsky-Glahn, V (2006) Simplicial geometry for compositional data. In Buccianti, A, Mateu-Figueras, G, Pawlowsky-Glahn, V (eds), Compositional data analysis in the geosciences: From theory to practice London: Geological Society, pp. 145160.
Google Scholar | Crossref
Egozcue, JJ, Pawlowsky-Glahn, V, Mateu-Figueras, G, Barceló-Vidal, C (2003) Isometric logratio transformations for compositional data analysis. Mathematical Geology, 35(3), 279300.
Google Scholar | Crossref
Egozcue, JJ, Tolosana-Delgado, R, Ortego, MI (eds) (2011) Proceedings of CODAWORK’11: The 4th Compositional Data Analysis Workshop. Sant Feliu De Guxols, May 10-13. ISBN978-84-87867-76-7 (electronic publication).
Google Scholar
Elston, DA, Illius, AW, Gordon, IJ (1996) Assessment of preference among a range of options using log ratio analysis. Ecology, 77, 253848.
Google Scholar | Crossref | ISI
Filzmoser, P, Hron, K, Templ, M (2012) Discriminant analysis for compositional data and robust parameter estimation. Computational Statistics, 27(4), 585604.
Google Scholar | Crossref | ISI
Friedman, J, Alm, EJ (2012) Inferring correlation networks from genomic survey data. PLoS Computational Biology, 8(9), e1002687. doi:10.1371/journal.pcbi.1002687.
Google Scholar | Crossref | ISI
Graffelman, J (2011) Statistical inference for Hardy-Weinberg equilibrium using logratio coordinates. In Egozcue, J.J., Tolosana-Delgado, R., Ortego, M.I. (Eds), Proceedings of the 4th International Workshop on Compositional Data Analysis, p. 5.
Google Scholar
Graffelman, J, Egozcue, JJ (2011) Hardy-Weinberg equilibrium: A nonparametric compositional approach, Ch. 15. In Pawlowsky-Glahn, V., Buccianti, A. (Eds), Compositional Data Analysis: Theory and Applications, pp. 20817. Chichester, UK: John Wiley & Sons, Ltd.
Google Scholar | Crossref
Hron, K, Templ, M, Filzmoser, P (2010) Imputation of missing values for compositional data using classical and robust methods. Computational Statistics & Data Analysis, 54(12), 3095107.
Google Scholar | Crossref | ISI
Martín-Fernández, JA, Barceló-Vidal, C, Pawlowsky-Glahn, V (2003) Dealing with zeros and missing values in compositional data sets using nonparametric imputation. Mathematical Geology, 35(3), 25378.
Google Scholar | Crossref
Martín-Fernández, JA, Palarea-Albaladejo, J, Olea, RA (2011) Dealing with zeros, Ch. 4. In Pawlowsky-Glahn, V., Buccianti, A. (Eds), Compositional Data Analysis: Theory and Applications, pp. 4762. Chichester, UK: John Wiley & Sons, Ltd.
Google Scholar | Crossref
Martín-Fernández, JA, Hron, K, Templ, M, Filzmoser, P, Palarea-Albaladejo, J (2012) Model-based replacement of rounded zeros in compositional data: Classical and robust approach. Computational Statistics & Data Analysis, 56(3), 2688704.
Google Scholar | Crossref | ISI
Mateu-Figueras, G, Pawlowsky-Glahn, V(2008) A critical approach to probability laws in geochemistry. Mathematical Geosciences, 40(5), 489502.
Google Scholar | Crossref | ISI
Monti, GS, Mateu-Figueras, G, Pawlowsky-Glahn, V (2011) Notes on the scaled Dirichlet distribution. In Pawlowsky-Glahn, V., Buccianti, A. (Eds), Compositional Data Analysis: Theory and Applications, pp. 12838. Chichester, UK: John Wiley & Sons, Ltd.
Google Scholar | Crossref
Palarea-Albaladejo, J, Martín-Fernández, JA, Gómez-García, J (2007) A parametric approach for dealing with compositional rounded zeros. Mathematical Geology, 39, 62545.
Google Scholar | Crossref
Palarea-Albaladejo, J, Martín-Fernández, JA (2008) A modified EM alr-algorithm for replacing rounded zeros in compositional data sets. Computers & Geosciences, 34(8), 90217.
Google Scholar | Crossref | ISI
Palarea-Albaladejo, J, Martín-Fernández, JA, Soto, JA (2012) Dealing with distances and transformations for fuzzy c-Means clustering of compositional data. Journal of Classification, 29(2), 14469.
Google Scholar | Crossref | ISI
Palarea-Albaladejo, J, Martín-Fernández, JA (2013) Values below detection limit in compositional chemical data. Analytica Chimica Acta, 764, 3243.
Google Scholar | Crossref | Medline | ISI
Pawlowsky-Glahn, V, Buccianti, A, eds (2011) Compositional data analysis: Theory and applications. Chichester: John Wiley & Sons, p. 378.
Google Scholar | Crossref
Pawlowsky-Glahn, V, Egozcue, JJ (2002) BLU estimators and compositional data. Mathematical Geology, 34(3), 25974.
Google Scholar | Crossref
Pearson, K (1897) Mathematical contributions to the theory of evolution. On a form of spurious correlation which may arise when indices are used in the measurement of organs. Proceedings of the Royal Society of London, 60, 489502.
Google Scholar | Crossref
Pierotti, MER, Martín-Fernández, JA, Seehausen, O (2009) A mapping individual variation in male mating preference space: Multiple choice in a colour polymorphic cichlid fish. Evolution, 63(9), 237288.
Google Scholar | Crossref | Medline | ISI
R development core team (2012) R: A language and environment for statistical computing, Vienna, Austria: R Foundation for Statistical Computing. http://www.r-project.org.
Google Scholar
Richardson, D (1997) How to recognize zero. Journal of Symbolic Computation, 24(6), 62745.
Google Scholar | Crossref | ISI
Rodrigues, PC, Lima, AT (2009) Analysis of an European union election using principal component analysis. Statistical Papers, 50, 895904.
Google Scholar | Crossref | ISI
Stewart, C, Field, C (2010) Managing the essential zeros in quantitative fatty acid signature analysis. Journal of Agricultural, Biological, and Environmental Statistics, 16(1), 4569.
Google Scholar | Crossref | ISI
Templ, M, Hron, K, Filzmoser, P (2011) robCompositions: An R-package for robust statistical analysis of compositional data, Ch. 25. In Pawlowsky-Glahn, V., Buccianti, A. (Eds), Compositional Data Analysis: Theory and Applications, pp. 34155. Chichester, UK: John Wiley & Sons, Ltd.
Google Scholar | Crossref
Walley, P (1996) Inferences from multinomial data: Learning about a bag of marbles. Journal of the Royal Statistical Society Series B (Methodological), 58(1), 357.
Google Scholar
Access Options

My Account

Welcome
You do not have access to this content.



Chinese Institutions / 中国用户

Click the button below for the full-text content

请点击以下获取该全文

Institutional Access

does not have access to this content.

Purchase Content

24 hours online access to download content

Research off-campus without worrying about access issues. Find out about Lean Library here

Your Access Options


Purchase

SMJ-article-ppv for $37.50
Single Issue 24 hour E-access for $250.00

Cookies Notification

This site uses cookies. By continuing to browse the site you are agreeing to our use of cookies. Find out more.
Top