This paper considers some appropriate and inappropriate uses of coefficient kappa and alternative kappa-like statistics. Discussion is restricted to the descriptive characteristics of these statistics for measuring agreement with categorical data in studies of reliability and validity. Special consideration is given to assumptions about whether marginals are fixed a priori, or free to vary. In reliability studies, when marginals are fixed, coefficient kappa is found to be appropriate. When either or both of the marginals are free to vary, however, it is suggested that the "chance" term in kappa be replaced by 1/n, where n is the number of categories. In validity studies, we suggest considering whether one wants an index of improvement beyond "chance" or beyond the best a priori strategy employing base rates. In the former case, considerations are similar to those in reliability studies with the marginals for the criterion measure considered as fixed. In the latter case, it is suggested that the largest marginal proportion for the criterion measure be used in place of the "chance" term in kappa. Similarities and differences among these statistics are discussed and illustrated with synthetic data.

Bennett, E.M. , Alpert, R. , and Goldstein, A.C. Communications through limited response questioning. Public Opinion Quarterly, 1954, 18, 303-308.
Google Scholar | Crossref | ISI
Cohen, J. A coefficient of agreement for nominal scales. EDUCATIONAL AND PSYCHOLOGICAL MEASUREMENT, 1960, 20, 37-46.
Google Scholar | SAGE Journals | ISI
Cohen, J. Weighted kappa: Nominal scale agreement with provision for scaled disagreement or partial credit. Psychological Bulletin, 1968, 70, 213-220.
Google Scholar | Crossref | Medline | ISI
Crittenden, K.S. and Hill, R.J. Coding reliability and validity of interview data. American Sociological Review, 1971, 36, 1073-1080.
Google Scholar | Crossref | ISI
Cronbach, L.J. and Gleser, G.C. Psychological tests and personnel decisions. Urbana : University of Illinois Press, 1965.
Google Scholar
Fleiss, J.L. Measuring agreement between two judges on the presence or absence of a trait . Biometrics, 1975, 31, 651-659.
Google Scholar | Crossref | Medline | ISI
Goodman, L.A. and Kruskal, W.H. Measures of association for cross classifications. American Statistical Association Journal, 1954, 49, 732-764.
Google Scholar | ISI
Gottfredson, G.D. and Holland, J.L. Vocational choices of men and women: A comparison of predictors from the self-directed search. Journal of Counseling Psychology , 1975, 22, 28-34.
Google Scholar | Crossref | ISI
Guttman, L. Mathematical and tabulation techniques. In P. Horst (Ed.) The prediction of personal adjustment (Bulletin 48). New York: Social Science Council, 1941.
Google Scholar
Guttman, L. The test-retest reliability of qualitative data. Psychometrika , 1946, 11, 81-95.
Google Scholar | Crossref | Medline
Kendall, M.G. and Stuart, A. The advanced theory of statistics (Second edition, Vol. 2). New York: Hafner, 1967.
Google Scholar
Krippendorff, K. Bivariate agreement coefficients for reliability of data. In E. F. Borgatta and G. W. Bohrnstedt (Eds.), Sociological methodology. San Francisco: Jossey-Bass, 1970 .
Google Scholar
Lawlis, G.F. and Lu, E. Judgments of counseling process: Reliability, agreement, and error . Psychological Bulletin, 1972, 78, 17-20.
Google Scholar | Crossref | Medline | ISI
Marx, T.J. Statistical measurement of agreement for data in the nominal scale with applications to educational research and decisions. Unpublished doctoral dissertation, Harvard University, 1972.
Google Scholar
Scott, W.A. Reliability of content analysis: The case of nominal scale coding . Public Opinion Quarterly, 1955, 19, 321-325.
Google Scholar | Crossref | ISI
Tinsley, H.A. and Weiss, D.J. Interrater reliability and agreement of subjective judgments. Journal of Counseling Psychology, 1975, 22, 358-376.
Google Scholar | Crossref | ISI
Touchton, J.G. and Magoon, T.M. Occupational daydreams as predictors of vocational plans of college women . Journal of Vocational Behavior, 1977 , 10, 156-166.
Google Scholar | Crossref | ISI
Access Options

My Account

Welcome
You do not have access to this content.



Chinese Institutions / 中国用户

Click the button below for the full-text content

请点击以下获取该全文

Institutional Access

does not have access to this content.

Purchase Content

24 hours online access to download content

Research off-campus without worrying about access issues. Find out about Lean Library here

Your Access Options


Purchase

EPM-article-ppv for $37.50
Single Issue 24 hour E-access for $323.77

Cookies Notification

This site uses cookies. By continuing to browse the site you are agreeing to our use of cookies. Find out more.
Top