Abstract
When examinees copy answers to test questions from other examinees, the validity of the test is compromised. Most available statistical procedures for detecting copying were developed out of classical test theory (CrT); hence, they suffer from sampledependent score and item statistics, and biased estimates of the expected number of answer matches between a pair of examinees. Item response theory (IRT) based procedures alleviate these problems; however, because they fail to compare the similarity of responses between neighboring examinees, they have relatively poor power for detecting copiers. A new IRT-based test statistic, wo, was compared with the best CUT-based index g2 under various copying conditions, amounts of copying, test lengths, and sample sizes. w consistently held the Type I error rate at or below the nominal level; g2 yielded substantially inflated Type I error rates. The power of w varied as a function of both test length and the percentage of items copied. w demonstrated good power to detect copiers, provided that at least 20% of the items were copied on an 80-item test and at least 30% were copied on a 40-item test. Based on these results, with regard to both Tbype I error rate and power, c appears to be more useful than g2 as a copying index.
|
Angoff, W. H. (1974). The development of statistical indices for detecting cheaters. Journal of the American Statistical Association, 69, 44-49. Google Scholar | Crossref | ISI | |
|
Assessment Systems Corporation . (1995). User's manualforScrutiny!: Software to identify testmisconduct. St. Paul MN: Author. Google Scholar | |
|
Baird, J. S., Jr. (1980). Current trends in college cheating. Psychology in the Schools, 17, 515-522. Google Scholar | Crossref | ISI | |
|
Baker, F. B. (1986). GENIRV: Computer program for generating item responses. University of Wisconsin-Madison, Department of Educational Psychology, Laboratory of Experimental Design. Google Scholar | |
|
Baker, F. B. (1992). Item response theory: Parameter estimation techniques. New York: Marcel Dekker. Google Scholar | |
|
Bay, L. (1995, April). Detection of cheating on multiple-choice examinations. Paper presented at the Annual meeting of the American Educational Research Association, San Francisco. Google Scholar | |
|
Bellezza, F. S. , & Bellezza, S. F. (1989). Detection of cheating on multiple-choice tests by using errorsimilarity analysis. Teaching of Psychology, 16, 151-155. Google Scholar | SAGE Journals | ISI | |
|
Bock, R. D. (1972). Estimating item parameters and latent ability when responses are scored in two or more nominal categories. Psychometrika, 46, 443-459. Google Scholar | |
|
Chason, W. M. , & Maller, S. (1996, April). Utility of the Rasch person-fit statistic in detecting answer copying: A comparison with traditional cheating indices. Paper presented at the Annual meeting of the American Educational Research Association, New York. Google Scholar | |
|
Cody, R. P. (1985). Statistical analysis of examinations to detect cheating. Journal of Medical Education, 60, 136-137. Google Scholar | Medline | |
|
Drasgow, F. , Levine, M. V. , & Williams, E. A. (1985). Appropriateness measurement with polychotomous item response models and standardized indices. British Journal of Mathematical and Statistical Psychology, 38, 67-86. Google Scholar | Crossref | ISI | |
|
Frary, R. B. (1977). Program to compute g2 [Computer program]. Blacksburg: Virginia Polytechnic Institute and State University. Google Scholar | |
|
Frary, R. B. (1993). Statistical detection of multiplechoice answer copying: Review and commentary. Applied Measurement in Education, 6, 152-165. Google Scholar | Crossref | |
|
Frary, R. B. , & Olsen, G. H. (1985, March). Statistical detection of answer copying and coaching. Paper presented at the Annual meeting of the American Educational Research Association, Toronto, Ontario. Google Scholar | |
|
Frary, R. B. , Tideman, T. N. , & Watts, T. M. (1977). Indices of cheating on multiple-choice tests. Journal of Educational Statistics, 2, 235-256. Google Scholar | Crossref | |
|
Graham, M. A. , Monday, J. , O'Brien, K. , & Steffen, S. (1994). Cheating at small colleges: An examination of student and faculty attitudes and behaviors. Journal of College Student Development, 35, 255-260. Google Scholar | ISI | |
|
Gulliksen, H. (1950). Theory of mental tests. New York: Wiley. Google Scholar | Crossref | |
|
Hambleton, R. K. (1989). Principles and selected applications of item response theory. In R. L. Linn (Ed.), Educational measurement (3rd ed.; pp. 147-200). New York: Macmillan. Google Scholar | |
|
Hanson, B. A. , Harris, D. J. , & Brennan, R. L. (1987). A comparison of several statistical methods for examining allegations of copying (Research Rep. Series No. 87-15). Iowa City: American College Testing Program. Google Scholar | |
|
Hetherington, E. , & Feldman, S. (1964). College cheating as a function of subject and situational variables. Journal of Educational Psychology, 55, 212-228. Google Scholar | Crossref | ISI | |
|
Iwamoto, C. K , Nungester, R. J. , & Luecht, R. M. (1996, April). Power of similarity methtods and person-fit analysis to detect copying behavior. Paper presented at the Annual meeting of the American Educational Research Association, New York. Google Scholar | |
|
Levine, M. V. , & Rubin, D. B. (1979). Measuring the appropriateness of multiple-choice test scores. Journal of Educational Statistics, 4, 269-290. Google Scholar | Crossref | |
|
Lord, F. M. (1980). Applications of item response theory topracticaltestingproblems. Hillsdale NJ: Erlbaum. Google Scholar | |
|
Lord, F. M. , & Novick, M. R. (1968). Statistical theories of mental test scores. Reading MA: AddisonWesley. Google Scholar | |
|
Payne, S. L. , & Nantz, K. S. (1994). Social accounts and metaphors about cheating. College Teaching, 42,90-94. Google Scholar | Crossref | |
|
Roberts, D. M. (1987). Limitations of the scoredifference method in detecting cheating in recognition test situations. Journal ofEducationalMeasurement, 24, 77-81. Google Scholar | |
|
Scheers, N. J. , & Dayton, M. (1987). Improved estimation of academic cheating behavior using the randomized response technique. Research in Higher Education, 26, 61-69. Google Scholar | Crossref | ISI | |
|
Seaman, M. A. , Levin, J. R. , & Serlin, R. C. (1991). New developments in pairwise multiple comparisons: Some powerful and practicable procedures. Psychological Bulletin, 110, 577-586. Google Scholar | Crossref | ISI | |
|
Stuart, A. , & Ord, J. K. (1987). Kendall's advanced theory of statistics, Volume 1: Distribution theory (5th ed.). New York: Oxford University Press. Google Scholar | |
|
Thissen, D. (1991). MULTILOG version 6 user's guide. Chicago: Scientific Software. Google Scholar | |
| Wollack, J. A. (1996). Detection of answer copying using item response theory (Doctoral dissertation, University of Wisconsin, Madison). Dissertation Abstracts International, 57/05, 2015-2015. Google Scholar |
