This commentary will take an historical perspective on the Kaufman Test of Educational Achievement (KTEA) error analysis, discussing where it started, where it is today, and where it may be headed in the future. In addition, the commentary will compare and contrast the KTEA error analysis procedures that are rooted in psychometric methodology and the process approach to error analysis which is derived primarily from cognitive neuropsychology.
In preface to a discussion of the specifics of error analysis, it is important to acknowledge the high level of scholarship that is exhibited in each of the studies in this special edition. The authors’ reviews of the literature were exceptional and appropriately covered a wide range of perspectives from psychometrically based correlational studies to imaging studies of brain function. Just reading the introductory sections of each of these articles would be a valuable exercise for any clinician interested in learning more about the state of the art in research into the cognitive factors involved in the acquisition of reading, writing, and mathematics skills. The discussion sections were equally informative, with excellent integration of study findings with the previous findings reported in the literature and thoughtful considerations of the implications of results. The articles in this issue provide a strong foundation that will surely encourage further exploration of the topic of academic skills error analysis.
The Kaufman Test of Educational Achievement (KTEA) error analysis procedures were developed during the final poststandardization stages of development in 1984 to 1985 to enhance interpretation of test performance (McCloskey, Kaufman, Kaufman, & McCloskey, 1985). Clinicians using the KTEA were encouraged to code errors based on error categories and use this information to inform interventions. The KTEA error analysis offered a number of unique features: (a) the provision of specific error category frameworks for the KTEA reading, spelling, and math subtests; (b) reading decoding and spelling error categorizations based on Orton–Gillingham and related multisensory models of reading and writing that are now considered to be the strongest evidence-based approaches to teaching reading and spelling; and (c) norm-referencing of the number of errors made in each error category.
The original error analysis was conceptualized as a way to identify errors within categories that were indicative of specific skill deficits. As a result, the error categories involved discrete errors in skill performance that could be objectively identified in the form of the final response.
As the project director of the development of the KTEA and one of the researchers responsible for coding the errors of the standardization cases, I sometimes encountered responses that represented an error in getting to the correct response much more than a lack of knowledge of the correct response. In the case of math computation, for example, the performance of a student who was asked to add 3 + 2 and provided a response of 1 was coded as an error in basic addition. Although this was accurate from the standpoint of product (an incorrect response to a basic addition problem), the error in this specific instance could be viewed as representing a particular kind of thought processing error involving a misreading of the operation sign whereby the student subtracted 2 from 3 to get 1 instead of adding 2 to 3 to get 5. Similarly, in the case of word recognition, a student who said “strike” when asked to pronounce the word “straight” was coded as making a vowel digraph error and a single & double consonant (consonant cluster) error. Again, although accurate from the standpoint of mispronunciation of letter cluster patterns within words, this response also reflected a jump to a word with a similar beginning (the onset) and a lack of attention to the remaining letters in the word (the rime). In both of these examples, the incorrect response reflected a specific kind of thought process that resulted in the error rather than a lack of knowledge of the correct answer.
In such instances, effective error analysis was not dependent on the straightforward observation that an incorrect response was provided, but rather dependent on the more nuanced interpretation of what that specific elements of that incorrect response represented in the way of inaccurate thought processing. Although the error could be categorized and recorded without much knowledge or effort, the more nuanced interpretation of that error required the acumen of someone with the knowledge of how specific thought processes could result in specific errors.
I knew that the error category coding systems we had developed could be empirically verified easily and therefore provided a reliable way to code errors. In coding thousands of individual errors, however, it occurred to me that the most effective use of error analysis would not always be based strictly on the identification of errors by category, but often times on the identification of the thought processes that led to the error. Although our error analysis system could accurately identify the number of errors in each category, it could not provide a rationale for understanding the inaccurate thought processing represented by many of the specific errors that students made. This insight provided the foundation for my subsequent efforts to study and apply error analysis in clinical practice.
Now in its third edition, the KTEA (Kaufman, 2014) continues to utilize error analysis procedures involving categories very similar to the ones developed for the first edition of the KTEA in 1984 to 1985 (McCloskey et al., 1985), and procedures and categories that I helped to develop in 2003 to 2004 for the second edition of the KTEA (Kaufman, 2004). The stability of the error analysis procedures over these many decades is a tribute to the original error analysis framework, and data derived from KTEA error analysis continue to provide clinicians with valuable information that can enhance interpretation of test performance. The research studies presented in this special issue represent a new direction in the exploration of the KTEA-3 categorical error analysis system. This new direction focuses on analyses of large group data, with purposes such as those stated by O’Brien et al. (2017) in the lead study in Part 1:
The current study has three main goals. The first goal is to examine the underlying relationship between students’ errors on selected KTEA-3 language and math subtests. The second goal is to identify those error categories that are more salient than others in the selected language and math subtests. Lastly, the current study aims to reduce data to a smaller set of summary variables, which will serve as the foundation for other articles in this special issue.
The studies presented in Part 2 build on this theme, analyzing the data from the standardization sample to investigate possible gender differences in types of errors (Stewart et al., 2017), types of errors associated with phonological processing (Choi et al., 2017), and errors in reading and writing comprehension and expression (Hatcher et al., 2017). The studies in Part 3 examine the patterns of errors made by specific clinical samples, including students identified as gifted learning disabled (Ottone-Cross et al., 2017), mildly intellectually disabled (Root et al., 2017), reading and math disordered (Avitia et al., 2017), attention deficit hyperactivity disorder (ADHD) with and without reading disorder (Pagirsky et al., 2017), and students exhibiting language comprehension and fluency difficulties (Koriakin et al., 2017). Part 4 examines in more detail the relationship between processing strengths and weaknesses identified with cognitive measures and types of errors (Breaux et al., 2017; Koriakin et al., 2017; Liu et al., 2017).
Collectively, these studies represent an impressive effort to understand the types of errors made by various samples and the relationships of these error types with mental constructs that underlie the acquisition of academic skills. Although each of these studies explores the relationship of error types to a different set of variables, they do so in similar ways, using sophisticated correlational techniques that emphasize a strong psychometric perspective. Like the original error analysis procedures, the statistical analyses used in these studies are tied to an empirical approach that focuses solely on what errors are being made. There is no question that the results of these studies will be beneficial to future researchers and perhaps in a more limited way to clinicians as well.
But as important as these studies are for understanding what types of errors are being made by various groups, they are equally important for understanding that there is more to be learned that will always remain hidden when using only error types as the basis of analysis. These studies tell us much about what errors are made, but not about why these errors are being made. Without a doubt, these studies are blazing a very new trail. The path they are clearing, however, is very different from the path I have been trying to clear since my initial insights about the thought processing errors I observed during the development of the original KTEA error analysis procedures.
There is no question in my mind that much can be gained from a systematic investigation of error analysis procedures. For research in this area to have its maximum impact on clinical practice, however, it must help to clarify not just the types of errors being made by students but also the reasons why these errors are being made. Table 1 offers an overarching conceptualization of the field of error analysis that could be used to guide future research. This conceptualization specifies the dual nature of error analysis and compares and contrasts the two approaches that can be used for further study.
|
Table 1. The Two Paths of Error Analysis.

As noted in Table 1 and in the previous section, category error analysis makes use of the categorical error analysis procedures that have been applied with the KTEA since its inception. The studies reported here all focused on the number of errors within specific error categories as the basic unit of measurement for analysis. These studies collectively represent the initial efforts to identify patterns of errors across categories that may represent important relationships to specific demographic variables, disability types, and patterns of cognitive strengths and weaknesses. The analyses conducted utilized correlational techniques such as exploratory factor analysis that are well suited to finding relationships among sets of variables and offering empirical support for the collapse of specific error categories into more general error categories. As this research is carried forward, it is likely to lead to a more systematic, hierarchical organization of error categories. The structure that emerges may be similar to the Cattell–Horn–Carroll (CHC) model that has been used to characterize a structure of cognitive abilities. Indeed, the studies in Part IV hint at the likelihood that error categories could be subsumed within the CHC model to further enhance the model’s ability to characterize the relationship between cognition and academic achievement.
The category approach to error analysis used in these studies is rooted in a classical psychometric approach to defining and understanding cognition. This approach to the study of error analysis shows great promise and is likely to offer many important insights into how error types are related to demographic variables, disability types, and patterns of cognitive strengths and weaknesses. What is less clear is the extent to which these insights will enhance clinical practice and decision making on a case-by-case basis. Although reliance on the psychometric approach can be seen as the category approach to error analysis’ greatest strength, it is also its greatest limitation.
In the classical psychometric approach, observations are quantified and analyzed using statistical methods. These methods typically involve correlational analyses such as exploratory factor analysis. When error analysis focuses only on counting the number of errors in each category, information regarding why the errors were made is lost. When the number of errors in each category is subjected to correlational analysis procedures, the meaning of specific errors and their impact on performance also is lost. When an empirical model is built on analyses that leave out information about the specific kinds of errors that occur and why these errors occur, it loses its explanatory power at the individual case level.
Some basic assumptions about measurement that are embodied in the classical psychometric approach also place limitations on the usefulness of categorical error analysis. The most central of these is how the classical psychometric approach conceptualizes the components of test performance. In the classical psychometric approach, an observed test score represents a combination of the true ability of the person plus error in the measurement of that ability. Even when measurement techniques are highly reliable, the obtained score usually does not match a person’s true ability. This is because some amount of variability in performance that is not due to variability in ability is present in the obtained score. Sources of variability in scores that are not due to true variability in ability are unwanted and are termed error variance. Although technically speaking error variance simply means that the source of variance is not known, error variance is often referred to as random error. The term random error suggests that the source of this variability is now attributed to random fluctuations that are explainable or predictable.
When correlational analyses are performed on quantitative scores with large groups of subjects, the cumulative effect of unexplained variability (error variance) in each individual’s test score is thought to be minimized so that the correlational patterns that emerge are likely to be indicating relationships mostly based on variability due to variation in true ability. But if the observed scores included in the analyses actually are accounting for much less of the variability than they could be accounting for, then the results of the analyses can be very misleading. In the analyses reported in these studies, factor solutions often accounted for less than 50% of the total variance in error category scores. It is possible that these factor solutions would account for much more of the total variance if sources of variability due to how errors are made were quantified as well as what errors are made.
Another factor that must be taken into account when conducting large-scale studies of the normative standardization sample is the source of the errors that have been categorized. Over the past 32 years, I have had the opportunity to visually inspect several standardization data sets, many of them involving academic skill assessments. When looking at the data from nonclassified normative samples, it was apparent that the majority of the incorrect responses provided on tests of academic skills occurred in the ceiling item set for the majority of subjects in these samples. When reviewing the data from clinical samples with diagnoses such as reading, writing, and math disorders and ADHD, errors prior to the ceiling item set were more prominent. These observations could be empirically verified or disconfirmed for the KTEA-3 data set (a data set that I have not yet seen) and that is a study that certainly should be forthcoming. If in fact this pattern of observations holds for the KTEA-3 data set, then the large-scale data set that was factor analyzed emphasized errors at the ceiling of the test. If that is the case, then the factors derived reflect patterns of errors that are the result of students reaching the limitations of their knowledge rather than errors reflecting weaknesses in specific knowledge strands or errors due to other factors. In the case of students in the clinical samples, however, errors at the ceiling level will be accompanied by many more errors prior to the ceiling level than would be the case in the general population sample. As a result, the factors derived for the clinical samples will be somewhat different than the factors derived from the general population.
The traditional psychometric approach emphasizes analysis of quantitative scores and assumes that obtained scores are the product of variation in true ability and error variance, and that error variance represents variation that is unexplainable and unpredictable. But what if much of the variance that is now termed error variance could be explained and predicted on a case-by-case basis? What if error analysis could be a technique used to provide that explanation and prediction?
In contrast to category error analysis that focuses on what specific errors were made, process error analysis focuses on how and why the errors were made and the contexts in which they are made. Although the results of process analyses could be quantified for large-scale studies, the technique is best suited to individual case analysis by a clinician who is well schooled in the approach. Although some tasks lend themselves to post hoc process error analysis (e.g., math computation), process error analysis should be initiated during the administration of the test items. This constraint makes it difficult to apply the technique consistently to large samples. Initiating the error analysis during item administration ensures that the clinician knows the contexts of performance and is able to directly observe how the individual is performing each item of a task. Reflection on the results of performance and on what was observed during administration can occur after administration of the test items is complete. Ideally, the clinician would conduct a follow-up interview with the examinee to obtain greater insight into the examinee’s perspective on how and why errors occurred. When using standardized tests, these interviews occur after all tasks have been completed in accordance with standardized directions.
Process error analysis relies heavily on the clinical skills of the examiner and the examiner’s knowledge of how students think about academic tasks and the strategies that they use to perform tasks. Because process error analysis emphasizes an understanding of how errors are made, it is rooted in fields of inquiry such as developmental psychology, neuropsychology, and cognitive neuroscience that study how individuals think and how brains function.
Careful observation of how a student performs test items leads to the generation of inferences about how thinking is occurring and involves hypothesis testing to verify these inferences. Hypothesis testing sometimes necessitates the administration of other tests to confirm or refute hypotheses.
The math error example provided at the beginning of this commentary illustrates an important aspect of process error analysis. When considered as a single event, this error could easily be attributed to random fluctuations in thinking due to any number of variables; that is, a random error. But when taken in the context of performance on other math items, its meaning becomes more evident. This same student made a similar error with operations signs on two other items. In addition, the student demonstrated in his performance with other test items that he was capable of producing the correct responses to the items that he got wrong when he used the wrong operation sign (e.g., he correctly added 3 and 2 en route to the answer to a more difficult addition item involving adding multidigit numbers).
This example shows how the use of process error analysis can help to explain variations in test performance that are not due to variation in ability but rather are due to a lack of consistent application of that ability. When the pattern of errors is frequent, it can even lead to accurate prediction of when variation in performance will be due to a source other than variation in ability. Thus, process error analysis applied on an individual basis can improve the reliability of assessment results in that it can explain additional sources of variability on a case-by-case basis in a manner that reduces the amount of variability due to the random occurrence of unexplainable and unpredictable phenomena.
During the frequent use of process error analysis in my clinical work, an important question kept arising: Why do some students make errors on items that are relatively easy while correctly performing items that are much harder and that demonstrate a knowledge of the skill that was performed incorrectly? These errors clearly were not reflecting a lack of knowledge; they were reflecting a lack of ability to consistently act on that knowledge. For me, answers to questions such as this about students who perform inconsistently came through my research in neuropsychology and cognitive neuroscience on the mental construct that is now referred to as executive functions, thanks in great part to the mentoring of Edith Kaplan.
Most process errors can be viewed as the inefficient or inadequate direction or use of neural networks that are activated when performing academic skills. In the majority of these cases, the portion of the neural network that was not functioning effectively is the one involving the use of executive functions and executive skills. These mental capacities cue, direct, prompt, integrate, and coordinate the use of all other mental constructs, including reasoning, language, visuospatial, working memory, and retrieval from long-term storage and all academic skills involving reading, writing, and mathematical calculating and problem solving (McCloskey, Perkins, & Van Diviner, 2008).
When I began the use of error analysis, the construct of executive functions was relatively new to the field of school psychology and had not yet been integrated into standard assessment practices. Even today, executive functions are not assessed on a regular basis (McCloskey & Perkins, 2012). This should not come as a surprise, as a full understanding of how executive functions are involved with other mental constructs has eluded the best minds in psychometrics for more than a century. No less a luminary than Charles Spearman was grappling with the influence of frontal lobe executive functions on the performance of standardized intelligence tests in the 1920s. In The Abilities of Man: Their Nature and Measurement, Spearman (1927) offered this observation about factors that emerged in his studies of mental abilities:
Still another great functional unity has revealed its existence; this, although not in itself of cognitive nature, yet has a dominating influence upon all exercise or even estimation of cognitive ability. On trying to express it by any current name, perhaps the least unsatisfactory—though still seriously misleading—would be “self-control.” It has shown itself to be chiefly responsible for the fact of one person’s ability seeming to be more “profound” or more inclined to “common sense” than that of persons otherwise equally capable. (p. 413)
It is interesting to note that Spearman did not conceive of this construct he labeled self-control as being cognitive.
Wechsler also struggled with what he called conative and nonintellective factors that greatly affected an individual’s expression of intellect. He published an article that opened with the following remark:
. . . general intelligence cannot be equated with intellectual ability however broadly defined, but must be regarded as a manifestation of the personality as a whole. (Wechsler, 1943, p. 78)
Although the conceptions of personality are numerous, it is interesting to note that 37 of the 44 items of the Big Five Personality Inventory are also items on the adult version of the Behavior Rating Inventory of Executive Functions, making a strong case for the argument that personality is the expression of frontal lobe development in the form of the construct of executive functions (McCloskey & Perkins, 2012).
In 1993, John Carroll published Human Cognitive Abilities: A Survey of Factor-Analytic Studies, a book that ultimately led to the construction of the CHC model of cognitive abilities. In this compendium of factor-analytic analyses of a myriad of cognitive measures, Carroll makes no reference to the frontal lobes of the human brain or the executive functions that reside within them and cue and direct all forms of perceiving, feeling, thinking, and acting. It seems that for more than a century now, executive functions have occupied a blind spot that remains inaccessible to the machinations of traditional psychometric methods, even those now being used to examine error analysis.
After these many years of utilizing process error analysis, it is clear to me that most of the unexplained and unpredictable variability in test scores that in a traditional psychometric model is attributed to error variance or random error can be attributed to variations in the degree to which an individual can engage executive functions to direct test-taking behavior in an efficient and effective manner. In this respect, at least in the case of my clinical experience, process error analysis has shed light on the most frequently overlooked mental capacity—executive functions—and in doing so has cleared the path to a deeper and more effective understanding of test performance. It has also convinced me that the construct of executive functions needs to be more fully understood and integrated into the daily practice of psychoeducational assessment.
In the future of error analysis, it would be good to see a merging of the two approaches such that the psychometric approach to categorical error analysis could be effectively utilized to study the data derived from the process approach, in other words, a hybrid approach that truly enables quantifying the qualitative. The fact that many of the articles in this issue acknowledge the role of executive functions in academic skill production, even though they were not able to quantify it in their analyses, gives me great optimism for what can be accomplished in the future through an integration of the two error analysis approaches.
Declaration of Conflicting Interests
The author(s) declared no potential conflicts of interest with respect to the research, authorship, and/or publication of this article.
Funding
The author(s) received no financial support for the research, authorship, and/or publication of this article.
|
Avitia, M., DeBiase, E., Pagirsky, M., Root, M. M., Howell, M., Pan, X., …Liu, X. (2017). Achievement error differences of students with reading versus math disorders. Journal of Psychoeducational Assessment, 35(1-2), 111-123. Google Scholar | SAGE Journals | |
|
Breaux, K. C., Avitia, M., Koriakin, T. A., Bray, M. A., DeBiase, E., Courville, T., …Grossman, S. (in press). Patterns of strengths and weaknesses (PSW) on the WISC-V, DAS-II, and KABC-II and their relationship to students’ errors in oral language, reading, writing, spelling, and math. Journal of Psychoeducational Assessment, 35(1-2), 168-185. Google Scholar | SAGE Journals | |
|
Carroll, J. B. (1993). Human cognitive abilities: A survey of factor-analytic studies. Cambridge, UK: Cambridge University Press. Google Scholar | Crossref | |
|
Choi, D., Hatcher, C., Langley, S. D., Liu, X., Bray, M. A., Courville, T., O’Brien, R., DeBiase, E. (2017). What do phonological processing errors tell about students’ skills in reading, writing, and oral language? Journal of Psychoeducational Assessment, 35(1-2), 24-46. Google Scholar | SAGE Journals | |
|
Hatcher, C., Breaux, K. C., Liu, X., Bray, M. A., Ottone-Cross, K. L., Courville, T., … Langley, S. D. (2017). Analysis of children’s errors in comprehension and expression. Journal of Psychoeducational Assessment, 35(1-2), 57-73. Google Scholar | SAGE Journals | |
|
Kaufman, A. S. (2004). K-TEA II: Kaufman Test of Educational Achievement: Comprehensive form. Circle Pines, MN: American Guidance Service. Google Scholar | |
|
Kaufman, A. S. (2014). K-TEA-3: Kaufman Test of Educational Achievement (3rd ed.). San Antonio, TX: Pearson Education. Google Scholar | |
|
Koriakin, T. A., White, E., Breaux, K. C., DeBiase, E., O’Brien, R., Howell, M., … Courville, T. (2017). Patterns of cognitive strengths and weaknesses and relationships to math errors. Journal of Psychoeducational Assessment, 35(1-2), 155-167. Google Scholar | SAGE Journals | |
|
Liu, X., Marchis, L., DeBiase, E., Breaux, K. C., Courville, T., Pan, X., … Kaufman, A. S. (2017). Do cognitive patterns of strengths and weaknesses differentially predict the errors on reading, writing, and spelling? Journal of Psychoeducational Assessment, 35(1-2), 186-205. Google Scholar | SAGE Journals | |
|
McCloskey, G., Kaufman, A. S., Kaufman, N. L., McCloskey, L. K. (1985). Clinical analysis of errors. In Kaufman, A. S., Kaufman, N. L. (Eds.), Kaufman Test of Educational Achievement: Comprehensive form manual (pp. 85-161). Circle Pines, MN: American Guidance Service. Google Scholar | |
|
McCloskey, G., Perkins, L. A. (2012). Essentials of executive functions assessment (Vol. 68). Hoboken, NJ: John Wiley. Google Scholar | |
|
McCloskey, G., Perkins, L. A., Van Diviner, B. (2008). Assessment and intervention for executive function difficulties. New York, NY: Taylor & Francis. Google Scholar | |
|
O’Brien, R., Pan, X., Courville, T., Bray, M. A., Breaux, K. C., Avitia, M., & Choi, D. (2017). Exploratory factor analysis of reading, spelling, and math errors. Journal of Psychoeducational Assessment, 35(1-2) 7-23. Google Scholar | SAGE Journals | |
|
Pagirsky, M., Koriakin, T. A., Avitia, M., Costa, M., Marchis, L., Maykel, C., … Pan, X. (2017). Do the kinds of achievement errors made by students diagnosed with ADHD vary as a function of their reading ability? Journal of Psychoeducational Assessment, 35(1-2), 124-137. Google Scholar | SAGE Journals | |
|
Root, M. M., Marchis, L., White, E., Courville, T., Choi, D., Bray, M. A., … Wayte, J. (2017). How achievement error patterns of students with mild intellectual disability differ from low IQ and low achievement students without diagnoses. Journal of Psychoeducational Assessment, 35(1-2), 94-110. Google Scholar | SAGE Journals | |
|
Spearman, C. (1927). The abilities of man: Their nature and measurement. London, England: Macmillan. Google Scholar | |
|
Stewart, C., Root, M. M., Koriakin, T., Choi, D., Luria, S. R., Bray, M. A., … Courville, T. (2017). Biological gender differences in students’ errors on mathematics achievement tests. Journal of Psychoeducational Assessment, 35(1-2), 47-56. Google Scholar | SAGE Journals | |
|
Wechsler, D. (1943). Non-intellective factors in general intelligence. The Journal of Abnormal and Social Psychology, 38, 101-103. Google Scholar | Crossref |

