Abstract
Objective:
To investigate validity and reliability of a new measure of case complexity, the Oxford Case Complexity Assessment Measure (OCCAM).
Design:
Data collection on inpatients and outpatients attending for rehabilitation. In subsets, repeat assessments were undertaken two weeks apart, by clinicians unaware of initial data, and on admission and on discharge from inpatient rehabilitation.
Measures:
OCCAM, the INTERMED, Rehabilitation Complexity Scale – Extended (RCS-E), clinical judgement of complexity (0–10 numerical rating scale), length of stay and discharge destination (for inpatients).
Results:
For the OCCAM, the Cronbach’s α coefficient was 0.69 and item-total correlations were moderate to high except for pathology and time. The correlation coefficients with OCCAM were: INTERMED (ρ = 0.694), RCS-E (ρ = 0.736), and team judgement (ρ = 0.796). Inter-rater agreement was excellent (Weighted κ = 0.95). Correlation between admission and discharge scores was ρ = 0.917. Test–retest agreement was good (intraclass correlation coefficient 0.86). Higher mean admission scores were associated with prolonged stays (38.6 ± 12.2 versus 32.9 ± 13.7, P = 0.04) and failure to return home (48.0 ± 13.7 versus mean 32.1 ± 10.7, P < 0.001). The optimal cut-off of OCCAM to detect patients not discharged home was ≥ 34, with corresponding sensitivity and specificity of 84.6% and 62.8%, respectively.
Introduction
Healthcare is becoming increasingly expensive. Several factors influence cost, including increased treatment and costs, and patients having increasingly complex health problems. Currently in most healthcare systems resources are allocated on the basis of the diagnosis and/or the specific treatment given. However a major part of healthcare costs relate to disability through the need for higher staff time often from more specialized staff, and because disability often leads to a longer length of stay. An alternative approach to the current reimbursement system is needed, one that takes into account the complexity of patients, especially but not only in rehabilitation services.1–5 This requires a suitable measure of complexity.
Complexity relates to the number of different factors that affect the illness and its management.6–14 There are only a few measures and they have limitations. For example the INTERMED, based on a biopsychosocial model of illness is subjective, has no guidance on scoring, and seems mainly directed at psychiatric diseases10,15,16 and the Rehabilitation Complexity Scale (RCS) was developed to distinguish clinical need for specialized services rather than more local services, and focuses on clinical features only, not covering the many other contextual factors that influence rehabilitation and resource use.14,17–19
Thus there is an opportunity to develop a better and easier method for measuring case complexity. This should be based on a holistic, biopsychosocial model of illness and healthcare in order to capture as many factors that might influence resource use and outcome as possible.20–23
As part of a service development process aiming to improve costing, and to understand outcome better we developed the Oxford Case Complexity Assessment Measure (OCCAM). It covers 27 items assessing nine different domains from the holistic model of healthcare. The items fall into two main categories: patient-centred items which cover disease diagnosis, impairments and activities; and context-centred items which include the physical and social environment, social roles, personal expectations, past history, prognosis and healthcare involvement. This measurement uses the simple scoring system used by the INTERMED15 with scores for each item ranging from 0 to 3, with higher scores indicating a more complex problem in that area. The score range is 0–81 points. There is guidance on scoring.
The objective of this study was to undertake an initial investigation of the OCCAM in patients managed in a neurological rehabilitation unit.
Methods
This study was part of a service development aimed at improving costing and prognosis. It analysed data prospectively collected on patients from the Oxford Centre for Enablement (OCE) from January to August 2012. This specialized rehabilitation unit manages patients with various neurological diseases including stroke, traumatic brain injury, spinal disorders, multiple sclerosis and cerebral hypoxia. Patients may be inpatients (25 bedded ward), or outpatients (including day hospital patients).
Data were collected on adult (≥16 years old) patients admitted to the OCE or attending as outpatients. Patients and families were informed about the development process and could decline study data analysis. If the patient could be reliably assessed, data were collected with a face-to-face interview. If not, the patient’s visiting family or carers were interviewed and the medical and nurse charts were reviewed to complete missing data if necessary.
Demographic details were collected for each patient including age, gender, diagnosis, usual housing, and both length of stay (LOS) and discharge destination for inpatients.
After a pilot phase, which culminated in its use on 30 patients to detect potential ambiguities in the grading system, to clarify wording, and to check what items seemed redundant or were missing, a final version of the OCCAM was developed (see Appendix on the journal website).
Information was collected or collated to complete four measures:
the OCCAM, shown in the Appendix (on the journal website),
the INTERMED,15
the Rehabilitation Complexity Scale – extended (RCS-E)18 and
a subjective numerical rating of rehabilitation case complexity based on the judgement of team members involved in a patient’s care. This score ranged from 0 (straightforward case, no complicating factors) to 10 (most complex case conceivable).
Information for the OCCAM and INTERMED were completed by a face-to-face interview, and information for the RCS-E and the team’s rating of complexity were collected in routine multidisciplinary team meetings.
Our study comprised three parts. The first investigated the feasibility, internal consistency and validity of OCCAM. The OCCAM rating was performed by a single rater on 110 consecutive patients at admission for inpatients, and at first examination for outpatients. Concomitantly, INTERMED, the RCS-E and team judgement scales were performed by the same rater. The time taken to complete the OCCAM data set was systematically measured.
The Shapiro–Wilk test was used to test for normality of distribution of scores obtained for the different scales. As the null hypothesis for normality was rejected for the majority of these scales, non-parametric statistical tests were applied.
The internal consistency of OCCAM was examined using Cronbach’s α test. In absence of a clear ‘gold standard’ against which to test criterion validity, concurrent convergent and discriminant validity were assessed through Spearman correlations with INTERMED, the RCS-E and team judgement scale. To account for multiple tests, a Bonferroni procedure was used.
The second part investigated inter-rater consistency. Thirty patients had the OCCAM performed by two clinicians blinded to the other’s score. Spearman correlation (ρ) was used to measure the correlation between the OCCAM scores assessed by the two raters, and the inter-rater agreement was evaluated for total scores using quadratic-weighted Cohen κ statistics.
The stability of OCCAM, and its relationship to length of stay and discharge home was tested in 38 of the 56 inpatients for which scores were measured at both admission and discharge by the same rater, without looking at the admission score. Comparison between admission and discharge scores were performed using Wilcoxon signed rank test. Test–retest correlation was estimated using Spearman coefficient correlations, and test–retest agreement was assessed using intraclass correlation coefficient.
We compared OCCAM scores between patients with short length of stay and those with long length of stay (cut-off defined as the median of the distribution of length of stay, i.e. 80 days), and between patients discharged and those not discharged home using the Wilcoxon Mann–Whitney test. Receiver operator characteristic (ROC) analysis was performed, and the c-statistic representing the area under the ROC curve for the models was evaluated. The optimal cut-off point representing the highest product of sensitivity and specificity was determined. Sensitivity, specificity, positive predictive value and negative predictive value were calculated for the OCCAM to detect long length of stay, and failure to be discharged home. Observed and predicted rates of prolonged length of stay and no home discharge by OCCAM score were compared with the Hosmer–Lemeshow goodness-of-fit test.
Results
All patients asked agreed to participate. Characteristics of these patients are shown in Table 1. Mean (SD) time to complete OCCAM questionnaire was 14.6 ± 5.9 minutes (median 15, interquartile range (IQR) 10–15 minutes). For inpatients, the mean delay between admission and assessment was 4.9 ± 4.6 days (median 3, IQR 1–8 days).
|
Table 1. Characteristics of patients (n = 110).

Data were difficult to record from the direct interview of the patient in 40/110 cases: 5 (4.5%) were in a vegetative state or minimally conscious state, 10 (9%) had severe language impairment and 25 (22.7%) had severe cognitive impairment, making the assessment difficult. For these patients, carers or family were asked for additional information and the patient’s chart was reviewed to help complete the evaluation. All items could be completed using the most impaired (‘worst’) value if not known (for example emotional assessment in vegetative state). This situation concerns only 16 patients and a limited number of items. Of note, the presented results were not significantly altered when the best value rather than the worst was used.
Scores for the OCCAM, INTERMED, RCS-E, and team judgement are shown in Table 2. The distribution of patients according to OCCAM score is shown in Figure 1. The normality assumption for distribution of scores was not met for OCCAM and INTERMED, whereas it was for the RCS-E and team judgement.
|
Table 2. Mean and median scores of patients included in part 1 of the study (n = 110) for OCCAM, INTERMED, RCS-E, and team judgement scales.

Internal consistency was moderate for the overall OCCAM scale (Cronbach’s α coefficient 0.69). Subscale total correlations were moderate to high (impairment 0.89, activities 0.88, social role 0.56, physical context 0.42, social context 0.45, personal context 0.49, healthcare 0.58) except for two items (pathology 0.26, time 0.23).
Good correlations between OCCAM and both INTERMED (ρ = 0.694, Figure 2), RCS-E (ρ = 0.736, Figure 3), and team judgement scale (ρ = 0.796, Figure 4) were found (P < 0.001 for each). There was also a good correlation between team judgement and INTERMED (ρ = 0.634, P < 0.001) (Figure 5), RCS-E and team judgement (ρ = 0.726, P < 0.001) (Figure 6) but low correlation between INTERMED and RCS-E (ρ = 0.366) (Figure 7) despite being statistically significant (P < 0.001).
An excellent correlation between OCCAM scores obtained by each rater was found (ρ = 0.958, P < 0.001) (Figure 8). Inter-rater agreement was also excellent (weighted κ = 0.95, P < 0.001). Weighted κ-values for agreement between two raters by individual items are reported in Table 3.
|
Table 3. Inter-rater agreement for OCCAM total scores and OCCAM individual items (quadratic-weighted Cohen κ statistics).

Only 56 of the inpatients had been discharged at the time of analysis. Among these patients, the discharge destination was home in 43 patients (77%), nursing home in 8 patients (14%), hospital (acute unit following complication, other rehabilitation centre) in 5 patients (9%). The mean (SD) length of stay for all discharged patients was 82 ± 54.5 (median 81, IQR 39–109) days.
Among these 56 patients, 38 underwent OCCAM both at admission and at discharge. An excellent correlation between admission score and discharge score was observed (ρ = 0.917, P < 0.001). At admission, the mean OCCAM was 34.8 ± 13.4, (median 31, IQR: 26–42). At discharge, it was 29.2 ± 15.4 (median OCCAM 27, IQR: 18–38) (P < 0.001). Test–retest agreement was good (intraclass correlation coefficient 0.86, 95% confidence interval (CI): 0.75–0.93).
Patients with prolonged length of stay (>80 days) had higher admission OCCAM scores than those with short stay (mean (SD) 38.6 ± 12.2, median 35 (30–44) versus mean (SD) 32.9 ± 13.7, median 29 (24–42), P = 0.04) (Figure 9, on the journal website). The receiver operating curve of the OCCAM to predict a prolonged stay of >80 days is shown in Figure 10, on the journal website. The model showed poor discrimination (c-statistic = 0.657; 95% CI: 0.508–0.806). The optimal cut-off of OCCAM to detect patients with prolonged stays was ≥31. For this cut-off, sensitivity was 75.0%, specificity was 60.7%, positive predictive value was 65.6% and negative predictive value was 70.8%. No better discrimination was found when using either upper tertile or upper quartile as definition of prolonged length of stay. The Hosmer–Lemeshow goodness-of-fit test did not show any significant difference between predicted and observed rates (P = 0.25).
Patients not discharged home had higher admission OCCAM scores than those discharged home (mean (SD) 48.0 ± 13.7, median 49 (35–61) versus mean (SD) 32.1 ± 10.7, median 30 (25–39), P < 0.001) (Figure 11, on the journal website). The receiver ooperating curve of the OCCAM to predict failure to return home is shown in Figure 12, on the journal website. The model showed good discrimination (c-statistic = 0.815; 95% CI: 0.680–0.950). The optimal cut-off of the OCCAM to detect patients not discharged home was ≥34. For this cut-off, sensitivity was 84.6%, specificity was 62.8%, positive predictive value was 40.7%, and negative predictive value was 93.1%. We also identified two other cut-offs for higher sensitivity or higher specificity (Table 4). The Hosmer–Lemeshow goodness-of-fit test did not show any significant difference between predicted and observed rates (P = 0.46).
|
Table 4. Sensitivity, specificity, positive predictive value (PPV), negative predictive value (NPV) of OCCAM to predict no discharge home, according to several cut-offs.

Discussion
This study found the OCCAM scale to have a good feasibility, being rapidly and easily performed with minimal training, a reasonable correlation with other measures of complexity, good inter-rater reliability, a weak relationship to length of stay but a stronger relationship to discharge destination (home versus not home), and anticipated relationships in scores over time. We conclude that the OCCAM warrants further study in other settings to investigate its utility. We also feel that it may act as an aide-memoire to the range of factors that need to be considered when rehabilitating patients with complex problems, and that this aspect also warrants investigation.
Our results suggest that a good clinical history and examination enables completion of most of the objective items (especially concerning impairments and activities) without needing to ask many specific questions, which means that completing the scale requires little additional time. We noted that raters also reported that this systematic questionnaire seemed to reflect the patients’ therapy needs and the potential difficulties at discharge. Thus, we feel that completing the OCCAM could not only be fitted within routine practice but also might improve the information available.
Complexity is not easily defined or measured, but the good correlation with other scales that are aiming to measure complexity (INTERMED, the Rehabilitation Complexity Scale, and clinical judgement) suggests that the OCCAM has some validity. The reduction over time in the clinical parts (impairment and disability) but stability in contextual factors would again support its validity as an overall measure of complexity within the biopsychosocial framework. The good correlation between the specialist team’s consensus judgement and the OCCAM score was particularly heartening.
The inter-rater reliability was surprisingly high, given the wide range of items. Only three items showed low agreement (pathology, social role, time). The item ‘pathology’ probably needs better specification because the ‘secondary pathology’ (complications) sub-item is sometimes confusing. For example, spasticity could be classified as a complication of a stroke; it is actually an impairment covered in the item movement and posture. The item ‘social roles’ can also cause some problems for coding, especially for inpatients in the first week of admission. It is sometimes difficult to evaluate objectively the visits received by the patient at this stage of the illness.
The item ‘time’ has a low intra-observer reliability, probably because it concerns prognosis, which is difficult to know and subject to individual opinion especially at the early stage. A similar problem affects the INTERMED.
At present we would recommend that some further guidance and specification might reduce variability but other changes should only be undertaken if studied in comparison with this version. For example, additional items could be taken into account in complexity assessment, such as management of a tracheostomy, the socio-educational level or the foreign native language when patients are not English native speakers.
The simple scoring system offers many advantages such as consistency and ease of use but it fails to acknowledge that some items may be more important than others. However it is unlikely that item weights will be comparable across groups or even between individuals. We feel that ease of use is much more important and would not recommend differential weighting.
The ability to predict length of stay or discharge destination will need much further investigation in a much wider rage of settings and patients. Our limited data suggest that it may help predict people who will not return to the community. It might also predict people with a length of stay prolonged due to contextual factors.
This study has many limitations. It was devised and developed by one author (DW) but in fact most of the data were collected by people not actively involved in its development, suggesting that its use can be easily learned. The population studied was small, and all patients were in a single centre and all had neurological conditions. The data were collected largely within normal service delivery, as this was a service development study, and a more structured research protocol might show stronger results. On the other hand its goal was to be useful clinically and in service management; the results are likely to be more representative of routine use.
We feel that the OCCAM warrants further study by other groups, in other settings and countries, and in a much wider range of patients covering other diseases, a broader age range, a wider range of severity, etc. Much further work is needed to investigate whether it adds information over and above scales such as the Barthel ADL Index.
One particular outcome from using the OCCAM in routine clinical practice might be to raise awareness of the whole range of factors that may affect the delivery and use of healthcare services. It might also facilitate teamwork by standardizing the model used within a healthcare team.
The OCCAM is a measure of complexity derived from a theoretical basis; it is a secondary development of the INTERMED and is, we hope, a useful further development of a measure that has already been shown to be valid but is not widely used. We hope that the OCCAM is more comprehensive and simpler. We hope that further research will investigate whether it is useful across a range of diseases and settings and across a range of clinical and managerial problems.
The Oxford Case Complexity Assessment Measure is a new measure derived entirely from the biopsychosocial model of illness
This pilot study suggests sufficient intra-observer reliability, simplicity and validity to warrant further studies.
Conflict of interest
Outcome measurement is a specific research interest of our centre. None of the authors has any personal financial interests in the work undertaken or the findings reported.
Funding
This study was supported by the Société Française de Médecine Physique et de Réadaptation (SOFMER) and IPSEN.
References
| 1. |
Department of Health . Review of commissioning arrangements for specialised services. The Warner Report. London: Department of Health, 2006. Google Scholar |
| 2. |
Department of Health . Reforming NHS financial flows: payment by results. London: Department of Health, 2002. Google Scholar |
| 3. |
Department of Health . Specialised Services National Definition Set. London: Department of Health, 2002. Google Scholar |
| 4. |
Department of Health . Specialised Services National Definition Set number 7: Complex specialised rehabilitation for brain injury and complex disability (adult). London: Department of Health, 2002. Google Scholar |
| 5. |
Turner-Stokes, L, Scott, H, Williams, H, Siegert, R. The Rehabilitation Complexity Scale – extended version: detection of patients with highly complex needs. Disabil Rehabil 2012; 34: 715–720. Google Scholar | Crossref | Medline | ISI |
| 6. |
Petryshen, P, Pallas, LL, Shamian, J. Outcomes monitoring: adjusting for risk factors, severity of illness, and complexity of care. J Am Med Inform Assoc 1995; 2: 243–249. Google Scholar | Crossref | Medline | ISI |
| 7. |
Plsek, PE, Greenhalgh, T. Complexity science: The challenge of complexity in health care. BMJ 2001; 323: 625–628. Google Scholar | Crossref | Medline |
| 8. |
Plsek, PE, Wilson, T. Complexity, leadership, and management in healthcare organisations. BMJ 2001; 323: 746–749. Google Scholar | Crossref | Medline |
| 9. |
Gilmer, TP, Philis-Tsimikas, A, Walker, C. Outcomes of Project Dulce: a culturally specific diabetes management program. Ann Pharmacother 2005; 39: 817–822. Google Scholar | SAGE Journals | ISI |
| 10. |
de Jonge, P, Huyse, FJ, Stiefel, FC. Case and care complexity in the medically ill. Med Clin North Am 2006; 90: 679–692. Google Scholar | Crossref | Medline | ISI |
| 11. |
Huyse, FJ, Stiefel, FC, de Jonge, P. Identifiers, or ‘red flags,’ of complexity and need for integrated care. Med Clin North Am 2006; 90: 703–712. Google Scholar | Crossref | Medline | ISI |
| 12. |
Shiell, A, Hawe, P, Gold, L. Complex interventions or complex systems? Implications for health economic evaluation. BMJ 2008; 336: 1281–1283. Google Scholar | Crossref | Medline |
| 13. |
Greenhalgh, T, Plsek, P, Wilson, T, Fraser, S, Holt, T. Response to ‘The appropriation of complexity theory in health care. J Health Serv Res Policy 2010; 15: 115–117. Google Scholar | SAGE Journals | ISI |
| 14. |
Wade, D . Measuring case complexity in neurological rehabilitation. J Neurol Neurosurg Psychiatry 2010; 81: 127. Google Scholar | Crossref | Medline | ISI |
| 15. |
Huyse, FJ, Lyons, JS, Stiefel, F, Slaets, J, de Jonge, P, Latour, C. Operationalizing the biopsychosocial model: the INTERMED. Psychosomatics 2001; 42: 5–13. Google Scholar | Crossref | Medline | ISI |
| 16. |
de Jonge, P, Bauer, I, Huyse, FJ, Latour, CH. Medical inpatients at risk of extended hospital stay and poor discharge health status: detection with COMPRI and INTERMED. Psychosom Med 2003; 65: 534–541. Google Scholar | Crossref | Medline | ISI |
| 17. |
Turner-Stokes, L, Disler, R, Williams, H. The Rehabilitation Complexity Scale: a simple, practical tool to identify ‘complex specialised’ services in neurological rehabilitation. Clin Med 2007; 7: 593–599. Google Scholar | Crossref | Medline | ISI |
| 18. |
Turner-Stokes, L, Williams, H, Siegert, RJ. The Rehabilitation Complexity Scale version 2: a clinimetric evaluation in patients with severe complex neurodisability. J Neurol Neurosurg Psychiatry 2010; 81: 146–153. Google Scholar | Crossref | Medline | ISI |
| 19. |
Hoffman, K, West, A, Nott, P. Measuring acute rehabilitation needs in trauma: Preliminary evaluation of the Rehabilitation Complexity Scale. Injury 2013; 44: 104–109. Google Scholar | Crossref | Medline | ISI |
| 20. |
Turner-Stokes, L, Sutch, S, Dredge, R. Healthcare tariffs for specialist inpatient neurorehabilitation services: rationale and development of a UK casemix and costing methodology. Clin Rehabil 2012; 26: 264–279. Google Scholar | SAGE Journals | ISI |
| 21. |
Wade, DT, Halligan, PW. Do biomedical models of illness make for good healthcare systems? BMJ 2004; 329: 1398–1401. Google Scholar | Crossref | Medline |
| 22. |
Engel, GL . The need for a new medical model: a challenge for biomedicine. Science 1977; 196: 129–136. Google Scholar | Crossref | Medline | ISI |
| 23. |
Wade, DT . Holistic health care. What is it, and how can we achieve it? 2009. http://www.noc.nhs.uk/oce/research-education/documents/HolisticHealthCare09-11-15.pdf. Google Scholar |









