An Adaptation of the Profile of Mood States for Use in Adults With Phenylketonuria

Adults with phenylketonuria (PKU) experience disturbances in mood. This study used qualitative and quantitative techniques to adapt the 65-item Profile of Mood States (POMS) for the assessment of key mood domains in adults with PKU. First, cognitive interviews on 58 POMS items (excluding 7 Friendliness domain items) among 15 adults and adolescents (age ≥ 16 years) with PKU were conducted to eliminate items poorly understood or considered irrelevant to PKU; 17 items were removed. Next, the remaining POMS items were quantitatively examined (Mokken scaling and Rasch analysis) in 115 adult patients with PKU. An additional 21 items were removed iteratively, resulting in the 20-item draft PKU-POMS. Finally, the psychometric properties of the draft PKU-POMS were examined. The instrument displayed strong psychometric properties (reliability, validity, and responsiveness) over 6 domains (Anxiety, Depression, Anger, Activity, Tiredness, and Confusion) and all items were well understood in the final cognitive interviews with 10 adults with PKU.


Introduction
Phenylketonuria (PKU; Online Mendelian Inheritance in Man 261600 and 261630]) is a rare autosomal recessive inborn error of metabolism in which the body is deficient in the phenylalanine (Phe) hydroxylase enzyme and as a result is unable to process the amino acid Phe. If untreated, Phe will accumulate in the body and cause mental retardation (in children), microcephaly, delayed speech, delayed social skills, psychiatric symptoms, and behavioral abnormalities. 1 Uncontrolled blood Phe levels in adulthood are also associated with executive dysfunction; lack of concentration; anxiety; depression; and a variety of behavioral, psychiatric, and mood problems. [2][3][4][5] ten Hoedt and colleagues 6 conducted a randomized double-blind, placebo-controlled cross-over study of Dutch patients with PKU to determine the effect of high Phe levels on mood using a revised version of the Profile of Mood States (POMSr) instrument designed to provide data on 5 categories of mood states: tension, depression, anger, vigor, and fatigue. 7 Results indicated that when patients were in the high Phe intake period, their overall mood was worse (p ¼ .017), and they were more fatigued (p ¼ .021) and less vigorous (p ¼ .006) than when in the placebo period, providing evidence of the potential importance of the POMS and mood symptoms in understanding the effects of Phe intake in adults with PKU. However, no patient-reported outcome (PRO) measure of mood symptoms has been developed specifically for patients with PKU.
The original 65-item POMSr PRO, developed in 1971, has been used across various therapeutic areas for the assessment of a respondent's transient and variable moods; however, it has never been used with patients with PKU. 8 The 65-item POMS is a self-administered evaluation that uses a 5-point Likert scale ranging from 0 (Not at all), 1 (A little) 2 (Moderately), 3 (Quite a bit), to 4 (Extremely). Mood states are interpreted though 6 mood domains: tension-anxiety (9 items), depression-dejection (15 items), anger-hostility (12 items), vigor-activity (8 items), fatigue-inertia (7 items), and confusion-bewilderment (7 items). The POMS user manual contains the full list of these items. 8 The remaining 7 items relate to the friendliness domain; however, subsequent analyses by the POMS developers showed evidence that the friendliness domain is considered too weak to be scored. 9 Although widely used, the POMS instrument has been subjected to various revisions in order to reduce the number of items or to better adapt the instrument for the assessment of mood symptoms in a specific target patient population [10][11][12][13][14] or specific culture. 7,[15][16][17][18] These studies used either qualitative or quantitative research techniques to modify the instrument, resulting in a reduced number of items, the addition of items, or the reconfiguration of the domain structure. Limited validation research has been conducted to assess the psychometric properties of some of the reduced or adapted POMS instruments; no extensive validation research has been conducted on any one version. [19][20][21] Aims The aim of the current study was to use qualitative and quantitative analytic techniques to adapt the 65-item POMS for the assessment of key mood states in adults with PKU and to assess the psychometric properties of this revised instrument, as this instrument was not developed or validated specifically for use in patients with PKU.

Methods
A multistep process that incorporated both qualitative and quantitative analyses was used to develop the PKU-POMS. This process consisted of 3 phases: (1) qualitative assessment of item comprehensibility, acceptability, and relevance to adults with PKU; (2) quantitative assessment of item and domain performance and item reduction; and (3) quantitative and qualitative assessment of the draft items.
Phase 1: Qualitative Assessment of Item Comprehensibility, Acceptability, and Relevance to Adults With PKU In phase 1, one-on-one qualitative (mixture of in person and telephone) interviews were conducted in the United States among individuals with PKU in order to determine if any of the 58 POMS items (after removal of the seven friendliness items) should be removed due to the lack of comprehension, acceptability, and relevance to patients with PKU.
Patient population. Participants were eligible to participate if they were English speaking, 16 years of age or older, diagnosed with PKU, and were not currently taking or had not taken any medication intended to treat PKU (eg, levodopa, pegvaliase, Kuvan, BH4, and neutral amino acid) in the past 4 weeks.
Participants were recruited through the distribution of study fliers; in person or via e-mail to patients attending the 2014 National PKU Alliance Conference and posted to a study Web site. This study was approved by the Ethical and Independent Review Services Institutional Review Board (http://www.ean direview.com/).
Procedures and analysis. Study staff trained on the conduct of patient interviews conducted one-on-one cognitive debriefing interviews with participating patients using a semi-structured interview guide, which involved asking the participants questions about the 58 POMS items to address the comprehensiveness, acceptability, and relevance to patients with PKU. Following completion of the interviews, 2 study staff members independently reviewed the cognitive debriefing portion of the interview transcripts to determine if any of the POMS items should be removed based on established thresholds. Items were considered for removal if >25% of respondents did not understand the item, did not think the item was acceptable because it was not a ''good word to use,'' or did not find the item relevant to PKU. Only the items retained through this analysis were included in the analyses in phase 2 (quantitative assessment).

Phase 2: Quantitative Assessment of Item and Domain Performance and Item Reduction
In phase 2, a series of iterative quantitative analyses were conducted in order to assess item and domain performance of the retained POMS items from phase 1, and to remove poorly performing items, as needed, to achieve acceptable measurement properties in patients with PKU. This phase consisted of 3 steps including (1) examination of item-level descriptive and domain structure, (2) item reduction using Mokken scale assessment (MSA), and (3) finally item evaluation using Rasch analysis.
Data source. The pegvaliase (BMN 165) clinical development program was designed to demonstrate the safety and efficacy of pegvaliase in the treatment of adult patients with PKU. Data were used from PRISM-1 study (formerly referred to as BMN 165-301)-a phase 3, open-label, randomized study designed to further characterize the safety of pegvaliase of individuals being treated in induction, titration, and maintenance dose regimens in adults with PKU who have not had previous exposure to pegvaliase (ClinicalTrials.gov: NCT01819727). Study participants were 18 to 70 years old and recruited from 32 study sites in the United States. Individuals were required to have a diagnosis of PKU with blood Phe concentration >600 μmol/L at screening and average blood Phe concentration of >600 μmol/L over the past 6 months per medical history. The analytic population for the current analysis included all enrolled patients who completed a POMS assessment at day 1.
Step 1: Examination of item-level descriptive and domain structure. The frequency and percentage of each response category was assessed, as well as the mean, percentages of minimum and maximum responses for floor and ceiling effects 22 (ie, >50% of participants selecting a response of 0 or 4), and percentage missing. An item was flagged for further investigation and possible removal if it exhibited floor/ceiling effects or excessive missingness (>5% missing), as this is indicative that the range of available response options are not appropriate for the patient population.
To understand the domain structure of the reduced POMS prior to further quantitative analyses to assess the instrument, confirmatory factor analysis (CFA) was conducted. The CFA was conducted to model the fit of the items to each of the 6 POMS domains separately. The following statistics and threshold values were used to evaluate model fit: comparative fit index (CFI) !0.90, root mean square error of approximation (RMSEA) 0.10, and weighted root mean square residual (WRMR) 0.05. Acceptable item fit was demonstrated if items had a factor loading !0.40. 23,24 Step 2: Item reduction using MSA. The MSA is a nonparametric method of data reduction that uses a probabilistic, hierarchical search procedure to identify the best subset of similar items within a measurement scale or domain of items. 25 The model assumes that the data are defined by unidimensionality (ie, 1 latent trait is being measured), local monotonicity (ie, as a person's mood decreases, the chance of giving a response indicating improved mood should never increase), and local independence (ie, responses to any 2 items should be independent and not depend on any other aspects of the respondent or items). Loevinger's coefficient H 26 was used to assess the scalability of the POMS items within each domain, and the 6 domains within the overall instrument. Mokken 27 proposed an interpretation of this scalability coefficient: H > 0.5 indicates a strong scale, 0.4 < H < 0.5 indicates a medium scale, and 0.3 < H < 0.4 indicates a weak scale. Items with loadings <0.4 were considered candidates for item removal. An iterative process of item removal was used by removing items within a domain one by one and checking the scalability coefficients for each iterative revision. This process continued until no items had loadings below 0.4 and the scalability coefficient for each domain was considered medium or strong.
Step 3: Final item evaluation using Rasch analysis. Following the Mokken scaling analysis, separate Rasch analyses were conducted using the reduced sets of items within each of the 6 POMS domains. Rasch analysis was selected to determine if each item and domain fit the Rasch measurement model, 28 which has scaling properties of linear and interval measurement. 29 With the Rasch model, the probability of a positive response is modeled as a logistic function of the simultaneous difference between a patient's PKU severity and the severity that is measured by that item response. The following parameters were examined to determine acceptable Rasch model fit: (1) threshold ordering (ie, determine if each item's response category probability distributions indicate the proper ordering of response category shifts across all categories); (2) residual fit statistics to assess item redundancy and multidimensionality (ie, evidence that an item is being affected by some other dimension other than the latent trait the scale is measuring); and (3) w 2 item fit of the observed data to the model. Individual item test of fit showed the w 2 probability for each item, where items with P < .05 were considered ill fitting. Items with high negative residuals (<À3.0) indicated an overfitting item, wherein the information provided by this item did not add additional value to the measure. High positive residuals (>3.0) indicated that the item was underfitting, indicating that the item had a poor model fit and the response categories were underdiscriminating or not discriminating differences in severity. 30 Failure to meet these standard thresholds can be indicative of a flawed item in need of revision or removal.
An iterative series of Rasch models were fit and examined in relation to the thresholds above, such that within each iteration, the previous iteration's most problematic item(s) was removed, continuing as long as there was poor fit of the Rasch model. However, to increase domain reliability, all efforts were made to retain at least 3 items per domain. 31 The remaining items were deemed the final items to compose the draft POMS for use in patients with PKU (ie, PKU-POMS) subjected to a final quantitative and qualitative confirmatory assessment in phase 3.

Phase 3: Quantitative and Qualitative Assessment of the Draft Items
In phase 3, the psychometric properties of the draft PKU-POMS were assessed using data from PRISM-1 study. These assessments included internal consistency reliability using Cronbach's a (day 1); convergent validity using Pearson's correlations between scores on each PKU-POMS domain and scores on the Attention-Deficit Hyperactivity Disorder Rating Scale-IV (ADHD RS-IV) and plasma Phe values (day 1); and responsiveness using Pearson's correlations between change from day 1 to end of study (EOS) on each PKU-POMS domain and change in the ADHD RS-IV and plasma Phe. Plasma Phe was selected for use in the validation analyses as exploratory, because although it is an objective measure of patients' health in this patient population, it was unknown if changes in mood states would correspond to changes in Phe.
Finally, one-on-one cognitive interviews were conducted via telephone to investigate the acceptability of the draft PKU-POMS in adults with PKU, including the relevance of all retained items as well as the understandability of the instructions, recall period, response options, item content, and domains. The patient population was a separate, but highly similar, patient population of the initial qualitative interviews conducted in phase 1 (with the same inclusion/exclusion criteria required for participation), in order to ensure that the final draft instrument is understood in the same population in which it was created and validated.

Phase 1: Qualitative Assessment of Item Comprehensibility, Acceptability, and Relevance to Patients With PKU
Interviews were conducted with 9 women (60%) and 6 men (40%) with PKU (mean age: 30.4 + 12.9 years). All participants were diagnosed with PKU during the first year of life, and the majority were currently following a Phe-restricted diet (93.3%). Only 2 participants (13.3%) reported a formal diagnosis of anxiety, and 2 (13.3%) reported a formal diagnosis of depression.
Seventeen POMS items were identified that were not well understood (eg, Muddled, Shaky), not acceptable as a good word to use (eg, Bushed, Bewildered), or not relevant to patients with PKU (eg, Unworthy, Guilty). These 17 items were removed from the proposed PKU-POMS, as detailed in Table  1, resulting in 41 remaining items.

Phase 2: Quantitative Assessment of Item and Domain Performance and Item Reduction
Data from a total of 115 patients with PKU were available for the analysis from PRISM-1 study. Patient demographic characteristics are presented in Table 2.
Step 1. Examination of item-level descriptive and domain structure. Descriptive statistics for the 41 remaining POMS items are shown in Table 3. At day 1, 14 of the 41 items displayed floor effects (ie, item where !50% of sample responded ''not at all''), while no items displayed ceiling effects (ie, item where !50% of sample responded ''extremely''). No items exhibited missingness >2%.
The CFA was conducted using the retained 41 items, yielding estimates of global model fit for each individual POMS domain. Using thresholds for acceptable fit, global model fit was found to be unacceptable for all 6 POMS domains (Table  4). In addition, the items weary, alert, and efficient were found to have low item loadings (ie, <0.40). The findings of lack of domain and item-level fit indicated that these 41 remaining POMS items did not conform to the 6-domain structure of the original POMS in this patient population, and informed the item analyses that followed.
Step 2: Item reduction using MSA. The MSA was conducted to assess item fit to each domain as well as the scalability of each of the 6 domains to the overall model. Using the H coefficient as an indicator of scalability, the depression-dejection and fatigue-inertia domains were found to be strong, tensionanxiety and anger-hostility were found to be medium, and vigor-activity and confusion-bewilderment were found to be weak (column 1 of Table 5).
In addition, each domain included at least 1 item that had a low item loading (ie, H < 0.40). Items with low item scalability coefficients were removed from each domain, one at a time, and the fit of the domain reassessed. This iterative process continued within each domain separately until all items demonstrated a loading above 0.4 and each domain had an H value indicative of a strong scale (column 2 of Table 5). Specifically, in the first iteration of the MSA analysis, the following items were removed by domain: alert (vigor-activity), relaxed (tension-anxiety), efficient and uncertain about things (confusion-bewilderment), weary (fatigue-inertia), rebellious (anger-hostility), and terrified  (depression-dejection). In the second iteration of the MSA analysis, cheerful (vigor-activity) and resentful (anger-hostility) were removed. Finally, in the third iteration of the MSA analysis, only bitter (anger-hostility) was removed. A total of 10 items were removed in step 2, resulting in 31 remaining items.
Step 3: Final item evaluation using Rasch analysis. The fit of the remaining 31 items to the Rasch measurement model was then assessed, separately within each POMS domain (Table 6). This analysis demonstrated that the fit of each item within the vigor-activity and tension-anxiety domains met the established thresholds for the Rasch measurement model. For the confusionbewilderment domain, the fit of each item was acceptable according to the fit residuals and w 2 values; however, the response thresholds for forgetful were found to be disordered. Specifically, moving from a response of 0 (Not at all) to a 1 (A little) and from 1 (A little) to 2 (Moderately) corresponded to a linear increase in the latent trait of experiencing a confused mood state, but moving from a 3 (Quite a bit) to 4 (Extremely) was actually associated with a lesser experience of a confused mood state. By collapsing response options of Extremely with Quite a bit, the thresholds for this item were ordered, and the fit of all items in this domain found acceptable. For the fatigueinertia domain, fatigued was also found to have disordered thresholds, but the removal of this item rather than collapsing item response thresholds increased the performance of this domain overall.
For the remaining 2 domains (anger-hostility and depression-dejection), a more iterative series of Rasch analyses were conducted to arrive at best fit due to the large number of problematic items within these domains in fitting the Rasch measurement model ( Table 6). The selection of items to remove was based on the performance of each item in relation to established thresholds, the possibility of collapsing response options, and considering the distribution of each item (eg, presence of high floor effects [>50%]; Table 3). In the angerhostility domain, 4 items demonstrated disordered thresholds in the initial Rasch analysis. To improve fit, all items were removed but the 3 without floor effects (angry, grouchy, and annoyed), but in the resulting model, grouchy remained disordered. After collapsing the response options of quite a bit and extremely in grouchy, an acceptable fit was found for this domain. For depression-dejection, 6 items displayed threshold disordering in the initial Rasch analysis, and 7 displayed high floor effects. The set of 3 items without floor effects (unhappy, sad, and discouraged) fit the Rasch model well with no threshold disordering. In order to retain at least 4 items, all removed items were reintroduced one by one with the 3 remaining items, and the fit of the items and domain reassessed. Only the reincorporation of lonely performed adequately and was included in the final depression-dejection domain.
Overall, 11 items were removed using Rasch analysis; the Rasch model parameters and item fit statistics of the remaining 20 items are provided in Table 6. A final MSA within each domain of the remaining 20 items was also conducted (column 3 of Table 5); all item loadings were acceptable and the scalability coefficient indicated that all items and scales were medium to strong. Given acceptable scalability and fit to the Rasch model, these remaining 20 items were used to compose the draft PKU-POMS. The 6 domains were renamed to reflect the remaining items after removal of items from phases 1 and 2 as follows: anxiety, depression, anger, activity, tiredness, and confusion (formerly tension-anxiety, depression-dejection, anger-hostility, vigor-activity, fatigue-inertia, and confusionbewilderment, respectively).

Phase 3: Quantitative and Qualitative Assessment of Draft Items
Psychometric assessment of the draft 20-item PKU-POMS indicated that each domain demonstrated high internal consistency (Cronbach's a ¼ 0.75-0.87). Pearson's correlations between each domain and the ADHD RS-IV (except activity and depression) were statistically significant. Correlations with plasma Phe levels were not statistically significant (Table 7). Finally, when using correlations with the ADHD RS-IV inattentiveness scores as the primary analysis of responsiveness, all PKU-POMS domains (except depression) exhibited responsiveness to change from day 1 to EOS (Table 7). Correlations with plasma Phe were treated as exploratory and demonstrated support for the responsiveness of anxiety, depression, and confusion domains.
Finally, cognitive interviews were conducted with 5 women and 5 men with PKU (mean age: 27.2 + 4.6 years).

Discussion
The original 65-item POMS is a widely used instrument for the assessment of mood states, and multiple revised versions of the measure have been created in order to adapt the measure to a specific patient population [10][11][12][13][14] or specific culture. 7,15-18 However, no known PRO for the assessment of mood states, or revised version of the POMS, has been created for patients with PKU. Additionally, the content validity and psychometric properties of the POMS in patients with PKU has not been investigated previously. Thus, in the current study, a 3-phase, mixed qualitative and quantitative approach was used to assess the content validity, revise the original POMS for use specifically in patients with PKU, and assess the psychometric properties of the new PKU-POMS. In phase 1, cognitive interviews with adults with PKU were conducted on the POMS to identify and remove items that were poorly understood, thought to be unacceptable as appropriate words to describe the patient's mood state, or considered irrelevant to patients with PKU. In total, 17 items were removed across all 6 domains from the 58 original POMS items (excluding the 7 friendliness domain items). As noted above, previous investigations conducted on multiple patient populations have reduced the original POMS by removing poorly performing items. All 17 items removed in the present study correspond to other independent item reduction studies where the same POMS items have been removed, such as sorry for things done, 7,12,16,32 shaky, 7,12,14,17 unworthy, 12-15,17,18 muddled, 7,13,18,33 and carefree. 7,[12][13][14]18 These findings provide further support for the removal of the selected items in the current investigation using input from adults and adolescents (!16 years) with PKU.
In phase 2, the remaining items were assessed quantitatively first using MSA and then Rasch analysis. Using MSA, an iterative process was used to remove the worst items within weak domains until all domains and all items achieved at least a moderate scalability level, and an additional 10 POMS items were removed. Following the MSA analysis, Rasch analyses were conducted on the reduced 31-item POMS, again using an iterative process within each of the 6 domains, to remove all misfitting items. Much like in phase 1, all items removed through the quantitative analysis had previously been identified as poorly performing items and removed through other revisions of the POMS, such as ready to fight, 7,12,13,33 miserable, 12,13,18 and helpless. 13,14,16 The MSA technique is a useful nonparametric approach for the development of questionnaires to measure health constructs, 34 which relies on less stringent statistical assumptions compared to Rasch analysis. 35 By coupling these 2 approaches in the development of the PKU-POMS, the current study was able to first identify the most problematic items within each domain and then more specifically assess the fit of the remaining items and domains to the stricter Rasch measurement model. The fit of the draft 20-item PKU-POMS to the Rasch measurement model and scalability was found acceptable.
In phase 3, the psychometric properties of the draft 20-item PKU-POMS were examined, and cognitive interviews among adults with PKU were conducted to investigate the acceptability of the draft instrument in the target population. These analyses demonstrated that the revised domains were internally consistent and provided preliminary support for the Various research studies in the patients with PKU have demonstrated that even with early initiation of treatment, there is still an increased prevalence for various psychiatric, neurocognitive, and behavioral problems, including problems with mood, [2][3][4][5] and that these psychiatric problems are associated with increases in Phe. Using the Brief Symptom Inventory (BSI), Bilder and colleagues 36 assessed the psychiatric symptom patterns in 64 patients with PKU, finding that 6 of the 7 subscales of the BSI were elevated in patients with PKU. Further, in a systematic review of 10 published intervention and case reports, there was a clear association between reductions in Phe with marked reductions in psychiatric symptoms in all studies. 37 However, no instruments of psychiatric symptoms used in any of the clinical studies identified were developed specifically for use in adults with PKU. The revised PKU-POMS developed through 3 phases of qualitative and quantitative research with adults having PKU can help more clearly elucidate the relationships between Phe and mood symptoms and serve as a valid, reliable, and responsive measure of changes in mood symptoms in clinical studies of PKU.
The strengths of this study included the use of both qualitative and quantitative methods to adapt the POMS for use in patients with PKU. However, certain limitations of the study should be noted. Specifically, the participants recruited for phase 1 included only patients taking part in the National PKU Alliance Conference or saw the recruitment flier online via a study Web site. Thus, it is unknown if these findings generalize to a broader sample of adults with PKU. Additionally, the sample size in the PRISM-1 study at the EOS was limited (n ¼ 65), and the psychometric validation analyses utilized the same data with which the quantitative data reduction analyses were conducted. Thus, a follow-up psychometric validation study on the PKU-POMS in a separate sample of patients with PKU will be used to further inform the usefulness and psychometric properties of this instrument.

Conclusion
A detailed, 3-phase item reduction process incorporating qualitative and quantitative techniques yielded the 20-item PKU-POMS instrument. This new instrument is designed for the assessment of treatment efficacy on change in mood states in clinical trials of adult patients with PKU.

Funding
The author(s) disclosed receipt of the following financial support for the research, authorship, and/or publication of this article: Elizabeth Bacci, Kathleen Wyrwich, and Katharine Gries are employees of Evidera, which received financial support from BioMarin Pharmaceuticals in connection with the study development and execution as well as the manuscript development.