Do you mind a closer look? A jingle-jangle fallacy perspective on mindfulness

Mindfulness is defined inconsistently, and its various measures resemble established personality self-report scales. Therefore, jingle and jangle fallacies are likely to undermine the construct’s utility. To address these issues, we conducted two studies to test three hurdles of validity: 1) a sound definition and measurement model, 2) empirical distinctiveness, and 3) incremental criterion validity. We established an overarching and inclusive mindfulness definition covering twelve aspects. Based on this definition, we used an item sampling algorithm to select items from eight mindfulness scales. We established an eclectic bi-factor and a single-factor model, both fitting the data well. Bivariate latent variable correlations between a single mindfulness factor and big-five/six personality factors reached up to .68. Although 50% of mindfulness' variance was unaccounted for by the personality factors, it provided no meaningful incremental criterion validity over personality factors. Our results indicate that mindfulness has little or no incremental utility above established personality factors.


Introduction
Mindfulness, often referred to as one of the core concepts of Buddha's teachings (Hạnh, 1999), has gained increasing attention in psychological research.Mindfulness-based interventions are used to reduce psychological distress and treat mental disorders in clinical research because mindfulness is assumed to enhance mental health (Grossman & van Dam, 2011;Purser & Milillo, 2015).However, to study the effectiveness of such interventions, their target-mindfulness-must be clearly defined and validly measurable.This requirement implies a sound measurement model consistent with a theory-based definition.However, there is no consensus on how to define and measure mindfulness (Hanley & Garland, 2017a).Furthermore, mindfulness needs to be distinct from established psychological constructs (Geiger et al., 2018).Here, we consider the distinction between mindfulness and personality factors essential because they are the best-understood typical behavior constructs in psychology.Personality factors relate to a range of life outcomes, such as divorce, mortality, and occupational attainment (Ozer & Benet-Martinez, 2006;Roberts et al., 2007).Mindfulness also needs to provide incremental value in predicting such criterion variables (Sechrest, 1963) above personality factors.
Therefore, the main goal of the current study is to thoroughly test the validity of mindfulness with regard to three hurdles: (1) an adequate definition and measurement model, (2) distinctiveness from, and (3) incremental criterion validity over and above personality constructs (M.Geiger et al., 2018).

Hurdle 1 -Definition and Measurement Model for Mindfulness
Before endorsing a novel trait as valid and useful, an adequate definition and a corresponding sound measurement model for the construct are necessary.
In Buddhism, mindfulness is understood as the Pali term "sati," best translated as the infinitive phrase "to be mindful."As such, Buddhists primarily refer to mindfulness as a practice or process with different phases and not as a mental function or trait (Grossman & van Dam, 2011).They describe mindfulness practice as ( 1) deliberate, open-hearted awareness of moment-to-moment perceptible experience; (2) a process held and sustained by such qualities as kindness, tolerance, patience, and courage (as underpinnings of a stance of non-judgementalness and acceptance); (3) a practice of non-discursive, non-analytic investigation of ongoing experience; (4) an awareness markedly different from everyday modes of attention; and (5) in general, a necessity of systematic practice for its gradual refinement (Grossman & van Dam, 2011, p. 221).
Western psychologists often claim that their mindfulness definitions are based on this Buddhistic perspective; however, most exclude ethical behavior and the four immeasurables from their definition (van Dam et al., 2018).Furthermore, research groups vary with respect to the amount and content of the remaining mindfulness aspects to be considered.This results in various self-report scales.Contrary to Buddhist definitions that consider mindfulness to be a practice or state, most mindfulness questionnaires refer to the construct as a "trait-like" disposition.For instance, instructions usually include phrases such as "your opinion of what is generally true for you" (Five Facets of Mindfulness Questionnaire, FFMQ; Baer et al., 2006) or "a collection of statements about your everyday experience" (Mindful Attention Awareness Scale, MAAS; Brown & Ryan, 2003).Furthermore, the items themselves include phrases such as "in everyday life" (Comprehensive Inventory of Mindful Experiences, CHIME; Bergomi et al., 2014).We are aware of only two scales, the Toronto Mindfulness Scale (TMS; Lau et al., 2006) and the State Mindfulness Scale (SMS; Tanay & Bernstein, 2013), that refer to "state-like" experiences during meditation-or mindfulness-based training sessions.Although this conceptualization seems closer to Buddhistic understanding, these state mindfulness scales are rarely used in psychological research (van Dam et al., 2018).We therefore exclude the TMS and SMS from the present manuscript and will refer to trait mindfulness unless stated otherwise.
To represent all hitherto empirically studied aspects of the trait-like mindfulness construct captured by the different definitions and scales, we introduce a working definition: Mindfulness can be eclectically described as an aggregate of twelve aspects.Eight of these aspects represent traitlike dispositions, which determine how one deals with and evaluates one's own thoughts and emotions: (1) observing, (2) acting with awareness, (3) non-judging, (4) nonreacting, (5) insightful understanding, (6) describing, (7) considering relativity, and ( 8) being open.The remaining four aspects determine one's attitudes toward other people and can be summarized as pro-social tendencies: (9) being loving and kind, (10) being compassionate, (11) showing empathetic joy, and (12) behaving ethically.
Table 1 provides an overview of all twelve mindfulness aspects, their definitions, and exemplary self-report items, if available.The twelve aspects cover both Western and Buddhist pro-social aspects of mindfulness.
Although the most popular mindfulness scales all consider mindfulness a trait, no single scale captures all twelve aspects, not even all eight "Western" aspects (excluding the four Buddhistic aspects, see Table 2).
As a consequence, mindfulness scales diverge in their underlying measurement models.Some authors take a onedimensional perspective on the construct, for example, as represented by the MAAS, Cognitive and Affective Mindfulness Scale Revised (CAMS-R, Feldman et al., 2007), Freiburg Mindfulness Inventory (FMI, Walach et al., 2006), and the Southampton Mindfulness Questionnaire (SMQ, Chadwick et al., 2008).Others see mindfulness as a multidimensional construct, represented by, for example, the Kentucky Inventory of Mindfulness Scale (KIMS, Baer et al., 2004) or the Philadelphia Mindfulness Scale (PHLMS, Note.We did not include equanimity (being undisturbed by present experiences), because it is conceptually similar to the aspect of non-reactivity.Cardaciotto et al., 2008).By testing higher order factor models, the authors of the FFMQ or CHIME combine multidimensionality and one-dimensionality.Just as the various mindfulness definitions require a unified working definition, we must attempt to unify the diverging measurement models.A purely multidimensional perspective does not seem plausible, considering the high communality among mindfulness aspects (Baer et al., 2004;Cardaciotto et al., 2008;Siegling & Petrides, 2014;Walach et al., 2006).Instead, a unidimensional structure or a combination of uni-and multidimensionality is plausible.According to Ockham's razor, the more parsimonious model is superior among equally well-fitting models.Therefore, we must compare a parsimonious single-factor model to plausible competitors that combine uni-and multidimensionality.
Different yet psychometrically related latent variable models may be considered competitors, for example, a higher order factor model or a bi-factor model.In both models, a general factor captures communality among mindfulness aspects.Nevertheless, the models differ in how aspect-specific variance is represented: either by residuals of first-order factors (also known as disturbance terms) in a higher order model or by orthogonal specific factors in a bifactor model (Brunner et al., 2012).Although both models can be considered nested (Mulaik & Quartetti, 1997;Schmid & Leiman, 1957;Yung et al., 1999), the general factor in a higher order factor model has no direct relation to the indicators, making its interpretation more complicated.Therefore, we compared a single-factor model for mindfulness to a bi-factor model.
Note.OB = observe, AW = act with awareness, NJ = non-judgement, NR = non-reactivity, OP = openness, RE = relativity, IU = insightful understanding, DE = describe, LK = loving kindness, CO = compassion, EJ = empathetic joy, EB = ethical behavior.■ means an aspect was covered in this scale, whereas □ refers to that an aspect was not covered.
Hurdle 2 -Distinctiveness of Mindfulness From Personality Factors The second hurdle of validity is empirical distinctiveness or divergent validity.Constructs like mindfulness, which are mainly operationalized via self-descriptive statements about typical behaviors, emotions, and thoughts, should demonstrate empirical and theoretical distinctiveness from similar constructs, such as established personality factors like the big-five (neuroticism or emotionality, extraversion, openness, agreeableness, and conscientiousness) or big-six (big-five plus honesty-humility).These personality factors represent the most widespread framework in trait research and are thus likely to draw on the most extensive item universe in psychology (L.R. Goldberg et al., 2006).
Using the big-five or big-six as a benchmark for testing mindfulness' empirical distinctiveness may suggest that personality factors have high validity.However, while a common assumption, it is not necessarily true.For example, measurement models of personality factors often do not pass model fit thresholds (Booth & Hughes, 2014;Hopwood & Donnellan, 2010).Therefore, we do not assume the big-five or big-six to be the "measure of all things" or the "gold standard."Nevertheless, among economically measured typical behavior constructs, they prove to be comparably high in criterion validity and therefore have high acceptance among researchers (L.R. Goldberg, 1990;Ozer & Benet-Martinez, 2006).Consequently, these factors are an integral part of differential psychology, and thus any emerging construct, including mindfulness, should prove its distinctiveness from these dispositions.
According to a multi-trait-multi-method approach (Campbell & Fiske, 1959), empirical distinctiveness should result in smaller between-construct correlations than within-construct correlations.This means the correlations of mindfulness measures to personality factors should be smaller than those among different mindfulness measures and subscales.In addition, those correlations should also be smaller than the correlations between similar personality factors.Meta-analyses have shown the highest correlations between neuroticism and conscientiousness (À.29 to À.32) and neuroticism and extraversion (À.26 to À.34) (Thielmann et al., 2021;van der Linden et al., 2010).These correlations are reported on the manifest level and are thus not corrected for reliability.Based on this, we propose to consider manifest correlations up to r = |.34| as acceptable for constructs assumed to be distinct from each other, whereas correlations exceeding |.34| indicate construct overlap.In the following, we will evaluate the plausibility of theoretical and empirical construct overlap between mindfulness and the big-five or big-six personality factors.
Conscientiousness shares conceptual overlap with mindfulness because it covers aspects of perseverance or deliberateness which seem comparable to mindfulness aspects of being non-reactive or non-judgemental.Metaanalytic correlations support this assumption, ranging between r = .29 and .33(Banfi & Randall, 2022;Giluk, 2009;Hanley & Garland, 2017b).
Depending on its definition, openness (sometimes labeled as or extended by the component intellect) can also be considered conceptually close to mindfulness.Although the meta-analytic correlations ranging between r = .15and r = .19(Banfi & Randall, 2022;Giluk, 2009;Hanley & Garland, 2017b) do not necessarily support this relationship, the magnitude of the correlation varies depending on the mindfulness or personality measure.For example, when assessing personality factors with the NEO-PI-R and mindfulness with the FFMQ, the manifest correlation between mindfulness and openness is at r = .35(Hollis-Walker & Colosimo, 2011).In contrast, the correlation is much smaller (r = .09)when using the BFI and the MAAS (Latzman & Masuda, 2013).
Agreeableness should be related to Buddhist aspects of mindfulness as they encompass pro-social tendencies and the aspect of ethical behavior.However, popular mindfulness questionnaires do not explicitly cover such aspects (e.g., the four immeasurables).Therefore, agreeableness and mindfulness might appear empirically more distinct than they are conceptually.Meta-analytic findings underline this with small correlations of r = .19to .26(Banfi & Randall, 2022;Giluk, 2009;Hanley & Garland, 2017b).
Comparable to agreeableness, the conceptual link between mindfulness and honesty-humility is reduced to the prosocial aspects of mindfulness as expressed in the four immeasurables.Again, as popular mindfulness scales do not explicitly cover those aspects, the empirical relation between honesty-humility and mindfulness should be smaller than the conceptual relationship indicates.Unfortunately, meta-analyses do not provide empirical estimates for the correlation between both constructs, as they only include measures of the big-five.Research examining the relation between honesty-humility and mindfulness focuses on a sub-construct called social mindfulness.Being socially mindful can be described as "safeguard[ing] other people's control over their own behavioral options in situations of interdependence" (van Doesum et al., 2013, p.86).A widespread tool for measuring social mindfulness is a paradigm in which participants play with a hypothetical other.They freely choose one out of three objects, one of which is unique.Choosing the unique item is deemed socially unmindful because the hypothetical other has no real choice anymore.Correlations between mindfulness and honesty-humility facets then range from r = .15to .22 (van Doesum et al., 2019).However, before embedding such findings into an overarching mindfulness model, social mindfulness measures must be studied more thoroughly.
Based on these meta-analytic correlations, mindfulness cannot be deemed entirely redundant to any personality factor.However, most reported correlations are attenuated, so we expect latent variable correlations to exceed the reported meta-analytic correlation coefficients.Furthermore, reported coefficients might be affected by poor and diverging mindfulness measures, which leads to partially huge differences in effect sizes.For example, the correlation of mindfulness with neuroticism (measured with the NEO-FFI) ranges from r = .37(KIMS) to .63 (CAMS-R) (Baer et al., 2006).As a consequence, the divergent validity of mindfulness is unclear, despite existing meta-analyses.To deliver unequivocal and clear results concerning divergent validity, an eclectic measurement model for mindfulness is instrumental.Only if one or multiple mindfulness factors from an overarching and valid measurement model prove to be empirically distinct from personality factors, hurdle 2 will be passed.In particular, latent variable correlations of mindfulness with any personality factor should not exceed the highest latent variable correlation observed between established personality factors.

Hurdle 3 -Incremental Validity of Mindfulness Over Personality
Even if a construct demonstrates divergent validity with respect to the mentioned criteria, it still needs to prove its incremental validity (Sechrest, 1963).As a typical behavior trait measured via self-report, mindfulness must compete against the most established self-report measures of typical behavior traits, that is, personality factors.Assuming mindfulness is at least somewhat related to personality factors, their shared variance could explain any relation between mindfulness and a criterion as well.
Since mindfulness interventions are of particular interest in clinical psychology, an apparent criterion for mindfulness is mental health (Kabat-Zinn, 2003;Segal et al., 2002).As mental health is a broad construct, operationalizations and assessments vary across studies.Mindfulness researchers commonly use depression or anxiety scales, as well as general questionnaires about psychological well-being as indicators of mental health (e.g., Brown & Ryan, 2003;Carpenter et al., 2019;Tran et al., 2020).Criterion validity is then usually examined using correlations between mindfulness and such mental health variables.Metaanalyses, for example, report correlations between the FFMQ and depression or anxiety scales in the range of r = À.35 to À.71 (Carpenter et al., 2019).When controlling for neuroticism, the MAAS demonstrated some incremental criterion validity for different mental health indices (Brown & Ryan, 2003).However, Tran et al. (2020) found conflicting results: when predicting scores on a mental health questionnaire, the FFMQ scores did not demonstrate incremental validity when controlling for all big-five factors.
These findings-and those of similarly reported indices for (incremental) criterion validity of mindfulness-must be interpreted cautiously.Most personality inventories explicitly conceptualize neuroticism using adjectives such as depressive, stressed, or anxious, and many conceptualizations of neuroticism include identically or similarly labeled facets.Comparing items of mental health scales with neuroticism scales reveals substantial overlap (e.g., Geiger et al., 2018), hinting toward predictor criterion contamination.Disentangling this contamination is almost impossible among self-report scales of typical behavior.In order to accurately estimate the incremental criterion validity of mindfulness, assessing outcomes with a more distinct methodology is necessary.For instance, biographical information (L-data;Cattell, 1957), such as the number of psychological treatments received, could be a suitable alternative.
Other popular criterion constructs for mindfulness are satisfaction with life (Brown & Ryan, 2003;Christopher & Gilbert, 2010); a healthy lifestyle, indicated by nutritional or exercise habits or sustainable living, indicated by different sustainable consumption scales (S.M. Geiger et al., 2019;Lentz et al., 2019;Soriano-Ayala et al., 2020); relationship quality (McGill et al., 2016); and spirituality (Carmody et al., 2008;Greeson et al., 2011).The instruments used to examine the relationship between mindfulness and the criteria mentioned above differ from study to study.Furthermore, these studies barely investigate the incremental validity of mindfulness over personality factors.If they do, evidence does not support any incremental validity of mindfulness (Buchanan, 2019).
To prove incremental validity over and above common personality factors, the construct mindfulness must show a sufficient increase in explained variance in criteria.The interpretation of an increment strongly depends on the context.Therefore, we suggest not only considering the size of the increment but also the total amount of explained variance in the criterion.For instance, a 1% (ΔR 2 = .01)increase in explained variance is relatively high if the overall explained variance in a criterion is at R 2 = .05,but it is low if explained variance overall is much higher, for example, R 2 = .30.

Current Studies
In this manuscript, we seek to put the validity of mindfulness to a critical test against the three aforementioned hurdles.After applying metaheuristic item sampling, we tested two competing measurement models for mindfulness in Study 1. Next, we assessed the divergent validity of mindfulness to personality factors by estimating bivariate latent variable correlations, and by modeling mindfulness as a personality facet.With the latter, we aimed to test whether mindfulness is a linear combination of established personality factors.Finally, in the third step, we estimated the incremental criterion validity of mindfulness.
In Study 2, we pursued the same goals but used a different personality measure.In contrast to Study 1, the personality measure in Study 2 already had an adequate model fit (no within-study optimization) and measured the big six.

Methods
Sample.The study was conducted following the standards of the declaration of Helsinki.For Study 1, we recruited an online community sample (n = 508) via Prolific.We report a sample size rationale in the preregistration (https:// aspredicted.org/X31_LKF),and applied several preregistered data cleaning steps.First, n = 4 participants who failed one of the four attention checks were excluded.Second, univariate outliers were defined as a deviation |>3| SD from the mean on a single item.Such outliers and all remaining values on the same scale were set to missing value.Third, multivariate outliers, defined as falling outside the 99% percentile of the Mahalanobis squared distance distribution within either the mindfulness or personality items, were also excluded.Accordingly, all mindfulness items or all personality factor items were set to missing for those participants.Finally, after excluding participants with >20% missing values (n = 32), the final sample in Study 1 consisted of n = 472 participants (178 female, 7 other), with an average age of 27.23 (SD = 9.47) years.
Since no restrictions were set on Prolific, the resulting sample was diverse in terms of participants' country of residence.The three central countries of residence are the UK (21.8%),Poland (21.8%), and Portugal (15.8%).A total of 118 participants listed English as their primary language.The remaining participants were assumed to be sufficient English speakers because they participated in the English prolific panel.The sample was also diverse in educational attainment (30.1% high school, 18% some college, and 29.4% bachelor or more than a bachelor's degree).
Design.Study 1 consisted of three blocks.In the first block, self-report items from personality and mindfulness scales were presented in a mixed order to participants.This block was followed by the criteria instruments and demographic information.In a third block, participants responded to a new supernormality measure.The results of the third block are not presented in this manuscript.

Measures
Mindfulness.For mindfulness, the eight most commonly used mindfulness self-report scales were selected: the CAMS, the KIMS, the FFMQ, the MAAS, the SMQ, the CHIME, the MAAS, and the FMI.Some items of these scales are included in two or more scales but obviously only used once.For Study 1, a set of n = 173 non-redundant selfreport mindfulness items were taken from the eight mindfulness scales.Participants rated the mindfulness items on a 7-point Likert scale from 1 = "totally disagree" to 7 = "totally agree."Reliability estimates from the final measurement models are reported in the Results section.
Personality.Personality was assessed using 240 items of the NEO-PI-R (Costa & McCrae, 1995).The NEO-PI-R covers 30 facets, with 8 items each representing the big-five personality factors.The NEO-PI-R was selected for Study 1 because it is one of the most commonly used personality measures in psychological research that also provides a broad representation of all factors and enough items to estimate facets.As the personality items were presented in a mixed order with the mindfulness items, participants rated the short self-descriptive statements from the personality scales on the same 7-point Likert scale from 1 = "totally disagree" to 7 = "totally agree."Reliability parameters from the final personality factor models are reported in the Results section.
Income.Participants were asked to report their annual gross income in their preferred currency, which was transformed into US dollars using the exchange rate on October 17 th , 2021.
Regular Exercise.We used a single item to assess the number of days per week participants did at least 30 minutes of exercise.Participants rated this item on an 8-point Likert Scale ranging from 0 to 7. Amount of Vegetables.We used a single item to assess participants' daily vegetable servings.Participants rated the question on a 5-Point Likert scale ranging from 0 = "no" to 4 = "all."Amount of Fast Food Meals.A single item assessed the number of fast food meals consumed per week.Participants rated the question on a 5-point Likert scale ranging from 0 = "no" to 4 = "all." Relationship Quality.Relationship quality was measured with four items.Two items focused on the current level of satisfaction toward one's partner and relationship and were rated on a 7-point Likert scale ranging from 1 = "completely unsatisfied" to 7 = "completely satisfied."The other two items focused on the commitment toward one's partner and relationship and were rated on a 7-point Likert scale ranging from 1 = "completely uncommitted" to 7 = "completely committed."We established a higher order factor model with two first-order factors, representing satisfaction and commitment, respectively.Factor saturation for the higher order factor was acceptable in both studies (ω = .75/.80).When obtaining the total variance explained by all latent factors within this higher order factor model, the reliability estimator was good as well (ω total = .94/.95).
Sustainability.Sustainability was measured with 16 items from the Short Impact Based Pro-environmental Behavior Scale (SIBS; S. M. Geiger et al., 2019), which covers statements about sustainable behavior in everyday life.Participants rated these 16 items on a 5-point Likert scale ranging from 1 = "never" to 5 = "always."Factor saturation for a single factor was acceptable (ω = .72).
Satisfaction with Life.Satisfaction with life was measured with the Satisfaction with Life Scale (SWLS; Diener et al., 1999).Participants rated five items on a 7-point Likert Scale ranging from 1 = "totally disagree" to 7 = "totally agree." Factor saturation for a single factor was good in both studies (ω = .91/.92).
Spirituality.Spirituality was measured using 12 items in total.Five items were taken from the Religious Background and Behavior Questionnaire (RBBQ, Connors et al., 1996), which assesses the frequency with which participants perform religious practices or have spiritual experiences.These questions were complemented by an additional seven items developed to assess the frequency with which participants consume media that focuses on topics such as yoga, meditation, religion, spirituality, and/or self-love.Participants rated each of these 12 items on a 7-point Likert Scale ranging from 1 = "never tried" to 7 = "daily."We established a higher order factor model, with two first-order factors, representing the RBBQ and media consumption, respectively.Factor saturation for a higher order factor was below acceptable in both studies (ω = .54/.55).However, when obtaining the total variance explained by all latent factors within this higher order factor model, the reliability estimator was good (ω total = .89/.91).

Analytical Approach
All statistical analyses were run in R version 4.0.3 with RStudio (RStudio Team, 2020).For latent variable analyses, the packages lavaan (Rosseel, 2012) and semTools (Jorgensen et al., 2021) were used.Our analyses were conceptually summarized in the preregistration, but not all analyses presented here have been preregistered in full detail.Unless otherwise stated, effects coding was used for factor identification.When using effects coding for identification, as described by Little et al. (2006), the sum of indicator intercepts is constrained to zero and the mean of loadings is constrained to 1.0 for each latent factor.
Hurdle 1: Measurement Model.Before establishing our own eclectic measurement model of mindfulness, we tested measurement models for each of the eight commonly used mindfulness scales.Given that the authors of the CAMS-R, MAAS, FMI, and SMQ call for calculating summed scores when using the measurements, a general factor model was tested for these four scales (Brown & Ryan, 2003;Chadwick et al., 2008;Feldman et al., 2007;Walach et al., 2006).For the PHLMS and the KIMS, a correlated factor model with two and four factors was estimated, corresponding to the authors' suggestions (Baer et al., 2004;Cardaciotto et al., 2008).For the FFMQ, a higher order factor model with five first-order factors (Baer et al., 2006) was tested; for the CHIME, a higher order factor model with six first-order factors was tested.Of those six first-order factors, one factor was further divided into two lower facets, following Bergomi and colleagues (2013).Please see the supplemental material for graphical representations of all eight scale models.After these scale models, two alternative measurement models were also tested.
For our eclectic measurement model of mindfulness, we evaluated whether a single factor is sufficient to describe mindfulness or whether additional, specific factors in a bifactor model are needed.After compiling the mindfulness item pool from the most commonly used mindfulness scales, we needed to select those mindfulness indicators which would build a psychometrically sound measurement model for mindfulness.To do so, we used Ant Colony Optimization (ACO)-a metaheuristic algorithm (Schroeders et al., 2016) which selected subsets of mindfulness indicators and tested whether they were suitable to represent either the bi-factor or the single-factor model.Since the bi-factor model is rather complex, the more parsimonious single-factor model should be preferred if it fits the data equally well.As higher order models or correlated group-factor models with the same group/firstorder/specific factors can be considered transformations of a bi-factor model (Schmid & Leiman, 1957;Yung et al., 1999), we did not test them.
The specific factors in the bi-factor model represent the eight mindfulness aspects covered by the Western perspective, as listed in Table 1.They should each have at least three indicators for local identification.Therefore, the algorithm was set to sample 24 items for both models.Indicators were selected from a mindfulness item pool covering n = 173 non-redundant items from the eight mindfulness scales.As some mindfulness questionnaires do not distinguish aspects or aspect labels vary somewhat across scales, a mindfulness expert assigned each of these 173 items to one of the eight Western mindfulness aspects prior to selection.The expert did not know the original scale of the items before.ACO was set to optimize model fit (CFI and RMSEA) and factor saturation of the general or single factor of mindfulness (McDonald's Omega (ω)) simultaneously.Weighted CFI and RMSEA were averaged into one fit optimizer, which in turn was averaged with the weighted McDonald's Omega.See Olaru et al. (2015) for a more detailed description of ACO.The ACO optimization function we used can be extracted from the R-syntax uploaded to the OSF.Both models are represented in Figure 1.
Hurdle 2: Divergent Validity.Before testing the divergent validity of mindfulness to the big-five personality factors, each personality factor was modeled as a latent variable itself.Although the NEO-PI-R is a well-established personality measure, psychometric problems such as poor model fit (Hopwood & Donnellan, 2010;Parker et al., 1993) or high factor inter-correlations (Ostendorf & Angleitner, 2004) are reported.Therefore, we applied ACO to align factor correlations with meta-analytic estimates and to maximize model fit within each factor model.After selecting three items per personality facet, structural equation modeling was employed to calculate bivariate latent variable correlations between mindfulness and the five higher order personality factors.
By inspecting the bivariate correlations between the single mindfulness factor and the personality factors, the correlated factor structure of the personality factors was neglected.Therefore, a correlated factor model, referred to as the parcel model, was calculated as well.In this model (Figure 2), each personality factor was indicated by six manifest personality scores, which were based on the three selected items per facet.In addition to these 30 manifest personality scores, one manifest mindfulness score representing the 24 selected mindfulness items in the single-factor model was included.Statistically, this model represents a multiple regression in which the mindfulness composite is regressed onto five correlated personality factors.With this parcel model, we can inspect the residual variance (1 -R 2 ) of mindfulness after controlling for five correlated factors.
In order to be deemed distinct from established personality factors, the mindfulness residual should show stronger uniqueness than residuals of personality facets.Residuals can include both measurement error and content-related uniqueness.
Stronger residual variances can therefore be due to less dependable and consistent measurement or unique variance that is not already in the sphere of the big-five factors.Given that we have already shown considerable saturation of an overarching mindfulness factor, a substantial amount of variance not accounted for by the big-five factors would therefore endorse the uniqueness of mindfulness.All other facet scores should possess some uniqueness, too, and within the distribution of the uniqueness of all 30 big-five facets, mindfulness should show a salient magnitude of its residual.
Hurdle 3: Incremental Criterion Validity.After regressing mindfulness onto the five correlated personality factors in the parcel model, the incremental validity of mindfulness was assessed.A latent phantom variable, which captured the residual variance of mindfulness after controlling for the five correlated personality factors, was created (see Feng & Hancock (2021) for details on the parametrization of such models and the online supplement for a graphical representation).When using this phantom variable as a predictor in multiple regression analyses, the standardized regression coefficient (β) statistically corresponds to the correlation of the mindfulness residual and the criterion.Therefore, squaring the regression coefficient of the phantom variable indicated the increase in explained variance (ΔR 2 ) when mindfulness was added to the regression.For ease of interpretation, the overall amount of explained variance in the criterion was set in relation to the increase in variance explanation.

Results
Hurdle 1: Measurement Model.First, the measurement models that were derived from prior publications of instruments were tested for each of the eight mindfulness scales.As shown in Table 3, none of the measurement models had acceptable fit in Study 1. Consequently, item sampling was used to establish an adequate measurement model, beginning with the proposed bifactor model (eight specific factors representing the eight Western mindfulness aspects and one general factor).Details about the final selected bi-factor model for mindfulness with 24 indicators are reported in the Appendix (Table A2).Indices indicated good model fit (χ 2 (228) = 294, p < .01,CFI = .97,TLI = .97,RMSEA = .03,SRMR = .04),and the general factor for mindfulness had good reliability (ω = .87).Factor saturation for specific factors varied (ω = .10-.59) and was primarily weak.As it is a probabilistic algorithm, repeating ACO eight times resulted in somewhat different item sets.Fit and factor saturation  were good for all of the runs.We chose the best-fitting model for consecutive analyses.The additional models are reported in the supplemental material.Due to the poor reliability of the specific factors in general and good factor saturation for the general factor for mindfulness, the more parsimonious single-factor model was evaluated next.Details about the final single-factor model for mindfulness can also be found in the Appendix (Table A2).As for the bi-factor model, the algorithm selected eight somewhat different indicator sets, all meeting the optimization criteria equally well (see OSF).We selected the best-fitting model for consecutive analyses.Fit indices for this model indicated good model fit (χ 2 (252) = 333, p < .01,CFI = .96,TLI = .96,RMSEA = .03,SRMR = .04),and factor saturation for the single factor was good (ω = .81). Figure 1 provides an overview of both the bifactor and the single-factor model.Although some of the selected items had small and/or insignificant loadings on the general or single factor, the single-factor model can be considered acceptable for representing the construct of mindfulness based on fit and factor saturation.
Hurdle 2: Divergent Validity.To test the divergent validity of mindfulness to personality factors, we established measurement models for the personality factors.Modeling higher order factor models for each of the five personality factors of the NEO-PI-R separately resulted in bad model fit and factor correlations higher than meta-analytic factor correlations (Thielmann et al., 2021;van der Linden et al., 2010) (please see the supplemental material for model fit indices and the manifest factor correlation in our study as compared to meta-analytic correlations).
Thus, we also applied item sampling to optimize the NEO-PI-R.First, 6 items per facet were selected, resulting in 36 items per personality dimension.Those 36 items were selected so that their manifest score would show correlations with other personality dimension scores comparable to meta-analytical findings.Second, similar to mindfulness, fit and factor saturation were optimized by selecting three out of these six items per facet.More details are reported in the supplement.Fit indices of the measurement models for each personality factor after optimization via item sampling are summarized in Table 4.
To test divergent validity, bivariate latent variable correlations between the single factor of mindfulness and each personality factor were calculated in separate structural equation models.Results are summarized in Table 5.The largest correlations were found between the single mindfulness factor and conscientiousness (r = À.68) and openness (r = À.59).These correlations also exceeded the highest latent variable correlation observed between the personality factors (please see Appendix Table A3).These findings point toward empirical construct overlap among mindfulness, conscientiousness, and openness.The bivariate correlations of the single mindfulness factor to the personality factors prior to optimizing them are available in the supplemental material.A correlated factor model was established to further account for inter-correlations of personality factors.In this parcel model, each personality factor was indicated by six manifest facet scores and one manifest mindfulness composite.See Figure 2 for a graphical representation.Model fit was below acceptable (χ 2 (420) = 1792, p < .05,CFI = .71,TLI = .67,RMSEA = .08,SRMR = .10)due to not allowed cross-loadings of the personality facet scores (as indicated by the modification indices).However, the modification indices showed that the manifest mindfulness composite score did not deteriorate model fit.Please see Table A5 in the Appendix for detailed information about the parcel model.
Among the 31 residual variances for each manifest score, the residual variance for mindfulness was at 1 À R 2 = .47.The latent personality factors could explain about half of the variance in the manifest mindfulness score.The residual  variances for the 30 NEO-PI-R personality scores were higher, with an average M res = .66(SD res = .19)indicating that the big-five accounted for about one-third of the variance of the 30 big-five facets.Therefore, relative to the bigfive facets mindfulness had substantially less unique variance.
Hurdle 3: Incremental Criterion Validity.Overall, our results show substantial overlap of mindfulness with openness and conscientiousness.However, whether the unique variance of mindfulness (i.e., the part of mindfulness which is not explained by personality factors) has any incremental value for predicting relevant outcomes must still be clarified.Accordingly, a test was performed to verify whether the residual of the manifest mindfulness composite had significant regression weights for the criterion variables included.For Study 1, the mindfulness residual's regression weights (β) were close to zero and did not reach statistical significance for any criterion variable (Table 6).

Conclusion
Concerning the first hurdle, results show that none of the measurement models proposed by the eight Only data for participants that reported to be in some kind of a romantic relationship were used (n study1 = 264, n study2 = 439).mindfulness scales fit sufficiently well.We therefore established two overarching measurement models including all scales.Psychometric indices were optimized by applying metaheuristic item sampling techniques.Both the bi-factor and the single-factor model fit our data well, and factor saturation for both the general and single factor was good.We chose to proceed with the more parsimonious single-factor model because several of the nested factors in the bi-factor model had insufficient factor saturation.We conclude that mindfulness can pass hurdle 1 when item sampling is applied.
Next, divergent validity was evaluated concerning the big-five personality factors (hurdle 2).Two steps of item sampling were applied to the personality indicators to approach population correlations of personality factors and to optimize fit.Bivariate latent variable correlations of mindfulness with conscientiousness and openness exceeded the highest latent variable correlation observed between personality factors themselves.In a correlated factor model, in which mindfulness was embedded as an additional manifest personality indicator, mindfulness retained 47% of unique variance.Thus, personality factors could not explain half of their variance and mindfulness passed hurdle 2. However, when testing for the incremental validity of this mindfulness residual compared to the big-five personality factors, no incremental validity for different criterion variables was found and thus hurdle 3 was not passed.To replicate and extend the results from Study 1, we conducted a second study that included honesty-humility as an additional personality factor.

Methods
Sample.For Study 2, we recruited another online community sample (n = 687).We chose a higher sample size than in Study 1 to further increase power.This time we used respondi, and participants from this sample were based in the UK.All listed English as their primary language.We further chose the weighted sampling option given by respondi for sex and age.We applied the same data cleaning procedures as in Study 1, and the final sample of Study 2 consisted of n = 657 participants (343 female, 1 other) who were on average 48 (SD = 16) years old.This sample was also diverse in educational attainment (33.5% no formal education, 25% GSCE, 29.7% BTEC/A-level, and 11.7% postgraduate degree).Design.Study 2 followed the same study design as Study 1 but only included block one (respond to personality and mindfulness items) and block two (respond to criterion measures and demographic questions).

Measures
Mindfulness.In Study 2, we presented a slightly revised version of the item set from Study 1 with n = 172 mindfulness items.Please see the supplemental material for an overview of all mindfulness items included in each study.
Participants rated the mindfulness items on a 7-point Likert scale from 1 = totally disagree to 7 = totally agree.
Personality.In Study 2, we chose an optimized version of the Trait Self-Description Inventory (TSDI; Christal, 1994) with 42 items plus 9 honesty-humility items from the HEXACO-PI-R (Lee & Ashton, 2018) for personality assessment.In this optimized version, personality factors are represented by three facets with three items each, except conscientiousness, which only covers two facets.For a more detailed description of the optimized version of the TSDI, see Olaru et al. (2015).Participants rated the short self-descriptive statements from the personality scales on a 7-point Likert scale from 1 = totally disagree to 7 = totally agree.
Criterion Variables.In general, we used the same criterion variables as in Study 1.Some minor revisions were that annual gross income was requested in British pounds and we did not include the SIBS but instead slightly extended measurement of life satisfaction.Therefore, we additionally included two items commonly implemented in panel studies to measure overall happiness and satisfaction as well as eight affect items also used in the context of life satisfaction research.However, in accordance to Study 1 we will only include the SWLS items to represent life satisfaction.

Analytical Approach
All statistical analyses from Study 1 were repeated in Study 2, with a few minor revisions.For the mindfulness measurement model, we repeated ACO in a slightly revised mindfulness item universe.Therefore, the algorithm selected 24 items from n = 172 items.Because we used the version of the TSDI for personality assessment which was already optimized for model fit and factor saturation, there was no need to apply item sampling for personality.Since the optimized version also covers honesty-humility items, we modeled six personality factor models instead of five.Therefore, in the parcel model we had one additional personality factor included compared to Study 1. Data and supplemental materials are available in an OSF repository (https://osf.io/agprm/).

Results
Hurdle 1: Measurement Model.Before testing the two different measurement model approaches to mindfulness that we proposed in Study 1, we tested measurement models for each of the eight mindfulness scales.Unfortunately, as shown in Table 3, none of these scale-specific measurement models fit acceptably well.Graphical representations of all models can be found in the supplemental material.
Comparable to Study 1, we applied item sampling to find an adequate bi-factor model with eight specific factors.Details for the final bi-factor model for mindfulness with 24 indicators as selected in Study 2 are reported in the Appendix (Table A1).Fit indices indicated good model fit (χ 2 (228) = 383, p < .01,CFI = .96,TLI = .95,RMSEA = .03,SRMR = .04)for the best model, and the general factor for mindfulness had good saturation (ω = .89).Factor saturation for specific factors ranged comparable to Study 1 (ω = .11-.52) and was mostly weak.As for Study 1, we report all bi-factor models from other ACO runs in the supplemental material.Subsequently, we tested the more parsimonious singlefactor model.Details about the selected single-factor model for mindfulness from Study 2 are reported in the Appendix (Table A2), and selected models from other runs are presented in the supplemental material.Fit indices indicated acceptable model fit for the best single-factor model (χ 2 (252) = 520, p < .01,CFI= .94,TLI = .93,RMSEA = .04,SRMR = .04),and factor saturation for the single mindfulness factor was good (ω = .88).See Figure 1 for an overview of both models in both studies.Like in Study 1, some of the selected items had small and/or insignificant loadings on the general or single factor, but based on fit and factor saturation we consider the best-fitting single-factor model acceptable for further analyses.
Hurdle 2: Divergent Validity.We established measurement models for the personality factors to test the divergent validity of mindfulness from personality factors.To do so, we separately modeled higher order factor models for each of the six personality factors.Since we used an optimized version of the TSDI with additional honesty-humility items, we had three indicators per facet.Fit indices of measurement models per personality factor are presented in Table 4.
Comparable to Study 1, we used the single-factor model for mindfulness to calculate bivariate latent variable correlations between mindfulness and the six higher order personality factors.The correlation between mindfulness and conscientiousness was still the highest (r = .55).See Table 5 for all bivariate latent variable correlations.As for Study 1, we proceeded with the parcel model (see Figure 3).Again, model fit was below acceptable (χ 2 (115) = 724, p < .01,CFI = .87,TLI = .83,RMSEA = .09,SRMR = .09)due to not allowed cross-loadings of the personality facet scores (as indicated by the modification indices).When inspecting the residual variances of the 18 indicators, again the mindfulness residual (.42) did not exceed the size of the personality facet residuals, which had an average residual variance of M res = .48(SD res = .20).
Hurdle 3: Incremental Criterion Validity.Finally, we calculated multiple regression analyses and had the six latent personality factors plus the residual of the manifest mindfulness score predict criterion variables.Comparable to Study 1, the mindfulness residual (modeled as a dummy variable) did not have incremental value compared to personality when predicting income, psychological health, regular exercise, healthy nutrition, relationship quality, and spirituality.On the other hand, satisfaction with life was predicted significantly by the mindfulness residual.However, the incremental size of this contribution to variance explanation seems small (ΔR 2 = 2%), especially considering the total amount of explained variance for this criterion (R 2 total = 34%).

Conclusion
In Study 2, we conceptually replicated and extended the findings from Study 1.The analytical strategy was the same as in Study 1: Establish meaningful measurement, show nomological uniqueness, and demonstrate incremental validity.Comparable to Study 1, scale-wise measurement models for mindfulness did not fit sufficiently well.However, the established bi-factor and the established single-factor model showed acceptable to good fit.The single-factor model was-just like in Study 1-more satisfactory from a psychometric stance and we proceeded with the single-factor model (hurdle 1 passed).Bivariate latent variable correlations of the single mindfulness factor and the six higher order personality factors were moderate to large.In a correlated factor model for the six personality factors with manifest facet scores and a mindfulness composite as indicators, residual variance for the mindfulness score did not exceed residual variances of the personality facets.These results indicate little but sufficient empirical distinctiveness of mindfulness from personality in Study 2 as well (hurdle 2 passed).When testing multiple regressions, the mindfulness residual had insufficient incremental validity over personality (hurdle 3 not passed).Notably, bivariate latent variable correlations differed between Study 1 and 2, as well regression weights onto the manifest mindfulness scores or the amount of total explained variance for criterion variables.This demonstrates potential differences in item-or person sampling across the two studies.Regardless, the key message does not change, and we were able to conceptually replicate and extend our findings from Study 1 in Study 2.

Discussion
We presented two studies investigating mindfulness's validity by testing three hurdles that any new construct should pass.Based on an overarching construct definition, sound measurement models for mindfulness were established (hurdle 1), divergent validity in terms of correlational overlap with personality factors was investigated (hurdle 2), and the incremental validity of mindfulness over personality factors was tested (hurdle 3).Model fit indices for a single-and a bi-factor model with eight specific factors were acceptable in both studies when using mindfulness item samples selected via Ant Colony Optimization.Due to good factor saturation of the general or single factor in both model types, the more parsimonious single-factor model for mindfulness was retained.Latent variable correlations between this single mindfulness factor and higher order personality factors were substantial and in line with theoretical assumptions.Personality factors accounted for around 50% of the variance in a manifest mindfulness score.Although half of the variance in mindfulness could be considered unique, there was no conclusive evidence for incremental validity of mindfulness over the big-five/big-six personality factors for a broad set of criterion variables.
Hurdle 1 -Is There a Sound Definition and Measurement Model for Mindfulness?
Diverging theoretical concepts of mindfulness have led to various self-report scales and measurement models.
However, we found none of the proposed models for these instruments demonstrated construct validity.Thus, we identified a need to improve definitions and measurement models.We first introduced a unifying definition of mindfulness and found that no existing measurement model captures the full scope of mindfulness.Nevertheless, we sought to establish an acceptable measurement model for what is considered mindfulness in the literature.Therefore, we combined all mindfulness items in a joint item pool and developed eclectic measurement models via item sampling.We compared eclectic bi-factor and single-factor models which go beyond the eight individual instruments and earlier unifying approaches (Baer et al., 2006;Bergomi et al., 2014;Siegling & Petrides, 2014;Walach et al., 2006).The more parsimonious single-factor model also showed sufficient construct validity (Borsboom et al., 2004).We recommend that future mindfulness research relies on the presented unified definition and to employ the item sets we compiled because both improve the status quo in mindfulness research.When these recommendations are followed, mindfulness can overcome hurdle 1.
Hurdle 2 -Is Mindfulness Distinct From Personality Factors?
When regressing an improved mindfulness score on multiple latent personality factors, about half of the variance in mindfulness was found to be explained by these personality factors.The regression weights of the personality factors on mindfulness in the parcel model allow location of the construct within the big-five or big-six (Bainbridge et al., 2022).Considering these weights, mindfulness seems closer to conscientiousness, neuroticism, and openness, and more distal to extraversion, agreeableness, and honestyhumility.Notably, regarding the comparable size for mindfulness and personality facet residuals, mindfulness should rather be located at the facet, than on the higher order factor level.However, mindfulness should not be seen as a personality facet.Contrary to personality facets, which are clearly assigned to only one personality dimension, mindfulness resides somewhere between multiple higher order personality dimensions.
Overall, locating mindfulness somewhere between conscientiousness, neuroticism, and openness fits earlier empirical work and some conceptual considerations, although not all of them.Considering the four immeasurables of mindfulness (loving-kindness, compassion, equanimity, and empathetic joy), one would have expected much higher correlations with agreeableness and honesty-humility.Our findings again highlight that the definitions and measures of mindfulness commonly used in psychological research are likely to have inadequate coverage of the construct.Yet, suppose the immeasurables were included in mindfulness instruments.In that case, a stronger empirical overlap of mindfulness and the big-five/big-six must be expected, presumably making mindfulness a construct fully redundant to personality factors.
Our mindfulness factor, however, had some unique variance and is therefore not completely a linear function of established personality factors.This is remarkable, considering that other positive psychology constructs have shown to be (nearly) redundant to factors of personality (such as self-compassion; M. Geiger et al., 2018;Pfattheicher et al., 2017) or grit (Schmidt et al., 2018).However, mindfulness measures established in the literature (such as the FFMQ) show very high correlations with neuroticism, for example (see supplemental material).This finding must lead to reconsidering any study findings using such questionnaires.Mindfulness studies relying on such mindfulness measures do not provide findings about mindfulness but rather about personality factors; in this case, neuroticism.However, by providing an overarching definition and measurement model we were able to help mindfulness pass hurdle 2. Yet, demonstrating some uniqueness of mindfulness does not prove its utility in demonstrating incremental validity.
Hurdle 3 -Does Mindfulness Have Incremental Validity Over Personality Factors?
Our selection of criteria allowed a fair test of the incremental validity of mindfulness by covering both generally established criteria (i.e., income, life satisfaction, and relationship quality) and criteria typically focused on in the mindfulness literature (i.e., healthy lifestyle, mental health, and spirituality).No conclusive or substantial incremental validity from mindfulness over personality factors was found for any of these criteria.Therefore, we conclude that mindfulness has failed to pass the third hurdle.
Given the abundance of mindfulness-based interventions, the lack of incremental criterion validity of mindfulness for such criteria is alarming.As research on mindfulness-based interventions has been criticized to suffer from severe methodological problems (S.B. Goldberg et al., 2018;Schindler & Pfattheicher, 2021), and existing research following a more elaborate study design (e.g., active control groups) could not find convincing evidence for the effectiveness of mindfulness (Kaplan et al., 2022)

Limitations
Although our studies provide sound empirical evidence based on two large samples using partly different item sets, some limitations must be addressed.First, using online samples might reduce data quality due to an unsupervised test setting and is therefore a limitation.Attention checks and pattern recognition algorithms with subsequent exclusion of invalid observations were applied to mitigate this limitation.Conceptual replications in which data collection modalities are varied could show whether or not the conclusions drawn here hold.
Second, a major limitation of both studies is the singlemethod design.Concerning a multi-trait-multi-method approach to validation, other assessment approaches could provide further evidence on the validity of mindfulness.For example, the breath-counting task offers an economical and supposedly more objective (although very specific) way of measuring mindfulness.Yet, its convergence with self-report measures is low (r = .16)(Wong et al., 2018).Other than that, informant-report measures could be used to assess mono-trait-hetero-method validity.However, correlations between self-and observer-report forms of mindfulness seem to fluctuate around r = .30(Bartlett et al., 2022;May & Reinhardt, 2018), again indicating no strong convergence.Therefore, it is unclear whether or not observer-report forms of mindfulness can capture relevant trait variance.After all, mindfulness is inherently defined as an internal process; therefore, it seems counterintuitive to use observer reports.Thus, and in line with the current practice, in our study we chose to focus on self-report.By applying latent variable modeling, we could at least control for measurement error.
Third, the diversity among mindfulness factor loading strengths is problematic; whereas some items showed strong loadings, others did not.Indeed, we found the opposite of expected loading patterns for some items.For example, the item "I tend to evaluate whether my perceptions are right or wrong" had a substantial negative loading on the general factor of mindfulness (after it was reverse coded).While the item's authors consider evaluating one's perceptions as right or wrong to not be mindful, the loading indicates the opposite.We leave it to the reader to question such items; their overall weight for a single mindfulness factor is limited.
Other limitations result from using the NEO-PI-R in Study 1.The NEO-PI-R is known for bad model fit (Hopwood & Donnellan, 2010;Parker et al., 1993) which also occurred in our sample.In particular, the openness facets are highly debatable as the openness factor itself is considered a compound variable and no clear factor structure emerges among the facets (DeYoung et al., 2014;Johnson, 1994).Additionally, the inter-correlations for the factors reported in the manual of the NEO-PI-R (Ostendorf & Angleitner, 2004) exceed meta-analytic estimates (Thielmann et al., 2021;van der Linden et al., 2010) drastically (e.g., correlation between neuroticism and conscientiousness at r = .53 in a US sample), which was also the case in our sample.To overcome those limitations, we applied item sampling.Yet, although applying an itemsampling algorithm for improving model fit and for pushing factor correlations toward meta-analytic estimates seems a fair approach, it also bears some limitations.
On the one hand, the reduction of items presumably comes at the cost of content breadth.With only three instead of eight indicators per facet, it remains questionable to what extent the facets and factors represent the initial NEO-PI-R facets and factors.Therefore, in Study 2 we used an ACOoptimized version of the TSDI, which covers facet-level domains and has relatively good model fit.Furthermore, testing the divergent validity between mindfulness and personality factors is inconsistent, when well-fitting measurement models are only established for mindfulness (but not for personality factors).Therefore, applying ACO allowed us to apply fair standards for both mindfulness and personality factors.
On the other hand, the item sampling algorithm we chose proceeds meta-heuristically and does not necessarily find the global maximum in each run.Indeed, the algorithm selected eight slightly different item sets in each study for each mindfulness model.This stochasticity could mean our results hinge upon peculiarities of item sets.Furthermore, indicator sets selected via an item sampling algorithm might be prone to overfitting, meaning that a selected set might produce good fit indices in the sample in which it was selected but, when presented to a different sample, fit indices might be substantially worse (Yarkoni & Westfall, 2017).Such problems are severe when a selected set is proposed as an outstanding set of indicators for assessing a construct.
Contrary, in the present manuscript, we applied item sampling to test whether it is possible at all to extract a set of indicators representing a bi-factor or a single-factor model for mindfulness.Running the sampling process multiple times yielded multiple indicator sets that met our selection criteria comparably well.Furthermore, the final conclusions concerning hurdle 3 are the same for all item sets (see supplemental material).More specifically, all sets had very similar relations with external variables.Therefore, we consider all sets to capture similar trait variance.As such, our conclusions are not limited to the peculiarities of one specific item set (which might represent an overfitted mindfulness measurement model) but consider mindfulness as a construct.

Conclusion
The present work contributes to mindfulness research in several ways.We provided an exhaustive definition covering all Western and Buddhist aspects of the construct.
Furthermore, despite the prevailing disagreement on the structure of the construct, we developed an overarching, eclectic measurement model.Unlike most previous work on mindfulness, we applied latent variable modeling which allows us to study disattenuated results.Since unique variance of mindfulness was unrelated with criterion variables, our results immensely diminish the perceived relevance of the construct.This conclusion is based on strong empirical evidence, as our results are based on two studies.Furthermore, as we used differing item sets, our findings are not bound to the peculiarities of a specific item set.
To conclude, we suggest embedding concepts and findings on mindfulness into the nomological net of personality research instead of continuously developing new definitions, devising new measurement tools, and creating novel interventions to improve mindfulness.Reservations against an idea that has been adapted and neutered from Buddhistic philosophy without attending to well-known scientific standards seem appropriate. Appendix:

Figure 2 .
Figure 2. Correlated factor model for five personality factors.Each personality factor was indicated by six manifest scores for personality facets and one manifest score for mindfulness.For better readability, only indicators with minimal and maximal loading are presented here plus the loading of the mindfulness score.Loadings with p > .05are printed dashed.

Figure 3 .
Figure 3. Correlated factor model for six personality factors.Each personality factor was indicated by three manifest scores for personality facets and one manifest score for mindfulness.Conscientiousness was indicated by two manifest facet scores plus the mindfulness score.For better readability, only indicators with minimal and maximal loadings are presented here plus the loading of the mindfulness score.Loadings with p > .05are printed dashed.

Note.
Insignificant loadings with p ≥ .05are printed in light grey.Items that have been selected in both studies are printed in bold and surrounded by a box.AW = act with awareness, NJ = non judgement, NR = non reactivity, IU = insightful understanding, DE = describing, OB = observe, RE = relativity, OP = openness.

Table 1 .
Mindfulness aspects included in the working definition of this manuscript with exemplary self-report items from common mindfulness scales.

Table 2 .
Overview of eight commonly used Mindfulness scales, their subscales, item number as well as which of the defined aspects of mindfulness are covered in each scale.

Table 3 .
Fit indices for measurement models per mindfulness scale in Study 1/ Study 2.
Note. *Degrees of freedom differ between studies because in Study 1 two items for the MAAS have not been presented to participants.

Table 4 .
Model fit for optimized measurement models for each personality factor (higher order factor models) in Study 1/Study 2.

Table 5 .
Bivariate latent variable correlations between the single mindfulness factor and the big-five/big-six in both studies.
Note.Values belong to Study 1/Study 2.

Table 6 .
Residual relations of the manifest mindfulness composite in a multiple regression after controlling for the correlated latent bigfive/big-six personality factors.Note.β mind_resid is the regression weight of the mindfulness residual as described in the parcel model.Beta weights are fully standardized and indicate incremental criterion validity.ΔR 2 mind_resid is the increase in variance explanation for the criterion by the mindfulness residual.□ = manifest variable, s = latent variable.
a Winsorized data to 99% percentile.bModel based on logistic regression with a dichotomous endogenous variable.c , better designed intervention studies are needed.Implementing active control groups, assessing other variables potentially affected by mindfulness-based interventions, randomly assigning participants to groups, studying change at a latent variable level, and implementing multiple follow-up measurements could all contribute to understanding how and to what extent mindfulness-based interventions are an effective tool to improve mental health.Furthermore, mindfulness interventions must also be compared to similar personalitychange focused interventions.Considering our findings, odds are that interventions focusing on change in personality factors could be superior to interventions focusing on mindfulness.

Table A3 .
Latent variable correlations between the higher-order personality factors in Study 1/Study 2.

Table A2 .
Standardized loadings for the single-factor model in Studies 1 (n = 472) and 2 (n = 657).Note.Non-significant loadings with p ≥ .05are printed in light grey.Items that have been selected in both studies are printed in bold and surrounded by a box.

Table A4 .
Parcel model: Standardized loadings of the manifest facet scores and the manifest mindfulness score on the big-five personality factors and their corresponding residual variances in Study 1 (n = 472).
Note.Standardized loadings with p > .05are printed in light gray.