Development of minimum reporting sets of patient characteristics in epidemiological research: A methodological systematic review

Background Core patient characteristic sets (CPCSs) are increasingly developed to identify variables that should be reported to describe the target population of epidemiological studies in the same medical area, while keeping the additional burden on the data collection acceptable. Methods We conduct a systematic review of primary studies and protocols that aim to develop a CPCS, using the PubMed database. We extract information on the study design and the characteristics of the proposed CPCS. The quality of Delphi studies is assessed by a tool proposed in the literature. All results are reported descriptively. Results Among 23 eligible studies, Delphi survey is the most frequently used technique to obtain consensus in CPCS development (69.6%, n = 16). Most studies do not include patients as stakeholders. The final CPCS rarely includes socioeconomic factors (26.1%, n = 6). Besides, 60.9% (n = 14) and 26.1% (n = 6) of the studies provide definitions and measurement methods for items in the CPCS, respectively. Conclusion This review identifies considerable variation and suboptimality in many methodological aspects of CPCS studies. To improve these shortcomings, guidance on the conduct and reporting of CPCS studies should be established in the future.


Introduction
In epidemiological research, collecting and reporting patient characteristics are of great importance.These data allow to assess the generalizability (or external validity) of research findings to settings that are diffrent from those originally examined. 1When comprehensive patient characteristic data are available, the difference between a study sample and a clinically relevant patient population can even be statistically accounted for. 2 Besides, patient characteristic data are also crucial for improving internal validity.For instance, by assessing the balance of important outcome prognostic factors across different treatment groups in a randomized controlled trial, one can assess whether there might be imperfect randomization.This aspect is pivotal when trials are with specific design (such as clusters randomized) or of small sample size (such as in cancerology, where complex algorithms are often used to determine the treatment assignment for each patient based on their characteristics). 3,4In pragmatic trials, detailed patient characteristic data are also strongly needed to account for adherence and drop-out, especially when the aim is to estimate per-protocol treatment effects or to handle missing data. 5Likewise, in observational studies, assessing the balance of exposure and nonexposure groups after propensity score-based stratification or matching, for instance, requires extensive data on patient characteristics. 6In systematic reviews and evidence synthesis, when the eligible studies collect and report data on a common set of patient characteristics, the assessment of the target population (factor P in the PICO criteria) across studies is facilitated.A more insightful evaluation of the heterogeneity observed among trial results is also possible. 7,8][10][11] These frameworks also rely on having a rich set of (prognostic) patient characteristics collected across individual studies.
Despite its importance in practice, the collection and reporting of patient characteristic data remain inconsistent and suboptimal.Cahan et al (2017) recently showed that among 186,941 trials on ClinicalTrials.gov,only 8.9% reported baseline participant measures, and up to 85% of those measures were reported only once in the entire registry. 12The lack of adequate reporting of important prognostic factors was also highlighted by Wertli et al.  (2013), when they assessed 84 low back pain trials and found that almost half of them incompletely reported variables that are of prognostic importance, even with easily obtainable variables such as age or comorbidities. 13][16][17][18] In these recent years, significant efforts have been made to standardize the collection and reporting of patient characteristics in epidemiological research.Across many therapeutic areas, a so-called core patient characteristic set (CPCS) is specifically developed to identify all key prognostic factors that should be commonly collected and reported (among studies and databases evaluating a target medical condition), while keeping the additional burden on the implementation acceptable (Figure 1).Beyond the variables proposed in the core set, researchers are free to measure and report additional patient characteristics that are of relevance to their topic.This CPCS concept is inspired by (and hence closely related to) the concept of core outcome set (COS) proposed in clinical research. 19owever, while the methodology for COS development is increasingly enriched in the literature, little attention has been given so far to CPCS and how to develop it in practice.
In this paper, we aim to describe the methodology of studies establishing a core set of patient characteristics that should be commonly measured and reported in epidemiological studies and/or in large medical cohorts.By shedding light on current practice and challenges in CPCS development, this review could pave the way for future recommendations and guidelines on methodological standards of CPCS, thus enhancing the adoption of this concept in epidemiological research.

Study design
We conduct a methodological systematic review conforming to the Preferred Reporting Items for Systematic Reviews and Meta-Analyses (PRISMA) 2020 statement. 20

Eligibility criteria
We include primary studies or study protocols aiming to establish a core set of patient characteristics that should be commonly measured and reported in epidemiological studies and/or databases of a pre-specified medical condition, published between 01/01/2001 and 11/08/2022.We exclude studies that establish patient characteristics sets for other purposes such as to guide therapeutic decision-making in clinical practice.Conference abstracts, editorials, commentaries, letters to the editor, non-English publications and articles without full-text accessibility are also excluded from our review.

Search strategy
A structured search in the PubMed database is undertaken by P.H.T.T on 12/08/2022.The full search strategy is available in Appendix S1.This search strategy is first developed by two reviewers (P.H.T.T and K.L.D), then further optimized by a senior researcher (T.T.V) and a librarian specialized in epidemiological systematic reviews.We also manually screen the reference lists of the eligible articles to identify additional eligible studies.

Study selection
The search results are downloaded into Endnote and imported into Rayyan web-based software. 21Duplicates are removed by the duplicate search function in Endnote and by manually reviewing the records list.Three reviewers (P.H.T.T, K.L.D, M.L.V) and one research assistant independently screen titles and abstracts of retrieved records to select eligible papers based on the inclusion criteria.Each researcher screens 25% of the total number of records and double-checks 20% of the work of another researcher.Disagreements are resolved by discussion among the four

Data extraction and assessment
The data extraction form is constructed by M.L.V and P.H.T.T, pilot-tested and refined by K.L.D and T.T.V (Appendix S2).Our data extraction form is inspired by another form previously developed by Boulkedid et al  (2011), who described the reporting of Delphi consensus method in developing healthcare quality indicators. 22We adapt this form by following recent methodological recommendations for Delphi studies. 22,23Besides, we develop addtional items to extract data of other non-Delphi studies.Data extraction is performed by M.L.V, P.H.T.T and K.L.D.Each reviewer extracts 33% and double-checks 33% of the total number of records.Any discrepancy is resolved by discussing among the three reviewers.
The following information is collected from the eligible studies: (1) publication year, (2) target medical conditions, (3) purposes of the developed CPCS (to use in epidemiological studies or in registry settings), (4) study design (consensus-reaching or non-consensus methods), and ( 5) geographical scope of the study (international or nationalwide).
We then evaluate the methodological and reporting quality of eligible studies in detail.For Delphi studies, the following characteristics are additionally collected: (1) study participants (number, response rate, types, selection criteria of participants, and whether authors report how representativeness of participants is ensured); (2) type of Delphi, i.e. traditional Delphi (which only involves asking questions to experts via questionnaires or interviews), or modified Delphi (which includes extra meeting rounds for experts' interaction) 24,25 ; (3) method to establish the primary list of items before Delphi rounds; (4.1) questionnaire round characteristics: number of rounds, purpose of each round, questions formulation (rating scale or open questions), whether the rating scale (if used) is welldefined (i.e. the number and the meaning of levels in the scales are specified), whether the questionnaire's content is publicly available and is piloted in advance, summary information sent to respondents after each round, and methods used to encourage participants to complete the questionnaires; (4.2) characteristics of in-person meetings or teleconferences (for modified Delphi studies): number of meetings and purposes, form of rating scale (if used) and whether the rating scale is well-defined, whether participants from questionnaire rounds are all invited to the meetings or only selectively, the timing of meetings and whether new items are allowed to be added between questionnaire and/or meeting rounds; and (5) how consensus is defined and attained, and how the Delphi process is terminated.Due to the lack of a standardized, validated quality assessment tool for Delphi studies, we evaluate these studies by using the checklist proposed by Diamond et al (2014). 23Although this tool has not been validated, it serves as a reasonable initial approach for examining the conduct and reporting of Delphi studies in the absence of other metrics. 23Four items in the tool include (i) the reproducibility of criteria for participant selection and whether (ii) the number of Delphi rounds, (iii) the criteria for dropping items at each round and (iv) the criteria to stop the Delphi process are stated and prespecified.The number of items satisfied in each Delphi study is then reported as a quality score.Three reviewers (M.L.V, P.H.T.T and K.L.D) independently assess the quality of all Delphi studies by this tool and reach final consensus.
When methods other than Delphi are used to establish the CPCS (i.e., the so-called non-Delphi methods), we describe the study design and extract detailed information on the number and type of experts participating in the CPCS development, method to establish the primary and final list of items, and how to attain consensus between participants when it is required.
Finally, we extract details of the obtained CPCS.These include (1) whether a description of item flow is reported, (2) whether only the final set or also intermediate results are reported, (3) whether the items in the final set are ranked and how, (4) the number of items in the final set, (5) whether the definition and measurement methods of included items are provided, and (6) the domains of items in the CPCS (demographic, clinical, patient history, socioeconomic, or healthcare setting factors).

Data synthesis
Continuous variables are presented as median and interquartile range.Categorical variables are summarized with frequencies and percentages.To investigate the content pattern of the final lists of items across eligible studies, we perform a hierarchical, complete-linkage clustering analysis. 26For this, we first calculate the percentage of the five domains in each CPCS.The domain profiles are then used to calculate the matrix of between-study Euclidean distances.The final result is visualized by a tree-structure graphic.
Data analysis is performed using Microsoft Excel 365 and R version 4.1.1.

Study selection
The PRISMA flow diagram summarizing the screening process is presented in Figure 2. Of all 5819 references identified, 23 articles meet the inclusion criteria.All 23 articles are primary studies and none is study protocol.

Methodological characteristics of Delphi studies
The methodological characteristics of 16 eligible Delphi studies are provided in Table 2 and Appendix S3.Remarkably, almost all studies involve healthcare professionals (93.8%, n = 15) or researchers (81.3%, n = 13), whereas only one study (6.3%) involves patients or patient representatives.The criteria for selecting participants are quite diverse across studies, but most commonly based on scientific renown and/or expertise level (56.3%, n = 9).Although the acceptance rate in these eligible studies is relatively low (median of 30 participants versus 62 invitations), only 37.5% of the studies (n = 6) report how they ensure the representativeness of participants.
Across all studies, rating scales are used to judge the importance of items during the questionnaire rounds (100%, n = 16).These scales range from two-point to ten-point, with five-point scales being the most commonly used (31.3%, n = 5).The scale is deemed as well-defined in 87.5% of the studies (n = 14).Apart from item rating, openended questions are also included in 62.5% of the studies (n = 10), mostly to collect qualitative feedback from participants (62.5%, n = 10).Besides, 43.8% of the studies (n = 7) report the use of a specific method to encourage participants to complete the questionnaires (e.g., by sending them reminders or vouchers).
Finally, 12.5% of the studies (n = 2) do not report the criteria for selecting or dropping an item (Appendix S3).In 87.5% of the studies (n = 14), the Delphi process is terminated when the preplanned rounds are completed, regardless of the stability of responses or whether consensus has been obtained for all items.In one study, the reason for termination is unclear.As stopping the Delphi not based on response stability or consensus is deemed as suboptimal, 23 all studies are penalized for this in the subsequent quality assessment.More precisely, 56.3% (n = 9) of the studies have a quality score of three, and 43.8% (n = 7) have a quality score of one or two, on the four-point quality assessment tool proposed by Diamond et al. ( 2014) 23 (Appendix S3).

Methodological characteristics of non-Delphi studies
The methodological characteristics of seven non-Delphi studies are provided in Table 3.In general, only one study (14.3%) reports the types of stakeholders participating in the construction of the CPCS, and no studies report the number nor the proportion of different types of stakeholders.Similarly, no studies report the criteria for selecting/ dropping each item, nor how consensus is reached after each round and at the end.

Characteristics of the final lists of patient characteristics
The reporting of results and characteristics of the final CPCSs are provided in Table 4 and Figure 3. Almost all studies (91.3%, n = 21) report the final CPCS.A CPCS developed for registries often have more items than a CPCS developed for epidemiological studies (26 [10-31] vs 17 [10-23]) (Table 4).Most CPCSs contain demographic factors (e.g., age, gender, race), clinical factors (e.g., disease severity, presence of a symptom, laboratory test), and patient history factors (e.g., lifestyle, comorbidities, family history).In contrast, socioeconomic factors (e.g., level of education) and healthcare settings factors (e.g., standard inpatient care, ambulatory or intensive care) are often absent in most final lists (Figure 3).
Items included are defined in 60.9% of the CPCSs (n = 14).Besides, 26.1% (n = 6) of the CPCSs have specific recommendations on the measurement of complex items, which are variables that must be measured by a subjective or complex tool such as quality of life, lab tests, etc. (Table 4).

Discussion
The call for better patient characteristics collection and reporting in epidemiological research is not new.The Consolidated Standards of Reporting Trials (CONSORT) 2010 statement is one of the first initiatives aiming to improve the reporting of trials, including the selection criteria (item 4a) and the description of the resulting samples (item 15). 46A table showing baseline demographic and clinical characteristics for each treatment group, including the baseline measurement of the outcome, is required. 47owever, the CONSORT statement provides no further indication of which patient characteristics to report.Extensions of the CONSORT statement specify that information on socioeconomic variables should be added, and that all relevant prognostic variables should be reported, but only one CONSORT extension explicitly asks to include comorbidity. 48,49Another initiative is the Food and Drug Administration Amendments Act (FDAAA) mandates, which require all covered studies to report results (including participants' age, gender, race or ethnicity, and the baseline measures of the primary outcome) within 1 year of completion. 50onstructing core patient characteristics sets is increasingly considered as a new method to further improve the collection and reporting of patient characteristics.Most CPCSs are developed within the last 10 years, not only to improve internal and external validity of epidemiological studies, but also to increase the quality of patient characteristics data in registries. 44,45This is essential because registries are becoming important data sources for recent epidemiological research.
In this review, we identify many different methods to construct a CPCS.Among these methods, consensus-reaching techniques such as Delphi survey are the most frequently used.Indeed, Delphi is one of the ideal methods to collect expertbased judgements when the available knowledge is incomplete, which is often the case in CPCS or core outcome set development. 51ost Delphi studies in our review do not include patients as stakeholders.This is probably because CPCS development requires specialized knowledge on prognostic factors of a certain disease.Therefore, involving patients would bring little benefit to the process.However, embracing patients' perspective on certain variables in the final set could be helpful, especially when these variables are private information of patients such as socioeconomic status,   income, family history, etc. Methods for patient engagement have recently been proposed for core outcome sets, which could be further adjusted for the development of CPCSs. 52,53Besides, many CPCS studies do not report how the representativeness of participants is ensured.Such information is important to determine the quality of the obtained CPCS and its uptake, hence should be better reported in future practice.
Our review has identified a wide range of consensus definitions employed by Delphi studies, with the most common definitions based on the pre-defined cut-offs of percentage of participants voting certain rating levels.This is in line with findings from previous reviews. 23,54arlier studies also acknowledged the difficulty of ascertaining the validity of consensus definitions, and there has been no specific guidance on methods to define consensus, which could explain the observed variability in our study. 23However, the minimum standard is to report carefully how consensus is defined and achieved throughout the process.This is not satisfied by one-sixth of eligible studies, which renders these studies susceptible to bias and arbitrariness during data collection, analysis, and interpretation. 54ost of the studies stop the Delphi process after completing a pre-specified number of rounds, regardless of the consensus attainment status.Considering the scarcity and/or divergence of evidence for each item, perfect consensus for 100% of items may not be achievable.Indeed, it has been shown that the evidence of many prognostic variables greatly suffers from a high risk of publication bias, selective reporting biases, poor statistical analyses, and so forth 55 To compromise on this issue, many CPCS studies group items into different sets with different priorities (based on level of evidence and/or consensus), so that researchers will also be informed about the quality of the variables in the final set.On the other hand, it is important to update the CPCS over time when further evidence for new (and current) prognostic factors are available in the literature.
Regarding non-Delphi studies, the reporting quality is relatively weak.Many important factors such as characteristics of study participants, method to establish the final list and consensus attainment are often not reported.This raises concerns about the rigor of the CPCSs obtained from these studies.
Our review also provides many important remarks on the final core sets across studies.First, while demographic, clinical and patient history factors are dominant in all final sets, socioeconomic and healthcare setting factors are often overlooked.This is suboptimal.][58] Meanwhile, describing the healthcare setting is important to assess the applicability of any epidemiological findings in practice.Thus, these factors are as important as other clinical factors often included in the CPCSs.
Second, the number of (final) items in CPCSs for registries is often higher than that of in CPCSs for epidemiological studies.This could be because registries are of large scale and have more (financial and human) resources for data collection than in traditional epidemiological research. 44The disparity between CPCSs for registries and for epidemiological studies, however, could pose a challenge to the interoperability between these two settings, and to the adoption of a CPCS developed in one setting in the other within one medical field.Finally, apart from a list of important patient characteristics to collect and report, many CPCSs also provide recommendations on the measurement methods for complex, or subjective items.Doing so could further reduce the heterogeneity and inconsistency in the data collection.However, when the recommended measurement method is uncommon or costly, the applicability of such method in practice could be undermined.These practical concerns should be considered when making recommendations on the measuremennt of items in the CPCS.
It is important to acknowledge some limitations of our study.First, given the already large number of records that we identify from Pubmed, we decide not to search a second database.Besides, we limit the eligibility criteria to articles published in English.Therefore, appropriate studies that are not indexed on Pubmed or not published in English might be missed.Second, the great difference between the number of records identified from the literature and the number of eligible studies may arise from the fact that the specificity and coverage of our search strategy are not optimal.Such a challenge stems from the fact that there is no standardized terminology for CPCS, as opposed to core outcome set.We mitigate the above-mentionned issues by consulting a librarian specialized in epidemiological systematic reviews to optimize the search strategy, and by manually searching for additional eligible studies from the reference list of identified eligible studies.Finally, we are not able to conduct a formal quality assessment for Delphi studies or for CPCS studies in general, because specific tools for this purpose are not yet available in the literature.

Conclusion
The methodological systematic review has identified deficiencies in the implementation and reporting of CPCS studies.A conducting and reporting guideline for CPCS studies is thus neccessary to further enhance the quality of CPCSs, and to promote the adoption of this concept in epidemiological research.

Figure 1 .
Figure 1.Core patient characteristic set (CPCS) in epidemiological research.The arrow from one box to the next reflects the generation and synthesis of clinical evidence in research practice.A CPCS could be particularly helpful in multiple steps.

Figure 2 .
Figure 2. Study selection PRISMA flowchart.PRISMA: Preferred Reporting Items for Systematic Reviews and Meta-Analyses.

Figure 3 .
Figure 3. Hierarchical clustering of 21/23 CPCS based on five variable domains, namely [1] Demographic factors (age, gender, race), [2] Clinical factors (e.g., disease severity, signs and symptoms, laboratory test), [3] Patient history factors (e.g., lifestyle factors, comorbidities, family history), [4] Socioeconomic factors (e.g., level of education, income, occupation), [5] Healthcare setting factors (e.g., standard inpatient care, ambulatory or intensive care).Each slice of the chart represents one CPCS.The sectors in each chart indicate what type of variables are included in each CPCS, with the area of each sector corresponds to the proportion of each variable type within one CPCS.For instance, the CPCS developed by Khalil et al. (2019) consists of two variable domains: demographic factors and patient history factors, which make up 25% and 75% of the CPCS, respectively.The blue lines starting from the center of the chart define how the tools are divided into the six clusters.Clusters #3 and #4, and #5 and #6 are grouped as sub-nodes of two major nodes, meaning that the tools in these sub-nodes have more similar domain profile compared to the tools in other clusters.
a Including burns, chronic fatigue syndrome, rehabilitation, hemophilia, and substance use disorder.
a Each study may be classified in more than one category; IQR: interquartile range.

Table 4 .
The reporting of results among all eligible studies (N = 23).
IQR: interquartile range; N: number of studies.