How Do Women Interpret the NHS Information Leaflet about Cervical Cancer Screening?

Background. Organized screening programs often rely on written materials to inform the public. In the United Kingdom, women invited for cervical cancer screening receive a leaflet from the National Health Service (NHS) to support screening decisions. However, information about screening may be too complex for people to understand, potentially hindering informed decision making. Objectives. We aimed to identify women’s difficulties in interpreting the leaflet used in England and negative and positive responses to the leaflet. Methods. We used a sequential mixed-methods design involving 2 steps: cognitive think-aloud interviews (n = 20), followed by an England-wide survey (n = 602). Data were collected between June 2017 and December 2018, and participants included women aged 25 to 64 y with varying sociodemographics. Results. Interview results revealed misunderstandings concerning screening results, benefits, and additional tests and treatment, although participants tended to react positively to numerical information. Participants were often unfamiliar with the potential harms associated with screening (i.e., screening risks), key aspects of human papillomavirus, and complex terms (e.g., dyskaryosis). Survey results indicated that interpretation difficulties were common (M correct items = 12.5 of 23). Lower understanding was associated with lower educational level (β’s >0.15, P’s <0.001), lower numeracy scores (β = 0.36, P < 0.001), and nonwhite ethnicity (β = 0.10, P = 0.007). The leaflet was evaluated positively overall. Conclusions. Despite previous user testing of the leaflet, key information may be too complex for some recipients. As a consequence, they may struggle to make informed decisions about screening participation based on the information provided. We discuss implications for the improvement of communications about screening and decision support.

and every 5 y to women aged 50 to 64 y. Eligible women are mailed an invitation letter and a leaflet containing information about cervical cancer, its causes, what screening involves, possible results, as well as screening benefits and risks. Benefits include reduction of cervical cancer incidence and mortality. Risks include potential detection and treatment of abnormal cells that would have cleared up on their own 7,8 and increased risk of preterm birth among women who are treated to remove abnormal cells. [9][10][11] Initial screening results are communicated by letter, and women invited for further tests (i.e., a colposcopy) receive an additional leaflet describing the procedure, possible results, and risks of treatment.
Besides raising awareness of cervical screening, a key aim of the invitation leaflet for England i is to support informed choices about participation. 12 However, communications about screening often involve quantitative information that can be complex, even for educated audiences. 13,14 Concepts such as overdiagnosis and overtreatment are unfamiliar and counterintuitive to most people. 15,16 Even NHS materials that have been user tested may include complex numerical information or terminology. 17 Screening communications that are not well understood may cause undue concern, reduce recipients' beliefs about their capability to participate in screening (i.e., self-efficacy), and undermine informed uptake. 13,18 Individuals with lower levels of educational attainment or numeracy may be particularly affected, contributing to socioeconomic inequalities in screening participation. 19,20 Here, we aimed to assess women's difficulties in interpreting the NHS cervical screening leaflet for England. We also sought to explore women's responses to the leaflet, including its numerical information and infographics. These aims were of relevance because the leaflet was being revised to reflect the move to HPV primary screening in England, whereby samples will first be tested for HPV. 21 A better understanding of the weaknesses and strengths of the current leaflet can help to inform new versions and point to specific aspects requiring attention.
We used a sequential mixed-methods design involving 2 steps. 22,23 First, qualitative cognitive think-aloud interviews aimed to identify women's responses to and potential difficulties with the leaflet. Second, a quantitative survey aimed to examine the generalizability of interview findings by assessing the prevalence of difficulties and responses in the population. The survey also explored whether difficulties and responses varied with participant characteristics, including sociodemographics, screening experience, and numeracy. Participants in both steps were recruited from England, because the leaflet we tested focused on England. Ethical approval for both steps was obtained from the ethics committee of the University of Leeds (AREA 16-071 and AREA 17-002). All materials and survey data are available from the Open Science Framework (https://doi.org/10.17605/ OSF.IO/8WQZV). 24 Step 1 Methods: Cognitive Think-Aloud Interviews In cognitive think-aloud interviews, women were asked to vocalize their thoughts while reading the leaflet. This method provides access to the cognitive processes that occur during a task and is often used to identify potential usability problems. 17,25,26 Participants Women were recruited in June 2017 via Luto Research Ltd. in Leeds, England. Our sample size (n = 20) was based on related think-aloud research 17 and evidence that 10 to 15 interviews are typically enough to identify most i. Each of the 4 countries of the United Kingdom-England, Scotland, Wales, and Northern Ireland-has its own leaflet. Specific content varies across leaflets. Here we focused on the leaflet for England. usability issues or themes. 27,28 Four pilot interviews were undertaken before the main 20. Luto telephoned potential participants from their database. Women were eligible if they were aged 25 to 64 y and had not had cervical cancer. Purposive sampling ensured diversity in age and education. Following Luto's standard procedures, we excluded people taking medication for opioid addiction (due to potentially impaired cognitive function), current or retired health care professionals, and others routinely working with medical information.

Procedure
Interviews were conducted in university meeting rooms by the first author. After giving informed consent, participants received standardized instructions about the think-aloud task. We used a marked protocol that instructed participants to read out the leaflet and think aloud every time they encountered a red asterisk in the text. Asterisks were placed at the end of bullet points, short paragraphs (i.e., 2 short sentences), and long sentences (i.e., more than 25 words). 17,25,ii Following recommended procedures, participants first practiced with a leaflet about an unrelated topic. 26 After 3 successful utterances, they received the cervical screening leaflet. Following the think-aloud task, participants answered questions about the leaflet, including how much they liked it, its numerical information, and the infographic of possible screening results ( Figure 1a). Finally, they completed a questionnaire assessing participant characteristics, including cervical screening experience, previous abnormal results, knowledge of someone diagnosed with cervical cancer, first language (English or other), and ethnicity (Table 1). Participants also completed Schwartz et al.'s 3-item numeracy measure, 14 which can provide good discriminability in samples of the general population. Details on age, education, and employment status were obtained from Luto.

Analysis
Interviews were audio-recorded, transcribed verbatim, and analyzed in QSR NVivo12. We used thematic analysis-a qualitative approach for identifying relevant patterns of meaning, independently of quantifiable frequency measures. [29][30][31] All transcripts were read by 2 researchers (Y.O. and D.P.). Y.O. generated initial codes and searched for initial themes and subthemes. Y.O. and D.P. reviewed themes and subthemes as needed and agreed on definitions and names. The thematic map was discussed iteratively with the remaining authors, who indicated whether the themes were adequately represented by the quotes and suggested alternative themes where relevant, until a final thematic map was defined. 17,iii Step 1 Results

Sample Characteristics
The sample (n = 20) was diverse in age, educational level, and numeracy, though all participants were of white ethnicity and had previously participated in cervical screening (Table 1).

Themes
We identified 6 themes: 2 reflecting difficulties in interpretation, 2 reflecting negative reactions, and 2 reflecting positive reactions. Illustrative quotes are provided in Box 1.
Misunderstandings and self-reported confusion. This theme reflects aspects of the leaflet that were either not interpreted as intended or resulted in confusion. It included 3 subthemes.
iii. This process reflects an established thematic analysis procedure involving 6 iterative phases 31 : 1) familiarization with the data, including repeated reading and noting initial ideas, 2) generating initial codes by systematically identifying patterns, 3) searching for themes by combining the initial codes into potential themes and subthemes (i.e., specific topics within a theme), 4) reviewing themes by checking that the coded quotes form a coherent pattern and that the thematic map reflects the meanings in the whole data set, 5) defining and naming themes, and 6) final analysis and write-up.
ii. Marked protocols encourage reports of misunderstandings and confusion that may otherwise go unnoticed 25 and have previously been used to examine comprehension of patient information leaflets. 17 This procedure also proved more effective in our pilot interviews than unmarked protocols.
Screening results. Numerical information about possible screening results caused confusion. For instance, the leaflet states that of 100 women who have cervical screening, about 94 will have a normal result, 6 will have abnormal cells, and 4 will be invited for a colposcopy. The leaflet further states that ''about half the women who have colposcopy are found to have abnormal cells that need to be removed.'' Thus, the leaflet implies that about 2 in 100 women will need treatment for abnormal cells. Instead, participants appeared to infer that half of the women who have screening may have abnormal cells.
Screening benefits. Participants also misunderstood numerical information about screening benefits. Specifically, the leaflet explains the reduction in the risk of getting cervical cancer by stating that ''screening stops about 1 woman getting cervical cancer for every 100 women who have screening.'' Some participants incorrectly inferred that this implies that 1 out of 100 women who have screening will be diagnosed with cancer.
Additional tests and treatment. Participants expressed confusion about the purpose of additional tests and when these may be offered. The leaflet explains that if slightly abnormal cells are detected, the sample will be tested for the HPV types that can cause cervical cancer. Some participants incorrectly inferred that samples would be tested for cancer if abnormal cells are detected. Others incorrectly inferred that treatment for abnormal cells is offered to women who test positive for HPV or abnormal cells, independently of colposcopy results.
Knowledge gaps and unfamiliar concepts. This theme focuses on concepts that were unfamiliar to participants and in some cases were seen as concerning or scary. Participants often noted that additional clarifications about these concepts would be useful. This theme included 4 subthemes.
HPV. Participants often noted that they were not previously aware of HPV, its link to cervical cancer, how it is transmitted, or the fact that it can regress without treatment. Some participants wondered how HPV might affect men.
Screening risks. Some participants expressed concern about the risk of premature labor associated with treatment for abnormal cells and noted that it would be good to quantify the risk. Complex terms (colposcopy, dyskaryosis). Participants often struggled to pronounce these terms and highlighted their complexity. Some questioned what a colposcopy would involve and whether it would hurt, particularly after reading initial sections of the leaflet about this.
Cervical screening (versus smear test). Some participants noted that they were more familiar with the term smear test to describe cervical screening.
Concern about speculum and pain. Participants noted that the procedure might be uncomfortable or painful. Some mentioned their own unpleasant experiences with the speculum. Several also found the image of how the speculum is inserted off-putting ( Figure 1b).

Disagreement with screening eligibility and frequency.
Participants generally questioned the current age range for screening and felt that screening should start earlier Employment status but not social grade was recorded in step 1 (n = 14 employed, n = 4 unemployed or students, n = 2 retired). c Numeracy was assessed using the measure by Schwartz et al. 14 (skew step 1: -0.08; step 2: -0.12). d Includes n = 5 who were overdue for screening in step 1 and n = 110 in step 2. e In step 2, these questions were displayed only to participants who previously reported having screening experience. or end later. Some also noted that screening should be more frequent. Participants' views on these issues were generally strong, despite a seemingly limited awareness of the rationale behind the current recommendations.

Box 1 Themes Identified in Think-Aloud Interviews and Illustrative Quotes
Positive reactions to statistical information about screening results and screening benefits. Despite some misunderstandings, participants tended to react positively to statistical information about screening benefits. For instance, the leaflet also mentions that cervical screening saves 5000 lives from cervical cancer each year in the United Kingdom. This information was often viewed as encouraging. Participants noted that it highlighted the importance of screening. In addition, information about screening results was often viewed as reassuring. Liking of information about the procedure. Participants noted that the information about the procedure and specific advice on how to prepare for the test were useful. They emphasized that it was good to be informed of the expected length of the appointment, waiting time to receive initial results, and of the option to ask that a woman performs the test.

Leaflet Evaluations
The leaflet was evaluated positively, with a mean rating of 5.9 (SD = 1.0) on a scale ranging from 1 to 7. Step 2 Methods: Survey

Participants
Survey respondents were recruited through research company Norstat in December 2018. Norstat e-mailed invitations to potentially eligible individuals in their database who could speak English to a native standard. Women were eligible if they were aged 25 to 64 y, lived in England, and had not had cervical cancer or a hysterectomy. 32, 33 We excluded those who reported being registered with a general practitioner in a location where HPV primary screening was piloted at the time, because that experience could potentially interfere with interpretations of the leaflet (see Supplementary Table S1 for a list of pilot sites). We set quotas for age, education, and ethnicity, taking into account distributions in the target population of English women aged 25 to 64 y (Supplementary Table S2). The survey was first piloted with 20 participants. The target sample size for the main survey (n = 601) was set to estimate the prevalence in the target population (n = 14 133 497), 34 with a confidence level of 95% and a margin of error of 4%. Following standard practice, Norstat overrecruited to meet the target sample size after removing inattentive participants who completed the survey in less than half of the median completion time (median time = 18 min 11 s).

Leaflet
The leaflet was the same as in step 1, with the exception that we removed 3 sections that were not linked to interpretation difficulties in step 1 to avoid excessive respondent burden: 1) the procedure and specific advice on how to prepare, 2) the symptoms of cervical cancer, and 3) storage of samples after screening.

Survey Items
Items assessing interpretations were built on the first 2 themes identified in step 1 (Box 1). We also developed items for each of the remaining themes, except for the sixth theme (i.e., liking of information about the procedure), as the corresponding information was removed from the leaflet (see above).
Interpretations. We developed items for each subtheme under ''misunderstanding and self-reported confusion'' and ''knowledge gaps and unfamiliar concepts''. We also developed items assessing understanding of other aspects relevant for screening decisions, including additional screening risks (overtreatment, false positives, false negatives) and the main goal of cervical screening. [35][36][37] Items were pretested iteratively using 3 rounds of cognitive interviews conducted by the first author (n = 4 per round, 12 in total). Participants thought aloud while answering each item and were probed for further details where relevant. They also suggested alternative wording for items that were unclear or confusing. 38 Following each pilot round, items were revised with all authors to reduce reading barriers and ensure that they were interpreted as intended.
The final set of items included 19 true/false items (10 true and 9 false) and 4 open-ended items. iv All items are shown in Table 2. For each item, participants expressed their confidence in their answers on a scale ranging from 50% (just guessing) to 100% (absolutely sure). 42 Evaluations of image depicting speculum. We assessed evaluations of this image (Figure 1b) in relation to the theme ''concern speculum inserted and pain.'' We adapted 3 items from previous work 43 (e.g., ''How much do you like or dislike this image?'') using a response scale ranging from 1 to 7 (e.g., 1 = do not like it at all, 7 = like it a lot). We averaged across items to produce an overall evaluation score (Cronbach's a = 0.70). We also asked participants to indicate how the image affected their motivation to attend screening when next invited iv. We used the true/false format for 3 reasons. First, true/ false items do not require considering multiple alternatives at a time and hence are less cognitively demanding than multiple-choice items. 39 Second, true/false items are less likely to artificially increase the scores of test-wise respondents, who may use cues in the set of answer options in multiple-choice items (e.g., excess specificity of some options, length, or order of options). 40 Third, true/false items can help to detect instances of mixed or partial understanding of a given concept. 41

using 3 response options: ''it decreases/increases/does not affect my motivation''
Views on screening eligibility and frequency. We developed 3 items assessing views on the current starting age, ending age, and frequency (e.g., ''I think screening should start . . . at 25/before 25/after 25''). Participants who expressed disagreement with current policy (e.g., who selected ''before 25'') were also asked to specify their preference (e.g., ''At what age do you think screening should start?'').

Evaluations of infographic showing screening results.
We assessed evaluations of this infographic (Figure 1a) in relation to the theme ''positive reactions to statistical information about screening results and screening benefits.'' Items were analogous to those assessing evaluations of the image depicting the speculum, described above (Cronbach's a = 0.88). Participants also indicated how the infographic affected their motivation to attend screening.
Overall leaflet evaluations and familiarity. We developed 3 items to assess overall evaluations of the leaflet (Cronbach's a = 0.83). In addition, we included an item to assess participants' familiarity with the leaflet (i.e., whether they had read it before). 44 Results for all individual evaluation items in the survey are presented in the supplement (Supplementary Table S3).

Procedure
The survey was implemented in Qualtrics. Participants first read an online consent form. Those who agreed to proceed were then presented with questions assessing eligibility. In addition to the sociodemographics recorded in step 1, step 2 also assessed participants' social grade according to the National Readership Survey system. 45 Categories represented the occupation of the chief income earner of the household (Table 1). Next, participants viewed the leaflet and answered items assessing interpretations. The different pages of the leaflet appeared on separate screens, accompanied by the corresponding interpretation items immediately below. Next, they completed items assessing leaflet evaluations, familiarity with the leaflet, and views on screening eligibility and frequency. They were then presented with the image depicting the speculum (Figure 1b), the infographic showing screening results (Figure 1a), and associated items in each case. Finally, they completed questions assessing participant characteristics analogous to those in step 1, including the same numeracy measure. v

Analysis
We computed overall accuracy scores for each participant by adding the number of correct responses to all items assessing interpretations. Missing responses were coded as incorrect. We performed multiple (univariate) linear regression analyses to examine whether accuracy scores, mean confidence ratings, and leaflet evaluations varied as a function of participant characteristics. Predictors consisted of sociodemographics (age, education, ethnicity, and social grade), cervical screening experience, numeracy, and English as a first language. vi The lowest educational level and social grade were used as the reference class, and age and numeracy scores were entered as continuous variables. Analyses were conducted using SPSS 23 for Windows. Full regression results are presented below, and mean accuracy, confidence, and evaluations corresponding to the different levels of all predictors are presented in the supplement (Supplementary Table S4).
Step 2 Results

Sample Characteristics
The survey was accessed by 1953 participants, of which 37% were eligible (Figure 2). The final sample (n = 602) included 12% participants of nonwhite ethnicity and 9% with no screening experience (Table 1). In the population, 14% were nonwhite and 11% has no screening experience (Supplementary Table S2).

Interpretations
Participants answered on average 12.5 items correctly out of 23 (SD = 3.06; range, 5-21), indicating relatively common interpretation difficulties. Regression results revealed that the strongest predictor of accuracy was numeracy, followed by education (Table 3). Scores were lower among participants with GCSE/O-level grade or less, relative to participants with A-levels and to those with higher education. Accuracy was also lower among nonwhites than among whites and among participants from the lowest social grades (D and E) relative to those from grades C1 and C2. Analyses of individual items revealed that performance was particularly poor for items assessing screening results (Table 2). Only 10% of participants accurately estimated the number of women expected to have possible cancer cells, and only 15% accurately estimated the number expected to need treatment for abnormal cells. Inspection of the distribution of responses revealed that participants often overestimated the likelihood of these adverse results (Supplementary Table S5a-c; Figure S1). For instance, 32% of participants inferred that 40 in 1000 women would have possible cancer cells (correct answer = 1 in 1000), and 19% inferred that 500 in 1000 women who have screening would need treatment for abnormal cells (correct answer = 20 in 1000). A different pattern emerged for the item concerning the number of women expected to have an abnormal result, where 43% of estimates were accurate and 48% were lower than the correct answer (60 in 1000). The fact that the correct answer for this item is higher than that of the previous 2 items implies that there was more room for underestimation. In v. The survey also included an alternative image depicting the speculum and items unrelated to the current research questions (i.e., perceptions of the risk of developing cervical cancer and screening intentions), which are available at the Open Science Framework.
vi. We did not include previous abnormal results as a predictor because this question was answered only by participants who reported having screening experience. Because of an error in survey flow, this was also the case for the item assessing knowledge of someone diagnosed with cervical cancer, which was also not included as a predictor.
addition, some incorrect responses likely reflect a failure to transform the estimate provided in the leaflet as required by the question. Whereas the leaflet stated that 6 out of 100 women will have an abnormal result, participants had to indicate how many out of 1000 would have an abnormal result. The most common incorrect response (seen in 31% of participants) was 6, which likely reflects direct extraction of the information from the leaflet. Performance was also poor for items assessing understanding of additional tests and treatment. The most common misunderstandings were that treatment would be offered to women with abnormal cells (70% of participants) or those who test positive for HPV (57%). Instead, the leaflet explains that a colposcopy is offered in both cases to determine whether treatment is needed.
Information about screening benefits and risks was also misunderstood frequently, although performance varied substantially across individual items. Whereas most participants (94%) understood the concept that screening lowers the risk of getting cervical cancer, two-thirds (67%) misinterpreted the risk reduction information provided (''screening stops about 1 woman getting cervical cancer for every 100 women who have screening''). Only 35% of participants accurately estimated the effect of screening on the risk of getting cervical cancer, with 18% assuming that the risk would be equal in groups of unscreened versus screened individuals (Supplementary Table S5d). Moreover, 26% of participants assumed that the main goal of cervical screening was diagnosis rather than prevention. Concerning screening risks, one of the most common misunderstandings was that the screening test itself increased risk of premature labor (75% of participants). In addition, 76% were unaware of the possibility of false-negative results, and 47% did not understand that screening can lead to unnecessary treatment.
Specific aspects of HPV were also misinterpreted. Although most participants (91%) understood that HPV can be passed on during sexual intercourse, 50% incorrectly inferred that condoms do not lower the risk of infection. More than half (57%) also failed to understand that HPV usually does not need any treatment.

Self-Reported Confidence
Despite participants' misunderstandings, their selfreported confidence was relatively high (Table 2). Mean confidence ranged between 73.9 and 90.1 for items assessing screening results and additional tests and treatment, despite poor performance. Mean confidence across all items was weakly correlated with the total number of accurate responses (r = 0.21, P \ 0.001), suggesting that participants who had better understanding tended to express more confidence. Regression results revealed that confidence ratings were higher among more numerate participants and among those with higher education, relative to those with GCSE/O levels or less. Confidence was also higher among participants from social grades C1 and C2, relative to those from the lowest grades (D and E). Older age and cervical screening experience were also associated with higher confidence.

Image Depicting Speculum
This image (Figure 1b) was on average evaluated positively, with a mean rating of 5.2 (SD = 1.3) on a 1 to 7 scale. Most participants (72%) noted that their motivation to attend screening would not be affected by this image, although 14% noted that it would decrease their motivation, with the remaining 14% saying that it would increase it.

Screening Eligibility and Frequency
Agreement with the current screening starting age (i.e., 25 y) and ending age (i.e., 64 y) was low (24% and 33% of participants, respectively). Most participants (72%) indicated that screening should start before age 25, of which 43% noted that it should start at 18 y. Most (64%) also indicated that screening should end after age 64, of which 47% stated that it should end at age 70. In addition, 35% participants indicated that screening should be offered more frequently, although 62% agreed with the current screening interval.

Infographic Showing Screening Results
The infographic (Figure 1a) received very positive evaluations, with an average rating of 6.0 (SD = 1.1) on a 1 to 7 scale. A total of 31% participants noted that the

Overall Leaflet Evaluations and Familiarity
The leaflet overall was also evaluated positively, with a mean rating of 5.8 (SD = 1.1) out of 7. The regression predicting evaluations explained a small amount of variance (Table 2). Participants who had cervical screening experience and whose first language was English evaluated the leaflet more positively. Most participants (64%) reported having read at least some of the leaflet the last time they were invited for screening, although 18% reported not having read it, and the remaining 18% did not remember previously seeing a leaflet.

Discussion
Our findings suggest that the NHS leaflet about cervical screening may be too complex for some recipients. Even though the leaflet underwent extensive user testing 12 and was evaluated positively in our study, we documented common misunderstandings about key aspects, including screening benefits, risks, and results. Despite these misunderstandings, participants' self-reported confidence in their answers was relatively high. This echoes previous findings on overconfidence in one's own knowledge 42,47,48 (but see Olsson 49 ). We also found that leaflet interpretations were less accurate among participants with lower education, lower numeracy, and ethnic minorities. These findings suggest that some recipients may struggle to make informed decisions about screening participation based on the information provided and highlight the challenges in developing communications that are effective for diverse audiences. In addition to hindering informed decision making, specific misunderstandings may have other unintended effects. Although information about screening results was often viewed as reassuring by interviewees, survey respondents overestimated the likelihood of some adverse results. Relatedly, about a quarter of survey respondents failed to understand the preventive purpose of cervical screening, converging with recent findings. 50 Misunderstanding of the main goal of cervical screening coupled with overestimations of adverse results may lead to undue worry about what the test might find. This in turn may potentially lead to avoidance of screening, particularly among women with high cancer fear. 50,51 The misunderstanding that the screening test increases the risk of preterm labor could have a similar effect, particularly among those planning to get pregnant. On the other hand, we also found that almost half of the respondents failed to infer that cervical screening can lead to unnecessary treatment. The failure to understand the risk of overtreatment may lead to a more positive attitude about screening, at the expense of informed decision making.
Despite participants' misunderstandings, they evaluated statistical information relatively positively. Indeed, there is evidence that numbers are often trusted and preferred over verbal quantifiers alone to communicate health risks. [52][53][54] The finding that the infographic showing screening results (Figure 1a) was evaluated positively also converges with research showing that simple visual aids are often liked by diverse audiences. 55,56 However, our findings also suggest that it may be beneficial to consider alternative numerical formats to support understanding. For instance, the leaflet did not provide information about the risks of developing cervical cancer and dying of cervical cancer with and without screening, contrasting with recommendations from the risk communication literature and International Patient Decision Aids Standards. 57 Such information could be communicated in an accessible way using fact boxes and/or visual aids, 58-60 which could facilitate evaluations of the effectiveness of screening. It could also be beneficial to add numerical estimates about screening risks, which are currently lacking in the leaflet. The use of verbal quantifiers without numbers to express risks is generally discouraged, as this can lead to diverse interpretations, including overestimations of risk. [61][62][63] Our findings also show that unfamiliar concepts may not be fully understood based on the information in the leaflet. Misunderstandings about HPV are of particular concern considering the move to HPV primary screening. Our findings support work that has identified similar gaps in HPV knowledge, 64,65 and provide the first evidence that some misunderstandings may persist despite the explanations provided in the leaflet. Hence, our findings highlight the importance of further clarifying key aspects of HPV, such as its link with cervical cancer, transmission, and how it can clear without treatment. In addition, the leaflet could potentially be simplified by removing other unfamiliar concepts that are arguably not essential for informed screening decisions at the invitation stage, such as ''dyskaryosis'' or specific aspects concerning colposcopies. Simplifying communication materials can increase understanding among diverse audiences without negatively affecting evaluations or intentions to participate in the advertised programs. 66

Limitations and Future Research
Our work has limitations. First, the marked think-aloud procedure may have introduced some bias as it encouraged comments at specific points in the text. Although prompts to think aloud were very frequent, they may have focused participants more on aspects immediately preceding each prompt. Second, our survey sample was recruited from an online panel, which may not have been representative of the population. Although we used quotas considering distributions of key demographics in the population, we were unable to recruit enough participants with no qualifications (8% in our sample v. 16% in the population; Supplementary Table S2). Relatedly, all think-aloud interviewees had previously participated in cervical screening at some point, which may have facilitated interpretations of information they could relate to their own experience. Some of this information (e.g., details about the procedure) was not tested in the survey because interviewees showed no confusion. Previous screening experience could potentially also result in more positive reactions to such information.
Third, although we took measures to remove inattentive survey participants, others may not have read the leaflet carefully either. However, any misunderstandings attributable to inattention may be present among actual leaflet recipients, who often do not read the full leaflet. 47 In addition, it is also possible that performance was negatively affected by specific item wordings. Although we pretested all items, some may have not been interpreted as intended. For instance, the item concerning the risk of preterm labor may have been interpreted by some as referring to screening participation generally, rather than to the screening test itself. Relatedly, the item assessing estimates of the cervical cancer risk reduction associated with screening did not provide a time interval (e.g., lifetime risk). While a time interval was also lacking in the leaflet, this may have contributed to interpretation difficulties. The chances of correct responses due to guessing should also be considered when interpreting our results. The high confidence ratings suggest that participants did not report guessing in most cases. However, some may have been reluctant to admit doing so.
Future work could examine the impact of cultural differences and prior beliefs about cancer or screening on interpretations of cervical screening communications. Previous beliefs about the effectiveness of screening in general or strong fears from cancer could interfere with comprehension or its relationship to screening intentions. 59,67 Similarly, low perceived cancer risk or cancer fatalism (e.g., the belief that cancer is incurable) could bias processing of information about screening, leading to misinterpretations. Such beliefs are more prevalent among ethnic minorities than among white British women, independently of other sociodemographic factors. 68,69 This could help to explain our finding that nonwhite ethnicity was linked with lower leaflet understanding after controlling for other sociodemographics and native language.

Conclusions
Our work points to strengths and weaknesses in the NHS cervical screening leaflet for England, which constitutes a central communication tool of the screening program. Addressing the weaknesses may contribute to reduce screening inequalities and support understanding for wider audiences. While we focused on the leaflet for England, our findings are also relevant for the design of other leaflets that may be revised to reflect the move to HPV primary screening (e.g., the Scottish leaflet). Our findings also have implications for improving other communications about cervical screening (e.g., Web sites), as well as potentially about other screening programs internationally.