Does misclassification of former tobacco smokers explain the ‘smoker’s paradox’ in the risk of COVID-19? Insights from the Stockholm Public Health Cohort

Background: The association between tobacco smoking and the risk of COVID-19 and its adverse outcomes is controversial, as studies reported contrasting findings. Bias due to misclassification of the exposure in the analyses of current versus non-current smoking could be a possible explanation because former smokers may have higher background risks of the disease due to co-morbidity. The aim of the study was to investigate the extent of this potential bias by separating non-, former, and current smokers when assessing the risk or prognosis of diseases. Methods: We analysed data from 43,400 participants in the Stockholm Public Health Cohort, Sweden, with information on smoking obtained prior to the pandemic. We estimated the risk of COVID-19, hospital admissions and death for (a) former and current smokers relative to non-smokers, (b) current smokers relative to non-current smokers, that is, including former smokers; adjusting for potential confounders (aRR). Results: The aRR of a COVID-19 diagnosis was elevated for former smokers compared with non-smokers (1.07; 95% confidence interval (CI) =1.00–1.15); including hospital admission with any COVID-19 diagnosis (aRR= 1.23; 95% CI = 1.03–1.48); or with COVID-19 as the main diagnosis (aRR=1.23, 95% CI= 1.01–1.49); and death within 30 days with COVID-19 as the main or a contributory cause (aRR=1.40; 95% CI=1.00–1.95). Current smoking was negatively associated with risk of COVID-19 (aRR=0.79; 95% CI=0.68–0.91). Conclusions: Separating non-smokers from former smokers when assessing the disease risk or prognosis is essential to avoid bias. However, the negative association between current smoking and the risk of COVID-19 could not be entirely explained by misclassification.


Introduction
The coronavirus disease (COVID-19) pandemic, caused by severe acute respiratory syndrome coronavirus 2 (SARS-CoV-2), has caused more than 605 million confirmed cases of COVID-19 and about 6.5 million deaths around the world by 12 September 2022 [1]. Tobacco smoking has been investigated as a potential risk factor for the disease. The association between tobacco smoking and SARS-CoV-2 infection or the disease's adverse outcomes remains controversial, as studies reported mixed findings. Some studies have reported a negative association between smoking and COVID-19 infection [2] or hospitalisations due to COVID-19 [3,4], yet others have shown a positive association [5,6].
It has been suggested that the observations of a negative association between tobacco smoking and Does misclassification of former tobacco smokers explain the 'smoker's paradox' in the risk of COVID-19? Insights from the Stockholm Public Health Cohort COVID-19 may result from biases, including confounding, selection or bias due to misclassification of the exposure. This latter bias may be present, for instance, when the exposure is defined as current versus non-current smokers (therefore including former and never smoker in the same reference category). Smokers who develop smoking-related chronic diseases are more likely to quit smoking than smokers without these conditions [7]. For instance, smokers with smoking-related diseases were found to have an increased motivation to quit smoking due to the direct negative effect of smoking, such as worsening symptoms or reduced quality of life [8]. moreover, these individuals often receive feedback from healthcare providers about their lung function, biomarkers feedback related to their smoking behaviour, and the impact of smoking on their disease progression [9]. healthcare providers can also help smokers understand the benefits of smoking cessation, such as improved disease management and lower risk of disease progression or complications [10].
The presence of such morbidity may, in turn, increase the risk of a SARS-CoV-2 infection and/or its adverse prognosis. The risk of infection among smokers may be enhanced by several biological or behavioural factors. First, smoking-related co-morbidity may imply accelerated lung function decline [11], and this decline is associated with infection susceptibility, inflammation and impaired immunity [11,12]. Also, some tobacco-related diseases, such as chronic obstructive pulmonary disease (COpD), do not reverse after smoking cessation [11]. Studies have shown that former smokers with COpD tend to have a similar number and type of inflammatory cells as current smokers, and this indicates that inflammation is ongoing even after smoking cessation [13]. In fact, a recent systematic review found that former smokers have a somewhat increased risk of hospitalisation with COVID-19 as well as an enhanced disease severity [2].
In Sweden, the transition to smokeless tobacco use ('snus') has been very common among men quitting smoking [14], and it can be hypothesised that snus users also have a high risk of infections of the airways due to frequent contact between the hands and the oral mucosa. Also, unlike smokers, snus users are not subjected to restrictions on tobacco use in proximity to others, which can have unfavourable consequences on scoial distancing during the pandemic.
For the reasons above, analyses including 'former' in the same reference group as 'never' smokers when estimating the relative risk of COVID-19 among 'current smokers' may yield downwards biased estimates of associations, reported in previous studies based on such a comparison [15][16][17].
The aim of this study was to assess the presence and entity of this potential bias within the same study, drawing on longitudinal data from the Stockholm public health Cohort (SphC). For this purpose, we explored the relative risk of COVID-19 and its adverse outcomes comparing (a) current and former smoking contrasted to non-smokers; (b) current smokers contrasted to non-current smokers, where the reference group includes both never and former smokers.

Study design and data sources
The study was based on longitudinal data collected within the framework of the SphC. The SphC is a population-based cohort that had been established within the region of Stockholm, with participants recruited in subsequent surveys conducted in 2002, 2006, 2010 and 2014. For the purpose of this study, we used information collected during the 2010 and 2014 surveys. Data were collected using postal or web-based questionnaires covering health-related and lifestyle information, including tobacco use. Self-reported information has been complemented by information from healthcare and socio-demographic registries.
The data collection was managed by Statistics Sweden in collaboration with the Department of public health Sciences at karolinska Institute [18]. The cohort profile, including the questionnaire, has been described in detail elsewhere [18]. Sociodemographic information (sex, age, achieved education, occupational risk for infection, income, cohabitation and country of birth) was extracted through record-linkage with the register of the total population of the region of Stockholm held by Statistics Sweden. We used the national personal identification number assigned to every resident in Sweden at birth or at immigration to obtain information on diagnoses of COVID-19 among individuals in this cohort through record-linkage with the regional database of inpatient and outpatient health care (VAl database).

Participants
The derivation of the study sample is shown in Figure 1.
Exposure participants were initially grouped according to their self-reported tobacco smoking status in 2014, whereby they were required to answer the following questions: A. have you ever smoked tobacco daily, or almost daily, for at least six months? (1: Yes, 2: No); b. Do you currently smoke tobacco daily or almost daily? (1: Yes, 2: No). For the purpose of this analysis, we first grouped the answers into three categories: non-smoker (if A=2); current smoker (if A=1, b=1); former smoker (if A= 1, b= 2). Smoking status in 2010 was then used to refine this latter category to also include former smokers, those who reported not smoking in 2014 but reported former or current daily smoking in 2010. To assess the potential bias induced by the inclusion of former smokers in the category of non-current smokers, we then re-categorised the answers into a dichotomous variable 'Current nonsmoker' (if A=2 or if A= 1, b= 2); and 'Current smoker' (if A=1, b=1).

Outcome
Diagnoses were included between 1 march 2020 and 31 August 2021 (date of last update of record linkage). The rollout of COVID-19 vaccinations started in Sweden in January 2021. Five binary outcomes (Yes/No) were included in the analysis: 1. Any diagnosis of COVID-19, whether in a hospital or outside, consisting of at least a positive polymerase chain reaction (pCR) test reported by the laboratories to Sweden's national electronic surveillance system for communicable diseases, SmiNet. 2. hospital admission with a diagnosis of COVID-19 (ICD-10 codes u071 and u072) registered either as main or as a concomitant diagnosis. 3. Admission to an intensive care unit (ICu) with a diagnosis of COVID-19 (ICD-10 codes as above). 4. Death by COVID-19, established using the Swedish Cause of Death Registry, which is based on the death certificate filled in by physicians. All deaths occurring during the follow-up period with COVID-19 registered as the main cause were included. The restriction to the main cause of death was done to maximise the specificity of the diagnosis [19]. 5. Death within 30 days from a diagnosis of COVID-19, listed as either the main or a contributory cause. This latter analysis was conducted for sensitivity purposes.

Confounders
Adjustments were made for potential confounders, selected using a directed acyclic graph (Supplemental material Figure S1 online). Socio-demographic information (sex, age, achieved education, occupational risk for infection, income, cohabitation and country of birth) was extracted through record-linkage with the register of the total population of the region of Stockholm held by Statistics Sweden. These covariates were categorised as follows: Sex (male/ female), Age in years (continuous), education (compulsory school, i.e. nine years of schooling; high school, i.e. two or three years of schooling after the compulsory education; university; unknown); Disposable yearly income in Swedish crowns (continuous); Cohabitation with others (Yes/No); Country of birth (Sweden; Other Nordic country; Other country). Occupational risk for infection with SARS-CoV-2 was categorised as 'high', 'moderate' or 'low' based on a priori knowledge of exposure to the transmission of the virus. Three variables were derived from self-reports in the SphC in 2010 and 2014: a diagnosis of sexually transmitted diseases in the past 12 months; diagnoses of chronic diseases; and the use of snus.
We used 'sexually transmitted diseases' as a proxy for risky social contact exposure to SARS-CoV-2 transmission. In fact, risky sexual behaviours, such as unprotected sex and multi-partnership, are causally linked to sexually transmitted diseases and correlate heavily with smoking [20,21]. This information and the presence of chronic diseases were selfreported as yes/no. Snus use, categorised as non-use; former use; current use, was derived in the same way as smoking.

Statistical analysis
Risk ratios (RRs) and their corresponding 95% confidence intervals (CIs) for COVID-19 infection, hospitalisation, ICu or death due to COVID-19 were estimated through generalised linear models. We specified the poisson family for the binary health outcomes with robust standard errors estimated using maximum likelihood [22]. We ran two sets of analyses. In the main analysis, we derived the risk of the selected outcomes between current and former smokers relative to non-smokers. Additionally, we compared the risk of the selected outcomes between current-and non-current smokers, where this latter category included former smokers. We expected that the inclusion of former smokers in the reference group would increase the absolute risk in this group, resulting in RR for current smokers further from the null in the case of a negative association and closer to the null in the case of a positive association, compared with the estimates obtained in the main analysis [23].
Adjustments were made for the putative confounders listed above: sex, age, cohabitation, education, income, occupational risk for infection, country of birth, snus use, existence of sexually transmitted or chronic diseases as covariates in the study. missing data in any co-variate (less than 10%) were considered ignorable (complete case analysis). All analyses were conducted using STATA®, version 17 (StataCorp lp, College Station, Texas, uSA).

results
A total of 43,400 participants (39,425 with complete data) were included in the study. The derivation of the study sample is shown in a flowchart (Figure 1). Table I shows the baseline socio-demographic characteristics and incident diagnoses of COVID-19 among the cohort participants, separately by categories of smoking behaviour.
The proportion of current smokers in this sample was 7.5%, and that of former smokers was 38.9%. Former smokers were older compared with both non-smokers and current smokers. The proportion of chronic diseases self-reported by current and former smokers (35.8%) was similar, and it was higher than among non-smokers (22.0%). The proportion reporting sexually transmitted diseases in the past 12 months was similar in all groups of smoking behaviour. Current snus use was more frequent among former smokers (12.2%) compared with non-smokers and current smokers (4.8 % and 8.1%, respectively). The cumulative incidence of COVID-19 diagnoses was 9.1%, and that of death was 0.43%. Table II reports the distribution of incident COVID-19 diagnoses and related outcomes (hospital admissions, admission to ICu, and death) across categories of smoking status.
The risk of being diagnosed with COVID-19 between 1 march 2020 and 31 August 2021 was higher among former smokers than among nonsmokers (adjusted RR 1.07, 95% CI =1.00-1.15) ( The sum of the total in the cells might differ from the total because of missing information.
a Self-report of chronic diseases.  (Table III). Table IV shows the risk of COVID-19 and related disease outcomes among current smokers compared with current non-smokers, that is, a reference group including both non-and former smokers. The risk of infection was significantly lower among current smokers (adjusted RR 0.76, 95% CI= 0.67-0.87) than among non-current smokers, while again, the estimated relative risk of hospitalisation, ICu and death was compatible with no association.

Discussion
In this population-based cohort, we examined the prospective association between tobacco smoking and SARS-CoV-2 infection risk and its adverse outcomes six years later, looking for evidence of potential bias introduced by the comparison between current and non-current smoking. The data allowed a refined definition of non-, former and current daily smoking, thus enabling the comparison between results potentially biased because they were obtained with the use of a misclassified reference group and those obtained with a correctly specified reference group. Former smoking was associated with a higher risk of COVID-19 infection, hospitalisation, and death compared with non-smoking, even after considering the presence of chronic diseases. On the other hand, current tobacco smoking was associated with a lower risk of COVID-19 infection after controlling for potential confounders. Our study showed that the results based on the misclassified comparison (categorising former smokers as non-smokers) were similar, but the negative association between current smoking and infection appeared to be stronger, thus speaking for a moderate amount of downward bias (about 4%).
The results from this study are in line with previous reports of a higher risk of COVID-19 infection and adverse outcomes among former smokers when compared with non-smokers [3,24]. Also, they are in line with those studies showing a higher risk of COVID-19 disease severity among former but not among current smokers [2,25] compared with non-smokers.
previous studies indicate that former smokers are more likely to be older and have smoked for a longer time than current smokers or suffer more comorbidities when compared with non-or current smokers [26][27][28]. This may indicate that former smokers are more likely to suffer from COVID-19 infection and adverse disease outcomes, given their pre-existing tobacco-related comorbidities. These observed differences in outcomes confirm the relevance of accurately separating former from non-smokers when investigating the association between smoking and the risk or prognosis of COVID-19 and potentially also of other diseases. Results from previous studies that misclassified former smokers as current nonsmokers due to incomplete data or inconsistent assessment of smoking status should be handled with caution. If former smokers are at higher risk due to co-morbidity, including them in the same category as non-smokers may spuriously inflate the risk of this latter group and result in biased estimates of the association between current smoking and the disease, as this study confirms. In fact, the moderate amount of bias estimated in this sample may not be assumed in other studies based on samples with different behavioural or socio-demographic characteristics.
Current smokers were at lower risk of infection with SARS-CoV-2 in this study, irrespective of smoking categorisation, while the prognostic outcomes of a diagnosis of COVID-19 among smokers in this study were not different from those of non-smokers. The negative association between current smoking and the risk of COVID-19 in our study could not be entirely explained by the misclassification of the exposure. Other studies also reported a lower risk of infection among smokers [2], in contrast with the expected deleterious effect of smoking due to its well-established causal role in respiratory tract diseases and infections [29]. A convincing explanation of this puzzling association is still lacking. hypotheses have been forwarded, for instance, regarding the potential protective role of nicotine either as an antiinflammatory agent or due to its ability to bind to the ACe2 cell-membrane protein, the entry point for the SARS-CoV-2 virus, hence blocking the virus from binding to the protein [30,31]. however, recent population-based studies conducted in Sweden and Finland showed that the users of smokeless tobacco snus are at a higher risk of infection compared with non-users [32,33]. because snus delivers high amounts of nicotine, these results did not support a protective role of this substance. It is also possible that our study may not have accounted for all confounding factors that could influence the association between smoking and SARS-CoV-2 infection risk.
Smokers and non-smokers may differ in their healthseeking behaviour, including their likelihood of seeking medical care and testing for SARS-CoV-2. This could lead to a spurious negative association between smoking and the risk of infection. It is crucial to emphasise that the findings of this study should not be taken as an encouragement to start or continue smoking. Smoking is a well-established risk factor for numerous health issues, including lung cancer, heart disease and COpD. The low SARS-CoV-2 infection risk observed in this and other studies, even if causal, does not outweigh the negative health effects of smoking. Additional studies are needed to better understand the relationship between smoking and SARS-CoV-2 infection. Our study has several strengths. First, the longitudinal study design allowed the study of the association of interest in a truly prospective fashion, avoiding the risk of reverse causality. Second, this study is population-based, hence minimising the risk of selection bias [34] present in hospital samples. Third, the rich database allowed a comprehensive adjustment for several potential confounders selected a priori according to causal pathways. Fourth, outcome assessment was done with a RT-pCR and registrylinked data, therefore, using information of high quality. Finally, the use of two data points to refine the exposure definition increased its sensitivity. Some limitations, however, should be noted. First, most information was based on self-reports. Second, the exposure assessment was conducted in 2014, long before the start of the COVID-19 pandemic. While this completely avoids reverse causation due to behavioural changes during the pandemic, it is likely that some individuals classified as current smokers in 2014 had subsequently quit smoking. In fact, the prevalence of daily smoking in the Swedish population has declined from 10% in 2014 to 7% in 2018-2020 [35]. however, if these misclassified former smokers had the same risk profile of the former smokers identified as such in this analysis, this would bias the negative association between current smoking and infection with SARS-CoV-2 rather towards the null.
Third, cohort participants, and particularly those retained between 2010 and 2014, may be a selected group with different behavioural and risk profile from those originally sampled. Caution should therefore be employed in extrapolating the results to the underlying source population. Also, we did not have information on occasional smoking; therefore occasional smokers are included in the 'non-smoking' group. Depending on assumptions on the risk profile of these smokers, the estimates of the relative risks of COVID-19 for former and current smokers may have been biased in either direction.
Former smokers in this study had a higher risk of COVID-19 infection and adverse disease outcomes. These findings indirectly support the public health efforts to curb smoking, given the effect of tobaccorelated comorbidities. On methodologic grounds, separating non-, former, and current smokers when assessing the risk or the prognosis of diseases with an impact on the respiratory system emerges as an important suggestion.