The Long-Term Impact of Different Offline Population Inclusion Strategies in Probability-Based Online Panels: Evidence From the German Internet Panel and the GESIS Panel

While online panels offer numerous advantages, they are often criticized for excluding the offline population. Therefore, some probability-based online panels have developed offline population inclusion strategies. Two dominant approaches prevail: providing internet equipment and offering an alternative survey participation mode. We investigate the impact of these approaches on two probability-based online panels in Germany: the German Internet Panel, which provides members of the offline population with internet equipment, and the GESIS Panel, which offers members of the offline population to participate via postal mail surveys. In addition, we explore the impact of offering an alternative mode only to non-internet users versus also offering the alternative mode to internet users who are unwilling to provide survey data online. Albeit lower recruitment and/or panel wave participation probabilities among offliners than onliners, we find that including the offline population has a positive long-term impact on sample accuracy in both panels. In the GESIS Panel, the positive impact is particularly strong when offering the alternative participation mode to non-internet users and internet users who are unwilling to provide survey data online.

Many researchers aim to draw inferences from survey data to the general population. In many cases, this requires that all parts of the population have a chance of being included (Groves & Lyberg, 2010). However, this requirement is not always fulfilled. Indeed, systematic exclusion of parts of the general population can occur in all survey modes. For example, people who do not own a telephone are systematically excluded from telephone surveys (Blumberg & Luke, 2007;Keeter et al., 2007). However, inclusiveness is particularly difficult to achieve in online surveys. This is because, in most countries across the world, some parts of the general population do not use the internet (Organization for Economic Co-operation and Development, 2019).
For some people, not using the internet is a choice, for example, because they are afraid of online data protection violations or they are unwilling to acquire the necessary digital skills (van Deursen & Helsper, 2015). For other people, not using the internet is a fate, for example, because they do not have the financial means to purchase internet equipment or they live in a remote area where an internet connection is as yet unavailable (Helsper & Reisdorf, 2016). Although the magnitude of the so-called "offline population" is decreasing (Eurostat, 2018), it has been found to be systematically different from the rest of the general population. For example, people who are older, have low educational degrees, or who are unemployed are more likely to be offline than people who are younger, have high educational degrees, or who are employed (Callegaro et al., 2014;Helsper & Reisdorf, 2016;Ragnedda & Muschert, 2013).
Online panels apply different strategies for dealing with the offline population. Some online panels ignore that there is a part of the general population that does not use the internet. This strategy is particularly prevalent in nonprobability online panels, which rely on convenience samples of internet users (Callegaro et al., 2014). Such nonprobability online panels have repeatedly been shown to lead to invalid inferences from the survey data to the general population, especially with regard to univariate statistics (see Cornesse et al., 2020, for an overview; Baker et al., 2013;Coppock et al., 2018;Litman & Robinson, 2020;Mullinix et al., 2015, for opposing findings regarding experimental research and "fit-for-purpose" approaches). However, because they can be recruited fast and at low cost, nonprobability online panels dominate the current survey landscape (Callegaro et al., 2014).
More accurate results can be achieved using probability-based online panels, which rely on traditional probability sampling procedures (see Schaurer, 2017, for an overview of probability-based online panels). To recruit probability-based online panels, survey agencies draw a random sample of the general population from sampling frames such as population registers (e.g., Bosnjak et al., 2018) or address lists (Blom et al., 2015). In a next step, the survey agencies contact the sample units offline via the contact details available on the sampling frame (e.g., addresses or telephone numbers). During or after the initial offline contact, sample units are asked to participate in further surveys online. At this stage, some probability-based online panels implement an offline population inclusion strategy to ensure that every sample unit willing to participate in the panel has the chance to do so.
While a number of probability-based online panels strive to include the offline population, they apply different strategies to reach this goal. A common inclusion strategy is to provide internet equipment (as applied, e.g., in the American Trends Panel, L'Étude Longitudinale par Internet Pour les Sciences Sociales [ELIPSS] Panel, Ipsos' KnowledgePanel, German Internet Panel [GIP], Longitudinal Internet studies for the Social Sciences [LISS] Panel, and Understanding America Study; see CentERdata, 2020;Ipsos, 2020;Pew Research Center, 2019;Sciences Po, 2016;University of Mannheim, 2020;University of Southern California, 2017). Another common inclusion strategy is to offer an alternative survey mode. The alternative survey modes offered in practice are postal mail (as applied, e.g., in the GESIS Panel, see GESIS, 2020a) and telephone (as applied, e.g., in the AmeriSpeak Panel, NatCen Panel, and Probit Panel; see NatCen Social Research, 2020; National Opinion Research Center at the University of Chicago, 2019; Probit Inc., 2020).
Although these different approaches to offline population inclusion are used in practice, little research has been conducted on their impact on the probability-based online panels, in particular from a longitudinal perspective. In addition, no research has yet examined whether the existing approaches to offline population inclusion differ in their impact on the online panel data. Furthermore, nearly no evidence is available as to whether internet users who are reluctant to provide survey data online should be included in offline population inclusion strategies. The observational study presented in this article contributes to filling these gaps in the literature by assessing the effect of different offline population inclusion strategies on panel sample quality.

Previous Studies and Research Questions
As yet, little research has been conducted on the impact of offline population inclusion strategies in probability-based online panels. The scarce existing literature usually focuses on the panel recruitment stage. A common finding from this literature is that panel recruitment rates are significantly lower in the offline population than in the online population. For example, in the German Internet Panel (GIP) 2014 recruitment, only 18.0% of eligible offline population members could be recruited to join the panel, whereas 50.4% of eligible online population members could be recruited (Blom et al., 2017). Similar conclusions can be reached when examining the recruitment outcomes of other probability-based online panels such as the LISS Panel (Leenheer & Scherpenzeel, 2013) and ELIPSS (Revilla et al., 2016).
While these findings clearly indicate lower recruitment probabilities among members of the offline population, it remains unclear whether members of the offline population also participate in the regular panel surveys after the recruitment at a lower rate than members of the online population. On the one hand, Jessop (2017) found that, in the NatCen Panel, members of the offline population indeed participate in regular panel waves at a lower rate than members of the online population. On the other hand, Toepoel and Hendriks (2016) found that, in the LISS Panel, offline population members drop out of the panel at significantly lower rates than online population members, indicating that members of the offline population, once recruited, are the more loyal panel members.
In addition to examining panel recruitment, participation, and/or retention, some studies have also assessed the impact of offline population inclusion on the accuracy with which a freshly recruited online panel sample represents the general population regarding sociodemographic characteristics. A common finding from this literature is that offline population inclusion has a positive impact. For example, in the GIP recruitment rounds of 2012 and 2014, offline population inclusion significantly increased sample accuracy on age, education, and household size (Blom et al., 2017). In addition, recruited members of the offline population have been found to differ from recruited members of the online population on a number of substantive characteristics. For example, in the GESIS Panel, recruited offline population members differed from online population members with regard to political attitudes (Pforr & Dannwolf, 2017). Similar differences regarding political attitudes and other substantive topics were found in the LISS Panel (e.g., on health, personality, and religion; see Toepoel & Hendriks, 2016).
Although most studies found a positive impact of offline population inclusion strategies on probability-based online panels, some studies showed that not every examined sociodemographic variable was positively affected. For example, Leenheer and Scherpenzeel (2013) found that offline population inclusion increased panel accuracy on age, household size, household composition, and homeownership but not on urbanity, migration status, and voter turnout. Similarly, Bosnjak et al. (2018) found that offline population inclusion increased panel accuracy on education and household income but not on age, household size, citizenship, marital status, and place of birth. Rookey et al. (2008) even found that offline population inclusion had a negative impact on panel accuracy with regard to age and gender. Furthermore, a study on the LISS Panel showed that including the offline population did not change conclusions drawn regarding a range of substantive research questions, including family, politics, and employment (Eckman, 2016; also see Toepoel & Hendriks, 2016). This might be because not enough members of the offline population could be recruited to the panel to make a difference in substantive analyses or because members of the offline population do not differ from members of the online population with regard to the examined characteristics, thus making offline population inclusion futile regarding the respective substantive research questions.
While the studies discussed above examine the success of offline population inclusion strategies on recruiting non-internet users to a probability-based online panel, nearly no evidence is available on whether it would be beneficial to extend the inclusion strategies to internet users who are reluctant to provide survey data online. Notable exceptions are studies by Bretschi et al. (in press) and Bosnjak et al. (2018), which suggest that there is indeed a significant subgroup of the general population that generally uses the internet but is reluctant to provide survey data online.
Overall, a number of gaps in the literature on offline population inclusion strategies can be identified. First and foremost, there is no research to compare the impact of different approaches to offline population inclusion yet. In addition, existing research on offline population inclusion strategies largely focuses on the online panel recruitment stage rather than applying a longitudinal panel perspective. Furthermore, no recent studies have examined the impact of extending offline population inclusion strategies beyond non-internet users. We contribute to filling these research gaps by examining the impact of different inclusion strategies across the panel survey waves of the GIP and GESIS Panel. Our research questions are: (1) To what extent does including the offline population have a lasting positive impact across the survey waves of probability-based online panels? (2) Is the impact of including the offline population different when providing internet equipment than when offering an offline participation mode? (3) Is the impact of offering an alternative participation mode different when extending the alternative mode offer to reluctant internet users than when only making the offer to noninternet users?

Data
For our analyses, we use data from two probability-based online panels in Germany: the GIP (see GESIS, 2020b, for data access) and the GESIS Panel (see GESIS, 2018, for data access). Generally, the GIP and GESIS Panel have a high number of similarities but also differ from each other in some crucial aspects.

Similarities
The key similarity between the GIP and GESIS Panel for the purpose of our study is that both panels apply offline population inclusion strategies. However, the GIP and GESIS Panel also share a number of other similarities. For example, the GIP and GESIS Panel are multitopic panel studies with a social scientific focus, and they have the same target population (i.e., the general population in Germany). In addition, the GIP and GESIS Panel samples we use in our analyses were recruited during approximately the same time period: The GIP was recruited in two independent recruitment rounds, of which the first took place in 2012 and the second took place in 2014. 1 The GESIS Panel sample was recruited in 2013. 2 Further similarities in the recruitment of the GIP and GESIS Panel include the reliance on traditional multistage probability sampling procedures and the application of a multistep recruitment process, including face-to-face recruitment interviews with subsequent self-administered panel registration surveys. The GIP and GESIS Panel even commissioned the same fieldwork organization with largely the same pool of interviewers to conduct the face-to-face recruitment interviews. Since their recruitment, the GIP and GESIS Panel conduct bimonthly survey waves of approximately the same length (i.e., 20-25 min).

Differences
Despite the similarities, the GIP and GESIS Panel differ from each other in two key aspects: the offline population inclusion strategy and the sampling design. For the purpose of our study, the key difference between the GIP and GESIS Panel is the offline population inclusion strategy. In the GIP, participants without internet-enabled device and/or sufficiently fast internet connections at their homes were provided with the necessary equipment to participate in the panel survey waves (i.e., an internet connection and an internet-enabled device). In the initial GIP recruitment of 2012, formerly offline households received a PC. In the second recruitment round of the GIP in 2014, formerly offline households received a tablet.
In the GESIS Panel, respondents to the face-to-face recruitment interview were first asked whether they generally used the internet and then whether they would be willing to participate in self-administered surveys in the future. Respondents who reported that they used the internet and were willing to participate in future surveys were then asked whether they would be willing to participate online. If they were willing to participate online, they were subsequently surveyed via the internet. If they were not willing to participate online, they were subsequently surveyed using postal mail surveys. In addition, recruitment survey respondents who reported that they did not use the internet but were willing to participate in future surveys were subsequently surveyed via postal mail, too.
In sum, the GIP and GESIS Panel offline population inclusion strategies differ in two key aspects: participation mode and treatment assignment. The participation mode differs because in the GIP, every panel member is surveyed online, whereas in the GESIS Panel, members of the offline population are surveyed via postal mail surveys. The treatment assignment differs because in the GIP, only people without sufficient internet access are assigned to the offline inclusion treatment, whereas in the GESIS Panel, people are also assigned to the treatment if they have access to the internet but do not want to use it to complete surveys.
The differences in offline population inclusion strategies between the two panels are the focus of this article. However, it should be noted that the GIP and GESIS Panel also have different sampling designs. The GIP sampling design is a three-stage area probability sampling procedure with complete listing of households in each primary sampling unit and complete listing of age-eligible individuals in each household (see Blom et al., 2015, for more information). The GESIS Panel sampling design is a two-stage register-based probability sampling procedure, where individuals are drawn from local registers held by municipalities (see Bosnjak et al., 2018, for more information).
As a consequence of these different approaches to sampling, the GIP face-to-face recruitment interviews were conducted with one (non-random) household member above the age of 16 per sampled household. This household member provided proxy information on all other household members during the interview. Subsequently, all household members of ages 16-75 were invited to become GIP members. In the GESIS Panel, face-to-face recruitment interviews were conducted with the named, prespecified individuals aged between 18 and 70 years drawn from the local population registers. Only these prespecified individuals were eligible to become GESIS Panel members.
Apart from the sampling design differences and their consequences, the GIP and GESIS Panel also differ in a number of other characteristics. For example, they apply different incentive schemes. In the GIP, participants receive a conditional €4 incentive for each survey wave they complete plus a €10 bonus if they complete all survey waves they are invited to in a year. The incentives are credited toward respondents' panel accounts and paid out twice a year as online vouchers, bank transfers, or charitable donation according to the panelists' preferences. In the GESIS Panel, all participants receive an unconditional €5 incentive with each survey wave invitation via postal mail.
Generally, the different sampling designs, incentive schemes, and other design choices of the GIP and GESIS Panel should not influence our findings with regard to the extent to which their offline population inclusion strategies have a lasting positive impact. In addition, we made the GIP and GESIS Panel more comparable by restricting the GIP sample to the GESIS Panel age range (18-70 years). We also pooled the GIP samples that were recruited in 2012 and 2014 for the analyses in this article because sampling, recruitment, and offline population inclusion strategies were essentially the same. Table 1 summarizes the recruitment outcomes across the GIP and GESIS Panel recruitment steps (see Blom et al., 2015Blom et al., , 2016Blom et al., , 2017Bosnjak et al., 2018, for detailed discussions of the GIP and GESIS Panel recruitment processes).

Recruitment Outcomes
As described in Table 1 As described above, the GIP recruitment interviews were conducted at the household level, while the GESIS Panel recruitment interviews were conducted at the individual level. The subsequent panel registration survey and panel survey waves were conducted at the individual level in both panels. Because in our analyses, we examine individual-level data, such as respondents' survey wave participation and their sociodemographic characteristics, the starting point for our analyses is the set of people who registered to the GIP and GESIS Panel after the initial recruitment survey. In our analyses, we use the first 12 waves of survey data collection in the GIP and GESIS Panel, covering a period of 2 years of bimonthly survey waves each.

Offliner Status
Since we aim to assess the impact of offline population inclusion strategies, the key variable in our analyses is whether panel members are considered to be members of the offline population (i.e., "offliners") rather than members of the online population (i.e., "onliners"). In general, we define panel members as offliners if they receive an offline population inclusion treatment. With regard to the GIP, this means that panel members are defined as offliners if they were provided with an internet connection and/or an internet-enabled device because they did not have (sufficient) internet access at their homes prior to their panel participation. With regard to the GESIS Panel, this means that panel members are defined as offliners if they participate via postal mail surveys rather than being surveyed online for one of two reasons: (1) they did not use the internet prior to their panel recruitment or (2) they were not willing to participate in the panel via the internet although they used the internet for other purposes.
Since two groups of people are addressed by the GESIS Panel offline population inclusion strategy, we apply two definitions of the offliner status in all of our analyses. If they are surveyed via postal mail, we define GESIS Panel participants as offliners following a panel operations definition. The panel operations definition does not differentiate between the different reasons why a GESIS Panel participant might receive the treatment. An advantage of the panel operations definition is that it allows us to explore the success of the offline population inclusion strategy as applied in the GESIS Panel in practice. However, it does not provide any information on whether a potential impact of the offline population inclusion strategy is due to actual non-internet users or due to internet users who are unwilling to be surveyed online. Therefore, if people report that they never used the internet for private purposes at the time of the panel recruitment, we define them as offliners following an actual internet usage definition. Following the actual internet usage definition, people who are surveyed via postal mail although they generally use the internet for private purposes are treated as onliners.
It should be noted that in the GIP and in the GESIS Panel, people are treated as offliners or onliners based on the information gathered in the initial panel recruitment interview. During the time period we examine in this article, panel members could not switch their treatment, for example, from being an offliner to becoming an onliner, except for nine people in the GESIS Panel who switched modes for panel administrative reasons and are therefore excluded from our analyses. Table 2 shows the share of offliners across the GIP and GESIS Panel samples at the starting point of our analyses. As Table 2 shows, the GIP and GESIS Panel differ vastly with regard to how many panel members are treated with an offline population inclusion strategy (GIP: 7.8% of the sample and GESIS Panel: 37.8% of the sample). In addition, the share of actual noninternet users is much lower in the GESIS Panel than the overall share of people surveyed via postal mail (13.2% non-internet users vs. 37.8% overall assigned to postal mail).

Analytical Strategy
In our analyses, we assess the impact of including the offline population in the GIP and GESIS Panel across the first 12 panel survey waves with regard to two panel quality indicators: survey participation and sample accuracy.

Survey Participation
To assess the impact of offline population inclusion on survey participation across panel survey waves, we compute survey completion rates in line with the American Association for Public Opinion Research (AAPOR) standards (AAPOR, 2016) for each of the first 12 panel survey waves of the GIP and GESIS Panel. According to AAPOR standards, completion rates "can be computed for response to a particular survey invitation [emphasis in original] sent to eligible [online] panel members" (AAPOR, 2016, p. 48) and are given by: where I is the number of responses to a particular survey wave invitation, P is the number of partial responses, R is the number of refusals, NC is the number of noncontacts, and O is the number of other nonresponses based on the recruited panel sample. To examine differences between offliners and onliners in terms of survey participation, we distinguish between completion rates for the offliners among all registered panel members and the onliners among all registered panel members. To assess differences between offliners and onliners in terms of participation at each panel survey wave, we compute 95% confidence intervals (CI) around the survey completion rates based on standard errors.
It should be noted that the denominator of the completion rate formula refers to the sample of registered panel members and is therefore the same across all panel waves of the GIP and GESIS Panel. Even people who de-register from the panel are still counted in the denominator at each panel survey wave.
To investigate whether potential differences in survey completion rates between offliners and onliners are statistically significant even when controlling for potential moderators (e.g., sociodemographics), we also fit random-effects regression models for structured longitudinal data (see Plewis, 2009). The dependent variable in these models is a repeated binary measure of survey participation across the 12 panel survey waves (0 ¼ non-participation, 1 ¼ participation). The repeated measures of survey participation are nested within panel members with one measurement for each panel member at each panel survey wave. We fit logistic random-effects regression models using offliner status as an independent variable for each of the two panels. For the GESIS Panel, we also fit separate models for each of the two offliner status definitions described above (panel operations definition and actual usage definition).
To examine whether the effect of including the offline population in the GIP and GESIS Panel is stable over time, we also include dummy variables identifying the panel wave (from Wave 1 to Wave 12) as independent variables in the models. In addition, we include terms for the interaction between the survey wave and the offliner status. These interaction terms indicate whether participation decreases or increases more steeply for the offliners compared to the onliners across panel survey waves. Furthermore, we include control variables in our models (sociodemographic characteristics: gender, age, education, household size, and citizenship) as well as indicators of whether participants were invited to so-called "mock survey waves" before the panels had finished the recruitment of their participant samples to pass time before the first regular panel survey wave. For the GIP, we also included a binary indicator of whether panel participants were recruited in 2012 or 2014.

Sample Accuracy
To assess the impact of offline population inclusion on sample accuracy, we calculated the average absolute relative bias (AARB; see Groves, 2006) at each of the first 12 panel survey waves in the GIP and GESIS Panel. As a reference survey, we used the German Microcensus of the year 2013 (https://www.gesis.org/gml/mikrozensus/). The German Microcensus is a mandatory survey of 1% of the German population conducted each year by the German Federal Statistics Office (Forschungsdatenzentren der Statistischen Ä mter des Bundes und der Länder, 2020).
We calculate the AARB as the deviation of a set of sociodemographic variables (gender, age, education, citizenship, and household size) from the German Microcensus with where y sk is the value for each category k for sample s, y bk is the corresponding value of the benchmark statistic b, and K is the total number of categories. We chose the AARB as a measure of accuracy because it aggregates the bias assessments across a set of variables, thereby providing an overview bias statistic. Furthermore, the AARB has the desirable property of being proportional to the size of the benchmark statistic. The proportionality takes into account that a small percentage point deviation of a survey estimate from a small benchmark value should have a higher impact on the bias measure than small percentage point deviations of a survey estimate from a high benchmark value. Regarding the interpretation of the AARB, it should be noted that low AARBs indicate high sample accuracy and high AARBs indicate low sample accuracy.
To assess the impact of including the offline population, we calculate AARBs for the GIP and GESIS Panel full samples (i.e., including both offliners and onliners) and the onliner-only samples (i.e., excluding the offliners) at each of the first 12 panel survey waves. For each AARB, we calculate 95% CIs based on bootstrapped standard errors.
In addition to assessing sample accuracy at an aggregated level using AARBs, we also examine relative biases on each of the sociodemographic characteristics used in the calculation of the AARB averaged across panel survey waves with where y sk w is the value for each category k for sample s at each survey wave w, y bk is the corresponding value of the benchmark statistic b, and W is the total number of all panel survey waves. We compute the relative biases for the GIP and GESIS Panel full samples and onliner-only samples at each of the first 12 survey waves. We then average the relative biases of each sociodemographic characteristic across the panel survey waves and calculate bootstrapped standard errors.

Results
In the following, we present our results regarding the impact of offline population inclusion in the GIP, which provides its offliners with the necessary equipment to participate online, and the GESIS Panel, which provides its offliners with a mail-mode alternative, on survey participation and sample accuracy across panel survey waves. Figure 1 shows the development of survey completion rates across panel waves in the GIP and GESIS Panel full samples and for the offliners and onliners separately. Generally, we find that survey completion rates decrease across panel survey waves (GIP: from 93.7% in Wave 1 to 60.2% in Wave 12; GESIS Panel: from 87.0% in Wave 1 to 71.5% in Wave 12). While in the GIP, survey completion rates are essentially the same among offliners and onliners, survey completion rates in the GESIS Panel are significantly lower among offliners than onliners.

Survey Participation
As can be seen in Figure 1, there is no significant difference in survey completion rates between offliners and onliners in the GIP (with the exception of Wave 2; see left pane of Figure 1). In the GESIS Panel, however, offliners have significantly lower survey completion rates than onliners from the start (at Wave 1: 81.9% among offliners vs. 90.1% among onliners following the panel operation definition of the offliner status; see middle pane of Figure 1). The difference in survey completion rates is slightly less pronounced when examining the actual internet usage definition (completion rates at Wave 1: 81.3% among offliners vs. 87.9% among onliners). This suggests that it is both the actual non-internet users and, in particular, the unwilling onliners who participate in the GESIS Panel surveys at lower rates than the actual onliners.
In addition to being lower from the start, GESIS Panel survey completion rates among offliners consistently continue to be significantly lower than among onliners on both definitions across the examined panel survey waves (survey completion rates at Wave 12: 62.1% among offliners vs. 77.0% among onliners following the panel operations definition and 63.5% among offliners vs. 72.8% among onliners following the actual internet usage definition). The evidence from the longitudinal regression models of participation (Table A1 in the Online Appendix) confirms the descriptive results and provides additional insights: Controlling for other potentially relevant differences, offliners are equally likely to participate across panel survey waves in the GIP, while offliners are less likely to participate across panel survey waves in the GESIS Panel on both offliner definitions. In addition, participation in the GIP decreases essentially at the same rate across panel waves among offliners and onliners with the exception of Wave 2, at which participation among offliners has decreased significantly more steeply than participation among onliners.
In the GESIS Panel following the panel operations definition, participation decreases at essentially the same rate among offliners and onliners until the turn of Waves 5-6, where participation decreases significantly more steeply among the offliners than the onliners, leading to a broadened gap in participation among offliners and onliners from thereon. Following the actual internet usage definition, the steeper decrease in participation at the turn of Waves 5-6 is not statistically significant, indicating that the widening of the gap in participation between GESIS Panel offliners and onliners is mostly attributable to the internet users who are unwilling to participate in panel surveys online. Figure 2 shows the development of sample accuracy (as measured using AARBs) across panel survey waves in the GIP and GESIS Panel full samples as well as the GIP and GESIS Panel samples excluding the offliners (i.e., the onliner-only samples; see also Table A2 in the Online Appendix). Generally, we find that the bias increases across panel survey waves in both panels (GIP: AARB increases from 12.7% in Wave 1 to 18.6% in Wave 12; GESIS Panel: AARB increases from 11.9% in Wave 1 to 17.1% in Wave 12). Including the offline population generally has a positive impact on sample accuracy, meaning that it reduces the bias in the data.

Sample Accuracy
In the GIP, including the offline population has a significant positive impact on sample accuracy at Waves 3-11 (e.g., AARBs at Wave 6: 15.4% in the full sample and 16.7% in the onliner-only sample). However, the positive impact is not statistically significant at Waves 1, 2, and 12.
The reason why the positive impact of offline population inclusion on sample accuracy in the GIP can be considered to be small might be that offliners have much lower recruitment probabilities than onliners (18.0% vs. 50.4%; see Blom et al., 2017). We therefore also calculate propensity-weighted AARBs for the GIP full sample and onliner-only sample, which account for the different recruitment probabilities of offliners and onliners in the GIP. The results from the weighted analyses show that, indeed, the positive impact of offline population inclusion on sample accuracy is statistically significant at all panel survey waves when weighting by the inverse recruitment probability of offliners and onliners (see Figure A1 in the Online Appendix).
In the GESIS Panel, including the offline population has a significantly positive impact from the start (AARBs at the first panel survey wave: 11.9% in the full sample compared to 16.4% in the onliner-only sample following the panel operations definition and 13.9% following the actual usage definition). In addition, including the offline population continues to have a significantly positive impact until Wave 12 (AARBs at Wave 12: 17.1% in the full sample, 22.3% in the onliner-only sample following the panel operations definition, and 19.3% in the onliner-only sample following the actual usage definition). Moreover, the comparison between offliner status definitions shows that both the actual non-internet users and the internet users who are unwilling to participate in panel surveys online have a significantly positive impact on the GESIS Panel sample accuracy.
A deeper understanding of sample accuracy can be reached by examining the relative bias of each socio-demographic characteristic used in the calculation of the AARB. These relative biases provide information on which characteristics are responsible for the increase in sample accuracy caused by offline population inclusion. Table 3 shows the results for the relative biases on gender, age, education, household size, and citizenship averaged across the first 12 panel survey waves (see also Table A3 in the Online Appendix for descriptive comparisons between the samples and the German Microcensus).
The results from the relative bias analyses show that in the GIP and in the GESIS Panel, the main reason why offline population inclusion has a positive impact on the samples is that it reduces the underrepresentation of people with low education and the overrepresentation of people with high education. Furthermore, including the offline population in both panels significantly reduces the underrepresentation of people who live alone.
Overall, the impact of including the offline population is greater in the GESIS Panel than in the GIP. For example, while in both the GIP and GESIS Panel, the underrepresentation of people with low education is significantly reduced by including the offline population, the underrepresentation is reduced to a greater extent in the GESIS Panel ( . However, the greater impact of including the offline population in the GESIS Panel also leads to some bias increases that are greater than in the GIP. For example, while offline population inclusion increases the underrepresentation of people aged 31-40 years in both panels, the underrepresentation is increased to a greater extent in the GESIS Panel than in the GIP (GIP: from À6.5% [onliner-only]  Regarding our sample accuracy analyses, it should be noted that they do not offer any answers to the question of whether the bias in the GIP and GESIS Panel is generally high or low. No universally acknowledged benchmark is available that we could compare the AARBs in the GIP and GESIS Panel to. The interpretation of our results is, therefore, limited to comparing the GIP and GESIS Panel to each other, in particular with regard to whether or not including the offline population has a positive impact on the two panels' accuracy.
Furthermore, while our results show that sample accuracy decreases in both panels over time, as indicated by the increase in AARBs across panel waves, our analyses cannot explain why this happens. It is likely due to systematic attrition of certain population subgroups, especially people with low education, over time. However, this needs to be explored in more detail in future research.

Summary and Discussion
In this article, we examined the impact of two approaches to offline population inclusion in probability-based online panels: providing members of the offline population with the necessary equipment to participate in surveys online and offering postal mail surveys as an alternative survey participation mode. In our analyses, we focused on determining to what extent including the offline population had a lasting impact across the first 12 survey waves of two probability-based online panels in Germany: the GIP, which provides internet equipment, and the GESIS Panel, which offers postal mail surveys.
We found that, even though recruitment and/or panel wave participation probabilities were lower among members of the offline population than among members of the online population, including the offline population had a positive long-term effect on panel sample accuracy in both panels. This improvement in sample accuracy across the two panels was largely driven by the success of the offline inclusion strategies in reducing the underrepresentation of people with low education. Regarding differences between the examined approaches to offline population inclusion, the findings from this article suggest that both approaches have some advantages and some disadvantages. An advantage of providing the offline population with internet equipment is that offliners and onliners participate in the same survey mode, which eliminates potential mode effects. In addition, providing internet equipment also has the advantage that all panel members can receive technologically elaborate treatments including video and audio features as well as real-time experimental splits and extensive filter conditions. However, a disadvantage of this approach is that the hurdle of agreeing to receive the internet equipment seems to be high, leading to low recruitment probabilities of offline population members, which, consequently, leads to a comparatively small impact of offline population inclusion on panel sample accuracy.
An advantage of offering an alternative panel participation mode is that it has a comparatively large positive impact on panel sample accuracy, in particular with regard to reducing the bias in education. This is likely due to the fact that the hurdle of agreeing to receive postal mail survey questionnaires is relatively low. A related advantage is that the alternative offline mode can also be offered to people who generally use the internet but are unwilling to use it to participate in online surveys, which increases sample accuracy even more than just offering the alternative mode to noninternet users only. However, a disadvantage of this approach is that offering an alternative mode might introduce mode effects into the data. In addition, offering an alternative survey mode to the offline population leads to differences in treatment between the panel members. For example, due to administrative reasons and the higher costs of the mail mode compared to the online surveys, the GESIS Panel sends no reminder letters to the panel members who are surveyed via postal mail, while it sends an email reminder to the people who are surveyed online. This difference in treatment likely contributes to the lower participation rates among members of the offline population that we found in the GESIS Panel data.
Generally, from the results of our study, we conclude that offline population inclusion strategies can lead to increased data quality in probability-based online panels. However, the substantial costs of the offline inclusion strategies also need to be weighed in when deciding whether or not to implement an offline inclusion strategy in a probability-based online panel. For studying some phenomena, especially those related to education and/or digital affinity, including the offline population is certainly more important than for studying other topics. In addition, to reduce the costs of offline population inclusion strategies and to increase their gains, more research into best practices is necessary, especially with regard to the question of how to apply offline population inclusion strategies in an increasingly digital society. Regarding best practices, experimental research is needed to examine how the different approaches to offline population inclusion perform relative to each other. In particular, future research should aim to apply a total survey error perspective to offline population inclusion that, for example, factors in potential mode effects on measurements caused by mixed-mode strategies. Other aspects that should be examined in future research include comparing the costs of different offline population inclusion strategies as well as their logistical complexity and fieldwork timelines.
Regarding the necessity to adapt offline population inclusion strategies to our changing digital societies, conceptual research into how to define the offline population is imperative. For example, it might no longer be sufficient to differentiate between non-internet users and internet users in a binary way, but rather to treat internet usage as a continuous or multidimensional characteristic, that also takes into account how often people use the internet, what people use the internet for, which devices they use to connect to the internet, and how confident they are in their digital skills (see, e.g., Couper et al., 2018;Herzing & Blom, 2019). Such conceptual considerations should particularly factor in the potential influence that the increase in smartphone usage might have in the context of online panels (see Weiß et al., 2019). For example, offering people smartphones and mobile internet might be a way to further reduce the underrepresentation of people with low education, who seem to adapt this technology particularly well (see, e.g., Antoun, 2015). Overall, for probability-based online panels to remain inclusive in our (digitally) changing societies, the panels will need to review and adapt continuously to the state of digitalization.