Measuring Smartphone Use: Survey Versus Digital Behavioral Data

While digital technology use and skills have typically been measured with surveys, digital behavioral data that are passively collected from individuals ’ digital devices have recently emerged as an alternative method of measuring technology usage patterns in a more unobtrusive and detailed way. In this paper, we evaluate how passively collected smartphone usage data compare to self-reported measures of smartphone use, considering the three usage dimensions amount of use , variety of use , and activities of use . Based on a sample of smartphone users in Germany who completed a survey and had a tracking app installed on their smartphone, we ﬁ nd that the alignment between the survey and digital behavioral data varies by dimension of smartphone use. Whereas amount of use is considerably overreported in the survey data, variety of use aligns more closely across the two data sources. For activities of use , the alignment differs by type of activity. The results also show that the alignment between survey and digital behavioral data is systematically related to individuals ’ sociodemographic characteristics, including age, gender, and educational attainment. Finally, latent class analyses conducted separately for the survey and digital behavioral data suggest similar typologies of smartphone use, although the overlap between the typologies on the individual level is rather small.


Introduction
Over the last decade, smartphones have become one of the most frequently used devices for accessing the Internet (Rice et al., 2023), but inequalities in the use of smartphone technologies and in smartphone-related skills persist in the general population (Wenz & Keusch, 2023).In research about the digital divide, the vast majority of studies have relied on surveys for measuring digital technology use and skills (e.g., Büchi et al., 2016;Hargittai, 2005;van Deursen et al., 2016).Surveys are a relatively cost-effective approach for collecting data on populationrepresentative samples and are still the predominant method of data collection in the social sciences (Sturgis & Luff, 2021).However, surveys also have limitations with regard to representation and measurement.For example, self-reports of behaviors, such as digital technology use, are prone to measurement error due to recall error, social desirability, and other prevalent errors in self-reports, potentially leading to biased results (Tourangeau et al., 2000).
Digital behavioral data (DBD) that are passively collected from individuals' digital devices have recently emerged as an alternative method for collecting data about attitudes and behaviors (Keusch & Kreuter, 2022).On smartphones, for example, the activities carried out on the device, including voice calls, text messages, and app usage, are continuously captured through smartphone usage logs by the operating system (Harari et al., 2016).Compared to surveys, DBD allow researchers to measure smartphone usage patterns not only unobtrusively, that is, not relying on the respondents' self-reports but also in a more detailed way and over a longer period of time.
In this paper, we evaluate how smartphone usage data that are passively collected from smartphones compare to self-reported measures of smartphone use based on a sample of smartphone users in Germany.First, we examine to what extent the DBD-and survey-based usage measures align with each other and how the alignment varies by sociodemographic characteristics.Second, we use latent class analysis to identify smartphone usage types separately for the DBDand survey-based measures and compare the results across the two methods of data collection.

Background
In digital divide research, surveys have been the most frequently employed approach for measuring Internet and digital technology use.Three dimensions of use have typically been distinguished (Blank & Groselj, 2014): amount of use, variety of use, and activities of use.While amount of use has been measured with survey questions about the frequency of going online (e.g., number of hours spent online per day), the frequency of engaging in different online activities (e.g., number of hours spent on social media platforms per day), or total years of Internet use (e.g., van Deursen & van Dijk, 2014), variety of use has typically been measured as the number of different online activities that individuals report (e.g., Reisdorf & Groselj, 2017).Activities of use, in turn, have been measured with survey questions about which types of online activities individuals engage in, such as information seeking, communication, or entertainment (e.g., Zillien & Hargittai, 2009).
To conceptualize and classify the large variation in Internet and digital technology use, usage typologies have been constructed, typically based on latent class analysis or other types of cluster analyses (Blank & Groselj, 2014).In a study of German smartphone owners, for example, Wenz and Keusch (2023) identified six distinct types of users who vary strongly in their self-reported frequency of smartphone use, self-rated smartphone skills, and activities carried out on their smartphone: (1) advanced users (who use their smartphone frequently, for all examined activities, and have advanced skills), (2) broad non-social media users (who use their smartphone frequently, for a large variety of activities except for social media, and have intermediate or advanced skills), (3) broad non-commercial users (who use their smartphone frequently, for a large variety of activities except for online banking and purchases, and have beginner-level, intermediate, or advanced skills), (4) social media and information users (who use their smartphone frequently, mainly for browsing websites, reading or writing emails, and using social media, and have beginner-level skills), (5) basic general users (who use their smartphone frequently, for taking photos, browsing websites, using GPS/location-aware apps, and reading or writing emails, and have beginner-level or intermediate skills), and (6) camera users (who use their smartphone less frequently, mainly for taking photos, and have beginner-level skills).The smartphone usage types also differed significantly by sociodemographic characteristics, with the types reflecting more frequent and diverse usage patterns being younger and having higher levels of educational attainment.
More recently, DBD that are collected as a by-product of individuals' everyday interactions with digital devices have emerged as an alternative approach for collecting data about Internet and digital technology use.While DBD have yet received little attention in digital divide research, they are increasingly used in related disciplines, such as psychology and communication research, for measuring Internet and smartphone use (e.g., Bach & Wenz, 2020;Ellis, 2019;Festic et al., 2021;Harari et al., 2020;Schoedel et al., 2020), social media use (e.g., Guess et al., 2019;Haenschen, 2020;Mahalingham et al., 2023), or news consumption (e.g., Barthel et al., 2020;Möller et al., 2020;Stier et al., 2022).Various techniques can be applied for collecting individual-level DBD on digital technology use (Ohme et al., 2023).Many studies follow a tracking approach in which participants are asked to download a research app on their mobile devices that passively captures prospective data about the apps used and the websites visited on the device (e.g., Araujo et al., 2017;Festic et al., 2021).Recent studies have also suggested a data donation approach in which participants are requested to donate their retrospective app usage data by manually recording usage reports that appear on the smartphone, such as the iOS Screen Time feature (e.g., Baumgartner et al., 2023;Jones-Jang et al., 2020;Ohme et al., 2021).
A small number of typologies of Internet and digital media use have been constructed based on DBD, mostly in the area of marketing research (Chen et al., 2019;Hamka et al., 2014;Lee et al., 2018).For example, Hamka et al. (2014) conducted a latent class analysis based on log files from 129 smartphone users who installed a tracking app on their device for at least two weeks.They identified six types of smartphone users who vary in the number of URLs visited, the number of apps used, and the number of new apps installed: (1) application ignorant users (who visit a small number of URLs in the mobile browser and use a small number of apps per day), (2) basic application users (who visit a small number of URLs but use a medium number of apps), (3) average application users (who visit a medium number of URLs and use a medium number of apps), (4) information seekers (who visit a larger number of URLs but use a small number of apps), (5) app savvy users (who visit a large number of URLs, use an extensive number of apps, and install a large number of new apps), and (6) high utility users (who visit an extensive number of URLs, use an extensive number of apps, but install a small number of new apps).
A growing body of research has examined to what extent DBD-and survey-based measures of Internet and digital technology use align with each other, mostly focusing on amount of use and treating DBD as the gold standard.Generally, the correlations between self-reported and tracked measures of digital technology use were found to be small to moderate (e.g., Andrews et al., 2015;Araujo et al., 2017;Boase & Ling, 2013;Deng et al., 2019;Ellis et al., 2019;Parry et al., 2021;Revilla et al., 2017;Scharkow, 2016).However, the findings about the direction of the difference, that is, whether individuals over-or underreport their digital technology use in surveys compared to their tracked behavior, are rather mixed.
In this paper, we study how DBD compare to survey data for measuring smartphone use and expand upon existing research by examining the three usage dimensions that are typically distinguished in digital divide research: amount of use, variety of use, and activities of use.We refrain from treating DBD as the gold standard since these data were also shown to have errors (Bosch & Revilla, 2022) and instead investigate the level of alignment between the survey and the DBD, relying on three indicators: absolute error (any differences in the measured behavior between the survey and the DBD), underreporting (behavior is observed in the DBD but not indicated in the survey), and overreporting (behavior is indicated in the survey but not observed in the DBD).In addition, we investigate whether DBD and survey data collected from the same individuals lead to different user typologies.

Sample
The DBD and survey data were collected as part of the project "Political Identities and News Consumption in Election Times" (PINCET; Bach et al., 2023).Members from an opt-in online panel in Germany provided by Respondi/Bilendi were invited through a survey-router system to participate in several waves of a web survey between August 30 and December 16, 2021.Panel members were eligible for the study if they were aged 18 years and older, lived in Germany, and were eligible to vote in the 2021 German federal election.Quotas for gender, age, and state were employed to generate a sample that aligned in these variables with the German general population.
To collect DBD from their devices, all panel members in this study had agreed to install a browser plug-in on their personal computers and/or download a research app on their mobile devices.The tracking technology, developed by Wakoopa, was provided by the online panel, and all participants had the technology already installed prior to the participation in this study.Each time a panel member visited a website, the browser plug-in captured the URL of the website, the domain, and the date, time, and duration of the visit.On mobile devices, the research app captured similar information about website visits from the device's native browser, that is, Chrome on Android devices and Safari on Apple devices.In addition, the research app captured the names of the apps that panel members used on their smartphone, including the date, time, and duration of any instance of use.Of all panel members who were invited to the DBD collection by the panel provider, approximately 30% allowed data collection on at least one device.In addition to incentives for survey participation, panel members who provided DBD received an incentive of €1 per month for data collection on a personal computer and €2 per month for data collection on a mobile device.If no longer interested in participating in the DBD collection, panel members could opt-out of the data collection or pause it temporarily at any time.In accordance with the EU General Data Protection Regulation (GDPR), panel members could also ask the sample provider to delete all their records.All DBD and survey data were provided to us in pseudonymized and deidentified form.
In this paper, we use data from N = 1204 smartphone users who completed the wave 1 survey of the PINCET project between August 30 and September 7, 2021 and had downloaded the research app on their smartphone.Participants have a median age of 45.5 years, 48.5% are female or diverse and 51.5% are male, and 23.9% have a college degree while 22.5% have a high school degree and 53.6% are without high school degree.For each participant, we use the app tracking data that were collected by the panel provider prior to their wave 1 survey interview between July 15 and September 7, 2021.During this period, participants provided DBD for a median of 47 days, with a range of 1-55 days.A total of 1153 participants (95.8%) installed the app on one smartphone and 51 participants (4.2%) on two smartphones, usually on one personal and one work-related device.The research app was available for Android and iOS operating systems, but the large majority of tracked devices are Android smartphones (88.9%; n = 1116) as opposed to iPhones (11.1%; n = 139).

Measures
For both the DBD and survey data, we create indicators of the amount, the variety, and the activities of smartphone use, reflecting the three dimensions of digital technology use that are typically distinguished in digital divide research (Blank & Groselj, 2014).To measure amount of use in the survey, participants were asked to self-report the time spent using the smartphone on an ordinary day in an open answer box (in hours and minutes 1 ).To measure activities of use, participants were asked to indicate the types of activities carried out on their smartphone in a check-all-that-apply question.The activities include (1) making and receiving phone calls, (2) using messenger services, (3) visiting websites, (4) sending and/or reading emails, (5) taking photos, (6) using social media, (7) shopping, (8) online banking, (9) using location-based apps, (10) playing games, (11) listening to music or watching videos, and (12) health and/or fitness tracking.In a separate question, they were also asked to self-report the frequency of (13) reading, listening, or watching the news on their smartphone.For each of the 13 activities, we create variables indicating whether the participants engage in the respective activities (coded as yes, no).Finally, to measure variety of use, the number of different activities carried out on the smartphone was summed up for each participant, ranging from 1 to 13.The original and translated questions are shown in Table S1 in the Online Appendix.
For the DBD, we create measures of smartphone use that mirror the survey-based measures.To measure amount of use, the time spent on all apps across all tracked smartphones during the data collection period were aggregated for each participant.The aggregated time was then divided by the number of days for which the participants' devices were tracked to create a measure on the day level.To measure activities of use, 13 variables were created indicating whether the participants engage in the smartphone activities that were also measured in the survey at least once during the data collection period (coded as yes, no).For the classification of apps into types of activities, the original categories provided by the app stores were used as the starting point and were refined through manual coding 2 .Finally, to measure variety of use, the number of different activities carried out on the smartphone was summed up for each participant, ranging from 0 to 13.
Descriptive statistics for the survey-and DBD-based measures are shown in Table 1.On the aggregate level, amount of use is considerably higher in the survey data (mean: 229.1 min; median: 180.0 min) than in the DBD (mean: 112.1 min; median: 84.8 min) whereas variety of use is almost completely aligned between the survey data (mean: 9.1 activities; median: 10.0 activities) and the DBD (mean: 8.4 activities; median: 9.0 activities).For activities of use, the level of alignment varies by type of activity.The majority of activities are closely aligned between the two data sources, including messenger services, emails, social media, online banking, GPS, and games.Other activities are reported by substantially more respondents in the survey, including phone calls, browsing websites, photos, or news, or underreported, including shopping, music or videos, and health or fitness.
As correlates of smartphone use, sociodemographic variables were collected in the survey, including age (in years), gender (male vs. female, diverse), and educational attainment (no high school degree, high school degree, college degree).

Analysis Strategy
To compare the alignment of DBD-and survey-based measures of smartphone use on the individual level, we create three types of variables following Araujo et al. (2017).Underreporting.Participants are considered to underreport a behavior if the behavior is observed in the DBD but not indicated in the survey.For the continuous variables amount of use and variety of use, we subtract the survey-based values from the DBD-based values for each participant.Instances in which participants report correctly or overreport their behavior in the survey are set as 0 for that variable.For the dichotomous variables reflecting activities of use, we code whether the behavior is observed in the DBD but not indicated in the survey (1 = underreporting and 0 = no underreporting).
Overreporting.Participants are considered to overreport a behavior if the behavior is indicated in the survey but not observed in the DBD.For the continuous variables amount of use and variety of use, we subtract the DBD-based values from the survey-based values for each participant.Instances in which participants report correctly or underreport their behavior in the survey are set as 0 for that variable.For the dichotomous variables reflecting activities of use, we code whether the behavior is indicated in the survey but not observed in the DBD (1 = overreporting and 0 = no overreporting).
To examine whether absolute error, underreporting, and overreporting are systematically related to sociodemographic characteristics, we estimate a series of regression models on age, gender, and educational attainment.We fit OLS regressions for the continuous variables amount of use and variety of use, and logistic regressions for the dichotomous variables reflecting activities of use.For the visualization of the regression results, we use the R ggplot2 package, version 3.4.1 (Wickham, 2016).Finally, to create typologies of smartphone use based on the DBD and survey data, we conduct a latent class analysis (LCA) separately for each method of data collection.LCA is a clustering method for identifying unobserved classes in a population from a set of observed categorical indicators (McCutcheon, 1987).Participants are assigned to the different classes based on their similarity in the indicator variables.We use the 15 variables reflecting the three usage dimensions (amount of use, variety of use, and 13 variables reflecting activities of use) in the models.For the continuous variables amount of use, and variety of use, the values were coded into categories (below median vs. equal or above median), separately for the DBD and survey data.To estimate the latent class models, we use the R poLCA package, version 1.6.0.1 (Linzer & Lewis, 2011).For each model, we vary the number of classes from two to 10 and compute model fit criteria, including the log likelihood (LL), the Akaike information criterion (AIC), and the Bayesian information criterion (BIC), to select the best-fitting model, with lower values indicating a better model fit (Nylund et al., 2007).We also report the size and percentage of the smallest class.The data preparation and analysis were conducted in R, version 4.2.2 (R Core Team, 2022).

Results
We first examine to what extent the DBD-and survey-based measures of smartphone use align with each other on the individual level (Table 2).Generally, the individual-level alignment mirrors the patterns found on the aggregate level.For amount of use, the mean absolute error (AE) is 157.3 min (SD = 191.0)which is mostly due to overreporting.On average, participants overreport their daily smartphone use by 137.1 min (SD = 198.0)and only underreport their use by 20.1 min (SD = 52.8).For variety of use, the mean AE is 3.1 activities (SD = 2.6) which is due to both underand overreporting.On average, participants underreport their variety of use by 1.1 activities (SD = 1.9) and overreport their use by 1.9 activities (SD = 2.7).For activities of use, the individual-level alignment varies considerably by type of activity.The largest differences between DBD-and survey-based measures are found for browsing websites (AE: 67.8%), news (AE: 59.7%), and health or fitness (AE: 46.5%).For browsing websites, the differences are mostly due to overreporting (65.8%) rather than underreporting (2.0%).Similarly, the differences for news are largely driven by overreporting (59.1%) as opposed to underreporting (.7%).For health or fitness, in turn, underreporting is more prevalent (36.0%) than overreporting (10.5%).The closest alignment between DBD-and survey-based measures can be found for messenger services (AE: 13.5%), social media (AE: 22.3%), and emails (AE: 23.3%); any differences in these activities are driven by both under-and overreporting.Among the remaining activities with a medium level of alignment between DBD and survey data, overreporting is more prevalent for phone calls and photos whereas underreporting is more prevalent for shopping and music or videos.Finally, the differences for online banking, GPS, and games are evenly driven by under-and overreporting.
We next study to what extent the individual-level alignment is systematically related to participants' sociodemographic characteristics.Figures 1-5 show the results from regression models of absolute error, underreporting, and overreporting for amount of use, variety of use, and activities of use, with detailed model results shown in Tables S2-S6 in the Online Appendix.Predictors for which the horizontal lines do not cross the dashed zero-line are statistically significant at the 5%-level.We find that age, educational attainment, and gender are significantly correlated with the alignment between DBD and survey data for several dimensions of smartphone use.For amount of use, the absolute error and the level of overreporting significantly decrease with age.Similarly, for variety of use, the absolute error and the level of overreporting significantly decrease with age, although the level of underreporting significantly increases with age.For activities of use, the effect of age on the alignment between DBD and survey data varies by type of activity.Age has a significant negative effect on the absolute error and the level of overreporting for browsing websites, the level of overreporting for GPS, and the absolute error and the level of underreporting for phone calls.However, age has a significant positive effect on the absolute error and the level of underreporting for messenger services, shopping, online banking, and music or videos.The findings are mixed for emails and news where the level of overreporting significantly decreases with age, but the level of underreporting significantly increases with age.
We also find that educational attainment is significantly correlated with the alignment between DBD and survey data for all three dimensions of smartphone use.For amount of use, participants with a high school or college degree have a significantly smaller average error and lower level of overreporting than those without high school degree.However, participants with a high school degree have a significantly higher level of underreporting for amount of use than participants without high school degree.For variety of use, the alignment between the data sources decreases with the level of educational attainment.Participants with a college degree have a significantly higher level of overreporting than those without high school degree.For activities of use, educational attainment has a significant but mixed effect on few of the activities.Participants with a higher level of educational attainment are significantly more likely to overreport photos but significantly less likely to underreport GPS.In addition, participants with a college degree are significantly less likely to underreport shopping and also more likely to overreport shopping than those without high school degree.
Gender is significantly correlated with the alignment between the data sources for two of the usage dimensions.For variety of use, the level of underreporting is significantly lower for female or diverse participants compared to male participants.For activities of use, female or diverse participants have a significantly lower level of underreporting for photos, social media, shopping, and GPS, and a significantly lower level of overreporting for social media.However, female or diverse participants have a significantly higher absolute error and level of overreporting for news than male participants.
Finally, we investigate which types of smartphone users can be identified in the DBD and survey data by conducting LCAs.Varying the number of classes from two to 10 shows that the BIC reaches a minimum at the three-class model for the survey data, with a LL and AIC that do not decrease substantially as more classes are included in the model (Table 3).The three-class model also results in classes of a reasonable size, with the smallest class containing 156 participants (13% of the overall sample).We therefore select the three-class solution for the survey data.For the DBD, in turn, varying the number of classes from two to 10 shows that the BIC reaches a minimum at the four-class model.This model similarly results in classes of a reasonable size, with the smallest class containing 142 participants (12% of the overall sample), and we therefore select the four-class solution for the DBD.We next examine the composition of the latent classes.Table 4 shows the predictor variables by smartphone usage class for the survey data and the DBD.For the survey data, we describe the three usage types as follows.Heavy users report using their smartphone for a large amount of time (equal or above median: 67%) and a large variety of activities (equal or above median: 100%).The majority engage in each of the 13 activities on their smartphone (each activity used by at least 55%).Heavy users constitute the largest usage group, with more than half of the sample (53%) categorized into this group.Intermediate users report using their smartphone for a medium amount of time (below median: 59%) and a smaller variety of activities (below median: 100%).The majority engage in most of the 13 activities, but they are considerably less likely than heavy users to use their smartphone for activities, such as health or fitness tracking (14%), shopping (24%), listening to music or watching videos (31%), and playing games (32%).Intermediate users constitute the second largest usage group, with one third of the sample (34%) categorized into this group.Light users report using their smartphone for a shorter amount of time (below median: 86%) and a smaller variety of activities (below median: 100%).They mainly use their device for news consumption (82%), making and receiving phone calls (73%), using messenger services (61%), and taking photos (41%).Light users constitute the smallest usage group, with 13% of the sample categorized into this group.
For the DBD, in turn, we describe the four usage types as follows.Heavy users use their smartphone for a large amount of time (equal or above median: 68%) and a large variety of activities (equal or above median: 100%).The majority engage in most of the 13 activities on their smartphone (each activity used by at least 52%, except for browsing websites with only 22%).Heavy users constitute the largest usage group, with more than half of the sample (56%) categorized into this group.Intermediate social media users use their smartphone for a medium amount of time (below median: 57%) and a smaller variety of activities (below median: 100%).The most popular activities are using messenger services (97%), social media (92%), and shopping (83%).These users also commonly use their device for sending or reading emails (73%), listening to music or watching videos (72%), and online banking (53%).Intermediate social media users constitute 16% of the sample.Intermediate phone call users use their smartphone for a medium amount of time (below median: 72%) and a smaller variety of activities (below median: 100%).The most popular activities are using messenger services (93%), making and receiving phone calls (84%), and taking photos (84%).These users also commonly use their device for sending or reading emails (70%), listening to music or watching videos (62%), and shopping (62%).Intermediate phone call users constitute 16% of the sample.Light users use their smartphone for a shorter amount of time (below median: 98%) and a smaller variety of activities (below median: 100%).They mainly use their device for using messenger services (54%) and sending or reading emails (30%).Light users constitute 12% of the sample.
Overall, the LCAs based on survey data and DBD result in similar smartphone usage classes, but the DBD reveal more nuanced patterns of smartphone use.Heavy users and light users have a similar composition and size across both data sources, although light users differ in the types of

Discussion
While digital technology use and skills have mostly been measured with surveys, DBD that are passively collected from individuals' digital devices might serve as an additional method of measuring technology usage patterns in a more unobtrusive and detailed way.In this paper, we contribute to the growing body of research about differences in DBD and survey data for measuring smartphone use.Based on a sample of smartphone users in Germany who completed a survey and had a tracking app installed on their smartphone, we examine to what extent the DBD-and survey-based usage measures align with each other.In addition, we investigate whether the DBD and survey data lead to different typologies of smartphone use.The results show that the level of alignment between DBD and survey data for measuring smartphone use varies by usage dimension.Whereas amount of use is considerably overreported in the survey data compared to the DBD, variety of use aligns more closely across the two data sources.For activities of use, the level of alignment differs by type of activity.On the one hand, there are activities that align relatively closely across DBD and survey data, such as using messenger services, social media, and sending or reading emails.On the other hand, there are activities that are either considerably overreported in the survey, such as browsing websites, news consumption, and making or receiving phone calls, or underreported, such as shopping, listening to music or watching videos, and health or fitness tracking.Overall, these findings are in line with previous research showing that the correlations between self-reported and tracked measures of digital technology use are generally small to moderate (e.g., Parry et al., 2021).In addition, we find that the level of alignment is systematically related to participants' sociodemographic characteristics.Differences in amount of use decrease with age whereas differences in variety of use decrease with age and educational attainment and are smaller for female or diverse participants compared to male participants.Differences in activities of use are also smaller for female or diverse participants for several types of activities, except for news consumption where differences are smaller for male participants.The effects of age and educational attainment on the alignment between DBD and survey data in activities of use are rather mixed.Finally, the DBD and survey data lead to similar typologies of smartphone use, although the DBD-based typology is more nuanced, with a larger number of classes.Whereas the classes are similar in size and composition on the aggregate level, the overlap between the two typologies on the individual level is rather small.There are several potential explanations for the differences between DBD and survey data in measuring smartphone use.On the one hand, the differences could have arisen due to measurement error in the survey-based self-reports (Tourangeau et al., 2000).Participants might not have been able to recall all activities that they usually carry out on their smartphone, or the phrasing of the survey questions might not have prompted them to think of all instances of use, leading to an underreporting of certain types of smartphone activities.For example, the question about news consumption does not further specify the type of news: While recalling the use of news apps on their smartphone, respondents might not have considered news consumption within other apps, such as mobile browser or social media apps.An additional limitation of the questions about activities of smartphone use is that they lack a specific time frame, which might have helped respondents to recall the types of activities carried out on their smartphone.In addition, social desirability might have been at play, leading to a misreporting of certain activities of use, such as an overreporting of news consumption.Participants might have also had difficulties recalling their amount of smartphone use on an ordinary day and reporting a continuous number in hours and minutes.We welcome future research that replicates our analysis with ordinal measures of amount of use, such as those employed in Blank and Groselj (2014).
On the other hand, errors in the DBD could have contributed to the observed differences (Bosch & Revilla, 2022;Keusch et al., 2023;Scharkow, 2016).Since only the name of the app is recorded and not the type of activity carried out within the app, the classification of apps into activities of use might have been erroneous, in particular for apps that can potentially serve numerous activities.For example, social media apps might be used for various activities, such as news consumption, messaging, or playing games.Future research might consider data collection approaches that allow monitoring within-app activities, such as data donation of social media data (Ohme et al., 2023).In addition, participants might not have installed the research app on all their smartphones or might have turned off the tracking due to privacy concerns, resulting in an underestimation of smartphone use.The participants' awareness of being observed might also have changed their smartphone usage behavior in the course of the study.Furthermore, it cannot be ruled out that the tracked smartphones were used by multiple people, in which case the participants' smartphone use is overestimated.Finally, although the research app passively collected DBD through the smartphone operating system log files, these data are not perfectly accurate and might also contain technical errors.
Another explanation why we see relatively little alignment between the self-reports and the DBD could be that the two data sources conceptually measure different dimensions of smartphone behavior.While self-reports measure perception of how frequently someone uses their smartphone and for what purpose, the DBD allow for a direct technical measure of actual behavior.Both aspects of the measurement could be relevant in a study, and researchers have to decide on a caseby-case basis which measure best operationalizes the concept of interest.For activities of use, for example, survey data could be better at measuring smartphone activities that can be carried out across multiple apps, such as news consumption or watching videos, whereas DBD could be better for activities carried out within distinct apps, such as making or receiving phone calls or taking photos.Given that both self-reports and DBD come with measurement error, approaches that incorporate both measures without assuming one of them being the gold standard might be the best fit for studies going forward, providing a more holistic view of smartphone use.
Absolute Error.For the continuous variables amount of use and variety of use, we subtract the DBD-based values from the survey-based values for each respondent and take the absolute value of the difference.For the dichotomous variables reflecting activities of use, we code whether the DBD-based and survey-based values are different (1 = different and 0 = not different).

Figure 1 .
Figure 1.Coefficients (points) and 95% confidence intervals (lines) from regression models of absolute error (AE), underreporting (UR), and overreporting (OR) of smartphone use (amount of use, variety of use, and phone calls) on sociodemographic characteristics.

Figure 5 .
Figure 5. Coefficients (points) and 95% confidence intervals (lines) from regression models of absolute error (AE), underreporting (UR), and overreporting (OR) of smartphone use (music or videos, health or fitness, and news) on sociodemographic characteristics.

Table 2 .
Comparison of Survey-Based and DBD-Based Measures of Smartphone Use.

Table 3 .
Model Fit and Diagnostic Criteria for Two to Ten Classes of Smartphone Use.Akaike information criterion; BIC = Bayesian information criterion.The rows in bold indicate the latent class models chosen for the analysis.activitiesthataremostfrequently used on their smartphones.Intermediate users identified in the survey data, however, are reflected by two classes of smartphone users in the DBD that differ in their activities of use.While intermediate phone call users engage in more classical types of mobile phone activities, such as making and receiving phone calls and taking photos, intermediate social media users are more likely to engage in Internet-based activities, such as social media, shopping, and online banking.Finally, we investigate to what extent the classification into smartphone usage groups aligns on the individual level (Table5).Whereas the aggregate-level comparison shows that the groups have

Table 4 .
Predictor Variables by Class of Smartphone Use. , the results from the individual-level analysis suggest that there is a rather small overlap between the usage groups identified in the two data sources.Less than half (45%) of the participants are classified into the same group of smartphone users in both survey data and DBD.The alignment is much closer for heavy users than for the other usage groups: 32% of participants are classified as heavy users in both data sources, although an additional 11% are classified as heavy users in the survey data and as intermediate social media users in the DBD.In contrast, only 11% of participants are classified as intermediate users in the survey data and as either intermediate social media users or intermediate phone call users in the DBD.Light users have the smallest alignment: Only 2% of participants are classified as light users in both data sources while an additional 10% are classified as light users in the survey data but in a different usage group in the DBD.

Table 5 .
Alignment Between Survey-Based and DBD-Based Classes of Smartphone Use.