Abstract
A growing social science literature has used Twitter and Facebook to study political and social phenomena including for election forecasting and tracking political conversations. This research note uses a nationally representative probability sample of the British population to examine how Twitter and Facebook users differ from the general population in terms of demographics, political attitudes and political behaviour. We find that Twitter and Facebook users differ substantially from the general population on many politically relevant dimensions including vote choice, turnout, age, gender, and education. On average social media users are younger and better educated than non-users, and they are more liberal and pay more attention to politics. Despite paying more attention to politics, social media users are less likely to vote than non-users, but they are more likely to support the left leaning Labour Party when they do vote. However, we show that these apparent differences mostly arise due to the demographic composition of social media users. After controlling for age, gender, and education, no statistically significant differences arise between social media users and non-users on political attention, values or political behaviour.
Political research using social media data has expanded rapidly, with studies using data from Facebook and Twitter to forecast elections (Tumasjan et al., 2010; Sang and Bos, 2012; McKelvey et al., 2014; Burnap et al., 2015), and study online deliberation (Larsson and Moe, 2011), political mobilization (Carlisle and Patton, 2013; Vissers and Stolle, 2014), and political ideology (Barbera, 2014; Bond and Messing, 2015). More generally, political actors and the media often pay attention to issues brought up on social media. For both academic and democratic reasons it is important to know how wide a slice of society these platforms represent.
Many of these studies focus only on the platform itself, but several (particularly those using social media to forecast elections) focus on wider trends in public opinion and political behaviour. As with other types of non-probability samples (e.g. for the challenges facing non-probability Internet panel surveys, see Baker et al., 2013) inferences from social media data run the risk of error if there are non-ignorable confounding relationships between the probability of self-selection into samples and outcome variables of interest.1
Survey analysis of social media demographics in the US have shown that Facebook and Twitter users tend to be younger and more educated than the general population, with Twitter having a more skewed distribution (Duggan, 2015; Greenwood et al., 2016). Studies using geotagged US Twitter data have found that Twitter users are more commonly found in urban areas (Mislove et al., 2011) and particularly wealthier areas with younger populations (Malik et al., 2015). Looking more specifically at the political attitudes of Twitter users, a study of the 2011 Spanish elections and the 2012 US presidential election showed that politically active Twitter users skew male, urban and politically extreme (Barbera and Rivero, 2014). Similarly, a survey of politically active Italian Twitter users showed that they were younger, better educated, male and left wing (Vaccari et al., 2013).
It is clear from previous research that social media users are not demographically representative of the population. However the question remains whether, when controlling for demographic variables, there remain unobserved, non-ignorable, differences between social media users and non-users. If there are non-demographic differences in the data, adjusting it with weights to appear demographically representative could lead to large errors (Mellon and Prosser, 2017) and would require more sophisticated adjustment methods (e.g. Wang et al., 2014).
This research note examines the representativeness of social media users using representative survey data. We make two contributions. First we look at the demographic representativeness of social media users in Great Britain. Second we examine the political attitudes and behaviours of British social media users. We find that, although social media users are far from demographically representative of the population, controlling for these differences, there are no differences on key political outcome variables such as ideology and vote choice. However, Twitter users are more likely to pay attention to politics and may be more likely to misreport having voted in a recent election.2
Data
This paper used the 2015 British Election Study (BES) face-to-face survey (Fieldhouse et al., 2015): a random probability sample of eligible voters in Great Britain. The survey had a response rate of 55.9% using American Association for Public Opinion Research response rate 3 (Smith et al., 2011).3
Results
We examined the distribution of demographic and attitudinal variables of Facebook and social media users. Facebook was by far the more popular social network (55.4% usage), with Twitter substantially less popular (18.6% usage). However, neither Facebook nor Twitter users were representative of the general UK population.4
Demographics
Age
Both Facebook and Twitter users were considerably younger than the general population (Figure 1). The mean age of Facebook users was 40, whereas the mean age of Twitter users was 34. This compares to an overall population mean age of 48 in the survey. The representativeness of each social network varied considerably by age: 85% of people aged between 18 and 30 used Facebook, whereas only a minority (40%) of respondents over 40 did. Consequently, Facebook users were closer to being representative of younger age groups. Twitter users were a minority in every age group.
Gender
Facebook was also slightly more representative of gender than Twitter. Figure 2 shows Twitter users were slightly more male than the general population, while Facebook was slightly more female (although the latter difference was not statistically significant).
Education
Facebook and Twitter users were more educated than non-users and the general population. Figure 2 shows both Facebook and Twitter users were much less likely to have no qualifications than non-users and the general population, and more likely to have A-levels, an undergraduate degree or a postgraduate degree.5 Although neither social network was representative in terms of education, Twitter was less educationally representative than Facebook.
Political attitudes
Political attention
Facebook and Twitter users differed from the general population in terms of political engagement. Interestingly, these differences were in opposite directions. Using the 11-point attention to politics self-placement scale, Facebook users were actually less politically attentive than non-users (0.47 points). By contrast, Twitter users were more politically attentive than non-users (0.40 points). These differences reflect the uses of the two platforms. Facebook is regarded as a primarily social platform, with news consumption forming a secondary function, whereas Twitter is seen as a place to follow current events.
Lower political attention among Facebook users results primarily from the younger age distribution of the social network compared with the general population. The regression model in Table 1 shows that after controlling for demographic factors, Facebook usage no longer predicts lower political attention. By contrast, the relationship between Twitter usage and political attention is stronger after demographic controls.
|
Table 1. Predictors of political attention (ordinary least squares (OLS) regression).

Political values
Next we examined the ideological positioning of Facebook and Twitter users using the left–right and authoritarian–libertarian scales (Evans and Heath, 1995). The scales were standardized before analysis. Twitter and Facebook users were both slightly more liberal (0.28 and 0.22 standard deviations, respectively) than non-users. The results differed across the platforms in terms of left–right: Facebook users were more left wing (0.14 standard deviations) but there was no significant difference for Twitter users (0.03 standard deviations more right wing).
To identify the source of these differences (Table 2), we ran linear regressions predicting the attitudinal scales, controlling for demographics. The results of these models show that the apparent differences in political attitudes on the left–right and libertarian–authoritarian scales appear to be driven by demographic differences. After controlling for age, gender and education, neither Facebook nor Twitter usage was a statistically significant predictor of political values, and the largest remaining difference was 0.11 standard deviations in authoritarian–libertarian values between Twitter users and non-users.6
|
Table 2. Ordinary least squares (OLS) models of social media predicting political values with demographic controls, dependent variable = z scores of ideology scales.

Political behaviour
We also compared the political behaviour of social media users and other respondents.
Turnout
Although Twitter is considered a politically active platform, Twitter users were less likely to report having voted than other BES respondents (Table 3), although this difference was only significant at the 10% level. Facebook users were also less likely to report having voted than the general population.
|
Table 3. Self-reported turnout of social media users compared to non-users and the full sample.

Next we examined whether social media users were more likely to misreport having voted than non-users using the vote validation data in the BES. Table 4 shows that both Twitter and Facebook users who report voting are more likely to misreport voting than non-users (though this is only significant at the p < 0.1 level for Twitter users).
|
Table 4. Vote validation outcome of those who report having voted.

Table 5 shows the differences in turnout controlling for the composition of users. The turnout differences largely result from Facebook’s and Twitter’s younger age profile, as social network usage no longer significantly predicts lower turnout after controlling for age, with Twitter use actually becoming a positive predictor of turnout. This last result is somewhat suspect, as the coefficient is only substantial when using reported turnout and does not appear when modelling turnout with the validated measure. The findings suggest Twitter users consider themselves to be politically engaged but do not necessarily actually vote at higher rates.
|
Table 5. Logit models predicting turnout (self-reported and validated) on the basis of social media usage and demographics.

Party support
We also compared social media users in terms of party support: the most important factor for assessing social media data’s use in election forecasting. In table 6 the results show that Twitter and Facebook over-represent Labour voters, with Twitter particularly unrepresentative of the general population. The balance of support is substantially worse than the error in the 2015 pre-election polls, meaning Twitter polls would have performed even worse than the pre-election surveys.
|
Table 6. Vote choice by social media usage.

To understand whether this difference could be explained by composition we ran vote choice models controlling for demographics and attitudes (Table 7). Social media measures did not remain significant after controlling for demographics, suggesting that while Twitter and Facebook are unrepresentative of voting initially, this is largely explainable in terms of composition.
|
Table 7. Multinomial logit predictors of vote choice including social media usage and demographics.

Conclusions
Our results show that neither Twitter nor Facebook are demographically representative of the population. Social media users are younger and better educated, Facebook users are more female and Twitter users more male. Social media users are also more liberal (and Facebook users more left wing). These differences corroborate those found in previous social media studies (e.g. Mislove et al., 2011; Vaccari et al., 2013; Barbera and Rivero, 2014; Duggan, 2015; Greenwood et al., 2016).
Both platforms also have markedly different political compositions to the general population. On average, social media users pay more attention to politics. Despite paying more attention to politics, social media users vote less (and may be more likely to misreport having voted), but lean more towards Labour when they do vote.
Importantly, our conclusions relate to social media users in general. Studies sampling only politically vocal social media users are likely to have even less representative samples.
This note also suggests that Twitter users are less representative along most demographic and political variables than Facebook users. Twitter studies are therefore particularly likely to have problems of representativeness if they are used without adjustment. However, with appropriate adjustments, samples of Facebook users could be a useful source of survey respondents.
Studies using Twitter or Facebook data to study public opinion or forecast elections should explain why the demographic and attitudinal differences in social media data will not affect their results. Our study provides some reason for hope: After controlling for demographics, social media’s association with vote choice and political attitudes greatly declines. This suggests social media data could be used for studying public opinion and forecasting if the data is appropriately weighted using demographics (which some work has already begun to (Filho et al., 2015)) and political attitudes or adjusted using an approach such as multilevel regression and post-stratification (Wang et al., 2014).7 Although social media data provides numerous opportunities for political science, it is vital to remember that Twitter and Facebook are not representative of the general population.
Declaration of conflicting interest
The authors declare that there is no conflict of interest.
Funding
The British Election Study is funded by the Economic and Social Research Council [ES/K005294/1 to Edward Fieldhouse]. This study also benefited from a registration matching project funded by the Electoral Commission.
Notes
1.
As summarized by Rivers (2013), self-selection into a sample is ‘ignorable’ when the probability of inclusion in a sample is conditionally independent of the outcome variable.
2.
A full set of replication materials is available at http://dx.doi.org/10.7910/DVN/AZHTBT.
3.
For more details on the sample, see the documentation at www.britishelectionstudy.com.
4.
The BES face-to-face survey used very short social media questions: ‘Do you use Twitter?’ and ‘Do you use Facebook?’, with response options ‘Yes’, ‘No’ and the possibility for respondents to spontaneously respond ‘Don’t know’.
5.
These education levels are the most common qualifications in that category and also include other qualifications at the equivalent level (e.g. Scottish Highers).
6.
The Twitter difference in authoritarian–libertarian values becomes significant at the 5% level, and the Facebook difference is significant at the 10% level if we use unweighted rather than weighted data in the OLS models. The effect sizes do not substantially change and are not significantly different from each other. No other analyses in this note vary substantially between weighted and unweighted analyses.
7.
The lack of detailed UK exit poll data somewhat limits this approach compared to the US, as does the absence of party registration information.
Carnegie Corporation of New York Grant
This publication was made possible (in part) by a grant from Carnegie Corporation of New York. The statements made and views expressed are solely the responsibility of the author.
References
|
Baker, R, Brick, MJ, Bates, NA. (2013) Summary report of the AAPOR task force on non-probability sampling. Journal of Survey Statistics and Methodology 1(2): 90–105. Google Scholar | Crossref | |
|
Barbera, P (2014) Birds of the same feather tweet together: Bayesian ideal point estimation using Twitter data. Political Analysis 23(1): 76–91. Google Scholar | Crossref | ISI | |
|
Barbera, P, Rivero, G (2014) Understanding the political representativeness of Twitter users. Social Science Computer Review 33(6): 712–29. Google Scholar | SAGE Journals | ISI | |
|
Bond, R, Messing, S (2015) Quantifying social media’s political space: Estimating ideology from publicly revealed preferences on Facebook. American Political Science Review 109(1): 62–78. Google Scholar | Crossref | ISI | |
|
Burnap, P, Gibson, R, Sloan, L. (2015) 140 Characters to victory?: Using Twitter to predict the UK 2015 general election. Electoral Studies, 41: 230–233. Google Scholar | Crossref | ISI | |
|
Carlisle, JE, Patton, RC (2013) Is social media changing how we understand political engagement? An analysis of Facebook and the 2008 presidential election. Political Research Quarterly 66(4): 883–95. Google Scholar | SAGE Journals | ISI | |
|
Duggan, M (2015) Mobile messaging and social media 2015. Pew Research Center. Available at: http://www.pewinternet.org/files/2015/08/Social-Media-Update-2015-FINAL2.pdf. Google Scholar | |
|
Evans, GA, Heath, AF (1995) The measurement of left–right and libertarian–authoritarian values: A comparison of balanced and unbalanced scales. Quality & Quantity 29(2): 191–206. Google Scholar | Crossref | ISI | |
|
Fieldhouse, E, Green, J, Evans, G. (2015) British election study 2015 face-to-face post-election survey. UK Data Service. SN: 7972. Google Scholar | |
|
Filho, RM, Almeida, JM, Pappa, GL (2015) Twitter population sample bias and its impact on predictive outcomes. In: Proceedings of the 2015 IEEE/ACM International Conference on Advances in Social Networks Analysis and Mining 2015, 25–28 August 2015, pp.1254–1261. New York, US: ACM Press. Google Scholar | Crossref | |
|
Greenwood, S, Perrin, A, Duggan, M (2016) Social Media Update 2016. Pew Research Center. Available at: http://assets.pewresearch.org/wp-content/uploads/sites/14/2016/11/10132827/PI_2016.11.11_Social-Media-Update_FINAL.pdf. Google Scholar | |
|
Larsson, AO, Moe, H (2011) Studying political microblogging: Twitter users in the 2010 Swedish election campaign. New Media & Society 14(5): 729–47. Google Scholar | SAGE Journals | ISI | |
|
Malik, MM, Lamba, H, Nakos, C. (2015) Population bias in geotagged tweets. Ninth International AAAI Conference on Weblogs and Social Media, 26–29 May 2015, pp.18–27. Google Scholar | |
|
McKelvey, K, DiGrazia, J, Rojas, F (2014) Twitter publics: How online political communities signaled electoral outcomes in the 2010 US house election. Information, Communication & Society 17(4): 436–50. Google Scholar | Crossref | ISI | |
|
Mellon, J, Prosser, C (2017) Missing non-voters and misweighted samples: Explaining the 2015 great British polling miss. Public Opinion Quarterly. https://doi.org/10.1093/poq/nfx015. Google Scholar | |
|
Mislove, A, Lehmann, S, Ahn, Y-Y. (2011) Understanding the demographics of Twitter users. In: Proceedings of the Fifth International AAAI Conference on Weblogs and Social Media, pp. 554–57. Available at: http://www.aaai.org/ocs/index.php/ICWSM/ICWSM11/paper/viewFile/2816/3234. Google Scholar | |
|
Rivers, D (2013) Comment on Task Force Report. Journal of Survey Statistics and Methodology 1(2): 111–17. Google Scholar | Crossref | |
|
Sang, ETK, Bos, J (2012) Predicting the 2011 Dutch senate election results with Twitter. In: Proceedings of SASN 2012, pp. 53–60. Google Scholar | |
|
Smith, TW, Bailar, B, Couper, M. (2011) Standard definitions – Final dispostions of case codes and outcome rates for surveys. The American Association for Public Opinion Research, 61. Google Scholar | |
|
Tumasjan, A, Sprenger, T, Sandner, PG. (2010) Predicting elections with Twitter: What 140 characters reveal about political sentiment, In: Proceedings of the Fourth International AAAI Conference on Weblogs and Social Media, pp. 178–85. Google Scholar | |
|
Vaccari, C, Valeriani, A, Barberá, P. (2013) Social media and political communication: A survey of Twitter users during the 2013 Italian general election. Rivista Italiana Di Scienza Politica, 43(3): 381–410. Google Scholar | |
|
Vissers, S, Stolle, D (2014) Spill-over effects between Facebook and on/offline political participation? Evidence from a two-wave panel study. Journal of Information Technology & Politics 11(3): 259–75. Google Scholar | Crossref | |
|
Wang, W, Rothschild, D, Goel, S. (2014) Forecasting elections with non-representative polls. International Journal of Forecasting 31(3): 980–91. Google Scholar | Crossref | ISI |




