Collective Emotions and Social Resilience in the Digital Traces After a Terrorist Attack

After collective traumas such as natural disasters and terrorist attacks, members of concerned communities experience intense emotions and talk profusely about them. Although these exchanges resemble simple emotional venting, Durkheim’s theory of collective effervescence postulates that these collective emotions lead to higher levels of solidarity in the affected community. We present the first large-scale test of this theory through the analysis of digital traces of 62,114 Twitter users after the Paris terrorist attacks of November 2015. We found a collective negative emotional response followed by a marked long-term increase in the use of lexical indicators related to solidarity. Expressions of social processes, prosocial behavior, and positive affect were higher in the months after the attacks for the individuals who participated to a higher degree in the collective emotion. Our findings support the conclusion that collective emotions after a disaster are associated with higher solidarity, revealing the social resilience of a community.


2
Collective emotions and social resilience in the digital traces after a terrorist attack -

Twitter data
For a month after the attacks, we used the Twitter search API on a daily basis to retrieve tweets that contained any of the hashtags included in Table 1. We removed all tweets that were not in French, as specified by Twitter's metadata and language detection software (Shuyo, 2010). Out of a total of 492,994 users found in that tweet sample, 287,996 (58%) of them had a non-empty user location field on their profile, which we used to select users that disclose living in France. More precisely, we selected users through the following criteria: 1. Produced less than 3,200 tweets in our observation period 2. Had a ratio of followers to friends between 0.1 and 10 3. Provided a nonempty location field in their Twitter profile 4. Their location field text could be mapped to a European France location through Google Maps Geocode API The first two rules exclude bots (high amount of friends compared to followers or high activity) and mass media or corporate accounts (high amount of followers compared to friends or high activity). Rules 3 and 4 ensure that all users have self-disclosed their location that places them in France. This geocoding step of user profiles differs from the technique of geolocation by tweet GPS coordinates, as we do not require the users to actively disclose their location in the metadata of their tweets. This leaves a user sample to 62,114 user accounts in the country affected by the attacks. Some of these users reported their locations in a general way (e.g. "France") that did not lead to a precise location estimate (longitude and latitude) in GARCIA & RIMÉ, 2018, SUPPLEMENTARY MATERIALS 3 the geocoding step, but many had more precise locations that allowed us to have an insight into the spatial distribution of the data, shown on the map of Figure 1 Table 1 Hashtags used to sample users after the terrorist attacks of November 13th.
In early 2017, we retrieved the timeline of tweets of the users in our sample, removing all retweets and all tweets not detected to be in French. We limit our analysis to the period between April 1st 2015 and June 30th 2016, including 17,899,591 original tweets in French.
Some users in our sample (447 users, less than 1%) had only produced retweets and tweets not in French during that period, and thus are excluded from our subsequent analyses.

Ethical considerations
All data used in this article is publicly available archival data that was produced and posted by the self-selected users. No private information of any user was retrieved at any point.
The study does not include any manipulation or intervention and constituted a purely observational study. In line with the growing consensus in ethics (Metcalf & Crawford, 2016), our work is excluded from ethics review. Minimum risk is associated with our research and the inclusion through hashtags constitutes sufficiently informed consent and respects user expectations. Before carrying this study, we assessed its potential to shed light in important social and emotional processes, bringing opportunities to understand social resilience and the impact of terrorism. These benefits greatly outweigh the risks of analyzing this kind of public archival data, in particular given the self-selection of users and the observational nature of this study. The research presented in here was considered exempt from ethics review by the Ethics

Validating French LIWC dictionaries for Tweets
We validated the emotion classes of LIWC in French against a dataset with manually annotated tweets from the DEFT 2015 competition (Fraisse, Grouin, Hamon, Paroubek, & Zweigenbaum, 2015). Among various tasks, this competition included a sentiment analysis task in which participants classified tweets as positive (+), negative (-), or neutral (0). It als included an emotion detection task to classify emotions from tweet text. Among the emotion classes, it included "Colere", which literally translates to "anger" in English, "Tristesse", literally translated as "sadness", and "Peur", which literally means "fear" but in the coding instructions it included "anxiety" as an example (see https://deft.limsi.fr/2015/descriptionTaches.fr.php?lang=fr). For our validation exercise, we merge training and test datasets of the task, since the LIWC method is unsupervised and can be validated against the whole dataset. The organizers of DEFT could only share the ids of tweets, which need to be looked up through the Twitter API to recover their text. This way we reconstructed 10,250 tweets for the first task and 4,123 tweets for the second task.
To test the validity of the French adaptation of LIWC when applied to tweets, we measured the frequencies of PA, NA, anger, sadness, and anxiety LIWC terms in each tweet of the dataset. The results are shown on Table 2, evidencing significant and sizable differences in the frequency of each term class when comparing tweets with the corresponding annotation versus the rest of the dataset. This is specially notable for the negative affect classes, as for example the text of tweets annotated as negative contain 5.2 times the amount of negative terms of the text of tweets annotated as neutral or positive.
We also evaluate the French translation of the prosocial terms dictionary of Frimer, Schaefer, and Oakes (2014) Table 2 LIWC emotion validation results against annotations reconstructed from Fraisse et al. (2015).
UNICEF_france, restosducoeur, CroixRouge, caritasfrance) and retrieved their timelines up to the last 3200 tweets in March 2017, gathering a total of 22,392 tweets. We applied the prosocial terms dictionary and compared the frequencies against the 6 month baseline of tweets in our dataset. Prosocial term frequency in prosocial accounts is 3.35%, while in the baseline dataset is 1.31% (χ 2 = 14357, p-value < 10 −10 ). This means that prosocial terms have a frequency ratio of 2.56 compared to the text of tweets in the baseline dataset, a significant and sizable difference that evidences the validity of the lexicon and its French translation.

Analysis of collective behavior
For each lexical indicator we calculated its daily mean X(t), and then we computed a normalized score Z X (t) = log(X(t)/X b ), where X b is the baseline value of the corresponding weekday. This logarithmic transformation reduces the skewness of the ratio between term A negative score indicates that the daily frequency is below the baseline and a positive score that the daily frequency is above the baseline.
We fitted the temporal evolution of each normalized score Z X (t) of each linguistic variable X through a time series model of the form: In the above equation, c is a time-independent intercept. D 1 and D 2 measure the size of the shock due to the attacks happening during the night of t 1 (Nov 13th) and being widely reported during day t 2 (Nov 14th). The parameter ϕ quantifies the memory of the time series after the attacks as the linear relationship between Z X (t − 1) and Z X (t). Positive values of ϕ can be generated by synchronized behavior, as explained by the agent-based model described below.
The parameter ϕ pre measures the memory of Z X (t) before the attacks, which should be much smaller than ϕ and very close to zero if the attacks triggered emotion sharing feedback loops that were not active before.
We fitted models with the bayesglm function of the arm R package (Gelman, Jakulin, Pittau, & Su, 2008), taking weakly informative priors for all parameters. We report the median value and the 95% CI of the posterior distribution of each parameter, along with the p-values of standard statistical tests. After fitting, we ran regression diagnostics to validate the assumptions of the above model. behavior (Smith & Conrey, 2007). We simulated a society composed of agents that have an observable variable x(t) that represents a state such as affect or prosocial term use in a day.
Agent states are driven by three mechanisms: 1. Internal dynamics: a combination of random effects sampled from a normal distribution of zero mean with a relaxation tendency γ = 0.005/min that drives state x(t) towards its baseline x = 0. This internal dynamics are based on the empirical results of emotion dynamics through self-reports (Kuppens, Oravecz, & Tuerlinckx, 2010).
2. Reaction to the attack: a fixed impulse of size D = 1 that is active during the day of the attacks. This impulse is perceived by all agents at the same time and marks the onset of any possible collective behavior. This principle is based on previous models of collective emotions in online social media (Schweitzer & Garcia, 2010).
3. Synchronization of behavior: Agents react to each other at a rate α that is directly proportional to the absolute value of their internal state, x(t). As a simplification, we sample the peer of this synchronization at random and add a delay in the reception of an online message that we sample from an exponential distribution of parameter λ = 0.001.
After that time the agent changes its state to the state of the other agent, as an event of emotional or behavioral synchronization. This effect follows the results found in experiments of emotional discussions in online media (Garcia, Kappas, Küster, & Schweitzer, 2016).
The above design allows us to calibrate the intensity of behavior synchronization through the parameter α. The relationship between this strength of synchronization and the internal relaxation tendency captured by γ will define the shape of the trend of the aggregate response, which we quantify in our empirical analysis through parameter ϕ.   Table 3 Regression results of the memory model for a simulation collective responses. Increases in the synchronization parameter α lead to collective responses with ϕ values significantly above zero. This shows the relationship between the agent synchronization and the memory pattern of the collective response X(t).  Table 4 Regression results of the memory model for affective terms.
The fit results of the PA and NA models are reported on  Table 5 Regression results of the memory model for negative affect classes.
The fit results of the three negative emotion models are reported on  Table 6 Regression results of the memory model for terms related to social resilience.
The fit results of the three social resilience terms models are reported on  Table 7 Self-selection model based on personality correlates measured during the baseline period. Table 7 presents the self-selection model of participating in the collective emotion as a function of personality-related lexical indicators and activity levels. Users with high emotionality in the two weeks after the attacks also had high emotionality in the three months before, higher use of first person singular terms, and slightly lower activity levels.  Table 8 Results of mixed models for tweet-level analysis. . In this case, we must note that there is an interaction effect in this last model that attenuates the effect of NA when there is also a social process term in the previous tweet, i.e. that NA terms predict changes from using no social process terms in a tweet to using at least one in the following tweet.  Table 9 Mediation analysis results Table 9 shows the results of mediation analysis using the mediation R package (Imai, Keele, Tingley, & Yamamoto, 2010), Table 10 Regression results of the negative binomial model with zero inflation of the count attack references in the three-month period. The model had a significant estimate of log(Θ) of -0.7297. Table 10 shows the results of a zero-inflated negative binomial model fitted with the zeroinfl function of the pscl R package (Zeileis, Kleiber, & Jackman, 2008). Beyond those results, the fraction of tweets with a reference to the attacks over all tweets of each user in the three-month period is negatively correlated with affect and social process terms in the