Accountability and alternation: How wholesale and partial alternation condition retrospective voting

Holding the government accountable is a crucial function of elections. The extent to which voters can actually do so depends on the political system. One element that may influence the likelihood that voters hold the government accountable is the difference between wholesale and partial alternation. Prominent political scientists like Mair, Bergman and Strøm and Pellegata and Quaranta propose that in countries with wholesale alternation voters are better able to hold governments accountable because in essence voters have the choice to keep their current government or ‘throw the rascals out’. However, this relationship has not been tested. We examine the relationship between partial and wholesale alternation and retrospective voting in a large-N cross-country study. We show that the association between government satisfaction and vote choice is stronger in countries with wholesale alternation than in systems with partial alternation.


Introduction
Holding the government accountable is a crucial function of elections. The extent to which voters can actually do so depends on the political system (Anderson, 2000;Hobolt et al., 2013). One element that may influence the likelihood that voters hold the government accountable is the difference between wholesale and partial alternation (Bergman and Strøm, 2011;Mair, 2008;Pellegata and Quaranta, 2018). In countries such as the United Kingdom, where party competition is characterised by wholesale alternation, it becomes obvious quickly after the votes are counted whether the current government party holds its majority and can continue to govern or whether the opposition takes over. The same is true for Denmark, except that here two blocs of parties instead of two individual parties vie for power. In countries with partial alternation like Belgium and Germany, election results say little about the upcoming government. In these systems, backroom negotiations rather than the election results form the crucial forum that determines who gets into government (Mair, 2011). In partial alternation systems, after elections, some parties leave the government, some parties enter government, and usually at least one party stays in government. The outcome of this process is not yet clear on election night. Bergman and Strøm (2011) propose that democratic accountability is greater in systems with wholesale alternation. We understand accountability as the extent to which voters are able to sanction the government if its actions are contrary to their interests (Manin et al., 1999). In systems with wholesale alternation, voters have a clear choice: they can support the government or they can discharge it if they do not like what it has done. The idea that electoral accountability is greater in systems with wholesale alternation has not been tested empirically. Therefore, we will set out examine to what extent voters in systems characterised by wholesale alternation are more likely to hold their government accountable compared to voters in systems with partial alternation.
In the political science literature, electoral accountability is often understood as retrospective voting (Fiorina, 1978;Key, 1966;Lewis-Beck and Stegmaier, 2000). The underlying idea is that voters who are satisfied with the government's accomplishments will support the governing party or parties in the elections. There is increasing attention on how political institutions condition retrospective voting (Silva and Whitten, 2017). The crucial idea is that the alternative options voters can choose from, determine the likelihood that they will punish the incumbent (Anderson, 2000). With our study, we contribute to the literature studying clarity of political responsibility -i.e. the literature finding that the clarity of responsibility of a political system affects the ability to attribute responsibility for the country's state of affairs to one party or the other (Powell and Whitten, 1993). Initially, authors like Powell and Whitten (1993), Schwindt-Bayer and Tavits (2016), and Whitten and Palmer (1999), tended to focus strongly on formal institutional rules when examining clarity of responsibility. More recently, Hobolt et al. (2013) moved the perspective away from the static institutional rules to the composition of government. They argue that the composition and cohesiveness of the ruling government, rather than stable institutional characteristics, are important to understand why retrospective voting is more common in some countries than in others. They label this government clarity. Since then, a number of researchers have employed more dynamic and more specific indicators of government clarity (e.g., Stiers and Dassonneville, 2020). With our present study, we aim to contribute to this development in the literature by arguing that the nature of party competition in terms of wholesale and partial alternation may shape the level of retrospective voting. We suggest that it is important for voters to be able see the effects of their vote (Otjes and Willumsen, 2019).
On the one hand, in a wholesale alternation system, voters are more aware of their options to either support the incumbent government or to vote against it and replace it entirely. In other words, voters' motivation is higher since their vote may have a considerable effect on the composition of the next government. On the other hand, in partial alternation systems, voters are less certain that their vote will influence the composition of the government. One of our main contributions is hence that we do not just look at cohesiveness of the incumbent government; we add to this by looking at the extent to which voters can expect on the basis of previous government alternation that, if enough voters cast a vote against the government as they do themselves, that government will not return. This clarity of the options that are presented to the voters should make them more likely to make the effort to monitor incumbent performance and vote accordingly. Mair (1997Mair ( , 2001Mair ( , 2011 provided a theoretical framework on the concept of wholesale and partial alternation, which has been used by a number authors to understand policy outcomes Horvárth, 2012a, 2012b;Green-Pedersen, 2002;Horowitz et al., 2009;Milanovic et al., 2010) and patterns of political cooperation within the political elite (Anthonsen and Lindvall, 2009;Green-Pedersen, 2004;Louwerse et al., 2017;Meyer-Sahling and Veen, 2012;Otjes and Rasmussen, 2017). Yet, this distinction has not played a major role in the study of partyvote relations (but see Otjes andWillumsen, 2019 andPellegata andQuaranta, 2018). The purpose of this study is to assess the effect of wholesale and partial alternation on retrospective voting. We do not claim to offer a completely new theory of retrospective voting. Rather, we believe that within the existing literature wholesale and partial alternation can be a small but significant addition that can help to understand how contextual variables impact the relationship between satisfaction with the government and voting behaviour. This article has the following structure: firstly, we discuss what we know about wholesale and partial alternation and retrospective voting in greater detail. Consequently, we bring those two literatures together. Next, we look at our data and our modelling strategy. We use the Comparative Study of Electoral Systems which covers 55 elections in 30 countries. From there we move on to discuss our results. The final section will draw a number of conclusions about the effect of the patterns of party competition on retrospective voting.

Wholesale and partial alternation
The idea that there is a difference between party systems where voters have a choice between 'alternative teams of governors' and force a 'clear and abrupt' change in government composition or between systems where the relationship between the election results and government composition is weaker date back to the 1970s (Finer, 1975: 31;Rokkan, 1970: 93). Despite prominent political scientists acknowledging this difference (Ieraci, 2012;Lundell, 2011;Mair, 1997), it has not been a prominent element in political science theory about the functioning of democracy. Yet, what sets a party system apart from a collection of parties is the way in which these parties interact when competing for government (Sartori, 1976: 39). This in turn shapes the ability of voters to influence the composition of the government and hold it accountable.
Wholesale alternation (sometimes called perfect government turnover) has often been observed in two-party systems. It is common in Commonwealth systems such as the United Kingdom. Sartori (1976: 165) even considers the alternation of government as the definitive feature of 'twopartism'. Yet, wholesale alternation can also occur in multiparty systems. For example, in Denmark a left-wing and a right-wing bloc compete for power. Another example is Spain, where the major party of either the left or the right often governs as a minority government supported by smaller parties. In other countries, government formation can be characterised as partial alternation or limited government turnover. This occurs in Germany and countries that have a similar party system in Western Europe, such as Luxembourg, Austria, the Netherlands and Belgium, but also in countries like Finland and Estonia. During the French Fourth Republic, government alternation was also partial; when it moved to the Fifth Republic, one of the most remarkable changes was away from partial alternation to wholesale alternation. A similar change can be observed in Italy when comparing the 'First' and 'Second' Republic (De Giorgi and Marangoni, 2015;Verzichelli and Cotta, 1999). Wholesale alternation occurs in two-party systems but also in multiparty systems, in particular in those that are strongly polarised. Partial alternation tends to occur in systems with a large number of parties but a low level of polarisation (Otjes, 2018).
Government alternation is closely related to the quality of democracy and democratic stability (Casal Bértoa and Enyedi, 2016;Cheibub et al., 2009;Kaiser et al., 2002;Lundell, 2011;Otjes and Willumsen, 2019). A key aspect of democracy as 'government of the people' is the ability of voters to hold the governing parties accountable (Manin et al., 1999). In a system with wholesale alternation voters have the ability to 'throw the rascals out' (Mair, 2008). In a system where partial alternation is the norm, voters may punish the governing parties at the polls, but these may still return to government through the byzantine process of government formation as a result of partial alternation (Mair, 2011). For instance, in 2002 the Austrian Freedom Party lost almost two-thirds of its seats but still stayed in government, and in 2010 the Dutch Christian-Democrats lost nearly half of their seats but continued to govern. The weak link between election outcomes and government formation can be an important source of voters' frustration with the political system (Irwin and Van Holsteyn, 2011).
We know surprisingly little about how this difference between wholesale and partial alternation shapes the relationship between voters and parties. Pellegata and Quaranta (2018) find indeed that during periods of economic downturn, wholesale alternation is more likely. Otjes and Willumsen (2019) show that the difference between wholesale and partial alternation affects turnout, boosting turnout in systems with proportional electoral systems but decreasing turnout in systems with disproportional electoral systems.
However, these two studies do not rely on individual-level data but look at aggregate election results instead.

Retrospective voting
In essence, the theory of retrospective voting proposes that whether or not voters vote for the government parties depends on their assessment of the government's performance (Fiorina, 1978;Key, 1966;Lewis-Beck and Stegmaier, 2000;Strøm, 2000). The more satisfied voters are with the government, the more likely they are to vote for one of the incumbent parties. Authors like Fiorina (1981), Healy and Malhotra (2013) and Lewis-Beck and Stegmaier (2007) have investigated retrospective voting and convincingly demonstrated that the assessment of government performance plays a role in voting behaviour. Most of these studies focused on the state of the economy (Lewis-Beck and Stegmaier, 2013;Nannestad and Paldam, 1994;Stubager et al., 2014). Yet, other policy domains are important to voters as well (Singer, 2011). This is supported by other studies that have found evidence of retrospective voting in a variety of domains (Crisp et al., 2014;Ecker et al., 2016;Fowler and Hall, 2018;Gasper and Reeves, 2011;Karol and Miguel, 2007). De Vries and Giger (2014) and Stiers (2019) move away from government specific performance evaluations and instead look at voters' assessment of government performance 'in general'.
While the relationship between government performance evaluations and the vote is quite strong, there are both individual and country-level factors that influence voters' ability to hold their government accountable (Anderson, 2007a). At the country level, institutional complexity may blur the lines of responsibility and may therefore weaken the role of performance evaluation in voting (Silva and Whitten, 2017). If voters are convinced that a particular party has total control over policy-making, they are more likely to assign responsibility for policy outcomes to that party (Powell and Whitten, 1993: 398). Hobolt and colleagues (2013) distinguish between two different dimensions of the political system: the more static 'institutional clarity', and the more dynamic 'government clarity'. Institutional clarity, on the one hand, is the extent to which government power is concentrated according to the formal institutional rules. Research has since examined the effect of power concentration on the relationship between the government performance and the attribution of responsibility. Studies have examined the 'vertical' distribution of power between different levels of government and the 'horizontal' distribution of power between the executive and the legislative, (Bengtsson, 2004;Cutler, 2008;Nadeau et al., 2002;Whitten and Palmer, 1999). Government clarity, on the other hand, is the extent to which the government is cohesive. Hobolt et al. (2013) propose that voters will be inclined to punish or reward a political party for the state of the country. If one party or a cohesive coalition holds power, it is easier to assign blame than if the government consists out of many, ideologically distinct parties. Overall, they indicate that retrospective voting is more common when government clarity is evident.
What all these studies have in common is that they look at the extent to which voters are able to attribute responsibility to the government. What they fail to take into account however, is the extent to which citizens have a meaningful choice at the elections. Anderson (2000) proposes that the clarity of available alternatives shapes the extent to which voters are able to effectively hold the government accountable. He argues that the larger the number of parties that run in an election, the less clear the coalition which can be formed after the election. If there are only two alternatives, the choice voter have, is between continuing the current government or choosing a new team (Anderson, 2000: 156). Anderson (2000) and Bengtsson (2004) provide empirical support for the link between retrospective voting and the fractionalisation of parliament. Recently, Stiers and Dassonneville (2020) found that increasing polarisation between the government and the opposition helps voters to distinguish parties, which in turn positively moderates retrospective voting. Hence, if it is clear to voters that they can alter government policies in the future by voting for an ideologically distinct party. Thus, the accountability function of elections is strengthened (Hellwig, 2010). All these studies emphasise the important role of dynamic contextual factors in retrospective voting.

Bringing retrospective voting and government alternation together
The literature on wholesale and partial alternation proposes that different countries have different norms and expectations about how government composition. On the one hand there are countries where the expectation is that if the government loses its majority, the opposition will come to power. On the other hand, there are countries where at least one of the governing parties stays in office after a monthslong formation, even if the government loses its majority.
Citizens are likely to use the latest government formation as a heuristic, a judgmental shortcut to predict future government formations (Sniderman et al., 1991). Crucial here is that citizens' experiences with the political system, more than theoretical possibilities, shape their expectations about the system they live in (Anderson, 2007b;Mishler and Rose, 2002). Specifically, citizens' expectations about the functioning of their democratic system, are shaped by their own experience of the system (Heyne, 2018: 58). That is: people learn the rules of game by seeing how it is played.
From the perspective of retrospective voting, the main distinction in the political landscape is between the government and the opposition (Key, 1966;Stiers, 2019). The extent to which voters' perceptions match this schema is crucial. If the latest government was formed by a government displacing the sitting government, voters are more likely to expect that the current election will offer a similar choice, and hence that they can vote the incumbent government out of office if they would want to do so. If, in contrast, the latest government was formed by some parties from the previous government and some from the opposition, voters are less likely to expect that they can punish the incumbent very strongly, because government composition may not change completely. These expectations may feed into voter behaviour (Otjes and Willumsen, 2019): the motivation to hold the government accountable is possibly stronger when citizens expect -on the basis of previous experience -that wholesale alternation occurs. Citizens who have not experienced the possibility to 'send the rascals out', the notion that by voting they have an effect on government formation is more foreign.
We expect that a previous experience with wholesale alternation boosts the probability that citizens take government evaluations into account. In summary: Wholesale-Partial Retrospective Voting Hypothesis: If the most recent government alternation after elections was wholesale alternation, the effect of retrospective performance evaluations on voting for government parties is stronger than after partial alternation.

Data and methods
To test our hypothesis empirically, we use the data of the Comparative Study of Electoral Systems (CSES). The CSES data consist of a collection of national election surveys including a common set of questions. Hence, using these data, we employ electoral surveys from a variety of different countries and contexts. This makes the CSES data well suited for our purposes here, as we need sufficient variation in government alternation patterns.
Our hypothesis brings forward two variables of interest. First, as we investigate the extent to which voters hold incumbents accountable in elections, we need a measure of the voter's retrospective evaluation of the government's performance. Performance indicators of different domains have been used to investigate retrospective voting, and the most commonly investigated indicator is a voter's evaluation of the evolution of the economy (Lewis-Beck and Stegmaier, 2013;Nannestand and Paldam, 1994). However, as Singer (2011) Vries and Giger, 2014;Stiers, 2019;Stiers and Dassonneville, 2020). This question is available in the second and third module of the CSES (CSES 2015a, 2015b). However, given the prevalence of economic voting specifically, we will report a robustness test using retrospective evaluations of the economy as well, which we derive from the CSES module 1 and 4.
The second variable of interest is an indicator of government alternation. Different indicators can be used to measure alternation. While we use these indicators to investigate the same characteristic of political systems, they imply a different understanding of politics by its citizens (Otjes and Willumsen, 2019). Therefore, we will test the conditioning effect of two different measures of alternation. Our first measure (Latest Government Change) is a dichotomous indicator of whether the most recent change of the government composition after elections was wholesale (code 1) or partial (code 0). 1 We expect that the underlying mechanism is when voters experience one kind of government alternation, they will expect the same kind of government alternation in the future (Otjes and Willumsen, 2019). Therefore, we will focus on the effects of recent alternation on voters' willingness to hold incumbents accountable for their performance. It seems reasonable that voter refer to the latest change of government as a heuristic: voters have relative short memories (Achen and Bartels, 2016). 2 We additionally test the robustness of the results using two alternative operationalisations of this measure. The first alternative operationalisation looks at the share of government changes in the previous 10 years, whether it was partial or wholesale (Share Government Changes), instead of just the latest change in government. This measure differs from the main measure in how it assumes voters develop expectations about the future: in the main measure we expect that the latest election overrides all previous experiences, in the second case we expect that voters take a number of different events into account. The second alternative specification looks at the latest change but only when it occurred within a period of 10 years (Latest Government Change Limited). The main measure only looks at a change in government, even if this occurred a long time ago, e.g. before the 1997 UK election, the previous government change was in 1979. It is rather unlikely to expect voters to take events from 18 years earlier into account when voting. These three measures all conceptualise partial and wholesale alternation as a dichotomy.
Our second main measure is the Government Turnover Index (Ieraci, 2012), which is the share of parties that are new in government. In the case of wholesale alternation, this is always 100 per cent. The Government Turnover Index differentiates between small and large parties entering government. For example, the CDU and CSU joining the SPD in German government, as senior partner after the 2005 elections was a larger change in government composition then when the FDP joined the CDU and CSU in government after the 2009 election. The Government Turnover Index takes this into account by looking at the share of seats taken by parties that were previously not in government. This was 50 per cent in 2005 (the CDU/CSU had 226 seats compared to the 222 of the SPD that continued in government) and 28 per cent in 2009 (the FDP had 93 seats compared to the 239 of the CDU/CSU). The Government Turnover Index assumes a different assumption of the political system by voters. It assumes that the feeling that they can send the rascals out scales with the share of the government that was sent home the previous time a government changed. Consequently, a two third change of government affects voters twice as much compared to a one third change, and in case the entire government changes, the effect is three times higher.
Similar to the measures of wholesale-partial dichotomy, we first look at the latest change in government. We also report robustness tests for the average of government changes in the previous 10 years (Average Government Turnover Index), and the most recent government change if it was within a period of 10 years (Government Turnover Index Limited). The combination of these results will provide a robustness test of the effects of government alternation on retrospective voting, and which aspects of alternation are most important. 3 Besides these variables of interest, we include some additional variables to control for their impact (Stiers and Dassonneville, 2020). More specifically, we include the standard socio-demographic variables. Gender binary self-identification is included with male respondents as reference category. Age is included as the age of the respondent in the election year. Educational level is a categorical variable distinguishing four categories: (1) voters that had no, primary, or incomplete secondary education (reference category); (2) voters with secondary education; (3) voters with post-secondary training but no completed university training; (4) voters with university training. We also include the ideological position of the respondent as selfplacement on the ideological continuum ranging from 'left' (value 0) to 'right' (value 10). Descriptive statistics of these variables are included in Appendix A.
Besides these covariates, we conduct an additional robustness test in Appendix E. As our argument strongly builds on the literature on clarity of responsibility, it is important to test whether our results are determined by government alternation or a mere reflection of clarity of responsibility. To test this, we estimate additional models, including an indicator of the clarity of responsibility of the election under investigation. To do so, we replicate the measure of 'government clarity' as operationalised by Hobolt et al. (2013) -as this was the measure they found to matter most for retrospective voting. Their measure of clarity of responsibility includes indicators of (1) coalition government or not; (2) cohabitation in semi-presidential regimes or not; (3) ideological cohesion of the government; (4) dominance of the main governing party. As clarity of responsibility is expected to moderate retrospective voting, in our robustness test, we include this indicator in interaction with performance evaluations.
As the dependent variable is binary -distinguishing a vote for an incumbent party from an opposition vote -we estimate logistic regression models. However, using data from different election studies, the observations within the respective election studies cannot be assumed to be independent of each other. To account for this clustering, we estimate multilevel models with random intercepts (Gelman and Hill, 2007). We are interested in the conditioning effect of government alternation for retrospective voting, and consequently include interactions in our models (Brambor et al., 2006). As these are cross-level interactions, we include random slopes for the individual-level performance evaluations (Heisig and Schaeffer, 2019). Finally, all continuous individual-level variables are group-mean centred so that we are left with a pure estimate of their effects (Enders and Tofighi, 2007). Table 1 summarises the results of the models explaining incumbent voting. Model 1 and 2 focus on the wholesale or partial alternation of the latest government; Model 3 and 4 focus on the Government Turnover Index for the latest change in government. Note: Entries are log-odds coefficients, standard errors reported in parentheses. Data: CSES module 2 and 3. Significance levels: þ : p < 0.10; *: p < 0.05; **: p < 0.01; ***: p < 0.001.

Results
First, the results presented in Table 1 support the traditional retrospective voting theory. As the coefficients of performance evaluations in Model 1 and Model 3 show, the more positive a voter's evaluation of the government's performance, the higher their probability to support an incumbent party. On itself there is a negative relationship between wholesale alternation and the likelihood of voting for the government (Model 1). We, however, will investigate the extent to which the effect of performance evaluations is conditioned by past experiences with government alternation. To test our hypothesis, we include interactions between performance evaluations and our respective measures of government alternation. First, in Model 2, we interact voter perceptions of performance with the dummy-indicator of whether the previous government alternation was wholesale or partial. The coefficient of the interaction is positive and significant, indicating that the coefficient of retrospective evaluations is larger in countries with wholesale government alternation than in countries with partial alternation. However, as interactions in logistic regression coefficients are hard to interpret, we calculate the predicted probability of voting for an incumbent party at different levels of satisfaction for partial and wholesale alternation systems respectively. These are displayed in Figure 1.
The results in Figure 1 confirm the finding of Table 1: wholesale government alternation positively moderates retrospective voting. At lower levels of satisfaction, voters living in systems with wholesale alternation are less likely to support an incumbent than dissatisfied voters in systems with partial alternation. For very satisfied voters, this is reversed, as satisfied voters in wholesale alternation systems are more likely to support the incumbent than voters in partial alternation systems. Hence, overall the effect of satisfaction on incumbent voting is larger in systems with wholesale alternation. This also becomes clear looking at the average marginal effects (Mood, 2010), which we display in Figure D.1 in Appendix D: when the previous government alternation was partial, a one-unit increase in performance evaluation increases the probability of voting for an incumbent party with 24.87 percentage points. For voters having a wholesale alternation in mind, the average marginal effect is 31.51 percentage points. Hence, previous wholesale alternation increases the effect of performance evaluations on the vote with more than 25% compared to partial alternation.
Our second indicator of government alternation is the Government Turnover Index. The interaction coefficientpresented in Model 4 of Table 1 -is also positive and significant, showing that this indicator also positively moderates retrospective voting. To get a better view on the effect, we plot the predicted probability of voting for an incumbent party at different levels of satisfaction at the first and third quartile of the Government Turnover Index respectively in Figure 2.
The results in Figure 2 are fully in line with those in Figure 1. Again, dissatisfied voters are less likely to vote for an incumbent party in wholesale alternation systems, while satisfied voters are more likely to support the incumbent in these systems. The average marginal effects ( Figure  D.2 in Appendix D) also show that retrospective performance evaluations have a stronger effect on voting behaviour when a larger share of the government has changed in the latest election. More specifically, the effect is almost 70% higher in full alternation systems compared to a minimal alternation. This moderation effect is comparable to that found in other studies (see, for instance, Stiers and Dassonneville (2020).   Figure 2 shows predicted probabilities and 95% confidence intervals of voting for an incumbent party at different levels of government satisfaction for systems with a Turnover Index at the first quartile (0.52; black line) and systems with Turnover Index at the third quartile (1; grey line) respectively.

Robustness test
To test the robustness of the findings presented here, we estimated a series of alternative models. Firstly, it is possible that at least a part of this result could be explained by the difference in clarity of responsibility between these different systems. To test for this, we estimate the models again, this time including an interaction between performance evaluations and the indicator of government clarity designed by Hobolt et al. (2013). The results are summarised in Table E.1 in Appendix E. When we control for the interaction between performance evaluations and clarity of responsibility, the conclusion is that government alternation significantly conditions retrospective voting. Furthermore, the interaction between performance evaluations and clarity of responsibility is no longer significant. However, it needs to be noted that there is a substantial correlation between government alternation and government clarity, so these results should be interpreted with caution.
Secondly, while the focus is on the government alternation of the election previous to the election under investigation, we also tested alternative indicators spanning the previous 10 years before the election. Hence, these indicators measure the share of full government alternations in the 10 years before the election and the average turnover index in the previous 10 years respectively. The results of these tests are included in Table B.1 in Appendix B, and show support for the hypothesis and the findings presented under this specification. That is, the interactions between performance evaluations and the indicators of government alternations are significant at the 0.1-level. These results support our conclusions, but indicate that recent experiences with government alternation play a greater role in the minds of voters than earlier elections. Besides these measures indicating the averages for our indicators over the previous 10 years, we also estimated the models using the same measures as those presented in Table 1, but with a 10-year time limit. These measures take into account that it might be unrealistic to expect a government formation of more than 10 years ago to influence current elections. The results of these models are summarised in Table B.2 in Appendix B. We still find a significant effect of the interaction between performance evaluations and the binary indicator, that is the previous election being wholesale alternation or not. The interaction between performance evaluations and the Government Turnover Index is not supported.
Thirdly, our measure of retrospective performance evaluations differs from the indicator used by a number of earlier studies -i.e. evaluations of the state of the economy. While we think it is preferable not to use evaluations from one domain in particular, it is useful to test how our results relate to the economic voting literature. We use the data of the first and fourth CSES module to supplement our analyses with economic evaluations, as the second and third module of the CSES do not include evaluations of the economy. We build our models as similar as possible to the models presented here. The results, summarised in Appendix C, show further support for our findings. While there is strong evidence for the conclusion of a positive moderation effect for the first binary indicator of alternation, the moderation by share of government change reaches the 0.1level of significance.

Conclusion
Central in this paper is the question whether democratic accountability is greater in systems characterised by wholesale alternation compared to systems characterised by partial alternation. We find that this is indeed the case; voters are more likely to consider government performance while voting if the last time the government changed, all government parties were new. Depending on the indicator employed, the effect of the government evaluations on voting increases by 25% to even 70%. We find that this effect persists when looking at different operationalisations of wholesale alternation, a different time horizon, and an alternative measure of government performance.
Our results refer to two literatures: the literature on retrospective voting and the literature on wholesale and partial alternation. On the one hand, this study contributes to increasing interest in the effect of institutional factors in the study of retrospective voting (Anderson, 2000;Hobolt et al., 2013;Silva and Whitten, 2017). We show that party system characteristics, and in our case the way party competition for government office is structured, affects extent to which evaluations of government influence how citizens vote. If the most recent elections resulted in a wholesale government change, citizens are more likely to take government performance into account while voting. If the most recent government was the result of backroom talks rather than the election results, these considerations are considerably weaker.
Our results support a long-held belief by those who are interested in differences between wholesale and partial alternation systems (Bergman and Strøm, 2011;Mair, 1997): in systems with wholesale alternation democratic accountability is significantly greater compared to systems with partial alternation. Citizens are more likely to 'vote the rascals out'. Systems with wholesale alternation reflect more closely the model of democracy that Schumpeter (1943) envisioned: citizens can choose between different teams of governors. If they are happy with the current government, they can back one of the government parties, otherwise they can support the opposition. The electoral feedback mechanism that is crucial in traditional models of democratic functioning, such as Easton's (1965) appears to be somewhat stronger in systems with wholesale alternation.
Democratic accountability seems to be greater in systems with wholesale alternation. However, this may come at the expense of other desired aspects of a well-functioning democratic system. In particular, greater democratic accountability may weaken the quality of policy representation (Quinn, 2016). Voters that take policy performance into account may be less likely to vote for the party which is closest to their policy preference. Future research may want to delve deeper into the role of policy considerations in voting comparing systems characterised by wholesale and partial alternation.

Declaration of conflicting interests
The author(s) declared no potential conflicts of interest with respect to the research, authorship, and/or publication of this article.

Funding
The author(s) disclosed receipt of the following financial support for the research, authorship, and/or publication of this article: Dieter Stiers was supported by the Research Foundation -Flanders.

Supplemental material
Supplemental material for this article is available online.

Notes
1. In our view when voters consider what their vote can affect, they are likely to think about what previous elections have affected. To us, it seems unlikely that they take into account changes that occurred at other times (that is when a government changed without elections). Those cases were unaffected what voters did. 2. There also is a mathematical reason why a latest-election measure works better than an average of the previous decade of elections. Consider the situation where there is a deviation from the norm in 1 year, Say, there is wholesale alternation in a system where there have been decades of partial alternation. And that citizens' expectations are not affected by this aberration and voters still expect partial alternation, this deviation will only provide an 'incorrect' measure for 1 year using the latest-election measure. But it will distort the values for a decade of elections for the other measure. 3. There is one alternative measure that we did not employ and that is a measure of ideological alternation (e.g. Horowitz et al., 2009;Pellegata, 2016;Tsebelis, 2002;Tsebelis and Chang, 2004;Zucchini, 2011). This is a measure of the ideological distance between successive governments. This brings an element of left-right voting into the analysis. We do not think that this is relevant to our argument because we look at retrospective voting in terms of a respondents' assessment of the job the government did (i.e., a valence consideration), not the ideological proximity of the policy output (i.e., a spatial consideration).