Measuring MPs’ Responsiveness: How to Do it and Stay Out of Trouble

This article reviews the issues raised by the reaction to an audit experiment, studying the responsiveness of British MPs to their constituents, in November and December 2020. The experiment was part of a wider comparative project investigating the linkage between legislators and their constituents. We sent two short emails to all MPs asking how they and their party were going to respond to the economic impact of the COVID-19 pandemic. We were required by our ethics committee to debrief the subjects, providing the opportunity to withdraw from the analysis. The scale of the reaction to the debriefing email was neither desired nor anticipated (https://www.bbc.co.uk/news/uk-politics-56196967). We explain how we got ourselves into such difficulty, how we might have stayed out of it and the wider implications of our experience for experimental research on politicians. We reflect on the ethical issues raised by the reaction to our research, alongside the role that communications with legislators, the wider parliamentary community and the media should play in research design when conducting experiments with politicians as subjects.

The subject of this article is the public reaction to the UK case study of 'The Nature of Political Representation in Times of Dealignment' (NAPRE) project. 1 The project aims to understand better the nature of the relationship between legislators and their constituents, a fundamental concern of representative democracy. The motivation for our international collaboration is to examine the connection between citizens and their elected representatives in the context of partisan dealignment (Dalton, 2016;Dalton and Wattenberg, 2000;Mair, 2013), where the linkage role played by political parties is diminished. Our comparative research project involves audit experiments of the British, German and Dutch legislatures and quantitative text analysis of parliamentary speech. The direct link between citizens and their elected representatives is apparent in first past the post single member constituency systems (Hanretty et al., 2017), such as the one that operates in British General Elections, but it is a feature to some extent of all representative democracies (Carey and Shugart, 1995). The three case study countries are arrayed across the spectrum of district sizes (number of legislators that represent each district), from a district size of 1 in the United Kingdom to 150 in the Netherlands, where all of the legislators represent one national 'district'. The German case sits in the middle of these two poles with a mixed member system combining 299 single member districts with 16 multimember districts (with between 4 and 134 members per district). The comparative approach enables us to explore the extent to which dyadic representation (individual-level linkage between legislators and citizens) provides a viable supplement to partisan mechanisms of representation in European contexts and to understand how electoral contexts (such as district size) either facilitate or hinder the responsiveness of individual MPs towards citizen-initiated contacts (Heitshusen et al., 2005).
In our view, audit studies of MPs' responsiveness using email experiments are justified if a strong case can be made regarding the scientific and social importance of the research; the inadequacy of other methods to obtain robust evidence; and a minimal burden on MPs and their staff (for a full discussion, see Zittel et al., 2021). In this article instead of reflecting on the ethics of correspondence study field experiments with political elites generally we narrow down and focus on the specific case of our experiment and the lessons we believe can be learnt from the public fallout.
The public criticism of our study largely focused on four issues: namely that we were wasting MPs and their staffs' valuable time; in the middle of a global pandemic; using deception and that our methods were unlikely to yield significant results. Linked to these concerns were questions as to whether, and how, we had received ethical approval for the research. 2

Research Design
The email experimental design was replicated across the three country studies. Instead of altering one constituent characteristic at a time in each email, we used a factorial design that allowed us to manipulate four characteristics at once (specifically partisanship, occupational class, gender and ethnicity). The emails were sent out in two waves, the topic of the email remained constant, but the wording was altered to reduce the risk of detection. Thus, each MP received a total of two short emails from a fictitious constituent with a random combination of the listed characteristics (ethnicity and gender were signalled through variations in the constituent's name) and a random allocation of the version of the topic email. We completed a pre-analysis plan that fully describes the research design (Baumann et al., 2020).

Debriefing
Our original ethics application was declined and the committee's preferred approach was that we send all MPs a pre-experimental briefing email with the option to withdraw before the study took place. Our concern was that this would undermine the experiment by reducing the sample size and introduce bias, as the MPs who withdraw may differ systematically in some way to those who agree to participate. Equally we feared that a pre-brief may have primed MPs to anticipate emails from fictitious constituents which may have altered their behaviour. We felt that there was a strong public interest argument for observing MPs undertaking their work representing constituents in a minimally invasive way that outweighed the requirement to gain informed consent.
The KCL Ethics Committee responded to our concerns and agreed that deception is sometimes an acceptable part of research and that, given our participants were a powerful elite, an alternative approach to pre-briefing would be justifiable. After a face-to-face discussion, a post experimental debrief was agreed as a compromise solution. Our view, having surveyed the extant literature (Zittel et al., 2021), was that informed consent was not required for us to conduct an audit experiment of elected officials' correspondence. There have been two other audit experiments involving MPs in recent years in the UK (Habel and Birch, 2019;McKee, 2019) demonstrating high responsiveness overall and a level of bias against working class and ethnic minority constituents. Our aim was to add to this field by adding a comparative analysis seeking to understand the impact of electoral context on MPs' responsiveness. These two studies provided us with hard evidence that the methods uncover findings unavailable through alternative approaches. These two studies did not require a debrief and were not publicly exposed. The same is true of the two other country studies in the NAPRE project, Germany and the Netherlands, neither of which were required to pre or debrief MPs. A study in Denmark which was due to take place alongside our study (although not funded through the same scheme) was cancelled at the last minute in late 2020 because Aarhus university required a debrief with an opt-in mode. The Aarhus ethics committee also ruled out any comparison of background statistics for those opting in and those not opting in, making it impossible to assess any obvious differences between participants and non-participants. The lead researcher of the Danish study believed that these restrictions would undermine the research.
We had serious doubts about employing a debrief, but these doubts related to loss of cases from the study rather than risk of exposure. The risk of exposure is something we considered in planning and we had prepared statements in the case of exposure (MPs discovering the experiment by accident). In hindsight we should have prepared for debrief as if it was an exposure. As all data could be withdrawn, we did not see the debrief as exposure per se but instead the means to achieve informed consent. But the debriefing email that arrived in MPs' inboxes was a self-detonated public exposure primed to go off and potentially undermine the study and damage relationships between MPs, their staff, and the research community. In hindsight the UK Principal Investigator (Rosie Campbell) should have seen that a debrief would have had exactly the same effect as a pre-brief in terms of withdrawals from the study, but in addition a debrief also ensures public exposure and we should have prepared for a potential backlash against the study and the use of experimental methods to study political elites, or alternatively decided not to go ahead with the research.
In our view, informed consent should not be required in audit studies of elected officials but should they be employed then considerable attention should be given to the communications surrounding the study. MPs' staff were not the subjects of the research, and no data were collected relating to them as individuals. However, this is not how the debrief was interpreted by many. Our strategy focused too much on the potential harm to the subjects (MPs) and not enough on communication. We did not fully consider the feelings of MPs' case workers when opening the debrief. The negative impact on MPs' staff is a matter of regret and something for which the Principal Investigator has publicly apologised. We adapted a standard debriefing to our study, we should have also included a concise plain English summary of exactly what data we were collecting from whom and why. We should have focused more attention on describing the experimental method and explaining that we were comparing control and treatment groups and not rating the responsiveness of individual MPs' offices. This primer would also have been extremely beneficial in aiding communications with the news media.
Lesson 1: A lesson we have learned and wish to share is that planning and managing communications should be a core activity for audit studies of political elites.

Scientific and Social Importance
The question of MPs' responsiveness to their constituents is clearly a matter of public interest and one that warrants research. A principle of representative democracy is equality of representation, and every citizen's vote is of equal weight in electoral terms. An extension of the logic of equal representation is that there ought not to be systematic bias against citizens from particular social backgrounds when we measure MPs' responsiveness; my gender, ethnicity or social class ought not to be a factor explaining whether my concerns are likely to be responded to by my elected representative. The most reliable way to measure MPs' responsiveness, and assess any bias, is through experimental methods such as randomised control trials or natural experiments. Randomised control trials involve the random allocation of subjects to control and treatment groups. Randomisation controls for the impact of extraneous factors such as the size or relative prosperity of a constituency, MPs' party and case load. Using these methods, we can avoid the effects of social desirability bias, whereby participants tend to answer questions in a manner that will be viewed favourably by others, that limit the usefulness of approaches that simply ask MPs how responsive they are, through surveys or interviews.
We had strong grounds to expect that our research would yield research of scientific and social importance. First, from the extant literature, we had good reason to anticipate sufficient variation in the levels of MPs' responsiveness to their constituents to ensure that the experiment yielded useful data. While in the UK the majority of MPs respond to email enquiries with a request for a residential address, a sizable proportion either do not answer or provide a substantive reply. Second, previous research had shown a relationship between constituents' backgrounds and MP responsiveness. Research from the United States finds lower rates of responsiveness among legislators to constituents from ethnic minorities (Butler and Broockman, 2011;Costa, 2017) and those from economically disadvantaged backgrounds (Butler, 2014). In the UK previous experimental studies have found very high rates of responsiveness overall among British MPs, but some bias against ethnic minority and working-class constituents (Habel and Birch, 2019;McKee, 2019).

Deception
Audit studies are now commonplace across much of the public and private sector and have revealed multiple biases that would be extremely hard to access with informed consent. Traditional methods such as interviews may not provide crucial evidence because research subjects may not be conscious of their own biases and they may wish to deny them for social desirability reasons if they are aware (see Butler and Despato's contribution to this special section). Without robust evidence that members of particular social groups (such as ethnic minorities) receive fewer or lower quality responses from public officials and companies, we lack the evidence required to create equality of access and opportunity. For example, a recent email experiment study has shown that gay men receive lower quality responses from foster care agencies (Mackenzie-Liu et al., 2021); this information is now in the public domain and has galvanised support for remedial action. There are multiple examples of audit studies informing government policy. One very notable case is former Prime Minister David Cameron's citation of an audit study conducted by the National Centre for Social Research on behalf of the Department for Work and Pensions (Wood et al., 2009). The study involved sending applications from three fictitious candidates to formally advertised job vacancies. The response rate to applicants with white sounding names was 10.7% compared with 6.2% for applicants with ethnic minority sounding names. David Cameron cited this study in his 2015 conference speech and used it to promote name blind recruitment processes. 3 Given the central role politicians play in policy making and political representation, political elites should not be exempt from this kind of scrutiny. In fact, given the crucial public role elected politicians play in our society, we consider it imperative that their behaviour should be the subject of rigorous research, and when measuring responsiveness bias audit studies are the most rigorous methods available.
The deception deployed is another common cost associated with audit studies and one that ideally should be avoided. Some innovative audit experiments have been designed using real individuals (Kessler et al., 2019). We considered using real constituents but concluded that it was not the optimal option for our study. To do so, we would have needed to recruit two residents in every constituency in the UK (1300 real people). Then to test our hypotheses, regarding the impact of constituents' backgrounds on responsiveness, we would have had to ask those real people to write letters pretending to be fictitious other people with the requisite background characteristics. This approach would add another layer of deception and we felt that the additional cost of the participants' time would not be warranted. Another alternative would be to recruit individuals and use their actual characteristics in the experimental design. Again, this would have been a very challenging undertaking to ensure that we had, for example, sufficient participants from ethnic minority backgrounds from a diversity of constituencies. To ensure sufficient variation across characteristics, the sample size would need to be significantly larger than in our fictitious email experiment. Thus, the costs would be escalated still further by the addition of the contribution of constituents' time and a multiplication in the number of emails reaching MPs from constituents as part of the experiment.

Time and Timing
Researching MPs' responsiveness requires a contribution of some kind from MPs and/or their constituents, for example, in the form of interviews or taking part in surveys. These methods provide useful insights, but they have clear limitations. Namely that they depend on politicians' own accounts of their interactions with constituents or constituents' recall of their interactions with their MP. Experimental methods provide the only robust means of gaining a reliable measure of MPs' responsiveness and are increasingly commonly used. However, there are costs of employing experiments on MPs' responsiveness. One is that they take up MPs and their staff's time, which would otherwise be spent responding to genuine constituents' concerns. This issue was at the forefront of our minds when designing the experiment and we ran power calculations to ensure that we sent out the absolute minimum number of emails required to test our hypotheses. We calculated that two short emails would be sufficient to test our main hypotheses. We opted to keep these two emails very brief and they were designed to elicit a general response and take up minimal response time (see sample email below). We did not follow up with responses of any kind. The King's College London College Research Ethics Committee were of the opinion that, on balance, the potential benefits of the research outweighed the burden of time spent responding to two emails sent to each MP. However, the ethical review process did not consider the timing of the research. Instead, we, the researchers, were responsible for the planning and conduct of the approved project. In hindsight, it is clear to us that the timing of the project was problematic. When we originally received funding, the COVID-19 pandemic had not yet struck. We delayed the experiment and deliberated with the international team about how best to proceed. We started the experiment on 2 November 2020, after concluding that these issues remained of public importance, and that two basic emails without follow-up would not be too burdensome. Given the stress that many MPs and their staff were under, responding to the urgent needs of their constituents during the crisis, we can now imagine how the discovery, through the debriefing email, that they had been sent emails from fake constituents caused distress. While we sincerely regret the timing of the study, we strongly defend the original research design and crucially the necessity of other researchers being able to deploy experimental methods when studying political elites.
Lesson 2: Clearly one lesson of the spectacular backfiring of our project is to integrate the question of timing into planning and to reassess the cost/benefits of the study in light of contemporary events through a formal process.

Impacts on Researchers
Having worked with MPs extensively over the years and found them to be supportive of academic research, the principal investigator seriously underestimated the reaction to this study. The emails that were sent to MPs should have all been sent from the PI's email account and not from junior members of the team. It is common in political science for junior researchers (including PhD students) to have frequent contact with MPs and it simply did not occur to the PI that this would prove problematic. In fact, one of the two other MP email experiments that have been undertaken in the UK was conducted by a PhD student. The overwhelming majority of email replies from MPs were courteous but some were more vitriolic and should have been directed at the Principal Investigator responsible for the research (example excerpts below).
'Dear Mesdames, Thank you for your email about your dubious "research" exercise. I consider it wholly disreputable'.
'There is a great deal I could say in response to your disgraceful conduct, but I am not prepared to waste any more of my staff's or my time on it'.
'You have brought shame on your university' 'As a former XXXXX I am disgusted at your conduct'.
Although the overwhelming majority of the emails we received were complaints regarding the study, we did receive a small number of more positive emails (example excerpts below).
'Thanks for coming back to me and for your statement. To my mind, the study seemed a perfectly reasonable one'.
'I would like to say I am surprised at the level of distress your study has caused and don't think the vitriol is deserved'.
Lesson 3: We would recommend that audit studies of elected officials include email correspondence only from the principal investigator leading the project.

The Impact on Response Rates
Overall, 161 of the 650 MPs (25%) in the British House of Commons requested that their data be withdrawn from the study. Within the remaining dataset, 74% of MPs responded to the emails in some form and 26% did not respond at all. In total, 54% of MPs replied requesting a postal address (we did not send any follow-up emails in response) and 21% replied with a substantive email addressing the issues raised. Among MPs who responded 29% followed up with a substantive email. Thus, we have sufficient variation in the remaining data for subsequent analysis but have lost sample size and therefore statistical power, and there may be a relationship between MPs' responsiveness and their decision to withdraw their data that we are unable to investigate.

Conclusion
It is our hope that researchers will continue to conduct audit studies of political elites and that our experience will not dissuade others from conducting rigorous research into representative democracy. With more careful consideration of timing and communications, this project may have been received very differently.
In our view, there is a serious question as to whom this data belongs, MPs or the public? The only personal information any of the UK studies have collected is all in the public domain, such as MPs' name, email and constituency. The ethics of researching powerful elites is a developing field and there is more work to do. But overall, our view is that by pre or debriefing we are potentially placing an emphasis on mitigating harms to the subject (MPs) rather than promoting the public interest.

Declaration of conflicting interests
The author(s) declared no potential conflicts of interest with respect to the research, authorship and/or publication of this article.

Funding
The author(s) disclosed receipt of the following financial support for the research, authorship and/or publication of this article: The project is financially supported by DFG, ESRC (ES/S015728/1) and NWO under the Open Research Area (ORA).