Advantages, Challenges and Limitations of Audit Experiments with Constituents

Audit experiments examining the responsiveness of public officials have become an increasingly popular tool used by political scientists. While these studies have brought significant insight into how public officials respond to different types of constituents, particularly those from minority and disadvantaged backgrounds, audit studies have also been controversial due to their frequent use of deception. Scholars have justified the use of deception by arguing that the benefits of audit studies ultimately outweigh the costs of deceptive practices. Do all audit experiments require the use of deception? This article reviews audit study designs differing in their amount of deception. It then discusses the organizational and logistical challenges of a UK study design where all letters were solicited from MPs’ actual constituents (so-called confederates) and reflected those constituents’ genuine opinions. We call on researchers to avoid deception, unless necessary, and engage in ethical design innovation of their audit experiments, on ethics review boards to raise the level of justification of needed studies involving fake identities and misrepresentation, and on journal editors and reviewers to require researchers to justify in detail which forms of deception were unavoidable.


Introduction
Audit studies are a valuable tool in political science research in studying the preferences, biases and behaviours of powerful actors, offering robust evidence on important topics (e.g. discrimination, accountability, representation) under real-world conditions at reasonable costs. However, given the nature of their focus and the variation in key variables required for an audit study to work, most audit study designs require the use of deception and the violation of the core ethical principle of informed consent. Given the negative sentiment among publics and academics towards being a participant in research involving deception (Desposato, 2018), it is not surprising that a recent audit study involving deception received negative reactions from the elected officials involved, arguing that such studies redirect their attention away from real constituency or policy work (Campbell and Bolet, 2021;Guardian, 2021).
The rise of digital communication channels and experimental research within political science has rendered audit studies an attractive, relatively cheap and widely employed approach for political scientists. As a result, concerns around the use of deception in audit studies have gained prominence, especially for audit studies involving legislative elites.
With an ever-increasing number of audit studies targeting a small and finite number of political elites, there are legitimate concerns regarding the aggregate costs of these studies on the general public (Desposato, 2021), the responsible management of the common pool resource of 'public elites' for both future experimental (Butler and Desposato, 2021) and non-experimental research (Cowley, 2021) and the heterogeneity of ethical standards towards this type of research across Western European countries (Pedersen et al., 2021).
This article's contribution to the debate is twofold. First, it briefly outlines the discipline's understanding of deception and reviews the use of deception in different audit study designs. Second, we discuss a confederate-based MP communication experiment that we designed to try to minimize deception and the practical challenges this presented. We highlight how, in trying to eliminate all forms of deception from our design while also trying to run an informative experiment, we were forced to innovate and gather relevant contextual information, which ultimately improved the research. We conclude by calling upon researchers to think carefully about how deception can be eliminated from audit studies and for ethical review boards to require researchers to put more thought into alternative approaches, including pre-tests of various design elements and qualitative background research, before allowing audit studies involving fake identities and misinformation. Moreover, journal editors and reviewers should demand more detailed justifications for -and more detailed description of attempts made to minimize -the use of deception.

Different Forms of Deception and Audit Study Designs
Over the past years, political scientists have become increasingly interested in defining what deception in research is and what forms it can take. While deception was mentioned only once in the American Political Science Association's (APSA, 2012) Guide to Professional Ethics, Rights and Freedoms, APSA's 2020 Principles and Guidance for Human Subjects Research has a dedicated section on deception and offers practical guidance and strategies for respecting participants' autonomy when deception is involved. The document distinguishes four forms of deception (APSA, 2020: 7): 1. Identity deception: Deception about who you are (a researcher in political science) or who you are working with.

Activity deception:
Deception of what you are doing (e.g. research for social science) or the situation confronting research participants. 3. Motivation deception: Deception about the reasons for the research or the use to which the research or data will be put. 4. Misinformation deception: Providing false information about the state of the world -for example, by providing unreliable or inaccurate information about political candidates.
While the definitions of identity and misinformation deception are uncontroversial and commonly well understood, the distinction between activity and motivation deception is less clear. Activity deception is a binary question: Do subjects know that they are being studied? If no, then activity deception is used. Motivation deception is less clear, as it depends on how broadly or narrowly we circumscribe the subject's knowledge about the study's purpose and the uses to which collected data will be put. A broad definition of motivation deception would require subjects to broadly know what the study is about (e.g. MP constituency communication or MPs' group biases). A narrow definition of motivation deception, on the contrary, would require subjects to know the exact hypotheses being studied (e.g. MPs are more responsive and credit-claiming towards co-partisans compared to non-co-partisans, or MPs discriminate against minority constituents). Avoiding narrow motivation deception is a problem for even non-experimental research, as participants in large established surveys, such as the British Election Studies (BES), will often merely know that their responses will be used to study political behaviour, but not the specific hypotheses that will be tested using their responses. Moreover, experiments and research on sensitive topics at times require activity and motivation deception to avoid subjects changing their behaviour (known as the Hawthorne effect) or offering socially desirable answers, both of which would undermine the validity of the research findings. Hence, while motivation deception is probably the most common and benign form of deception, followed by activity deception, identity and misinformation deception are often perceived to be normatively most problematic. The extent to which the harms of deception can be justified depends on, among other things, the subjects studied. Public officials and other people who seek, hold or wield power in the political sphere are accountable to the public in different ways from ordinary citizens. Consequently, under certain conditions, the need to protect these subjects from certain harms may be less stringent. The greater the obligations and duties towards the public and the greater an official's role in designing, influencing or implementing public policy, the greater the degree to which harms related to their public duties are permissible (APSA, 2020: 3). Hence, regarding UK MPs, who are elected to represent constituents and have some influence on the legislative process, certain harms may be ethically permissible.
Audit experiments vary widely regarding the extent and forms of deception used. The classic audit study designs (e.g. Butler and Broockman, 2011;McClendon, 2016) all involve at least identity, activity and motivation deception. Depending on the research topic and design, audit studies might also involve misinformation deception, such as fictional information about a constituent's political preferences or concerns.
Confederate designs (e.g. Butler et al., 2012;De Vries et al., 2016;Grose et al., 2015) aim to reduce deception by getting actual constituents to contact their legislative representatives. This eliminates identity deception, although the extent depends on the research team's ability to recruit a sufficiently large and diverse confederate population. Moreover, while some studies take great care to match their confederate's political preferences to treatment letters (e.g. Butler et al., 2012), others merely assign volunteers to send prewritten letters that may or may not match the writer's views (e.g. De Vries et al., 2016;Grose et al., 2015), which involves some misrepresentation deception.
More recently, Landgrave (2020) proposed an audit design that eliminates all but motivational deception. He invited legislators to participate in a research survey in exchange for a donation to a college scholarship fund for either non-descript or Hispanic students, randomizing both the target and whether the donation would be made in the name of the legislator or anonymously. While innovative and involving low levels of deception when allowing researchers to detect potential anti-Hispanic discrimination, it suffers with regard to realism, which is a key benefit of audit designs involving constituency letters, as legislators are likely to engage differently with requests by researchers to participate in a survey compared to constituency requests.
While it is difficult to avoid activity and motivation deception without loss in realism undermining the validity of an audit study, we believe that identity and misrepresentation deception can -and ideally should -be avoided. These forms of deception are mostly used for feasibility and cost reasons and may introduce rather than eliminate bias if MPs detect that they are not dealing with actual constituents (e.g. by choosing not to respond or providing socially expected responses). Below we outline a confederate design in the UK context, highlighting how it is possible to minimize the most severe forms of deception while also maintaining a rigorous research design, but also how doing so involves practical challenges that increase researcher costs and necessitates careful planning.

A UK Confederate Design to Minimize Deception
Building on the work of Grose et al. (2015) and De Vries et al. (2016), we aimed to understand whether MPs in the UK vary in their responsiveness to emails on identical issues sent by constituents who vary in their policy positions. Our aim was to develop a wellpowered within-MP experiment in the UK context, without the use of identity or misrepresentation deception. This created considerable organizational and logistical challenges. In particular, the design required us to find a large enough sample of pairs of confederates who resided in the same constituency but held genuinely different opinions on a political issue and who were both willing to send a standardized letter to their MP expressing their opinion. We refer to this as a 'constituency-matched' sample. Given that there was no survey organization that could provide us with such a sample of confederates, we opted to recruit confederates starting from the students at our various universities, which together offered a politically and geographically diverse pool. To figure out how to recruit a constituency-matched sample of constituents with differing views on a specific political topic most effectively, we pre-tested recruitment. This resulted in the following insights and practical solutions: • Ambassador recruitment: Rather than recruiting confederates ourselves, we found it to be much more effective to predominantly rely on 'student ambassadors'. Student ambassadors had better access to student networks and possessed better information about the potential opinions and constituency location of fellow students. Selecting a geographically and politically diverse set of ambassadors improved recruitment efficiency. Moreover, having peers -rather than lecturersapproach students about sending a letter to their MP helps ensure that participation is truly voluntary.
• Training, equipping and paying student ambassadors: This specific recruitment approach required us to properly train ambassadors on the background of the project and the ethical considerations involved so that they could answer student questions. We provided each ambassador with a link to an online survey that was used for data collection and to record potential letter writers' informed consent to us opening an email account in their name and sending the pre-approved letter on their behalf. To incentivize ambassadors to recruit as many participants as possible from within their wider network, their compensation was linked to the number of participants they recruited. To avoid peer pressure and ensure voluntary participation, ambassador pay was kept deliberately low (£2 per completed survey) and not linked to participants' agreement to sending a letter on their behalf. Each ambassador was given a unique access code required to start the survey so that the research team could calculate pay and monitor progress and entry patterns. • Wider recruitment: While recruiting confederates from different universities across the UK, we noticed that students are more homogeneous in their political opinions than anticipated, which is why we allowed ambassadors to recruit confederates outside the university network and encouraged them to tap into their family and social networks within their hometown and neighbouring constituencies. Choosing geographically and politically diverse ambassadors helped this wider recruitment to achieve more matched constituency pairs. • Multiple topics: The pre-test also revealed that we should recruit on multiple topics rather than just one, as this would increase our chances of achieving constituencymatched pairs of confederates with differing opinions on an issue and improve the study's statistical power.
To identify and resolve logistical issues, we also pre-tested the sending of a small number of letters and collecting responses. This led to the following insights: • Automated and staggered sending: Conscious of our experiment's use of precious MP time and resources and because of our within-MP design, we asked for confederate permission to potentially send each previewed letter to the MP on their behalf. For each matched pair, we then set up (with confederate's consent) confederatespecific email accounts to send the letters. This was to ensure that we had control over the timing and nature of the messages sent to MPs. We staggered the sending of letters across a month to avoid detection (which could have induced a Hawthorne effect), to ensure non-interference and to distribute the response burden, especially towards those MPs receiving more than two letters. All email responses were automatically forwarded from the confederate-specific accounts to a central project account for record-keeping and analysis and were also forwarded to the confederates, which were free to follow up if they wished to do so. • Variation in response form and time: The pre-test revealed important variation in response form and time. While most MPs responded by email, around 10% responded by sending physical letters through the post to the confederate's home address, highlighting that we needed to have a system in place to capture those responses. To collect those physical responses, we contacted participants multiple times throughout the response period by email to ask whether they had received a response. If so, they could submit a photo of the letter to the project's email account or to a designated WhatsApp account. Moreover, the pre-test also allowed us to get a better sense of the response rate, time and variability of response length, which informed our analytical approach and power calculations.
Finally, because of the cost associated with running the experiment with confederates, we also conducted initial qualitative research, speaking to students who had interned with MPs on how constituency communication is typically handled in an MP's office. This led to further insights: • Volume and form of constituency correspondence: MPs in regular times receive a large volume of correspondence: roughly 100 items per day. Most correspondence arrives in the form of emails or letters. • MP involvement: MPs' involvement in responding to the letters varies. In some cases, we were told that MPs dictate all responses, whereas in others there are processes in place for staff to respond to select types of queries, while MPs deal personally with others. • Need to include addresses and avoiding petition format: Finally, the interviews revealed the need to include the confederate's home address in any letter or email sent to an MP. This is because MPs only send substantive responses after verifying that a correspondent is one of their constituents. In fact, in our study, in many cases we initially received automated replies from MPs simply saying that they would only respond fully once they had verified the address of the sender as being located in their constituency, and that they would not respond if no address had been supplied. We were able to isolate these non-substantive responses from subsequent substantive responses.
Based on the pre-test information, we adapted and pre-registered our design and sought final ethical approval from all four universities from which we recruited confederates. Our final design included nine potential policy topics to which we recruited confederates via a diverse set of student ambassadors. Our final sample includes 102 MPs (52 Labour, 47 Conservatives, 3 other), who received between 2 and 10 confederate emails across 9 issues, resulting in a total of 624 emails from 291 voters. The unevenness of correspondence across MPs to achieve the necessary statistical power is a clear downside of our approach, which could raise ethical concerns with regard to the equitable treatment of MPs. However, we believe this is ethically defensible and preferable to designs involving identity and misinformation deception, given that all the response time of MPs or their staff was directed towards correspondence from genuine constituents expressing their genuinely held opinions.

Conclusion
Deception is probably impossible to fully eliminate from audit studies without introducing bias in the estimator and making the results of the study considerably less useful. Yet there is a risk that political scientists have become too quick to adopt identity or misinformation deception to keep organizational and logistical costs down. Identity and misinformation deception violate core ethics principles and are not strictly necessary design elements of audit studies. We suggest that political scientists try to think more creatively on how to design audit studies involving the minimal deception possible, by drawing upon confederates in their research. We understand that doing so comes at considerable organizational and logistical costs but feel that this increase in costs has three important positive effects. First, the increased costs provide incentives for researchers to perform the necessary background research and pre-testing (including the running of power analysis), which should result in better experimental designs. Second, the increased costs incentivise creativity and innovation in research design. Third, the increased costs provide an effective natural barrier against the overuse of a limited pool of subjects. We believe that the adoption of higher ethical standards for studies engaging in identity or misinformation deception can have a similar effect to the compensation requirement suggested by Butler and Desposato (2021). Higher ethical standards would foster ethically oriented research design development (an underprovided public good) while less directly strengthening existing inequalities between resource-rich and resource-poor research institutions. We therefore call on ethics review committees, journal editors and reviewers to become more exacting in terms of the ethical standards against which they evaluate audit experiments, asking whether researchers have convincingly established that an audit study must involve identity or misinformation deception to be successful.

Declaration of conflicting interests
The author(s) declared no potential conflicts of interest with respect to the research, authorship and/or publication of this article.

Funding
The author(s) disclosed receipt of the following financial support for the research, authorship, and/or publication of this article: The authors gratefully acknowledge funding from the BA/Leverhulm Small Research Grants scheme (SG 163019), which enabled the UK confederate design discussed here. Daniel acknowledges funding from the Swiss National Science Foundation (SNF Ambizione Grant, NO. 179938).
Neil Visalvanich is an Associate Professor in Comparative Politics in the School of Government and International Affairs (SGIA) at Durham University. He holds an MA and PhD from the University of California, San Diego. His research focus on American Politics, particularly in the politics of campaigns and elections, as well as the influence of racial and ethnic diversity on political institutions, mass political behaviour and public opinion. His research has been published or is forthcoming in the American Journal of Political Science, Political Behavior, Political Science Research & Methods and other leading peer-reviewed journals.
Nick Vivyan is a Professor of Politics in the School of Government and International Affairs (SGIA) at Durham University. He holds an MRes and PhD from the London School of Economics and Political Science (LSE). He is a co-investigator on the joint ESRC/AHRC-funded project 'Causes and Consequences of Election Violence: Evidence from England and Wales, 1832-1914'. His research investigates public opinion, representation and accountability, with a particular focus on British Politics. His research has been published or is forthcoming in the American Journal of Political Science, the American Political Science Review, the Journal of Politics, Political Analysis and other leading peer-reviewed journals.