Foreign Accents in the Early Hiring Process: A Field Experiment on Accent-Related Ethnic Discrimination in Germany

Based on a field experiment conducted in Germany between October 2014 and October 2015, this article focuses on the disadvantages associated with the presence of a foreign accent in the early hiring process, when applicants call in response to a job advertisement to ask whether the position is still available. We examine whether a foreign accent influences employers’ behaviors via productivity considerations and/or whether foreign-accented speech is related to statistical discrimination or tastes among employers or customers that translate into differential treatment. To address these processes, we supplement our field-experimental data with information on job and firm characteristics from the texts of vacancy announcements and advertising companies’ homepages, on labor supply from the Federal Employment Agency, and on anti-immigrant attitudes from the German General Social Survey. Results suggest that while calling with a Turkish name did not result in a lower rate of positive replies, this rate was reduced for candidates who called with a Turkish accent. Turkish-accented applicants were told more often than the advertised position was already filled. Our findings suggest that the difference in response rates was not due to productivity considerations related to how well individuals understood foreign-accented speakers. Instead, results support the notion that the observed disadvantages were linked to discrimination based on employers’ ethnic tastes. While we found no indications pointing to the relevance of customer tastes or statistical discrimination, we cannot rule out these processes altogether. Our findings demonstrate that language cues can be more relevant than applicants’ names in shaping employers’ initial responses. They, thereby, highlight the need to consider multiple ethnic cues and different stages of the hiring process.


Introduction
Immigrants who acquire the destination country's language as a second language frequently display a foreign accent. In this article, we use the term "foreign accent" to refer to the carryover of phonology and intonation of the origin country's language into the destination country's language (Lippi-Green 1994). Foreign accents can persist many years after immigration, even when individuals achieve near-perfect control over other features of this second language (Dollmann, Kogan, and Weißmann 2019). A foreign accent is a salient aspect of speech (Derwing and Munro 2009;Rakic, Steffens, and Mummendey 2011) and a strong cue for ethnic group membership (Rödin and Özcan 2011;Paladino and Mazzurega 2020). A distinct manner of pronunciation can also serve as an important element of impression formation (Fuertes et al. 2012) and seems to be more influential in evaluation processes than other ethnic cues, such as appearances (Rakic, Steffens, and Mummendey 2011;Rödin and Özcan 2011).
The presence of a foreign accent can affect individuals' chances on the labor market: findings from laboratory experiments, for example, reveal that foreignaccented speech can lead to negative evaluations that impact perceptions of job suitability, hiring recommendations, and the likelihood of promotion (Carlson and McHenry 2006;Segrest Purkiss et al. 2006;Hosoda and Stone-Romero 2010;Hosoda, Nguyen and Stone-Romero 2012;Deprez-Sims and Morris 2013;Hansen, Rakićand Steffens 2014;Timming 2017). While laboratory experiments are well suited for establishing whether accented speech yields such outcomes, findings from lab-based experiments are difficult to generalize to real-world employment contexts (Jackson and Cox 2013). An additional major challenge is identifying the processes that bring about these adverse effects in the first place (e.g., Deprez-Sims and Morris 2013). Regarding the processes that generate adverse effects of foreignaccented speech in employment contexts, one strand of arguments suggests that immigrants with foreign accents are more difficult to understand than individuals who speak with a standard accent (Munro and Derwing 1995;Deprez-Sims and Morris 2013). In these instances, productivity considerations could be responsible for less favorable assessments. Another strand of arguments addressing the adverse consequences of foreign-accented speech in employment contexts focuses on discrimination (e.g., Timming 2017). In this line of reasoning, a foreign accent is seen as a salient characteristic that signals a person's ethnic origin and influences how an individual is perceived by others (Fuertes et al. 2012). Foreign-accented speech can trigger group-specific assumptions or tastes and result in a penalty (Gluszek and Dovidio 2010;Horr, Hunkler and Kroneberg 2018).
Based on a field experiment, this article contributes to current research on the relationship between foreign-accented speech and hiring decisions (e.g., Timming 2017). In addition to analyzing potential disadvantages associated with the presence of a foreign accent in the early hiring process, we take a closer look at the mechanisms that may generate these disadvantages. Accordingly, we assess whether the presence of a foreign accent influences employers' behaviors via productivity considerations and/or whether foreign accents shape statistical or tastebased discrimination.
The field experiment on which our analysis is based was conducted in Germany between October 2014 and October 2015 and targeted the initial stage of the hiring process when applicants first enter into contact with employers. In Germany, this initial contact can consist of a phone call in which applicants inquire about an advertised position, as a preliminary study that we conducted prior to our field experiment, and discuss later in this article, confirmed. Since this initial phone contact is entirely speech based, a wide range of potentially confounding factors that accompany face-to-face interactions, such as appearance, are excluded from the scenario. In addition, in this purely verbal form of contact, an easily perceptible foreign accent may serve as a prominent signal which can influence initial assessments of applicants. As a result, a caller might be turned down immediately, or, if the candidate is encouraged to submit an application, the initial evaluation may still have a lasting impact on the subsequent hiring process. Finally, because this form of first exchange is short, it involves little effort on the part of employers, and the risk of turning down a real applicant due to the prospect of hiring one of the experimental testers seems, at this early stage, negligible. These considerations are important since the field experiment should not interfere with or alter the ongoing selection process.
In our experiment, trained individuals, so-called testers, inquired about a position that had recently been announced online or in a newspaper, asking whether the job was still available. The response to this inquiry is the key outcome of our analysis. In these short queries, we manipulated two signals of ethnic origin: accent (Turkish accent/standard German) and name (Turkish/German). As we employed a matched-guise technique (Lambert et al. 1960), the same tester placed three phone calls to each advertisement. Insofar as we succeeded in selecting and training testers who differed exclusively in these manipulated attributes, variation in employers' responses can be traced to differences in these stimuli (Pager 2007).
We enriched the data collected during the field experiment with information on the job and firm characteristics from vacancy announcements' texts and advertising companies' homepages and on labor supply from the Federal Employment Agency (FEA). In addition, we employed data from the German General Social Survey (ALLBUS) on anti-immigrant attitudes of the population residing in the region of the advertised workplace. Based on these sources, we examine potential reasons underlying the impacts of job-seekers having a foreign-accented speech in the early hiring process.
We focus on a Turkish accent for various reasons. First, immigrants from Turkey constitute one of the largest migrant groups in Western Europe (Guveli et al. 2016). Today, of the approximately 5 million people of Turkish descent in Western Europe, around 3.5 million are in Germany (ibid.). In most of these destination countries, including Germany, Turks struggle in the labor market and have lower levels of destination-language proficiency than other migrant groups (Kalter 2006;Heath, Rothon, and Kilpi 2008). A foreign accent, as one component of speaking characteristics, may contribute to this group's disadvantage. Second, debates about the alleged unwillingness of Turkish, as well as other Muslim groups to adapt (Diehl, Fischer-Neumann, and Mühlau 2016, 243) attest to the bright boundaries that characterize a European context (Alba 2005). Unfavorable stereotypes, as well as social distance between the majority populations and Turks and Muslims, are pronounced throughout Western Europe (Hagendoorn 1995;Kahraman and Knoblich 2000;Strabac and Listhaug 2008;Degner and Wentura 2010;Froehlich and Schulte 2019;Bonefeld and Karst 2020). Third, individuals of Turkish origin have been repeatedly shown to experience discrimination in a European context (Diehl, Fischer-Neumann, and Mühlau 2016;Zschirnt and Ruedin 2016). Against this backdrop, both accent-based productivity considerations and different forms of accent-based discrimination seem plausible for this group.
To explore these possibilities, this article proceeds in the following way. We, first, present the state of experimental research on foreign-accented speech in the workplace (Section 'Previous Research on the Role of Foreign Accents in Employment Contexts'). We, then, discuss how the presence of a foreign accent may affect employers' decisions in the early hiring process (section "Theoretical Considerations"). In this respect, we refer to employers' productivity considerations, to statistical discrimination, and to employer tastes, as well as to customer preferences. We address how each line of argument relates to Turkish-accented applicants' chances of successfully passing through the initial stage of the hiring process. In section "Methods," we describe the design of our field experiment, additional data sources, and further methodological considerations. Results are presented in section "Results" and addressed in light of our theoretical framework. We conclude with a discussion of our article's limitations and contributions (section "Discussion").

Previous Research on the Role of Foreign Accents in Employment Contexts
Most research on the effects of foreign-accented speech in the workplace consists of laboratory experiments conducted in the United States (Carlson and McHenry 2006;Hosoda and Stone-Romero 2010;Deprez-Sims and Morris 2013;Segrest Purkiss et al. 2006;Timming 2017;Hosoda, Nguyen, and Stone-Romero 2012), with a handful of laboratory studies in European destinations (e.g., Rödin and Özcan 2011;Hansen, Rakić, and Steffens 2014). Collectively, this research has shown that speakers with foreign accents are often perceived more negatively: they are rated as less pleasant to listen to, less intelligent, less competent, and as being of lower social status than individuals who speak the majority language with a standard accent (Gluszek and Dovidio 2010;Fuertes et al. 2012). Such negative evaluations impact ratings of job suitability, assessments of job-relevant attributes, hiring recommendations and decisions, and the likelihood of promotion (Carlson and McHenry 2006;Segrest Purkiss et al. 2006;Hosoda and Stone-Romero 2010;Deprez-Sims and Morris 2013;Hosoda, Nguyen, and Stone-Romero 2012;Hansen, Rakić, and Steffens 2014;Timming 2017). Although many studies highlight the negative consequences of foreign-accented speech, there is also evidence suggesting a more nuanced picture. For example, some studies identify unfavorable evaluations in the workplace only for certain ethnic groups (e.g., Hosoda and Stone-Romero 2010; Deprez-Sims and Morris 2013). There are also indications that, in some cases, foreign accents can counteract negative evaluations (Hopkins 2015). By examining the processes that may underlie such findings, we hope to contribute to a better understanding of the conditions under which foreign accents shape work-related outcomes.
Evidence on accent-related labor market outcomes stemming from field experiments is scarce. In contrast to the numerous correspondence tests 1 on ethnic discrimination in hiring decisions, which signal an applicant's ethnicity by presenting distinct names (Zschirnt and Ruedin 2016;Quillian et al. 2019), the consequences of speaking with a foreign accent have rarely been examined in real-life workplace settings. In addition, the research designs of the few available field experiments do not allow for a separation of speech influences from those associated with other attributes, such as names or appearances (United States General Accounting Office 1990; Cediey and Foronie 2008;Larja et al. 2012). Our field experimental design, in contrast, allows us to isolate the effect of foreign-accented speech from that of other ethnic cues and may, therefore, offer a more accurate assessment of the consequences of speaking with a foreign accent in the early hiring process.

Theoretical Considerations
To address the question of why foreign-accented speech might reduce the likelihood of receiving a positive response in the early hiring process, we discuss two strands of arguments. The first is rooted in human capital theory and centers on employers' productivity considerations, while the second refers to different forms of discrimination. Building on these two lines of scholarship, we propose five main hypotheses.

Foreign Accents and Productivity
Language skills can be viewed as a form of human capital (Chiswick and Miller 1995, 248): they are embodied in the person, their acquisition requires investments that involve costs, and they are productive in that they can be used to increase future benefits, such as returns on the labor market (ibid.). According to this reasoning, applicants who are fluent in the dominant language should be more productive than applicants who are not fully proficient.
Pronunciation enters the picture here as one component of language proficiency (Canale and Swain 1980). Individuals who speak with a foreign accent can be more difficult to understand than individuals who communicate with a standard accent. Empirical studies show that non-native accents affect the degree to which a listener understands a message (Van Wijngaarden 2001;Floccia et al. 2009;Deprez-Sims and Morris 2013). The presence of non-native accents can lead to delays in word identification (Floccia et al. 2009), promote the misidentification of syllables, words, and sentences (Van Wijngaarden 2001), and prolong the time it takes a listener to process the utterances they hear (Munro and Derwing 1995;Adank et al. 2009). In a similar manner to economists, linguists discuss the "costs" of speaking with a foreign accent (Munro and Derwing 1995;Adank et al. 2009). In the hiring process, employers may consider these perceived costs. As a result, they may reject applicants with foreign accents more often than they reject candidates who speak with standard accents.
Jobs typically differ in the linguistic skills they require (Berman, Lang, and Siniver 2003). While for some occupations, communication demands are high (i.e., teachers or call-center agents), other positions involve tasks in which language proficiency is less relevant (i.e., cleaners or bus drivers). In line with the productivity argument, we expect that employers will turn down Turkish-accented applicants more often when the advertised job requires strong communication skills but are less likely to do so for jobs with lower communication demands (Hypothesis 1).

Foreign Accents and Statistical Discrimination
Foreign accents, at the same time, may elicit statistical discrimination or discrimination based on employer or customer tastes. Models of statistical discrimination assume that in situations in which employers have limited information about an applicant, they rely on easily accessible information they have about a group to which the applicant is presumed to belong (Phelps 1972;Arrow 1973;Aigner and Cain 1977). When potential candidates phone to inquire about an advertised position, employers do not yet have any knowledge about the caller. According to these models of statistical discrimination, they base their evaluation on what they know about the average productivity of the social group to which the caller, in their view, belongs (Arrow 1973, 24). A foreign accent in these instances provides a signal that employers can use for their initial assessment.
Foreign-accented speech can indicate a recent immigration status and a higher age at immigration because foreign accents are less prevalent in later generations and among individuals who immigrated during childhood or in their early teens (Dollmann, Kogan, and Weißmann 2019). A first-generation status also often points to lower levels of destination-language proficiency (Kristen 2019) and can imply that individuals attended school in a different country and, therefore, lack host-country-specific knowledge and competences (Friedberg 2000). Turkish-accented speech may, thus, trigger assumptions about lower levels of language fluency and other skills. Considerations of this kind should be more relevant for higher-level positions, which require greater competences and knowledge. In line with the statistical discrimination argument, we, therefore, expect employers to turn down callers with Turkish accents more often when the advertised position demands a higher qualification but are less likely to do so for positions that require a lower qualification (Hypothesis 2).
Models of statistical discrimination are based on the premise that employers can adequately assess a group's average productivity (Aigner and Cain 1977, 177). However, this premise has been challenged, as individuals may hold mistaken beliefs (England and Lewin 1989). While some studies find that employers eventually learn and update group-specific assumptions in view of their experiences (Birkelund et al. 2020), others suggest that negative beliefs remain mostly untouched by such experiences (Pager and Karafin 2009;Midtbøen 2014). Since findings on stereotype content that relate to competences in a German context suggest that individuals of Turkish origin are perceived less favorably (Kahraman and Knoblich 2000;Froehlich and Schulte 2019;Bonefeld and Karst 2020), it is conceivable that employers also base their assessments on mistaken assumptions.

Foreign accents and taste-based discrimination
Models of taste-based discrimination build upon the idea that members of certain groups are preferred over members of other groups (Becker 1971). In the labor market, employers factor in the psychological costs of having to associate with someone or some group they dislike: they either pay members of this group less or avoid hiring them altogether (ibid.). In a German context, as in other Western European countries, distaste toward individuals of Turkish origin seems to be widespread and pronounced (Hagendoorn 1995;Degner and Wentura 2010). A Turkish accent may, thus, trigger distaste among employers in Germany.
In contrast to the case of statistical discrimination, discrimination related to employer tastes reflects individual preferences, which are independent of a given job's characteristics, such as its qualification level (Becker 1971). At the same time, certain features of a firm may restrict the extent to which employers can act upon their tastes. In the following, we consider a series of moderating conditions under which employers with discriminatory preferences can be expected to be less/more likely to act in accordance with their tastes. The reasoning underlying the discussion of these conditions is twofold: (1) if possibilities for rejecting candidates of certain origins are restricted or (2) if the pursuit of discriminatory preferences incurs substantive costs, rational employers will adapt their behavior accordingly. Consequently, taste-based discrimination should be less severe in these situations.
Opportunities and Taste-Based Discrimination. Opportunities for discrimination are reduced in workplace settings in which the recruitment process follows predefined rules because standardization leaves less scope for individual tastes to influence applicant selection and, thus, limits the opportunity for preferences to become effective (Petersen and Saporta 2004;Kaas and Manger 2012;Midtbøen 2015). Typical measures of standardization include the implementation of tests, the appointment of specialized human resource personnel, and the separation of the search for suitable candidates from the final hiring decision (Schmidt, Ones, and Hunter 1992). Usually, larger firms with more vacancies have established application processes that follow such predefined procedures (Kaas and Manger 2012). Therefore, recruiters in these firms should be less prone to taste-based discrimination. 2 A similar reasoning can be applied to external recruitment agencies, which search for employees on behalf of other companies. These agencies benefit from providing their customers with candidates who perform well on the job. Prior research has shown that standardized procedures aimed at predicting job performance are effective tools in personnel recruitment (Schmidt and Hunter 1998). Consequently, we expect external recruitment agencies to rely on standardized methods on a regular basis. As argued above, the implementation of such predefined proceedings restricts opportunities for the pursuit of individual preferences for or against certain ethnic groups.
Moreover, companies that promote diversity may implement strategies that aim to reduce discrimination. Empirical evidence suggests that certain diversity strategies, such as appointing diversity managers and task forces, can improve opportunities for minorities (Dobbin and Kalev 2014). In addition, diversity education can improve overall attitudes toward diversity (Choi-Pearson, Castillo, and Maples 2004), although the results are less consistent when addressing specific origin groups (Kulik and Roberson 2008). In general, then, certain diversity strategies seem to increase sensitivity to equal treatment and to provide measures which reduce the scope for discrimination in hiring situations.
Following this reasoning on opportunities, we expect that a greater degree of standardization reduces the room for taste-based discrimination of speakers with a Turkish accent in the early hiring stage. Accordingly, applicants with a Turkish accent should be turned down less often in larger firms (Hypothesis 3a), in firms that rely on external recruitment agencies (Hypothesis 3b), and in firms that implement diversity strategies (Hypothesis 3c).
The Costs of Taste-Based Discrimination. In addition to opportunities, the costs of discrimination can vary across contexts. Labor supply may serve as an important example. If the supply of applicants for a certain occupation is limited in a country or region, the costs of hiring someone who, in addition to possessing the required qualification, also matches group-specific tastes are greater than in a better-supplied labor market (Baert et al. 2015). In other words, in times of labor shortage, employers have less leeway for discrimination (Becker 1971;Midtbøen 2015). In these instances, the penalties for a foreign accent should be smaller than in contexts without recruitment difficulties.
Regional differences may also come into play in environments in which prejudice and negative attitudes toward certain immigrant groups are widespread. In these contexts, employers are more likely to hold such views themselves and to assume that they will not be sanctioned when following their preferences. Evidence from a Swedish field experiment underlines this reasoning: in regions with more negative attitudes toward ethnic minorities, there is more ethnic hiring discrimination (Carlsson and Eriksson 2017).
In Germany, anti-immigrant attitudes, particularly toward individuals of Turkish origin, are much more widespread in the East than in the West (Boehnke, Hagen, and Hefler 1998;Wagner et al. 2003). In addition, there is substantive variation in these sentiments within the East and the West, for example, between urban and rural areas (Gorodzeisky and Semyonov 2016). In general, then, the penalties of a foreign accent should be greater in German regions with more pronounced anti-immigrant attitudes than in German regions in which negative sentiments are less severe. In line with arguments on the costs of taste-based discrimination, we expect that employers are more likely to turn down Turkish-accented callers in times of sufficient labor supply (Hypothesis 4a) and in regions where anti-immigrant attitudes are more prevalent (Hypothesis 4b).
Customer Preferences. In addition to acting upon their own preferences, employers might factor in their customers' tastes when hiring new employees (Becker 1971). If the current or prospective clientele is known to have preferences against certain groups, this clientele may be inclined to avoid companies which force them to engage with members of these groups. In these instances, employers might pay these workers less or not hire them at all.
A foreign accent is a strong signal of group membership and, following Becker's (1971) considerations on customer preferences, may be relevant to employers' perceptions of their clientele's preferences. Findings from numerous studies conducted in the United States reveal that listeners give higher ratings to products advertised by a standard-accented salesperson (Hill and Tombs 2011;Hendriks, van Meurs, and van der Meij 2014). Individuals with foreign accents were also rated higher in jobs without customer contact than in customer-facing jobs (Timming 2017).
In addition to accounting for customer tastes, employers might follow productivity-related considerations when filling positions with customer contact. As noted earlier, it may be more challenging to understand speakers with a foreign accent. Therefore, customers who have difficulties in understanding the person in question may be less likely to buy or recommend a product or service (Mai and Hoffmann 2014). If employers consider how well their clientele can understand and process a message presented by a foreign-accented employee, a negative response to such speakers is an increasingly likely outcome. For jobs with customer contact, employers' potentially dismissive reactions to applicants who speak with foreign accents may, thus, represent both perceived customer preferences and considerations of how well customers will understand the applicant. We, therefore, expect that employers will turn down callers speaking with a Turkish accent more often when the job in question involves customer contact (Hypothesis 5).

Field Experiment
With a field experiment conducted in Germany, we addressed the consequences of speaking with a Turkish accent in the early hiring process. In a short phone conversation, testers inquired about a job that had just been advertised. The key outcome in our analysis is the response to the question of whether the position was still available. For each advertisement and within two weeks after the position was published, the same tester placed three phone calls in which the applicant's name (Turkish/German) and accent (Turkish accent/standard German) were manipulated. As the simultaneous presence of a German name and a Turkish accent is an unlikely combination (Horr, Hunkler, and Kroneberg 2018), calling with these attributes might produce an awkward situation. We, therefore, excluded this condition from the design. We used male and female testers, and our analyses controlled for this attribute.
To ensure that calling in response to an advertisement before sending an application is common in Germany, we conducted a small preliminary study. We used job announcements from different online job sites and national newspapers and conducted 90 semi-structured interviews with employers. Seventy-two percent of employers indicated they had received at least one call in response to their announcement. On average, they reported receiving 11 calls per offer. We, therefore, concluded that this way of making contact is rather common in the German labor market. In addition, these interviews revealed that the primary reason for calling is to establish whether the job is still available.

Manipulations.
A foreign accent primarily manifests itself through a difference in phonetic characteristics, such as articulation, intonation, prosody, or stress (Lippi-Green 1994). In line with this notion, we restricted our manipulation to phonetic features, while other linguistic aspects, such as grammar, syntax, or vocabulary, remained intact. To ensure that all elements of speech other than those that constituted a Turkish accent were comparable across the two experimental conditions, we used testers who were able to speak German with a standard accent and with a Turkish accent. This matched-guise technique (Lambert et al. 1960) allows other speech elements, which also contribute to listeners' perceptions of a speaker, such as frequency or pitch, to be held constant.
Almost all testers were actors or acting students. Each was asked to read the same short text, once with a Turkish accent and once with a standard German accent. We recorded these audio samples and presented them to 276 students attending a sociology class at either the bachelor or master level at the University of Bamberg. These students were asked to assess the accents and voices' vocal, personal, and professional appeal. Based on these ratings, we preselected a set of candidates. In a second step, we presented their speech samples to linguistic experts who study foreign accents in a German context. They assessed the genuineness and credibility of both speech variants and classified the accent strength. Based on their evaluations, we eventually chose two female and two male testers, 3 although initially, we had planned for more testers. In the course of the selection process, however, it became evident that suitable candidates who could speak both accents in a credible manner were scarce. The testers we eventually chose were able to produce both accents at the same speed in a natural and credible way. They received comparable ratings for their vocal and personal attractiveness and were considered professionally competent by the student sample. Whereas students perceived the final testers as heavily accented and provided rather similar ratings regarding their accent strength, expert ratings indicated a medium accent and some variation across testers. We assume that employers' perceptions of the testers' accents likely resembled those of the student sample, as both groups are untrained listeners.
Our second signal of ethnic origin, name, is the characteristic most commonly used in field experiments on hiring discrimination (e.g., Gaddis 2017). A person's name is an easily accessible cue, especially in written applications. On the phone, conversations in Germany also start by providing one's name. By randomly assigning different combinations of the two key attributes-accent and name, it is possible to distinguish between influences that are due to the presence of a Turkish accent and influences that result from the signals given by the presence of either a Turkish or a German name.
To identify suitable names, we used available lists of common Turkish (Rodríguez 2010) and German (Rudolph, Böhm, and Lummer 2007;Digitales Familienwörterbuch Deutschlands 2018) first names and surnames present in the German population. From these lists and for each language, we took six first names for females, six first names for males, and eight surnames. We, then, asked 71 students to rate these names according to different attributes and to write down the associations they had for each name. Based on these assessments, we selected names that allowed for an unequivocal identification of the two ethnic groups, that were perceived as timeless, and that were associated with average levels of attributes such as intelligence or attractiveness. 4 Sampling and Response Rates. Our sample of job advertisements was based on vacancies from all over Germany for any occupation and for any firm that offered a vacancy between October 2014 and October 2015. We covered two of the most common job advertising channels: online job sites and newspapers. First, we included three prominent and frequently visited online websites. 5 In addition, we considered two national newspapers with high circulation and well-known job advertisement sections. 6 Finally, we took a random sample of regional newspapers with high and low coverage from 15 regions. 7 With this latter selection, we aimed to cover advertisements from smaller firms, such as trade companies, which frequently announce jobs in the local media.
Acknowledging that the number of advertisements in online media far exceeds the number of advertisements in regional newspapers (Weitzel et al. 2013), which in turn exceeds the number of advertisements in national newspapers, we sampled from these channels with a ratio of 6:3:1. We divided online advertisements into two strata: positions which required high versus low qualifications and positions which demanded high versus low communication skills. From each stratum, we sampled an equal number of advertisements. This step was necessary because some online sites focused on certain job categories and presented more advertisements for them than for other job types. From the newspapers, we drew random samples.
The field experiment was carried out between October 2014 and October 2015. During this period, each week, we identified all new advertisements that were published in any of the three channels described above. From these lists, we drew our weekly sample. In total, the sampling frame consisted of 8,690,791 online advertisements, 59,881 advertisements from regional newspapers, and 5,750 advertisements from the two national newspapers. To detect small effect sizes when including several independent variables (Faul et al. 2009), we targeted a net sample of 1,000 advertisements. Based on the response rate of our preliminary study (i.e., 76 percent) where we followed the same proceeding, we sampled 1,385 job announcements and were successful in contacting 702 of the advertising firms (50.7 percent). As we planned three conversations for each offer, the number of phone calls that, in the ideal case, could have been achieved amounted to 2,106 calls. We realized 1,804 (85.7 percent) of these phone calls.
For 59.4 percent of the job advertisements, testers were able to make the intended number of calls for each of our three treatments: (1) Turkish name/Turkish accent; (2) Turkish name/standard German; (3) German name/standard German (n = 417 advertisements; n = 1,251 calls). For 38.5 percent of advertisements, they managed to complete only one or two calls (n = 270 advertisements, n = 490 calls). The small remaining proportion (2.1 percent) consists of advertisements to which testers called more than three times by mistake (n = 15 advertisements, n = 63 calls, of which n = 45 calls covered the three treatments as intended and n = 18 calls were misplaced). As not fully completed sets and misplaced conversations were randomly distributed across testers and treatments, we include all calls in our analyses.
To enable comparisons between cases where the set of calls was incomplete, we employed restricted randomization and allocated the accented caller (i.e., Turkish name/Turkish accent) to the second position. In this way, we ensured that the third call did not receive the key treatment. Preliminary trials showed that this call had the highest risk of not being realized. As at least two calls are necessary for a meaningful comparison of the different treatment conditions, we chose the second call for the key treatment (but, in principle, could have also selected the first). We randomly assigned the other treatment conditions (i.e., Turkish name/standard German and German name/standard German) to the remaining first and third calling positions. We conducted additional checks to rule out the possibility that the implemented calling order biased our analyses. These findings are included in the Online Appendix (Tables A.3 and A.4) and reveal that restricted randomization did not bias results in a substantial manner.

Additional Data Sources
In addition to the data collected in the field experiment, we used information from three other sources to examine the reasons underlying the potentially adverse effects of calling with a Turkish accent. First, we considered the texts of job announcements and the homepages of advertising firms. While the advertised texts provided information on job requirements and position characteristics, the homepages provided information on the firms. In some cases, when we could not find the relevant information on the website, we called and asked for the missing features. Second, we merged data on regional labor supply and occupational vacancy rates with our experimental data, using information from the FEA. Third, we added data on anti-immigrant attitudes present in the population residing in the region of the advertised workplace from the German General Social Survey (ALLBUS).
Job and Firm Characteristics: Advertisements and Homepages. Information on positions' communication demands was obtained from the descriptions included in the selected job announcements. Based on the texts of all advertisements, two coders compiled a list of 276 words which indicated a communication requirement (e.g., negotiating or consulting). In a second step, these words were rated by 217 students on an elevenpoint scale as to how strongly they signaled the need for communication skills. We, then, weighted each word with the standardized median value of student ratings so that words which were perceived as powerful signals of communication demands obtained a higher value. In the final step, for each advertisement, we calculated the weighted share of these words. A higher value on this variable indicates that the position involved more communication tasks.
To capture a job's required qualification level, we assigned to each advertised position the necessary educational degree, using the International Standard Classification of Education. For the analyses, we distinguished between three levels: low (ISCED 0-2), medium (ISCED 3-4), and high (ISCED 5-6). Alternatively, we used the requirement levels of the Classification of Occupations (KldB 2010) and distinguished between unskilled/semiskilled, specialist, complex specialist, and highly complex tasks (Paulus and Matthes 2013).
With three additional measures, we considered important aspects related to the standardization of the recruitment process. Firm size refers to the number of employees. We used a shortened version of the staff headcount criteria for small and medium-sized enterprises proposed by the European Commission (as defined in the EU recommendation 2003/361) and distinguished between small (less than 50 employees) and larger (50 or more employees) firms. In addition, we took into account whether the job advertisement was published by an external recruitment agency. Finally, we measured whether the firm had a diverse orientation, assigning a value of 1 if the firm mentioned diversity strategies or programs on its homepage.
Customer contact was measured by applying a similar procedure to that of the communication tasks. That is, two coders first compiled a list of 72 expressions from the advertisements which indicated customer contact. Each word was then rated on an 11-point scale by 217 students, who indicated how strongly the expression signaled that interactions with customers were part of the job. For each advertisement, we used these ratings to calculate the weighted share of words pointing to customer contact. A higher value on this measure implied that the position in question involved more contact with customers.
Labor Supply: FEA. Furthermore, we merged data from the Federal Employment Agency (FEA) with our experimental data set to portray the region's labor supply. For each federal state, and on a monthly basis, we calculated the share of vacant positions in the employed population, considering four levels of vocational qualification requirements which captured an occupation's degree of complexity, for each of 37 occupational main groups (Paulus and Matthes 2013). We, then, assigned to each advertised position the relevant vacancy rate at the time of the experimental call. Higher vacancy rates pointed to a larger number of open positions and, consequently, to labor shortages.
High unemployment rates, in contrast, implied that there was more labor supply than the market could absorb. Information on unemployment was available on a monthly basis at the district level. Accordingly, we assigned the regional unemployment rate that was present at the time of the phone call to each job advertisement included in our analysis. As the FEA considers 401 districts in total, the unemployment measure is regionally finer grained than the vacancy rate, which is documented at the federal state level. However, in contrast to vacancy rates, unemployment rates are not specific to certain occupational categories or vocational requirements.
Adverse Tastes: ALLBUS. Finally, to capture differences in the regional distribution of tastes, we relied on a measure of anti-immigrant attitudes that is part of the German General Social Survey (ALLBUS). With four items in 2010 and 2012 and eight items in 2014, the surveys recorded sentiments toward ethnic minorities in Germany. For each respondent, we calculated the standardized mean over these items. We, then, aggregated the individual values for the three survey years at the level of federal states. The resulting average captured how strong anti-immigrant attitudes were in the respective region. Higher values pointed to increasingly negative feelings toward ethnic minorities. Table A.1 in the Online Appendix illustrates the distributions of our model variables. It also covers the controls, such as the tester's gender, the gender of the person named in the advertisement (i.e., male, female, no contact person), the ethnicity of the person named in the advertisement (i.e., name signaled immigrant origin, name signaled German origin, no contact person/unclear origin), and the number of words in the advertisement.

Analytic Strategy
Our field experiment's design generated clustered data, with calls (level 1) nested in job advertisements (level 2) and advertisements nested in federal states (level 3). We consider this data structure by estimating multilevel logistic regression models. After analyzing potential penalties associated with foreign-accented speech, we address the theoretical arguments via interaction effects. In this way, we investigated whether the extent to which a Turkish accent affected employers' response in an initial phone conversation was conditional upon characteristics which pointed to productivity considerations and/or which related to statistical or taste-based discrimination.
Except for diversity and firm size (with missing values reaching 24 and 28 percent), we did not encounter missing information of more than one percent (Table A.1. in the Online Appendix). We dealt with missing values by employing multiple imputations (Allison 2001) based on all model variables and relevant auxiliaries, such as the advertising channel or sampling strata. We ran regression models for each of the 25 imputed datasets and then combined the estimates, following Rubin's (1987) rule.

Results
To address the question of how foreign-accented applicants fared in the initial stage of the hiring process, Table 1 tabulates average response rates to the question of whether an advertised position was still available, by name and accent. The first row shows that individuals from the majority population who called with a German-sounding name and standard German accent had an 80 percent chance of being told that the job in question was not yet filled. The response rates for a caller with a Turkish-sounding name and standard German speech was, at 80 percent, identical, indicating that the manipulation of the name did not elicit a differential reaction. Only candidates with a Turkish accent experienced a disadvantage. Their positive response rate amounted to 73 percent. This difference in positive response rates between applicants speaking with a standard German accent and applicants speaking with a Turkish accent represents a statistically significant difference of 7 percentage points, which can be solely attributed to the accent manipulation. Put differently, an applicant with standard German pronunciation could expect one positive response for every 10 advertisements s/he called about, while an applicant with a Turkish accent would need to place phone calls in response to 11 advertisements to achieve the same result (Table A.2 in the Online Appendix). Notes: The table reports the positive response rates (i.e., the job is still available) for the three different treatments. The ratios and percent differences in the two last columns refer to the comparison to the majority population (i.e., German name/standard German); *p < .001.
In a second step, we studied this difference in response rates in light of considerations about productivity and statistical and taste-based discrimination. To address these factors, we estimated a set of multilevel logistic regression models, presented in Table 2. We report the average marginal effects and, in brackets, the standard errors for a positive response to the question of whether a position was still available. The first model is included for comparison purposes. Our substantive interest rests upon the subsequent models, in which we examined the different arguments one after another, based on interaction terms.
Model 2 takes up the notion that individuals with foreign accents could be more difficult to understand. Turkish-accented applicants who called to inquire about a job that required strong communication skills should be turned down more often than when calling for a position for which communication skills were less relevant (Hypothesis 1). The results, however, are not in line with this expectation. At this early point in the selection process, employers' responses on the phone do not seem to follow productivity-related considerations.
With the subsequent set of analyses, we addressed the reasoning on statistical discrimination, according to which Turkish-accented applicants should receive fewer positive replies when calling for a position that demands a higher qualification (Hypothesis 2). In Models 3a and 3b, we, thus, included the advertised position's educational and task requirements. Concerning occupations that demanded more qualifications, there were no indications that the penalty associated with the presence of a Turkish accent was significantly greater than for positions requiring lower qualifications.
With a series of moderating conditions, we then examined taste-based discrimination. If adverse tastes were present among employers, these moderating conditions should restrict the opportunities for or increase the costs of discrimination (see section "Foreign Accents and Discrimination"). Models 4a, 4b, and 4c take up the notion that, as recruitment processes become increasingly standardized, employers' means to act upon group-based tastes are restricted (Hypotheses 3a-c). The findings provide support for this reasoning. Model 4a indicates that calling with a Turkish accent in smaller firms reduced the probability of receiving a positive reply by eleven percentage points, compared to individuals who called with a German accent. This disadvantage, however, almost completely disappeared in larger firms. 8 Similarly, the penalties associated with an inquiry that involved a Turkish   accent were absent in companies that emphasized diversity (Model 4b) and in firms that relied on external recruitment agencies (Model 4c). Models 5 and 6 address the costs of employers' ethnic tastes. The relatively higher costs that discrimination incurs in tight labor markets do not seem to matter for employers' replies. This finding contradicts Hypothesis 4a and holds true for both indicators of labor supply (Models 5a and 5b). In contrast, regional differences in anti-immigrant attitudes made a difference. In line with Hypothesis 4b, Model 6 indicates that in regions in which the population held more negative attitudes toward immigrants, the presence of a Turkish accent reduced the probability of receiving a positive response, while foreign-accented candidates were less likely to be penalized in regions in which negative attitudes toward immigrants were less prevalent. In a final model, we investigated whether employers factored in their customers' preferences (Hypothesis 5). Results presented in Model 7 do not provide support for this line of reasoning, as the disadvantage for applicants who called with Turkish accents remains unaltered, irrespective of whether the job in question involved more or less customer contact.

Discussion
Based on a field experiment, this article focused on employers' responses to foreignaccented speakers who called to inquire about an advertised position in Germany. While calling with a Turkish name did not result in a lower rate of positive replies, this rate was reduced for candidates who called with a Turkish accent. The difference in response rates did not seem to be related to productivity considerations centering on concerns about how well employers, colleagues, or customers understand foreign-accented speakers; at least, we were unable to find such evidence based on the measures implemented in our analysis. Instead, several findings supported the notion that foreign accents trigger discrimination. In line with the reasoning on employers' tastes, we found that companies' organizational characteristics moderated the relationship between accented speech and employers' responses, indicating that in more standardized settings, there is less leeway for employers' tastes. In addition, applicants with Turkish accents faced greater disadvantages in regions where anti-immigrant attitudes were more prevalent. This result might suggest that in environments that are more hostile toward immigrants, employers are more likely to hold such attitudes themselves and can expect not to be sanctioned when acting upon their ethnic preferences. At the same time, there were no indications that customer tastes matter in employers' responses to foreign-accented speakers. There was also no evidence pointing to statistical discrimination, as foreign-accented speech did not elicit greater penalties for positions that require higher qualifications.
However, we cannot rule out statistical discrimination altogether. The evidence on standardization, for instance, can also be interpreted in the context of statistical discrimination. If this form of discrimination is driven by ethnic stereotypes, the implementation of standardized proceedings means that there is also less room for the inclusion of such stereotypes. Accordingly, standardization may reduce the opportunities not only for taste-based discrimination but also for statistical discrimination.
In addition, employers with a distaste for Turks should discriminate against Turkish applicants, regardless of whether the applicant's ethnicity was revealed by the name or the accent. The employers in our analysis, however, did not seem to have a problem with applicants with a Turkish name who spoke standard German. The absence of discrimination in this case could suggest that statistical discrimination is an issue. While we cannot rule out that this reasoning applies, we consider it to be less convincing than the alternative interpretation of taste-based discrimination because the phone setting renders a foreign accent more salient than a foreign name. In numerous correspondence tests conducted in recent years (Kaas and Manger 2012;Zschirnt and Ruedin 2016;Quillian et al. 2019), applicants' names have been presented in writing. In a written application, a candidate's ethnicity is easier to detect than on the phone, where the name is only briefly mentioned. In our case, employers may not have had the same opportunity to process the "foreignness" or specific ethnic content of the name that they would have if they had seen it in writing. It has also been shown that verbal cues can override other ethnic cues (Rödin and Özcan 2011). In our analysis, processes of this kind could have rendered the briefly mentioned Turkish names irrelevant, as applicants who were only identifiable by Turkish names spoke standard German. A recent field experiment on ethnic discrimination in the German housing market points to a similar finding: while callers with Turkish accents were invited less often to view an apartment, callers with Turkish names were not rejected more frequently (Horr, Hunkler, and Kroneberg 2018). This discussion about the adequacy of competing interpretations of our findings attests to the difficulty of disentangling the processes underlying employers' responses to individuals who speak with a foreign accent.
Our analysis also faced a number of additional limitations. First, we studied the consequences of a foreign accent only for individuals of Turkish origin, rather than for different ethnic groups. With the current research design, it is, thus, impossible to assess whether the adverse effect of a Turkish accent was due to the accent itself or to the particular ethnic group in question. Second, we were unable to assess whether ethnic discrimination varied with the applicant's gender (Bursell 2014). Initially, we had planned to address differences in employers' reactions to female and male applicants. In view of our difficulties in attracting suitable testers in sufficient numbers, however, we could not pursue this route. Third, there are potential problems with the credibility of the speech of actors who spoke standard German and German with a Turkish accent. Staged pronunciation can diverge from natural speech and may sound stereotypical, and stereotypical speech can evoke suspicion among employers and may lead to overestimating a foreign accent's effect. We tried to rule out this possibility in different ways in selecting our testers (see section "Field Experiment") but cannot be sure to have fully succeeded. Fourth, as foreign accents are more prevalent among recent immigrants than among later generations (Dollmann, Kogan, and Weißmann 2019), it is likely that employers interpreted callers with Turkish names who spoke standard German as German born. If employers drew this conclusion, our analysis measures differences in how firstand second-generation job-seekers of Turkish origin are treated when calling employers, rather than the effect of a Turkish accent.
Fifth and finally, some of our measures were less detailed than we would have liked. The measures of customer contact and communication demands, for example, were based on the contents of job advertisements and homepages. Given that the information presented varied across these sources, we cannot be sure that our proceedings allowed for an appropriate identification of these constructs. A more detailed job description, for instance, increased the likelihood that expressions which signaled communication demands or customer contact were included, while a less detailed advertisement for a similar job implied that these expressions would come up less often. This problem cannot be fully tackled by considering the number of words included. A similar reasoning applies to certain other measurements implemented in our analysis. For instance, the diversity measure was based on whether firms mentioned diversity on their homepages. However, mentioning diversity does not necessarily mean that the company in question had indeed implemented such strategies (Kang et al. 2016). Similarly, our measures of the required qualification level may not have captured the full set of skills relevant for the advertised position. In addition, for our estimate of anti-immigrant attitudes, we had to aggregate information at the federal state level to obtain sufficient numbers of observations per regional unit, while a more detailed geographical distinction might have been more suitable for capturing our reasoning.
Notwithstanding these shortcomings, this article demonstrates that foreign accents can alter the chances of entering the hiring process. Even during brief initial contact between applicants and employers, where little is at stake and positive replies are non-committal and fairly inconsequential, differential treatment occurs. The penalty associated with the presence of a foreign accent that we found for this early point in the hiring process could contribute to eventual hiring gaps, which develop throughout the various stages of candidate selection. However, as we do not know how the difference in response rates for speakers with Turkish accents translates into such gaps, we are left speculating about the extent to which losing opportunities at this early point might shape labor market outcomes overall. Future studies could start addressing those stages of applicant selection that have received less attention in a literature which has focused mainly on the selection of candidates for an interview based on written applications (e.g., Zschirnt and Ruedin 2016;Quillian et al. 2019). With our examination of the initial contact between applicants and employers and, thus, of an earlier stage in the overall process, we took the first step in this direction. Future studies might also extend the investigation beyond a single stage of applicant selection and examine consecutive stages of the hiring process, as additional discrimination seems to occur in later stages of the hiring process (Quillian, Lee, and Oliver 2020). These studies could consider a range of outcomes, such as placement decisions that allocate employees to differentially valued positions within a company (e.g., Pager, Bonikowski, and Western 2009).
In addition, our article highlights the importance of examining different cues of ethnic origin. While we argue that a foreign accent is an important signal of ethnicity in hiring scenarios that take place over the phone, other cues could be more relevant in other stages of the hiring process. In a written application, for example, the name likely matters most, and in a face-to-face interview, it might be an amalgam of accent, name, and appearance. Therefore, to assess ethnic discrimination in hiring decisions and to make broader claims about discrimination in the labor market, it is necessary to consider the full set of ethnic cues that are typically present in the respective stages. In our field experiment, we were able to include two important cues of ethnic origin that are detectable in a phone call-name and accent. Future research could broaden this perspective, possibly by considering the interplay between different ethnic cues and other characteristics, such as gender, age, or socioeconomic status (Lahey and Oxley 2018).
Overall, our field experiment attests to one of the major challenges of studying discrimination-namely, the empirical identification of the processes underlying it. Carefully designed and conducted empirical studies that allow for rigorous tests of the theoretical considerations on the origins of discrimination are key to making progress in this regard.