Dealing With Conflicting Values in Policy Experiments: A New Pragmatist Approach

Despite the “turn to values” in Public Administration, there is still a lack of empirical research in situ that investigates how various stakeholders in interaction develop strategies to deal with conflicting values over time. By using a new pragmatist approach, this article fills in this gap by investigating policy experiments in Dutch healthcare. The results show how professionals, citizens, and policymakers differently valued the worth of policy experiments, which manifested itself in multiple value conflicts. To deal with these conflicts, stakeholders adopted different strategies: colonization, compromising, prioritization, short-cutting, organizational enmeshing, and pilotification. The results show a shift from exclusive top-down strategies to inclusive multi-value strategies over time.


Introduction
Despite the "turn to values" in the field of Public Administration (West & Davis, 2011, p. 226), the majority of studies still focuses on philosophical and theoretical discussions of values (see for a critique; de Graaf, 2015;de Graaf et al., 2016;Paanakker, 2019Paanakker, , 2020Wagenaar, 2014;West & Davis, 2011). This conceptual interest in values has not been equally matched by empirical research. As a consequence, we do not exactly know "how values shape practical action in situations" (West & Davis, 2011, p. 230) and how actors "grapple with and make judgements about value conflicts when making policy decisions" (Spicer, 2009, p. 539; see also de Graaf et al., 2016). Given this gap in Public Values research, there is a need for empirical research that investigates how multiple stakeholders-for example, public professionals, policymakers, and citizens-in interaction deal with contested value questions on the ground. A key question on this agenda is how different stakeholders frame, change, and transform values over time when dealing with difficult and controversial social issues (Boltanski & Thévenot, 2006;Kornberger, 2017;Van der Wal, 2016). This question can also provide new insights into the various strategies that multiple stakeholders pragmatically develop to deal with conflicting values in practices.
This paper aims to contribute to the empirical research agenda on values by particularly focusing on conflicting values in policy experiments that join multiple stakeholders in the co-design, experimentation, implementation, and evaluation of innovative policy ideas in local settings. We argue that a focus on policy experiments is productive for gaining new insights into how value conflicts are constituted and experienced by multiple stakeholders for various reasons. First, an important goal of policy experiments is to move beyond the status quo of existing routines, thereby potentially disrupting institutionalized values and renegotiating new value settlements. This makes policy experiments an ideal case to investigate how stakeholders critique, reshuffle, renegotiate, and reinvent values. Second, as a consequence of this disruptive focus, policy experiments are known to be characterized by a high degree of ambiguity and conflict (Nair & Howlett, 2016). Because there is no blueprint for action, multiple stakeholders need to negotiate the meaning, purpose, direction, and operationalization of experiments under conditions of uncertainty and power inequality in networks (Ettelt et al., 2015;Felder et al., 2018;Nair & Howlett, 2016). It is therefore likely that value conflicts will be part and parcel of policy experiments, thereby providing ample empirical opportunities for investigating how stakeholders differently perceive which values (no longer) count. Third, in policy experiments stakeholders are encouraged to develop joint ways of problem-solving that partially transcend existing accountability regimes in which stakeholders are embedded. An empirical investigation of value conflicts in policy experiments can therefore reveal how stakeholders balance between demands from existing accountability regimes (Benish & Mattei, 2020;Nair & Howlett, 2016) and new ways of working. Fourth, due to the political pressure, reputational stake and prestige attached to policy experiments, stakeholders are under a lot of pressure to deliver results (Ettelt et al., 2015). They therefore need to develop pragmatic responses to conflicting values to avoid paralysis. By zooming in on policy experiments, we are thus able to gain insights into the conflicting valuations of the purpose of the policy experiment as well as the strategies that stakeholders emergently and in interaction develop to deal with conflicting values.
The empirical analysis of this paper is based on a qualitative study of a Dutch experimental healthcare program (National Program for Elderly Care: NPEC) in the period 2008 to 2016 (with a total budget of 80 million funded by the Dutch Ministry of Health Welfare and Sports). The goal of this experimental program was to enable innovation in care for older persons by setting up regional networks in which various stakeholders co-designed policy transition experiments to integrate medical and social care. In the regional networks, stakeholders from different organizations worked together on this: that is physicians, nurses, care workers, medical researchers from University Medical Centres, national and local policymakers, patients, insurers, and local government representatives. The program was explicitly experimental in that it tried to establish new connections between actors in the health sector, overcome existing institutional barriers in integrating care and welfare, give older persons more "voice" in the design of new interventions, and test interventions with the purpose of stimulating policy learning. Due to conflicting interpretations by stakeholders of the added value of NPEC as an improvement program and purpose of the policy experiments , NPEC is an excellent case to study conflicting values in situ over a longer period of time. The central research questions we aim to answer in this article are: how do various stakeholders differently value the main worth of NPEC, which conflicting values do they experience during the course of the program (2008)(2009)(2010)(2011)(2012)(2013)(2014)(2015)(2016) and which strategies do they develop to deal with conflicting values over time? We answer these questions based on semi-structured interviews with key stakeholders in NPEC (n = 53 interviews with 63 respondents), document analysis and additional in situ observations of meetings of regional networks, the steering committee and conferences.
In the theory section, we first introduce a new pragmatist approach to studying values in practices that has rarely been used in the field of Public Administration. As noted by West and Davis (2011), however, a new pragmatist approach can make an important contribution to PA literature as it can bring in a more practice-based understanding of values. This provides insights into how multiple stakeholders in daily practices constitute the value of an experimental improvement program, thereby challenging or defending institutionalized routines, behavior, and accountability criteria. Second, we reflect on the issue of conflicting values in experimental policy programs. After discussing our qualitative methods, we discuss the findings in two parts: the main value conflicts and the strategies developed to deal with these conflicts. In the discussion and conclusion, we reflect on the theoretical implications of our findings.

Introducing a New Pragmatist Approach to Studying Value Conflicts and Strategies
We build on an alternative body of work called "new pragmatism of values" (West & Davis, 2011) that joins together insights from Science and Technology Studies (Latour, 2004(Latour, , 2005Law & Mol, 2006;Mol, 2007), pragmatist sociology (Boltanski & Thévenot, 2006;Jagd, 2011;Lamont, 2012;Patriotta et al., 2011), and valuation studies (Dussauge et al., 2015a;Helgesson & Muniesa, 2013;Kornberger, 2017;Schuurmans et al., 2020;Vatin, 2013). Although rarely used in the field of Public Administration (see for an exception , according to West and Davis (2011) a new pragmatist approach can make an important contribution to public values research as it brings in a practice-based view of values. This approach is both sensitive to how values are embedded in institutional routines and accountability regimes over a longer period of time, yet at the same time pays attention to how values and value conflicts can be reconfigured in practices by means of pragmatically developing strategies. Below, we further outline three core insights derived from a new pragmatist approach about values.
First, a new pragmatist approach places values firmly inside mundane practices, thereby building on previous pragmatist thought as developed by Dewey (1939Dewey ( , 1913. As Dussauge et al. (2015a, p.19) note, "values should be seen as always already constituted in practices, not as static entities which exist outside of action." As such, values can be defined as emergent qualities rooted in the demands of concrete practices (Boltanski & Thévenot, 2006), such as the practice of care giving (Mol, 2008;Pols, 2006) or strategy making (Kornberger, 2017). However, this does not mean that practices merely bring to the fore already existing values. Practices can also be constitutive of value making (Helgesson & Muniesa, 2013;Kornberger, 2017;Vatin, 2013). It is in daily practices of stakeholders that value (or "worth") is created in situ. This implies that values are not a priori given but come into being in acts of valuing (Helgesson & Muniesa, 2013;Kornberger, 2017;Vatin, 2013). This practice-based understanding of values thus moves beyond current understanding of values as either cognitive beliefs of certain actors (professionals/citizens/ managers) or as stable qualities of specific institutions (e.g., competition belongs solely to the institution of the market), placing values within practices themselves.
Second, while building on previous pragmatist thinking, a new pragmatist approach develops new insights by researching how multiple actors, when in disagreement about which values count most, use collective discretionary space to pragmatically develop different strategies for dealing with conflicting values (Boltanski & Thévenot, 2006;Rutz & De Bont, 2020). Empirical studies can thus provide insights into how certain things come to be considered valuable and desirable, as well as how certain registers of value are ordered and displaced (Dussauge et al., 2015b, p. 268). For example, Rutz et al.'s (2018) study of a collaboration between healthcare inspectors and service users showed how inspectors developed different strategies to deal with conflicting valuations of good care, ranging from creating a value hierarchy to delegating value conflicts to professionals and separating conflicting values in daily work. Moreover, Oldenhof et al.'s (2014) study showed how care managers in negotiations with professionals and clients, constructed compromises between conflicting values (e.g., civic and domestic values), which were materialized into rhetoric, behavior, protocols, and objects in small-scale care facilities.
Third, a new pragmatist approach has the potential to explicitly "bring in" the role of organizational objects that play a role in how something is valued. In line with Asdal (2015) and Vatin (2013) we especially consider organizational objects such as vision documents, performance indicators, accountability criteria, and protocols, as "value agents" constitutive of value-making (Boltanski & Thévenot, 2006). They do not only dictate who receives what (e.g., organizational resources), but also what values come to matter in the first place (Roscoe, 2013). As such, material value agents can thus be thought of as "matters of concern" (Latour, 2004): spurring controversy about which matters should be addressed or about the correct way of evaluating performance. By taking into account the role of objects, we are able to ameliorate the anthropocentric focus of public values research that tends to place humans center stage and overlooks the interaction with non-human objects (Latour, 2005;Ustek-Spilda, 2020;West & Davis, 2011). Moreover, attention for organizational objects like guidelines can also explain how certain values become more institutionalized than others, thereby being more difficult to displace or re-order.
Hence, by applying a new pragmatist approach, we are able to "place" values firmly inside practices and "bring in" objects and strategies into the study of conflicting values, thereby making a valuable contribution to the field of Public Administration. The empirical terrain for studying value questions up close is vast given that value conflicts are part and parcel of administrative and organizational life (de Graaf et al., 2016;Dussauge et al., 2015a;Spicer, 2009;Stark, 2009;Stewart, 2006;Thacher & Rein, 2004;Wagenaar, 2014). However, we argue in this paper that value conflicts become particularly visible in experimentalist contexts that join up multiple stakeholders in the ambition to move beyond the status quo, thereby potentially disrupting existing values and generating new ones. This makes practice of policy experiments fruitful research ground to study value struggles and challenges.

Conflicts at Stake in Policy Experiments
Despite the widespread use of policy experiments in the public sector (Ettelt et al., 2015), the literature on policy experiments has been sparse (Martin & Sanderson, 1999;Sanderson, 2002Sanderson, , 2009) and only recently gained traction (Bailey et al., 2017;Ettelt et al., 2015;Felder et al., 2018;Hodgson et al., 2019;Nair & Howlett, 2016). In this paper we build on this growing body of literature and aim to further develop it by particularly focusing on how conflicts in experiments can be analyzed as value conflicts. Before doing so, we first briefly describe the aim and scope of policy experiments and how conflicts have been discussed in existing studies to date.
Policy experiments have become an increasingly popular mode of policymaking to address wicked problems in diverse fields such health, social care, education, employment, and justice (Bailey et al., 2017;Ettelt et al., 2015;Felder et al., 2018;Hodgson et al., 2019;Nair & Howlett, 2016). Recent examples of policy experiments range from new partnerships for preventive care for older persons (Ettelt et al., 2015) to innovative urban housing policies (Mei & Liu, 2014) and the development and use of assistive technologies (Bailey et al., 2017). What is common to these diverse policy experiments is the acknowledgment that top-down approaches to policy innovation do not work due to uncertainty about cause and effect of policy measures in complex multi-governance settings (Nair & Howlett, 2016). Therefore, policy experiments aim to join national and local stakeholders in the quest to collectively "work with" uncertainty by co-developing experiments that can be adjusted to specific settings and incorporate emerging insights and unexpected events along the way.
So far, existing studies of policy experiments have addressed the topic of conflict mainly through the lens of (a) politics of power, (b) conflicting goals.
With regards to politics, Nair and Howlett (2016) have discussed various political factors that can impede the design, development, and evaluation of policy experiments. To begin with, there are recurring power struggles between national and local policy actors in policy experiments. When national governments feel reputational risks are involved (in case of potential failure/ lack of tangible progress), they may decide to increase their central grip on the design and implementation of the experiments. This may clash with expectations of local stakeholders that operated under the assumption that they would have plenty of discretionary space to tailor the policy experiments to their specific needs (see also Bate & Robert, 2003;Zuiderent-Jerak et al., 2009). In addition, existing institutional arrangements and policies may clash with new policy experiments that aim to replace existing arrangements. Moreover, in institutionally layered environments (Van de Bovenkamp et al., 2014), policymakers may sometimes instrumentally use pilot experiments to delay or avoid more fundamental large-scale policy reform until "the political mood is ripe for a more enduring course of action" (Jann and Wegrich, cited in Nair & Howlett, 2016, p. 71). As Nair and Howlett put this, "instead of evidence-based policymaking, pilot projects may be used as tools for conflict avoidance" (p. 71). Finally, in the evaluation phase of experiments, politicians and policymakers can exercise political pressure on evaluators of policy experiments to deliver positive evidence for pre-determined policy measures, thereby creating integrity conflicts (Nair & Howlett, 2016).
A slightly different take on conflict is given by Ettelt et al. (2015). Their study of policy experiments in social care primarily defines conflicts in terms of divergent goals of the experiment itself. Based on their case study research in the UK, they outline four main goals of policy experiments: (1) experimentation (testing via RCT design whether pilots are cost-effective), (2) implementation (rolling-out best practices), (3) demonstration (best-practice sites showing to others how to do it), (4) learning (learning how to overcome barriers). Although on paper, it seems possible to combine multiple goals of policy experiments, in practice these goals often conflict or are differently prioritized in time. For example, stakeholders may still be in the phase of learning to overcome barriers, when national policy makers may already want to identify and scale-up best-practices across the country. In addition, conflicts may also arise due to the fact that goals may shift in time, depending on shifting agendas of stakeholders.
Although the discussed studies have firmly put conflict on the research agenda, these conflicts have not been analyzed in terms of value conflicts, thereby missing out on the underlying value dynamics at stake in policy experiments. Moreover, it is unclear how value conflicts change over time as a consequence of strategies developed by multiple stakeholders.

Methods
Our analysis is based on qualitative research of the National Program of Elderly Care (NPEC) that ran from 2008 until 2016. With a total budget of over 80 million euros, it was one of the largest care improvement programs in the Netherlands. The Ministry of Health, Welfare and Sports commissioned the Netherlands Organisation for Health Research and Development (ZonMw, a funding body for health research and innovation) to develop and monitor the program. A national program committee was established by the funding body to oversee the operational governance of the program. Members of this committee had a high public profile and were viewed as experts in research, policy, and healthcare. As part of the program, eight regional networks were established with various stakeholders in each network (medical research departments, elderly associations, regional care, and welfare organizations and patient associations). The number of organizational actors in the regional networks varied from 19 to more than 100. The involved organizations come from different sectors. Most commonly they include municipalities, Public Health Services, mental healthcare organizations, applied universities, University Medical Centres, care-and welfare organizations and-in some occasions-health insurers.
Each network applied for one large "transition experiment" (to reorganize care) and could apply for additional research projects. In each network, a target group panel consisting of older persons was established, whose task consisted of evaluating the research proposals and providing input for decisions about the focus of the regional network. Figure 1 provides a schematic overview of the program: As researchers, we were commissioned by ZonMw to evaluate the NPEC. In this role, we conducted 53 semi-structured interviews with 63 respondents, ranging from 50 to 150 minutes per interview, 90 minutes average. We interviewed different stakeholders, both on national and regional levels. Respondents included older persons of the target group panels, national representatives of elderly associations, medical researchers, representatives of medical associations, national policymakers at the Ministry of Health, Welfare and Sports and the funding body, network coordinators, and other stakeholders. We have interviewed at least three actors in each regional network: the network leader, the coordinator and one or more members of the target group panels. We selected three networks for an in-depth study, conducting 22 interviews with 24 persons. In these networks we also interviewed members involved in the network governance (advisory committee, strategic committee) and actors from different sectors (see above). All interviews were transcribed verbatim. To supplement the interview data, we conducted observations of meetings of the target group panels in networks (n = 3) and analyzed various documents (such as program texts, review criteria, progress reports, newsletters, vision documents, and minutes of meetings) to better understand how the program was presented "front stage" and which values were publicly enacted in different moments of the program (cf. Wehrens et al., 2021).
In our interviews, we tried to avoid bias by not a priori defining a set of values. Rather, we adopted the strategy of asking questions that explored how respondents valued something "in context" (see also Paanakker, 2020;Rutgers, 2008): for example, what do you see as an important overall worth of the NPEC? How do you value a "good" experiment in your regional network? Did this change throughout the course of the program? When respondents mentioned conflicts, we probed with further questions to reveal controversies about conflicting values and different strategies to respond to this. In addition, during observations we paid attention to how respondents interacted with each other when "matters of concern" (Latour, 2004) were raised.
In our analysis of the data, the widely divergent and conflicting valuations of the program's worth was a surprising result emerging from our empirical analysis. Through an abductive process (Tavory & Timmermans, 2014) of moving back and forth between empirical results and theoretical notions of value strategies and (conflicting) values, we categorized our data in terms of values (how do actors value the overall "worth" of the experimental program?), value conflicts (frictions between different valuations of the program), and different strategies to deal with conflicting values. In line with Boltanski and Thévenot (2006), we consider value-making not just as a human activity but a material endeavor too. We therefore also focused on the role of evaluative material devices by analyzing the role of various documents and instruments in value-making.
Our analysis resulted in the identification of four valuation schemes and six strategies for dealing with tensions between these valuation schemes (detailed below), which form the core of our results. While these schemes and strategies function as ideal-typical abstractions, they are based on the emic accounts of various actors we interviewed (i.e., the actions and processes they described in their accounts of the program). The term "scheme" signifies that values are not free floating, but are explicated in accountability criteria, operationalized in devices, and institutionalized in organizations and projects. These aspects are internally consistent and reinforce the main values: the notion of a valuation "scheme" thus helps to point towards this internal coherence and reinforcement.
For each valuation scheme, we identified the core values that were enacted in practice, the core accountability criteria, devices, level of institutionalization, and main concern. These categories were developed through our familiarity with the theoretical literature in valuation studies and public administration. We identified the core values by coding what respondents mentioned as worthy, valuable, or desirable. For the category "accountability criteria," we coded actors' accounts of what they identified as key criteria to evaluate "success" or "good practice." For the category "devices," we drew inspiration from Asdal's (2015) and Vatin's (2013) recognition of documents and devices as lively "value agents" constitutive of value-making. Thus, accountability criteria and values become operationalized in devices. We specifically coded tools, instruments, methods, and formats that are used in order to achieve the values deemed important. The category of "institutionalization" refers to whether (and to what extent) the valuation scheme is structurally embedded in an organisational context. This is based on the assumption that the more institutionalized certain valuation schemes are, the more powerful they become. The category of "main concern" is based on the value conflicts that arise. It is within such value conflicts that the main concerns of actors become visible. We particularly coded situations that actors sought to avoid (as this says something about the stakes involved and the risks experienced).
The analysis subsequentially focuses on the identification of tensions between different valuation schemes and the strategies that actors developed over time in order to deal with emerging conflicts. The specific labels of the six strategies (e.g., compromises, colonization, prioritization, etc.) are partially inspired by literature (Boltanski & Thévenot, 2006;Habermas, 1987;Stewart, 2006;Thacher & Rein, 2004). We used "sensitizing concepts" from the existing literature on values strategies, such as hybridization, bias (Stewart, 2006), and compromises (Boltanski & Thévenot, 2006), to provide a general sense of direction, while still keeping sufficient space for indictive findings that are grounded in the data and views of the respondents (Blumer, 1954;Bowen, 2006). During the data-analysis it became clear that most strategies from existing literature operated on a macro system level (hybridization, bias, etc.), whereas our strategies seemed more attuned to organizational realities of the policy experiment on a meso level. While having similar dynamics, the strategies found are thus more concrete and embedded in inter-organizational practices. During the interviews respondents reflected on difficult choices and dilemmas within the program. We asked them how they dealt with such dilemmas practically. Such practical examples subsequentially became coded in terms of value conflicts and strategies for dealing with value conflicts, thereby redeveloping the sensitizing concepts on a meso-level. The analysis is strengthened via data triangulation of interview transcripts, field notes of observations, and documents to increase the validity of our findings. The data analysis was conducted by the first two researchers who independently coded the data and subsequently discussed and compared the coding.

Results
In this section we describe the various valuation practices of actors engaged in the program, unraveling which values come to count as important and which values are discounted in the process of experimentation. We also demonstrate how actors deal with conflicting valuation schemes by developing various strategies. A typology is provided of the main valuation schemes and strategies to deal with conflicting valuation schemes. Additionally, we show the consequences of these strategies for experimental learning in terms of what can be experimentally learned and by whom.

Different Valuations of the Worth of an Experimental Improvement Program
Throughout our research we identified four distinct valuation schemes that each value the worth of the National Program for Elderly Care (NPEC) in different ways: an evidence-based, participatory, collaborative, and experience-based valuation scheme. These valuation schemes constitute specific values, are enacted by different actors, become reified through different devices and instruments, come with different accountability criteria, vary in their degree of institutionalization, and distinguish different main concerns, as shown in Table 1.
Evidence-based valuation scheme. The evidence-based valuation scheme was primarily enacted in the practice of medical researchers that also worked in the frontline as physicians. It was also supported by important policy actors and the Netherlands Federation of University Medical Centres (NFU). Actors in this valuation scheme argued that a "good" experimental program should lead to evidence-based outcomes. Core values embraced by these actors were "effectiveness" of the developed projects and objectivity of the research. The success of projects was primarily defined in terms of statistical significance. Core instruments through which this valuation scheme was enacted were particular research designs (cost-effectiveness analysis, Randomized Controlled Trials). The valuation scheme is characterized by its homogeneous accountability criteria used to demarcate "proper" research about elderly care. A tight "fit" exists between these criteria and the set of instruments. Actors utilizing this valuation scheme often contrasted "hard evidence" with "soft" forms of research or "common sense," practice-based ways of working. The evidencebased valuation scheme is also characterized by a high level of institutionalization. The NFU is an association with a long history and strong policy position. Furthermore, Dutch health policy has a strong historical context of using medical evidence. The primary concern within this valuation scheme is the complex and unpredictable character of elderly care, potentially corrupting study designs.
Participatory valuation scheme. In the participatory valuation scheme, the worth of the experimental improvement program was constituted in participation of older persons. This valuation scheme was primarily enacted by older persons in the regional networks, representatives from elderly associations, and some representatives in the national program committee committed to include the "voice" of older persons. It also materialized in the program documentation as an important ideal of the funding agency. The main values enacted in this scheme are "empowerment" and "inclusiveness." The inclusion of older persons in the design and evaluation of research proposals was expected to lead to research better aligned with their needs. Empowerment of older persons ("talking with" instead of "talking about") and ideals about shared decision-making also gained prominence. The value of inclusiveness was mainly enacted through the formulation of specific program requirements by the funding agency, making participation of older persons in "target group panels" a mandatory condition for receiving funding. This scheme is characterized by more heterogeneous accountability criteria, in which relevance for the lifeworld of older persons and representativeness of the research results were most important. Compared to the evidence-based valuation scheme, the participatory valuation scheme is less institutionalized. Despite a long history of umbrella associations, the national organization of elderly associations is fragmented into different "pillarized" organizations. A main concern in this valuation scheme is tokenism: the pro-forma routine of including older persons without meaningful engagement.
Collaborative valuation scheme. The collaborative valuation scheme was mainly enacted by program committee members and actors involved in the regional networks. In this valuation scheme, a "good" experimental improvement program was perceived as a collaborative effort in which different organizations prioritized values such as "integration of care," "customization," and "inter-organizational collaboration." On the regional level, this related to customized services and integrated care (discursively contrasted with "fragmentized" care in a market-based setting). On the national level, program officials initially viewed collaboration as a means, but gradually framed collaboration as a goal in itself. Within this valuation scheme, the regional networks and transition-experiments aimed at re-organizing care were seen as important instruments. Accountability criteria of "successful" collaboration were growing number of network partners from different organizations, impactful transition experiments, and learning from each other in and between networks. This valuation scheme is less coherent and less institutionalized than the previous two, as its institutionalization depends on the sustainability of the regional networks. Given that the networks and transition experiments were dependent on temporary funding, the institutionalization of collaboration was weak. The lack of sustainability of the collaboration is also one of the main concerns experienced by actors. Another concern relates to red tape that hinders successful collaboration, for instance misaligned financial structures that made interorganizational collaboration between nursing homes and hospitals more complicated.
Experience-based valuation scheme. The experience-based valuation scheme was mainly enacted by persons involved in concrete regional improvement projects (e.g., healthcare professionals and older persons who were the "target" of the interventions). These actors valued the worth of the experimental improvement program as an experiential improvement. This implied that the projects to improve elderly care were not valued for their statistical significance, but on an experiential basis: the projects led to better perceived wellbeing and practical improvements in daily life. Whereas the other valuation schemes became materialized in particular devices and instruments, the experience-based valuation scheme is less coherently reified. Its main instruments are user narratives presented in documentation about the program (personal stories serving as "intermezzo" to more formal descriptions of the program's results) and the work of ambassadors (older persons with a large network in managerial or policy functions), who share positive experiences with the program at various meetings. In terms of accountability criteria, this valuation scheme emphasizes direct practical value: interventions should resonate with the lifeworld of professionals and older persons and need to be perceived as usable and meaningful. This valuation scheme is poorly institutionalized, as its institutionalization depends on older participants and professionals that are only temporarily involved in the regional networks. The main concern in this valuation scheme is "ivory tower science": scientifically interesting research projects that lack relevance for the lived experiences of older persons.

Frictions Between Various Valuation Schemes
Although the valuation schemes are presented separately, in the day-to-day practice of the regional networks and the overall program they are entangled, sometimes strengthening each other, but in other situations clashing and creating dissonances. We identify three recurring frictions between valuation schemes.
Friction 1: Inclusiveness of elderly perspectives versus generalized, non-idiosyncratic knowledge. The first friction manifested itself between the participatory and evidence-based valuation schemes. In the participatory scheme, empowerment and inclusiveness of older participants were core values. These were enacted through the formulation of target group panels and the representation of older persons in program governance. In each regional network, a target group panel of older participants was involved in evaluating the quality of research proposals before they were sent to the funding organization for review. However, in the evidence-based valuation scheme, much emphasis was placed on epistemic criteria of rigor and objectivity. As outlined, this led to specific requirements for research projects and a particular view on suitable methods, which were often not in line with the ideas older participants developed in the panels. The project leader facilitating participation of older persons reflects on these differences: "If you are going to ask older persons what they find most important in the projects, they want the projects [in the NPEC] to be conducted in a way that it leads to direct, very concrete, benefits for older persons [. . .]. That they receive more attention, that something is done to reduce loneliness, that mobility is improved. All these themes were not the focus of the NPEC" (Interview project leader older participation NPEC).
Older participants in the target group panels thus struggled to have their values and accountability criteria reflected in the research projects. Especially in the first period of the program, this difficulty tied into the concern of tokenism. Feelings of exclusion were strengthened because research proposals initiated bottom-up by target group panels often were not funded by the funding body that used scientific evaluation criteria to assess the worth of proposals.
Friction 2: Lack of scientific results versus experienced improvements. A second friction, between the experiential and evidence-based valuation schemes, arose at the end of the program in debates about how to value the program's overall results. Although the regional networks were growing and new research was conducted, many medical researchers argued that the effectiveness and statistical significance of interventions was small to non-existent. When researchers claimed that projects failed to render results, older participants and professionals however felt indignant because they experienced worth in different ways: "There is still no agreement about what the NPEC has delivered. On what scale do you weigh this? [. . .] If you look at the projects of early detection [of risk factors], that was about less unnecessary care, more self-reliance and better quality of life. Well, if you look at it scientifically, the projects didn't deliver that. But what did they deliver? That the older persons who have received care and support in these projects are much happier with the support they received. The professionals like this way of working much more and have the feeling that they are able to connect [with other care providers] much faster in case an older person is at risk of becoming vulnerable" (Interview project leader older participation NPEC).
The perceived improvements by older persons and professionals, however, often could not be substantiated in epistemic criteria that mattered in the evidence-based valuation scheme. The experiential value was hard to capture in credible numbers.
Friction 3: Flexibility in experimenting versus exclusion of uncertainties. A third friction manifested between core values of the collaborative valuation scheme (freely experimenting and collaborating in order to reorganize care) and the main concerns within the evidence-based valuation scheme (the need to control the process of experimentation in order to avoid unpredictable events corrupting the study-design). The core values in the collaborative valuation scheme required a flexible approach of researchers, who had to engage with various stakeholders, collaborate, intervene, and make adjustments in changing organizational settings. Not knowing what to expect was a crucial element in transition projects that aimed to radically transform the healthcare system by challenging common ways of organizing. These ideals conflicted with the methodological devices and accountability criteria in the evidencebased valuation scheme, as a medical researcher also recognized: With a clinical trial one describes in advance in a very detailed manner what the intervention will be and that has nothing to do with the fact whether you sufficiently keep into contact with practitioners and consider how practices can adapt [. . .]. A transition project has a much more open scenario in the sense of learning to change [. . .]. They are two different worlds. (Interview medical researcher) As becomes clear from this quote, the attempt to exclude uncertainty via the RCT format does not align well with the collaborative values of flexible experimentation.
Notably, the different frictions all related to the evidence-based valuation scheme in different constellations. This can be explained through the high level of institutionalization and internal consistency of this valuation scheme, which renders it less easily reconcilable with values embedded in the other valuation schemes. This is not to argue that the tensions between valuation schemes are a priori given: they arise in particular situations and can also be (temporarily) resolved or negotiated into new alignments. In the following section, we focus on different strategies for dealing with conflicting valuation schemes.

Strategies for Dealing With Different Valuation Schemes
We identified six strategies that were used by stakeholders in order to coordinate and align various valuation schemes in practice (see Table 2). The strategies differ in the extent to which they lead to value alignment: that is the combination of multiple values. The degree of value alignment was determined by taking account the following aspects: time (e.g., whether values were combined temporarily or more permanently), power (e.g., whether values were imposed by powerful others or combined based on a shared understanding), institutionalization (e.g., whether values were institutionalized in separate silos or combined in new organizational bodies/formats). The strategies are developed in different relational contexts and come with different concerns, as is shown in Table 2: Colonization. The strategy with the lowest level of value alignment is colonization: the durable imposition of a dominant valuation scheme on other valuation practices. This strategy denies equality between valuation schemes by advantaging one particular set of values for a longer period of time. Usually, colonization occurs when a particular valuation scheme has been embedded and materialized in detailed evaluation criteria, professional formats, and organizational infrastructures. This was the case for the evidence-based valuation scheme in the first years of the NPEC. An important consequence was that actors representing different values had to align their activities toward the values and criteria of the dominant valuation scheme. As a result of colonization certain actors increasingly sought to resemble or mimic the dominant valuation scheme by adopting the same language and metrics: "The project proposals as developed by medical researchers, were written in a format provided by the funding organization and need to be scientifically convincing. But if you want older persons to evaluate these, then you ask them to place themselves in the world of researchers and the scientific discourse to be able to come to an appropriate judgement. [. . .] So for older persons this was very complicated" (Interview coordinator elderly participation).
Within the program, the national program committee developed formats through which older participants were further "professionalized" via training and handbooks to understand the scientific discourse and get used to the specific requirements for evaluating and submitting research.
Prioritization. Another strategy to deal with conflicting valuation schemes without attempting value alignment is the strategy of prioritization: the choice to make certain valuation schemes temporarily more important than others, thereby ordering values hierarchically. In contrast to colonization, prioritization is more temporary and can be the result of consensus or bargaining of stakeholders. Whereas colonization leads to mimicking the procedures of the dominant valuation scheme, prioritization can lead to broader reflection on the advantages and disadvantages of various valuation schemes. This entails the weighing of pros and cons, although the extent to which various groups can prioritize values depends on the relative strength of their position vis-à-vis others. What is not being learned in this strategy is how valuation schemes can be productively combined.
An example of this strategy was the gradual shift toward prioritizing the participatory valuation scheme over the evidence-based valuation scheme. On a national level, the program committee conducted an interim evaluation and concluded that the voice of older participants, which on paper was one of the core goals of the program, remained limited because narrowly defined medical research projects dominated the program (Source: Interim overview of the NPEC, letter to secretary of state, 31-08-2010). Consequentially, the program committee developed several specific calls for proposals in which criteria reflecting the participatory and experience-based valuation schemes (relevance for older persons, more empowerment) became explicitly prioritized: [We] started as a regular program, with research calls that are evaluated. And many proposals were about generating knowledge about older persons. And halfway we [the program committee] discovered that the focus should not only be on medical care, but especially also well-being. So [. . .] there was a development in which we said: these older persons need more support (Interview member program committee).
The values that mattered in the lifeworld of older persons (well-being, empowerment) gradually became prioritized through specific calls for projects.
Another example of prioritization was the attempted move to prioritize values of collaboration in order to highlight the program's outcomes. As most projects did not lead to improved health outcomes as valued in the evidencebased scheme, and the program still needed to be legitimized to high level policy makers, the program committee became more open to other values that captured some of the program's results. By prioritizing the collaborative valuation scheme in accounting for the results, the program was presented by the program committee as a first step in a longer process of collaboration and knowledge exchange in the domain of elderly care. Although the program committee organized several attempts to establish a broader coalition of key policy actors and national associations that would express their commitment to the collaborative change agenda, other actors were reluctant to align with this prioritization strategy.

Pilotification.
A third strategy to deal with various valuation schemes is pilotification: attempts to circumvent tensions between valuation schemes by creating a niche space outside the system. The extent of value alignment can be characterized as low. Where the strategies of colonization and prioritization establish a clear hierarchical view on accountability (some criteria are considered more important than others), in the pilotification strategy accountability criteria are treated as "siloed" (i.e., the differences are circumvented by temporary solutions).
This strategy can be seen in how the program committee formulated the purpose of the large-scale transition experiments: as distinctive from traditional research projects. Their aim was to be bolder and broader: a full reorganization of care for older persons facing multiple conditions. This bold sense of experimentation was on the one hand greatly appreciated by actors in regional networks. However, such experiments did not align well with existing institutional, financial, and organizational logics. The program committee therefore sought ways to create a temporary niche space in order to create room for innovative projects to start: Where necessary we offer, in consultation with the Ministry and other relevant stakeholders, (temporary) extra space to be able to conduct the intended experiments. It can for instance be necessary to combine financing from different systems or domains over the course of the project, to expand existing regulations, etc. (Source: ZonMw (2008) While the temporary space created for experimentation did enable the establishment of transition experiments, this strategy also made it more difficult to implement the outcomes of projects in existing institutional logics and structures. It turned out to be particularly difficult to embed pilots, even those widely considered to be valuable, within existing financial structures, especially if this involves multiple budgets of different stakeholders. Pilotification can thus lead to a lack of sustainable solutions and the persistence of value tensions.
Short-cutting. A fourth strategy in aligning various valuation schemes is short-cutting: aligning valuation schemes by strategically selecting the "right" actors to create more convergence between various perspectives. The extent of value alignment in this strategy can be characterized as low to medium: although the inclusion of actors from other valuation schemes leads to a convergence of perspectives, this convergence is only possible because the actors included are already partially aligned with the other valuation scheme.
An example of short-cutting can be seen in the strategies used by researchers to include older persons in target panels when applying for network funding. As a member of the program committee recalls, network coordinators mainly used their own personal network to select older participants to become involved in the target group panels: The regional networks had received the assignment from the program committee to involve older persons. Part of the older participants spoke the medical language well, like a retired professor (. . .). So [the network coordinators] had complied with the assignment: you have to organize elderly participation. But the older persons were-for a part-personal relations that could easily participate in that culture and way of thinking. (Interview member program committee) Because of the biased selection of academically oriented and policy savvy older persons already more affiliated with the values considered important in the evidence-based valuation scheme (see also , it is questionable whether participants in the target group panels were a good representation of frail older persons in general. Rather than leading toward a more substantive alignment of perspectives and values, this strategy therefore seeks to short-cut some of the tensions. This strategy's main risk is therefore the superficial character of any consensus reached in this way due to the lack of representativeness. Compromising. A fifth strategy used to deal with different valuations is compromising: attempts to reach a concession that leads to a (temporary) settlement between multiple valuation schemes. With this strategy, actors water down certain valuation schemes to enable the inclusion of alternative values. Accountability becomes more diverse, as this strategy acknowledges the need for (mutual) adjustment of accountability criteria over more hierarchical or siloed views. Learning is conceived as a substantive and mutual development of insights into each other's values and concerns.
A good example of this strategy is linked to the shift in research calls toward participatory and experience-based valuation schemes (discussed above). Whilst this shift is an example of prioritization, the way in which the funding organization evaluated the submitted research proposals is an example of a concession in order to reach a temporary value alignment. According to one of the members of the program committee, it was important at that point in the program to evaluate the research proposals more flexibly, in order to be able to include proposals focusing on welfare that did not fully meet the scientific review criteria: In the meeting where [relevant proposals] were being evaluated, we [would say]: just let them have another look at it [. . .], rather than rejecting proposals immediately. So in case of an application, you could say: we cannot grant your (proposal) funding, but if you take into account this and this and you come back, we will have another look at it (Interview program committee member). This is an example of compromising between the evidence-based schemein which rigor is highly valued as an important epistemic criterion-and the experience-based valuation scheme-in which projects with a broader focus on well-being were considered to be valuable.
Enmeshing. The last strategy used to deal with conflicting valuation schemes is enmeshing: setting-up hybrid organizational bodies representing different valuation schemes. This strategy attempts the highest degree of value alignment. By bringing together actors from different valuation schemes in the same organizational body, actors need to face up to the multiplicity of values. Similar to the strategy of "compromising," accountability here becomes viewed as multiple. Whereas other strategies avoid confrontation with the multiplicity of value distinctions (either through prioritization or by circumvention), this strategy sees confrontation with the multiplicity of values as part of a mutual learning process of recognizing each other's perspective. There is a fine line, however, between the strategy of enmeshing and shortcutting: an organizational body can cosmetically bring together actors from various valuation schemes yet in practice only select "favorable" candidates already more aligned towards particular values.
An important example of enmeshing is the establishment of the hybrid NPEC program committee: So that commission was composed of many more disciplines than just the medical-biological one (. . .). That's why we were able to better judge applications. Otherwise we could have said: those "soft" issues, we can't take that into account. (Interview member program committee) By seeking a balance between actors from different backgrounds-medical researchers, welfare representatives, elderly representatives-the committee explicitly sought to include key-actors that represent different valuation schemes. At the same time, previous examples showed that the existence of such a hybrid organizational body per se does not automatically lead to value alignment: it took the committee several years to shift from the evidencebased valuation scheme toward a more harmonized approach in which other valuation schemes were gradually considered equally important.

Conclusion and discussion
With this paper we contributed to recent calls in Public Administration for more empirical studies in situ of how different stakeholders struggle with multiple conflicting values and develop pragmatic strategies to deal with this (de Graaf et al., 2016;Hartley et al., 2017;West & Davis, 2011). By adopting a new pragmatist approach to studying value questions in the context of policy experiments, we were able to investigate how values are actively constituted in practices of different stakeholders, thereby moving beyond current accounts of values as either abstract entities "out there" or cognitive preferences in individual minds (see for a critique Kornberger, 2017;West & Davis, 2011).
Based on a qualitative analysis of a Dutch experimental improvement program in healthcare, we reveal how stakeholders differently valued and operationalized the worth of the program, which manifested itself in three value conflicts: inclusive empowerment of older persons versus non-idiosyncratic knowledge; evidence based versus personally experienced improvements; flexibility in experimenting versus exclusion of uncertainties. To deal with these conflicts, stakeholders developed various strategies depending on their institutionalized position in the field of elderly care. We identified six strategies to deal with these value conflicts: that is, colonization (imposition of a dominant value scheme), shortcutting (aligning values by strategically selecting "the right" actors), compromising (temporary settlement of values), prioritization (temporarily making certain values more important than others), organizational enmeshing (creating hybrid organizational bodies), pilotification (circumventing value conflicts by creating a niche space). The results show how the choice of strategies shifts over time from more exclusive topdown value strategies to more inclusive multi-value strategies due to pressures of target group panels and new priorities of the funding agency.
The empirical analysis demonstrates that stakeholders cannot freely pick and choose strategies to deal with conflicting values as they see fit. For example, the strategy of colonization is only a viable option when stakeholders have a dominant position in the field. This position is not so much a matter of individual power attributed to stakeholders but the result of a high level of institutionalization of particular values in work routines, valuation devices, and accountability criteria. This was evidently the case with the evidencebased valuation scheme that was firmly embedded in daily practices of healthcare: not only in the dominant RCT design that medical researchers used, but also in terms of criteria for funding and the format of submitting research proposals. However, this does not automatically imply that less institutionalized valuation regimes do not stand a chance of disrupting the status quo in experimental improvement programs. Due to increasing critique and pressure of the target group (i.e. frail older persons in need of care) and disappointing results of RCT's, the funding body gradually started to prioritize other values in subsequent funding calls, including values of empowerment of older persons and inter-organizational collaboration in networks. In addition, medical researchers and policymakers at the funding body were more willing to apply the strategy of compromising. Rather than sticking rigidly to criteria of scientific rigor, they evaluated proposals for transition experiments with a lenient eye and focused more on values such as experience based impact for older persons. Moreover, after a number of years the strategy of organizational enmeshing seemed to pay off because stakeholders with different backgrounds got increasingly acquainted with each other's perspectives, thereby opening up space for value alignments rather than value colonization. These unfolding shifts in type of strategies demonstrate the key importance of taking into account the temporal dimension in researching value conflicts and strategies (Hydle, 2015). Rather than in situ value research just focusing on the here and now, it can adopt a longitudinal time frame that is able to capture how practices of value-making change as time unfolds, how value conflicts are (temporarily) settled, and how different valuations become more or less dominant.
Our empirical analysis also contributes to existing studies in PA that previously identified strategies to deal with conflicting values in various policy domains (de Graaf et al., 2016;Stewart, 2006;Thacher & Rein, 2004). Rather than adding new strategies to an increasing list, it makes more sense to assess how various strategies relate to each other and on what level: micro (individual frontline), meso (organizational), macro (system). While some strategies previously identified are individual micro level strategies, for example, casuistry (Thacher & Rein, 2004) and escalating (de Graaf et al., 2016), most strategies appear to function on a macro system level (Stewart, 2006;Thacher & Rein, 2004), such as hybridization (the co-existence of multiple values as a consequence of new policies being layered on top of existing ones), bias (the exclusion of alternative values by the development of dominant policy paradigms or technicization), and firewalls (creating different institutions that focus on different values, thereby distributing responsibility for conflicting values). The strategies that we identified in this study seem to have similar underlying dynamics but operate on a less abstract organizational meso-level. When applied frequently and over time, however, meso-level strategies can amount to macro level impact. For example, the strategies of colonization (imposition of a dominant value scheme) and shortcutting (aligning values by strategically selecting "the right" actors) can lead to bias on a system level. Moreover, meso-level strategies such as organizational enmeshing (creating hybrid organizational bodies and committees) and organizational compromising (temporary value settlements) can potentially lead to hybridization on a system level. For future research, the interplay between strategies on different levels (micro, meso, macro) could further be explored, thereby developing a layered and interactive approach to values strategies.
Such an approach could also explain how interactions between different strategies lead to institutional change and value convergence or divergence on different levels (see also Paanakker, 2020).
The identification of various strategies in this study also reveals how dealing with conflicting values is not just an individual process of balancing values, striking trade-offs, or psychological coping behavior (e.g., Lipsky, 1980;Tummers et al., 2015), as is often suggested in existing studies of street-level bureaucrats (see for a critique Thacher & Rein, 2004). Rather than putting the individual center stage, a new pragmatist approach to value questions and strategies foregrounds everyday practices and devices through which valuing and strategizing is being done (West & Davis, 2011). By adopting the viewpoint of valuation practices, it becomes possible to analyze the struggles between different valuation devices that determine what comes to count as valuable in the first place (Kornberger, 2017). As our findings indicate, the RCT design made visible and valued certain quantifiable health outcomes, while not counting less tangible experiences of older persons which could be better captured by qualitative valuation devices such as user narratives. Although we did not fully explore the in-depth struggles between valuation devices, this focus could further enrich existing studies in PA on conflicting values by going beyond the anthropocentric focus on human actors (Boltanski & Thévenot, 2006;Latour, 2005;West & Davis, 2011). Future studies could take as a starting point the rise of evaluation society (Dahler-Larsen, 2011) and empirically research the proliferation of (e)valuation devices such as rankings, benchmarks, societal cost-benefit analysis, and key performance indicators and show how these devices shape our current ideas of what is of value.
Finally, in line with existing literature on policy experiments (Bailey et al., 2017;Ettelt et al., 2015;Felder et al., 2018;Hodgson et al., 2019;Nair & Howlett, 2016), our analysis demonstrates that experimental improvement programs are far from protected spaces where stakeholders can freely experiment and learn. Conflicts are part and parcel of experimental policy programs and the daily reality of stakeholders participating in these programs. So far, authors have primarily analyzed conflicts in terms of power struggles between central and local actors (Nair & Howlett, 2016) and conflicting goals of experimental programs, for example, learning and implementation (Ettelt et al., 2015). However, our analysis reveals the underlying value dynamics of these conflicts. The experimental improvement program was a strategic arena in which different values clashed and were re-articulated, discussed, and molded to create new settlements. Experimentation, our analysis shows, is thus not a value-free search for "what works" but entails different values that are supported by different groups and embedded in different valuation instruments that shape "matters of concern" (Latour, 2004). For Public Administration scholars, it is therefore necessary to study policy experiments not as the bringer of facts or evidence but as strategic sites in which values are forged.

Declaration of Conflicting Interests
The author(s) declared no potential conflicts of interest with respect to the research, authorship, and/or publication of this article.

Funding
The author(s) disclosed receipt of the following financial support for the research, authorship, and/or publication of this article: This work was supported by the Netherlands Organisation for Health Research and Development (Grant No. 633300004).

Supplemental Material
Supplemental material for this article is available online.