The rise of the reflexive expert? Epistemic, care-ful and instrumental reflexivity in global public policy

The production of data and numbers has become the key mechanism of both knowing and governing global public policy. And yet, processes of quantification are inherently paradoxical: from expectations of technocratic rationality and political usability of producing ‘global’ numbers that count for ‘local’ politics and needs to practical limitation of measurement and the necessity to work with ‘good enough’ data. This begs a question – how do these competing epistemic, political and value orders manifest themselves through the work that experts do? In this article, we explore the problem by focussing on reflexivity as a way for experts (primarily those working in key International Organisations) to make sense of and tame the tensions inherent in their work. Through rich qualitative exploration of over 80 semi-structured interviews with experts working in the areas of poverty, education and statistical capacity development, we contribute to debates in the social studies of quantification by arguing that reflexivity is not just a mental process that experts engage in but rather an important resource allowing them to make sense of the contradictions inherent in their work and to mobilise political and ethical considerations in the technocratic process of producing numbers. We identify three types of reflexivity: (1) epistemic reflexivity – regarding the quality of data and its epistemic status as reflecting the reality; (2) care-ful reflexivity – regarding values embedded in data and the duty of care to the populations affected by the measurement and (3) instrumental reflexivity – regarding political rationality and necessary trade-off required to realise political goals. Overall, the article argues that reflexivity becomes an increasingly central expert practice, allowing the transformation of the process of quantification into one of qualification enabling them to attach political attributes and values to data and measurement.


Introduction
The production of data and numbers has become a key mechanism of both knowing and governing global public policy 1 .Complex statistical systems, such as the Sustainable Development Goals (SDGs) introduced in 2015, emerged as tools for both monitoring and steering global action.At the same time, quantification is as powerful as it is paradoxical: measurement is not a neutral activity but located at the intersection of diverse (and often competing) epistemic and value orders.On one hand, quantification requires a balance between political and technocratic considerations.Historically, the power of numbers stemmed from their ability to represent -and construct -governing problems (Scott, 2008), underpinned by the technocratic legitimacy of seemingly apolitical statistical method (Bandola-Gill, 2021;Grek, 2010).Nevertheless, it is now widely acknowledged that the power of numbers is equally derived by their political value -and this value has been increasingly foregrounded in the global arena.For instance, country participation in decision-making around the global goals has been a core premise of the SDGs, which, from the outset, invested in the adoption of a 'country-led' approach.
Thus, the push for depoliticisation and technicisation of numbers has been more recently counter-balanced by a re-politicisation of metrics, particularly as a result of the new participatory paradigm in global monitoring systems (Bandola-Gill et al., 2022).Indeed, with varying levels of active engagement, there has been more diverse participation of actors in number-making, including representatives from countries from the Global South (Fukuda-Parr and McNeill, 2019) with the aspiration to create opportunities for more democratised statistical systems (Milan and Treré, 2019).Increasingly, the production of numbers is expected to go beyond 'global' numbers and instead to account for 'local' politics and needs -or at least to give them equal weight, in that no global numbers can be produced without the active co-option of local actors and their needs.
On the other hand, quantification is a practice of 'sufficing' rather than finding perfect methodological solutions.As the global agendas become more and more extensive (partly due to the aforementioned growing number of actors involved in the monitoring process, compared, for example, with the Millennium Development Goals [MDGs]), they stray further from 'ideal' statistical environments.Many countries do not produce data about all the required items.In a stark example, as of October 2020, 5 years into the SDG Agenda, there were 52 indicators of 243 for which no African country was producing data (Ilboudo, 2020).Despite the aspiration to collect quantitative information that is as solid as possible, experts often work with 'good enough' data and 'workable' solutions.
A key question emerges in this context -how do these competing epistemic, political and value orders manifest themselves through the work that experts do?By looking at the case study of the SDGs, this article aims to answer this question by mobilising a theoretical lens of reflexivity and its multiple enactments.Throughout the article, we take reflexivity to be a practice, on the part of experts, of attending to the conditions by which governing knowledge is created (and thereby its limits), as well as an attention to the effects that governing knowledge has on those it is meant to serve.The starting point of our analysis is an observation that in the context of quantification, reflexivity is not only a thought process that is necessary for the work of experts to continuously make sense of what their work involves, to justify its failings, or to explain their moral predicament.Rather, reflexivity is also a key resource: experts use reflexivity as a political tool in their efforts to construct consensus and mobilise the participation of countries and their representatives in monitoring agendas and frameworks.
Reflexivity, as seen from this perspective, is not a monolith but rather a diverse expert practice.We argue that understanding expert practices in the highly metricised context of global governance requires a closer focus on three types of discrete yet intersecting forms of reflexivity that global quantification experts practise: epistemic reflexivity (regarding the quality of data and its epistemic status as reflecting reality), care-ful reflexivity (regarding values embedded in data and the duty of care to populations affected by their measurements) and instrumental reflexivity (regarding political rationality and the necessary trade-offs required to realise political goals).Experts mix and match these different forms, depending on their goals, preferences, strategic goals or even personal characteristics.Hence, we posit reflexivity as a bridging concept between large statistical systems (or 'infrastructures ' -Tichenor et al., 2022) and the agency of experts involved in measurement projects.
Indeed, reflexivity is an essential tool for scholars of global public policy to address head on both perspectivity (i.e.how researchers' social position, political aims and relationships to the study field impact on the ways they view these fields) and performativity (i.e.how our research directly affects and shapes such fields).However, we are interested not only in the ways that reflexivity is a necessary tool for those analysing the production and implementation of global public policy, but also in the ways that those working in international organisations (IOs), non-governmental organisations and governmental organisations in these transnational spaces habitually make use of various modes of reflexivity in their work, highlighting the perspectivity and performativity of those working in these spaces.How might these modes of reflexivity intersect and depart from those more established ways of understanding and writing about reflexivity?In this articlegrounded on the Science and Technology Studies focus on qualification (vs quantification) (Reinicke, 2015) -we explore reflexivity as a professional practice of experts working in IOs who employ their agency to respond to the pressure and limitations of their environments.Here a key distinction is one between technical rationality and reflection-in-action identified by Donald Schön's (1991Schön's ( [1983]]).As he famously stated, There are those who choose the swampy lowlands.They deliberately involve themselves in messy but crucially important problems and, when asked to describe their methods of inquiry, they speak of experience, trial and error, intuition and muddling through.
Other professionals opt for the high ground.Hungry for technical rigor, devoted to an image of solid professional competence, or fearful of entering a world in which they feel they do not know what they are doing, they choose to confine themselves to a narrowly technical practice.(Schön, 1991: 43) We explore the production of numbers by IOs as a process of constant navigation between the 'lowlands' and the 'high ground'.As we argue, experts in global governance now are required to mobilise virtues of technical rationality while practising reflection-in-action.This complex interplay of different orders is enacted on the levels of expert practices.Hence, his practice entails reflexively mobilising ideals of technical rationality to achieve political and institutional goals.This conceptual approach allows us to explore these actors' decisions where such ideas as objectivity and performativity of numbers are in fact instrumentally mobilised to achieve specific outcomes.

Reflexivity in academic and policy work
Some have argued that reflexivity directly opposes positivist knowledge (Neufeld, 1991), and that the 'routine reflexivity' of knowledge production practices in the social sciences often calls objectivity into question (Bourdieu and Waquant, 1992;Strathern, 1987).This reflexivity of sociology 'invites other sciences to address the question of their social foundations' (Bourdieu and Waquant, 1992: 4), but also calls upon social scientists to interrogate the origins of our own research.In sociology and anthropology, scholars have been particularly spurred to attend to the social foundations of their own knowledge production by Foucault, (1988Foucault, ( , 1994) ) meditations on the ways that psychiatric and medical knowledge were tied to the production and maintenance of power, Asad's (1973) frank assessment of anthropology's complicity with colonialism, and Rosaldo and Lamphere's (1974) identification of anthropology's dominant male perspective.
In his critique of the lack of reflexivity among International Relations scholars studying large-scale historical changes, Neufeld (1991: 54) argues for a 'theoretical reflexivity' -or 'reflection on the process of theorising' -that has three elements.First, it requires being 'aware of the underlying premises of one's theorising' (Neufeld, 1991: 55).Second, it requires the 'recognition of the inherently politico-normative content of paradigms and the normal science traditions they generate' (Neufeld, 1991: 55).In other words, reflexivity requires an attention to 'the active and vital role-played by the community of researchers in the production and validation of knowledge' (Neufeld, 1991: 55).Finally, a reflexive stance to the research we produce must allow for 'the possibility of reasoned judgements in the absence of objective standards' (Neufeld, 1991: 58).In other words, taking a reflexive stance requires assessing one's assumptions, acknowledging the conditions by which you produce evidence and the frameworks used to give meaning to that evidence, and being comfortable with truth claims that might not adhere to strict objective standards.Throughout our study, we centred our analysis on the practices of the experts creating quantified global public policy in the form of the SDGs, and we found that many of these experts took reflexive stances on their own work in all three of these ways.
That statisticians and other producers and custodians of quantified knowledge are aware 'of what measures do' is certainly not a new concept in the anthropology and sociology of quantification.Mugler's (2015) work showed how South African prosecutors were very aware of the limits of quantified knowledge and actively thought about the implications of the indicators they used, what she terms 'numerical reflexivity'.As she put it, it was clear to them that it is difficult to capture the actual complexity of a specific case, court prosecutors' skill levels, or the unpredictable and uncertain multi-actor environment in which they are managing with performance indicators.As a result these managers would not, for example [. ..], treat 'stats' as self-evident.They relied instead on certain experienced prosecutors' views on the data to decide how statistics should be read and interpreted.(Mugler, 2015: 94-95) In this article, we take as a given that many statisticians and global public policy makers are reflexive about their practices, and we are interested in understanding both the frontstage and backstage work that this reflexivity does.
Neufield's 'theoretical reflexivity' and Mugler's 'numerical reflexivity' are similar to our first type of reflexivity -what we call 'epistemic reflexivity' below -which forms the foundation for the broader forms reflexivity that shape these experts' words and actions.Following (Beck et al., 1994) we are also interested in the kind of reflexive loop created by these experts' epistemic reflexivity.Beck and his colleagues' 'reflexive modernization' was a shorthand for the 'modernization of modernization' -the analysis of what happens when 'modernization, understanding its own excesses and vicious spiral of destructive subjugation (if inner, outer and social nature) begins to take itself as object of reflection' (Beck et al., 1994: 112).For our work, we are interested in the reflexive loop instigated by experts' contemplating of the limits of quantified knowledge, opening up to contemplating the impacts of these limited knowledges on the communities they are targeted to help (section 'Care-ful reflexivity') while also wielding both of these reflexivities for political purpose ('instrumental reflexivity').

Theoretical underpinnings
To explore the production of reflexive expertise in global public policy, we develop our analysis on the theoretical underpinnings of the concept of 'qualification' (Reinicke, 2015), the process via which actors make value judgements on the basis of the decisions and choices they are confronted with; the latter might not necessarily take into account pre-conceived categorisations, classifications or even other expert advice.At least in the field of economic sociology, such value judgements are seen as being made continuously, given the infinite world of commodities and services available: selecting a lawyer is, for example, a decision perhaps not only based on the value of the services that may be on offer, or on the ranking of the local solicitors' performance, but on other values, too, such as trust, personal acquaintance, fame or respect.
In other words, decisions on many aspects of everyday life are not only dependent on statistical knowledge (that tends to standardise to reduce multiple values to a specific value: the process of quantification).Rather, they are based on judgement of the decision's (or the good's) values (the process of qualification): this is a process that, instead of standardisation, requires a process of 'individualisation' (Callon, 2002: 267).Despite our focus on analyses of quantitative expertise as a process of commensuration and standardisation, in reality experts in the field are continuously confronted with the very specific ('individualised') challenges and values of the constituency at hand.Although we know that making judgements is an inherent aspect of the production of quantification, we argue that the process of qualification denotes more than that: it is the process whereby certain measurable and standardised values (in the statistical sense) are being consciously opened up to assigning certain political values to the good in question (or they establish new 'orders of worth', following Boltanski and Thévenot, 2006).
In the increasingly dispersed governing space of global public policy, such a distinction between quantification and qualification, albeit thin and transient, is crucial to understanding the ways experts negotiate their epistemic capital with the political values on the ground, as well as their own personal ones as they go about their day-to-day work.To clarify, our analysis is not confined to the tensions of Cochoy's (2008) 'qualculation' that all quantification practices involve, that is, 'calculation, whether arithmetical in form or not, as the manipulation of objects within a single spatiotemporal frame' (Callon and Law, 2005: 719); that is, judgement, materially enacted.Calculation does not grow on trees, as Callon and Law (2005) assert: it requires time, money and effort and the sociology of quantification has given us persuasive accounts of the judgements inherent in all quantitative practices (Strathern, 1987).Rather, our focus here is on experts' reflexive practices on the ground, in terms of making technical decisions (the epistemic reflexivity), feeling that they have a duty of care towards those whose lives they want to improve (the care-ful reflexivity), as well as strategising best practices to create consensus and persuade (instrumental reflexivity).This, in some ways, is the reverse process of quantification, via which values are ascribed to value: it is a key component of consensus-building and increasing participatory and inclusive decision-making practices in global governance, the complex task IO experts are asked to deliver, trying to always match global processes of commensuration with local struggles over priorities and political ideas.The concept of qualification, as we will show in our empirical analysis, helps establish and analyse the role of personal and collective values in the struggle over establishing conventions of worth: reflexivity is a key resource here, both at identifying and codifying values at the level of the individual expert/actor (values that make their work, however utopian, worth doing), as well as at the level of working through local political values and agendas and trying to 'marry' them with the more top-down global goal-setting.

Methods
This article is based on research funded by the European Research Council, 'International Organisations and the Rise of a Global Metrological Field' (or METRO for short).METRO's research design was grounded in a comparative case study of different policy fields, examining the SDGs as a whole, but also focussing deeper on the cases of education (in particular, SDG 4), poverty (SDG 1) and statistical capacity development (cutting across all the SDGs).
Our research included over 80 interviews with key experts in IOs, including: the World Bank; the United Nations Educational, Scientific and Cultural Organization (UNESCO); the UN Children's Fund (UNICEF); the UN Development Programme (UNDP); the World Health Organization (WHO); the UN Statistical Division (UNSD); and the Partnership in Statistics for Development in the 21st Century (PARIS21).The interviews were digitally recorded, transcribed and coded in NVivo.We also draw on the careful analysis of official documents produced by this epistemic community, including flagship reports, policy and strategic documents (such as declarations, position papers and action plans), internal documents produced by IOs (including meeting agendas, open consultations and PowerPoint presentations) and research articles published by actors in these networks.The central analytical approach was inspired by grounded theory, entailing multiple rounds of coding (including descriptive, focused and theoretical coding) (Charmaz, 2006).

Experts' reflexivity in global public policy and the SDGs
In the context of the quantified data production and global public policy-making project of the SDGs, we found that experts working in IOs had a shared faith in numbers to bring transformative change, and they were acutely aware that their work is mostly political (Bandola-Gill, 2021).More importantly, they were happy to discuss and share the challenges of their work.Providing expert advice was seen as a process that required a specific set of qualities: an understanding of data but also of the local contexts; humility and perseverance in the face of limited funding and the diversity of interests and value-systems; an ability to foresee change and place themselves at the best possible place to tame it; and finally, the skill to transform a perceived obstacle (the lack of perfect data) into a valuable instrument for advocacy and consensus-making (the concept of 'good enough' data) (Grek, 2020).
Quantified knowledge production in the context of the SDGs' global public policy is expertise of a particular sort, ultimately shaped by official decisions discussed by various stakeholders at and ratified by voting members of the United Nations Statistical Commission (UNSC).The 'global statistical community' represented at the UNSC includes members of national statistical offices, of statistical divisions of IOs and of civil society organisations (Bandola-Gill et al., 2022;Tichenor, 2022).Data production for monitoring the SDGs also demanded that UN agencies do the work of 'harmonisation' between 'local' and 'global' statistics: 'a process through which a variety and diversity of national statistics become translated into one global number' (Bandola-Gill et al., 2022: 42).While statistics and development data are produced at the national and subnational levels by various governmental (e.g. the ministry of health alongside the national statistical office) and non-governmental (e.g. the Bill and Melinda Gates Foundation alongside the ministry of health) entities, these statistics and data are then harmonised both to lock together the standards produced by the UNSC to monitor progress on the SDGs with a country's standards to monitor its own domestic goals, as well as to compare different countries' progress.
At these different epistemic orders (related to the quality of data and the politics of measurement), political orders (related to bringing actors 'on board' and producing contextualised measures) and value orders (related to different ethical priorities and crosscultural ways of working), the work of the experts went beyond just 'producing numbers'.This multifaceted navigation between different priorities required them to mobilise and navigate between different styles of knowing.Reflexivity, as we will show in the remaining part of this section, emerges as one of the new skills in the expert arsenal.As such, it is both an epistemic practice (as traditionally discussed in the literature on the topic) but also a practical and strategic tool that can be mobilised in the context of complexity.Approaching reflexivity as a practice allows for unpacking its core elements.In our data, we identified three types of reflexivity practices: epistemic reflexivity, care-ful reflexivity and instrumental reflexivity.In the remaining part of this section, we unpack these reflexive practices in detail.

Epistemic reflexivity
The first type of reflexivity identified in our data is epistemic reflexivity which experts have shown when engaging in the 'core' of their quantifying work -producing and mobilising numbers.The experts were well aware that there are limitations to the numbers they are producing: data collection might be flawed, statistical instruments have inherent biases and -most importantly -achieving the goals requires that IOs produce 'global' numbers or country rankings through constant negotiation between the robustness of the process and the availability of data.Attention to the limitations of data has also increased with the proliferating number of actors in the field of development data, as national statisticians now feel they must compete with the more rapid producers of 'Big Data' and other forms of alternative data.Here, interviewees were reflexive over which tools to use, how to communicate numerical uncertainty and how to produce numbers that are 'good enough' to be fit for purpose.For example, We're now trying to put these technical notes with every update [of the measure].In the past all these things [necessary to update a global number] happened too and it was just that it wasn't documented very well.Now I think you can't avoid the revisions, because we have new data, countries change, sometimes you make mistakes and sometimes, well most of the time it's just the result of new data.Sometimes countries make mistakes, and they change their view about the CPI [Consumer Prices Index] or whatever or they've revised their national accounts data, that's also I find a huge issue.So, yeah, of course it's a problem.So, I think that's why we need to do a better job at communicating the uncertainty because we don't want people to think that 700 to 710 million poor is a huge change.That's well within the margin of error -[. ..] last time we made a mistake in Ethiopia and that shifted 4 million or something like that.(World Bank,4) In fact, a central engine of the SDG framework is the Inter-Agency and Expert Group on SDG Indicators (IAEG-SDGs), ( 2022) where representatives of national statistical offices have the authority to refine and validate methodologies for collecting and harmonising data to monitor global progress on the goals.In this space, UN agencies and other IOs are observers but are ultimately responsible for validating and harmonising these data (Bandola-Gill et al., 2022).This is a space of constant reflection on the limits of numbers, as this member of the IAEG-SDGs expressed: So [the representatives of the UN agency] were like, 'look we've done all this work', and we were like, 'yeah, that is fantastic, but we also need a number', and they were like, 'ooh, a number, no, don't think so', and we were like, 'well, you see, [. ..] you can't say that you have a follow-up unless you have some type of assessment that we can actually work on.Because otherwise it'll be a nice analysis, [. ..] and if you want that then that can be part of some other follow-up.But if it's supposed to be an indicator for [SDG framework], then it actually has to be produced as a number'.And gradually and with a lot of pain they accepted this and said 'OK, we will bring our analysis all the way there, even though we don't really feel comfortable about it [. ..], but we realise that this has to be part of the final product [of the SDG framework], in order for you to make it a Tier II 2 , we actually have to provide something that is a number'.(National Statistician,1) One component of this type of reflexivity involves assessing the potential consequences of numbers.The interviewees described their work in terms that Desrosières (1998) and Espeland and Stevens (1998) would define as 'metrological realism': this is the quality of numbers to inherently be seen as true and representative of reality.The interviewees were not only aware of but also accounted for the power of numbers and the influence they might exude once they are published.This was particularly evident in the accounts from experts from the statistical capacity development community, while responding to increased calls to measure the impact of their work in improving country statistical systems.For example, a representative of the UNSD described this performativity of measures as the 'distortion' of numerical work, and that was because the global statistics community was aware of 'what measures do to people': What I'm trying to say is measures also sometimes lead to decisions and maybe distorted decisions, especially when you have an index.I'm always very careful with indices, because [an] index is always only a selection from a very complex reality and then you always have the dimension of the weighing of the components of the index, which are by definition arbitrary.So, if you get one-third of the way to every component, or one-tenth or 40% to this and [30%] to that, there is no scientific measures to determine that.So, I think the hesitance of this measurement community to define a measurement for themselves [in defining an indicator for statistical capacity development] is also because they are very aware of what measures do to people.(UNSD, 1) This form of reflexivity process might lead to diverse outcomes.On one hand -and most common in our data -experts were advocating for the production of numbers, despite possible distortions, since the production of numbers was seen as the only route towards envisioned policy change (further discussed below).On the other hand, the experts were reflexive of the power of some numbers once they are released, and consequently, they were advocating for withdrawing from producing a number.Here an example could be a practice in the World Bank where the organisation takes on a cautious approach to producing the 'global' poverty number whereby they, despite some pressures, did not want to present their global poverty number in 'real-time' manner.

Care-ful reflexivity
The second type of reflexivity -care-ful reflexivity -refers to the modes by which experts assert their relationship of care to those impacted by their work and assess the implications of their decisions on these populations.In this way, experts both 'care about' and (aim to) 'take care of' the proposed beneficiaries of their numbers' work and the social problems seen to impact them (Puig de la Bellacasa, 2017: 4;Tronto, 1993).For example, as one interviewee justified assessment of different measuring approaches by responding: 'we have a duty of care' (OPHI, 1).This type of reflexivity is best exemplified by the types of justifications positioning experts as active participants in the process of solving challenges and advocating for disadvantaged groups (e.g.children, the poor, and the sick).This form of reflexivity was mobilised as a powerful motivator for action, as experts made references to 'caring' for poverty, education or health as being the key motivation for choosing careers in IOs.For example, as exemplified in the following quote, There are people who just really care about measurement for its own sake.And so, they just will focus on principles of measurement, and it's just a particular way of looking [..].So, I think that that tension definitely is there.And I think what I always come down to is: the point of this is for it [measurement] to be used.(UNICEF,3) As actors in this space position themselves into a relationship of care with the beneficiaries of their expert work -forging care through the production of numbers -these statisticians, demographers and development data scientists embed calls for reflection and additional data production for assessing the 'effectiveness' and 'impact' of their work.Within this framing, these actors are frequently attributing affect to the cold rationality of numbers, since their numbers are key in constituting and knowing populations like 'the poor' and 'those with insufficient healthcare access', vulnerable communities' needs are projected as guiding the production of numbers and thus the entire global public policy machine.There are, of course, questions to be asked about how reflexive these acts are, and how willing these actors might be to engage with the ways that bureaucratic care or institutional care can instead become 'something uncaring, even murderous' for those on the receiving end (Stevenson, 2014: 4).
Thus, our interviewees were aware of the limitations of the numbers to achieve policy change.They pointed out that measurement, just like any other form of evidence (see, for example, Parkhurst, 2017;Smith, 2013), is not sufficient to lead to policy change on its own.For example, as explained by one interviewee from the World Bank, The policy's really hard because even if we think of a country with just a well-functioning data architecture and system in place [for example the US], I'm not sure I could point to specific examples of data that has directly influenced policy.(World Bank,7) In the context of the SDGs, the experts reflected on the fact that the 17 goals were in all likelihood not achievable and that the data and indicators will not achieve the kinds of positive social change they were after.So how do experts make sense of and justify this reality in which quantification becomes ever-expanding without being transformative?Here, the experts turned to care-ful reflexivity to explain this discrepancy: according to the majority of them, measurement has to be done because there is a 'duty of care' to monitor levels of poverty and different forms of inequalities, even if it is to acknowledge the problem and expose it rather than solve it; even the slightest gap in data collection was seen by most of them as betrayal and as undoing the valuable work of measurement taking place over such a long time.
As a result, we found that experts often mobilised a reflexivity of care, instead of other epistemic virtues associated with measurement.There were cases in which such values as objectivity were backgrounded, and instead, the justification for measurement was brought to the foreground.Thus, we see that often 'care-ful' reflexivity was taking the place of objectivity as the guiding principle of quantification.This points to the changing nature of quantification in global public policy.Traditionally, objectivity was seen as the value that was justifying measurement as a governing tool, as it was enabling numbers to be seen the apolitical 'view from nowhere' (Jasanoff, 2011).Nonetheless, our experts' accounts portray an almost disillusionment with a detached, technocratic expert view and a much more pragmatic approach to numbers, guided by an ethics of care, rather than merely epistemic principles.This form of reflexivity opened up not only a more nuanced -but also a more explicit -engagement with politics of numbers where the objectivity of 'what we know' was in some cases replaced by the objectivity of 'what is right'.Thus, what we see in our analysis is reflexivity discussed as a process of assessing the performative effects of numbers and mobilising them to achieve specific goals (Bandola-Gill et al., 2022).In other words, although quantification has dominated global governance as the new unequivocal doxa of planning the future, it is precisely those same numbers that are mobilised to conceive, as Bourdieu calls it, the 'improbable possibles' (Bourdieu, 2000: 134) whereby the quantitative aspirations (such as 'no poverty' or 'education for all') become realised.According to one of our interviewees, I love numbers . . .For me it's the joy of looking at some -let's take a massive amount of data -and getting a message out of it, right?Look at something, and then you find out what can this teach us.And then also beyond that of course is then the mission of the UN, which I strongly believe in.It's to improve everyone's lives, and to help Member States, and let's say UNESCO now has this vision of world peace.It's kind of how to link that all to your daily work, right, but in the end that you really want to help improve people's lives [. . .].So, that maybe sounds -I'm not saying that to please you or something, but it is really something that I and my colleagues believe in.And it's something that's difficult to see for others.People don't care . . .they go nine to five, it's their job that they're doing, they don't care.But there are a lot of people that really believe in what they do, and they hope that what they do contributes to the work that is making the world a better place.And in my case, I was lucky to do something that I love, and that I also can feel that maybe it will help somebody in the end.(UNESCO, 1) Actors, like the interviewee above, reflect on their different, interlinked identities; they are prophets, using their expertise to forecast the future; they are saviours, working for organisations that proclaim their very existence as servants of those in need; and they are saints, since it is their duty of care that keeps them in a game which, despite being futile, is the only one to play (Grek, 2020).

Instrumental reflexivity
Previous sections have discussed how expert actors use epistemic and care-ful reflexivities to help justify (to themselves and others) the sometimes futile effort of using numerical knowledge in the quest for a better world.Here, we will move this analysis one step further, to explain how reflexivity has become a key resource, not only in the self-affirmative, legitimating work that numbers require, but also as a political instrument: it is being foregrounded and used as the main means of constructing and maintaining relationships of trust between experts and countries.In this way, experts instrumentalise epistemic and care-ful reflexivity for political action.
To illustrate the ways in which experts mobilised instrumental reflexivity, we explore specifically the work of the UNESCO Institute of Statistics (UIS) as an expert broker of choice, not despite but because of their explicit and intentional reflexive accounting of the challenges of producing quantification for the benefit of countries in the global South.
The history of the construction of the SDG4 is one of struggle.The two main opposing camps were the 'Education For All' (EFA) movement, and the process of work undertaken as part of the MDG education indicator.In the interest of brevity, we won't outline this history here except to state that the two groupings had very conflictual views about the best measurement approach in education to be undertaken: EFA pushed for a diverse set of goals that would acknowledge a broader, humanistic approach to education, whereas the MDGs education experts pushed for finding a much more specific and measurable set of instruments, favouring a utilitarian view of education and focussing on key metrics such as literacy and numeracy.Facing the threat of being excluded from the SDGs due to a lack of common ground, the two groups found a solution and the worst was avoided: the compromise led to the production of the SDG4.Nonetheless, even if the contestation seemed to temporarily abate, it never really went away.On the contrary, the continued challenges of meeting the SDG4 targets and constructing a solid set of indicators to do so have intensified the struggle and conflict in the field.It is in this space of clash that UIS has emerged as the reflexive, trusted actor: UIS became the expert IO with long-standing links and relationships with the countries of the Global South as well as the ability to use the data failings (and often of their own making -UIS had had some serious measurement project failings in the past, for detailed analysis see Fontdevila, 2021) to advocate for the notion of 'good-enough' data and the political (rather than purely technocratic) uses of target-setting for coalition-building and agenda-setting.
First, UIS adopted a practical rather than purist, zero-sum, approach to the production of global learning data; instead of advocating for a single measurement tool, they focussed their efforts towards accommodating the use of different assessments and harmonisation methods.In contrast to other actors, such as the World Bank or the Organisation for Economic Co-operation and Development (OECD) that would have been much stricter in the choice of method (with a preference for their own instruments), the UIS developed a patchwork approach: they recombined several already available and legitimate models, recognising openly the limitations of each and emphasising the potential for complementarity.Due to these efforts, many interviewees recognised the UIS as perhaps not a data superpower, but as the trustworthy actor that recognised the unequal character of the data production market and thus the difficulties of creating an inclusive space, with the emphasis on the principle of country ownership.
Second, perhaps more importantly, UIS, primarily through its outspoken Director, Silvia Montoya, publicly discussed the imperfect character of global learning data, as well as the political nature of the indicator process.Under her leadership, UIS nurtured types of approaches for the collection of data that are hybrid and brought together different types of assessments, insisting that alternatives are not mutually exclusive but reinforce one another; more importantly this data 'portfolio approach' went against selecting one specific method as technically superior to others, and thus was politically much more in tune with countries and their specificities.Thus, not only some middleway forward was found, but countries also felt respected for the context-specificities and were not side-lined: There has been significant growth and improvement in the field of learning assessment across the world.Yet today, it is impossible to provide a global perspective of what children are learning . . .We must be pragmatic.As explained in previous blogs, the best measures and methodologies in the world will amount to little if countries cannot produce them.We must therefore take a pragmatic approach, which may mean mixing the options.This stepping-stone approach was widely endorsed by stakeholders attending the June meeting.They understand the political stakes, the technical issues and the need to find a balance between pragmatism and accuracy . . .We need to recognize that SDG 4 indicators are barometers -showing which countries (and, for equity's sake, ideally which segments of which countries) are making progress and which countries need help.Instead of aiming for the most technically rigorous methodologies, we may better serve the world by taking a pragmatic approach to producing the global measures while helping countries improve the quality and use of their national data.(Montoya, 2017) As is apparent in the above quotation, UIS used reflexivity instrumentally to re-affirm and strengthen its authority in the education measurement realm as the only trustworthy, ethical and transparent expert broker.Instead of approaching the construction of indicators as a purely technical exercise (or emphasising expert knowledge as the organisation's most relevant asset), the UIS openly discussed the political nature of the debate as well as the vested interests that shaped it (e.g. its director exposed the inefficiencies of the 'learning assessment market' in two influential blogs in 2019 3 ).UIS openly admitted that there is no perfect way of doing this kind of work and that technical rigour would have to go hand in hand with a more pragmatic approach: this way, reflexivity became the prime instrument for the organisation to bolster its credibility and create minimum consensus in the field.Consequently, the notion of 'good-enough' data became central, as the political choices and judgements were not hidden but in fact, displayed publicly.
Therefore, instrumental reflexivity refers to the types of considerations in which experts engaged in the cases where the epistemic qualities of quantification (objectivity, de-contextualisation, universality) were in tension with the political goals of measurement.Here, not only did experts not avoid exposing the political nature of numbers, but they even went as far as to mobilise and instrumentalise this political nature to achieve their goals of building consensus or securing 'buy-in' into statistical projects (such as the SDGs).Of course, one has to take into account the interdependencies, competitions and collaborations between IOs to get a fuller picture of how IOs interact and assume different, complementary identities as they work collaboratively: while some take the high ground and defend their authority by sticking closely to its objectivity and trustworthiness, others choose to benefit from getting their hands dirty and dive into the 'swampy lowlands' of muddling through political contestations and 'good enough' numbers.

Discussion
Based on interviews with over 80 experts working in IOs in the fields of education, poverty and statistical capacity development, this article has focussed on the reflexive accounts of these actors' day-to-day business, as they went about describing and justifying their work to us.Reflecting on our own expectations of what these accounts might entail, we anticipated that interviewing them would require more intensive probing to get them to explain the limits and challenges of quantification and the types of political work required to successfully implement metrics.We did not underestimate them and never thought that their technical expertise would not allow them space to be analytical; our surprise did not relate to the fact that they were thoughtful and eager to reflect on what their work involves.What did surprise us was the extent to which, time and again, many of these actors treated the interview space as a cathartic zone, where they would freely share their exasperations at being asked to achieve the unachievable, but also a space where they would share their conviction that measuring inequality was the only available means not only to know, but crucially to raise awareness of the injustices communities -in the global South in particular -have to endure and overcome.Reflexivity was not a thought process experts exercised as part of the encounter of the interview; instead, it was a tool in their day-to-day job, as they were tasked to assign meaning to their work of numbers, to persuade and to build relationships of trust and reciprocity.
In other words, apart from a focus on experts as the ones holding the epistemic -or even epistemological -capital to know global public policy by naming and measuring it, we explored the kinds of ontologies that emerge from experts' interactions with data, through which they ponder and reflect on what their work isand how to navigate the epistemic, political and practical tensions inherent to it.Statisticians, IO experts, national and local decision-makers are not unconscious actors, but instead, they reflect on their practices that produce the monitoring system and wonder -and at times despair -over how much or how little real-world effects their work has.Unsurprisingly, these active agents are reflexive about their practices and their effects, and this reflexivity works on multiple levels.
Thus, it was in this context that we came to conceptualise reflexivity as doing a lot more of the heavy lifting of quantification than the literature has so far discussed.As this article has shown, reflexivity is not merely a process of self-appraisal by experts, as they make sense of their work in an internal dialogue between their personal values and aspirations and their activities on the ground.More than merely self-reflexivity, the article showed the ways actors used the process of opening up the black box of number-making not only to us as researchers, but also with those in the field -including colleagues, collaborators and even policymakers.As we showed, they purposefully put reflexivity to work, to, on one hand, explain and justify choices as they muddle through trying to establish some order in the messy realities of quantifying complex problems, and on the other, as they actively attempt to imbue the assemblage of data with the political values of inclusivity.Thus, they purposefully apply processes of qualification, as almost the reverse process of quantification: in their efforts to engage and co-opt communities, they need to -momentarily, at least -move away from the rationality and objectivity of commensurability, to open up these numbers to contextualisation and even contestation.
Although seemingly antithetical to the production of quantification as the process through which multiple values come together and are expressed through their representation by a single value (the one that can then represent multiple realities and thus be commensurable), qualification is a sine qua non to quantification.This is not simply because judgement is inherent in every single decision, no matter how large or little, over the making of numbers (i.e.what the concept of qualculation denotes).Although such considerations are important, what this article attempted to show was the ways that expert actors, through thinking and practicing numbers, are happy to go as far as opening up the numbers debate to include the political and personal values that they and their field participants may share: in casting light on the ways that reflexivity becomes an essential element of the performativity of quantification, this article follows Skeggs (2014), in her formulation that 'values will always haunt value' (p. 1).
What experts' use of epistemic, care-ful and instrumental reflexivity shows (a separation that is of course much more fluid than what our schema has made it out to be) is that to make quantification work, experts need to re-attach political values to numbers, and thus allow them to take on new meaning and be translated in 'useful' ways in the field.Hence, qualification becomes the socio-material process via which new qualities are attributed to measured values (be they gender equity, multi-dimensional poverty or educational outcomes) to become locally malleable and stabilised, pre-arranged and re-arranged to suit local needs.This is the process of attributing new qualities to standardised values that have already been commonly accepted.As the article showed, calling one's data practices purer than another's (epistemic reflexivity), or promoting data collection as a 'duty of care' towards communities (care-ful reflexivity), or even assembling different data sources to suit local preferences and needs in a bid to look more democratic and ethical (instrumental reflexivity) do no less than politically (ear-)mark numbers as a lot more than simply numbers, representing a reality as is.Thus, we would argue that reflexivity becomes a useful instrument in the everyday political struggles that experts fight not only to collect data 'values' from the field, but crucially 'to establish what value is' (Graeber, 2001: 88).
Therefore, unlike the predominant focus in the literature posing such qualification practices as almost 'hidden' and happening on the level of the institutional discourses (Porter, 2020;Scott, 2008), we showed how they are daily mobilised on the micro-level of expert practices.Thus, reflexivity is a key resource in processes of 'qualification', especially in a context of increased emphasis on democratisation and decolonisation: reflexivity allows the assignment of political values (values with 'heart and soul') back to the measurement of statistical values, to enlist participation, facilitate inclusion and thus further enhance quantification as the only available means to know and govern global public policy.
Even though our exploration focusses on an arguably unique case of the SDGs with their ambition to democratise and open the processes of measurement and monitoring to a wide variety of actors, the insights presented in this article are more broadly applicable outside of this localised case study.The turn towards bottom-up decision-making and participation in addressing complex technical problems (from climate change -Jacquet and Jamieson, 2016 to development finance -Best, 2014), points to a broader paradigm change in global governance.As such, the problem of reflexivity of experts is increasingly central to the problem of measurement of global issues.