Different Methods, Different Standards? A Comparison of Two Finnish Reference Budgets

According to Principle 14 of the European Pillar of Social Rights, everyone should have the right to adequate minimum income benefits that ensure a life in dignity. Reference budgets have been proposed to monitor this principle. Reference budgets are priced baskets of goods and services that represent a given living standard. At the moment, no common methodology for constructing reference budgets exists; instead, different methods are used to construct them. This study sought to compare the approaches and results of two Finnish reference budgets: one created by the Centre for Consumer Society Research (CCSR), and the second by the ImPRovE project. The purpose of the article is to respond to a gap in existing literature around how different methods for constructing reference budgets impact their outcomes. The two reference budgets offer a strong basis for comparison because they both sought to capture the same living standard in the same context for similar household types (single woman, single man, heterosexual couple, and heterosexual couple with two children), while using different approaches. The results suggest that the two reference budgets arrive at different estimates of what is needed for social participation. Ultimately, we found that the most significant differences between the budgets were housing and mobility costs for the couple with two children due to differences in information bases, selection criteria, evaluators, and pricing. The study makes a significant contribution to the literature because it is one of the first to explore how different approaches to constructing reference budgets affect their outcomes. The results suggest that clear criteria for constructing reference budgets are needed to monitor Principle 14 of the European Pillar of Social Rights.


Introduction
The European Union is currently under pressure to make a decisive contribution in safeguarding social protection systems. This movement has given rise to the European Pillar of Social Rights (EPSR), an initiative that strongly emphasises adequate income. According to the EPSR's fourteenth principle, everyone should have the right to adequate minimum income benefits to ensure a life in dignity. To date, different indicators have been used to monitor the degree to which this principle is fulfilled. Scholars suggest that indicators of the adequacy of minimum income benefits should be attentive not only to income, but also to societal circumstances, such as the relative degree of service provision in a particular society (Penne et al., 2019). Currently, the principle is monitored using only the at-risk-of-poverty (AROP) indicator, which sets the poverty threshold at 60 per cent of the national median equivalent disposable income. However, there are widely acknowledged problems with this indicator, such as its inability to capture the determinants of the adequacy of a minimum income (e.g., Callan and Nolan, 1991;Ringen, 1987).
To address the problems with the AROP indicator, reference budgets have been proposed to assess the adequacy of social security (e.g., Penne et al., 2019). 'Reference budgets' (RBs) are priced baskets of goods and services that can be used to represent a particular living standard (Bradshaw, 1993). RBs are most frequently used to define the items that citizens need to participate in a particular institutional, cultural, and social context (Goedemé et al., 2015a). Thus, RBs consider the costs of public services and the impact of public provisions on households' livelihoods. In addition, RBs can be used to account for variation in housing costs across different geographical locations. While RBs are based on needs, RBs are relative indicators because, as noted above, they aim to capture the requirements for social participation in a certain time and place.
Currently, RBs are constructed in almost all European Union Member States; however, no common methodology exists (Storms et al., 2014). RB approaches differ not only by the methods used to construct them, but often by their theoretical backgrounds, too (Deeming, 2017). In Finland, two RBs with different methods have been constructed. The first was formed by the Centre for Consumer Society Research (CCSR) based on focus groups who consensually drew the budget. The second was created by the ImPRovE 1 project and relies mainly on guidelines and expert knowledge; however, focus groups were also used to validate the baskets. The CCSR's budget has been used to assess the adequacy of the minimum social security amount in Finland (The Second Expert Group for Evaluation of the Adequacy of Basic Social Security, 2015) and to measure poverty (e.g., Mäkinen, 2017). Along these same lines, ImPRovE's Finnish budget has also been used for poverty research from a comparative perspective (Goedemé et al., 2019a;Penne et al., 2016).
Against this background, the current study sought to: (1) analyse whether these two Finnish RBs produce similar estimates of acceptable living standards; (2) inspect whether differences in the RBs' research design or implementation produce any differences in the RBs' estimates; (3) compare the RBs to the AROP threshold; and (4) assess the adequacy of the Finnish minimum income benefit against these RBs. This approach stems from the rationale that because different methods follow different procedures, it is plausible these RBs may produce different standards (Deeming, 2017; 1. In the ImPRovE project (a project funded under the 7th Framework Programme of the European Commission), crosscomparable reference budgets were constructed for seven cities: Antwerp, Athens, Barcelona, Budapest, Helsinki, Luxemburg, and Milan (Goedemé et al., 2019a(Goedemé et al., , 2019b. Stropnik, 2020). Little work has been done on RB comparisons (e.g., Gilles et al., 2014) 2 or the mechanisms that produce differences in RBs. This is perhaps because RBs have historically been constructed based on different populations and families in different countries and at different times (Deeming, 2017). This study responds to this existing gap in the literature by presenting a detailed analysis of the results of two different Finnish RBs. These two RBs offer meaningful grounds for comparison because they utilise similar theoretical frameworks, were formed at the same time, and target a living standard that enables social participation. In addition, the RBs' purpose is similar, i.e., to evaluate the adequacy of minimum social security. Analysis of the consistency of results across different types of RBs is necessary to ensure they can offer a reliable indicator for monitoring the minimum income principle of the EPSR. Although it is reasonable to assume that different RBs will produce dissimilar results, observing similarities in their impacts would nevertheless increase the validity of the RB approach. Common principles for constructing RBs are necessary for meeting the demand for RBs that are comparable across countries and can be used to monitor the reliability and validity of the minimum income principle across the European Union at large. This study makes use of Goedemé et al.'s (2015b) framework of the constituent components of RBs -that is, the 'building blocks' of RBswhich includes the theoretical perspectives and methods underpinning RBs. Accordingly, this framework was chosen for this study because it enables the inspection of the budgets' conceptual and theoretical bases and methods. Ultimately, this study provides new information about the meaningfulness of the RB indicator in assessing the adequacy of social security and measuring poverty.
The article is structured as follows. First, RBs' constituting characteristics are presented. This is followed by a presentation of the different methods for constructing RBs. Subsequently, the two RBs' construction is elaborated. The two RBs' estimates are then compared and their differences explored. Finally, the two RBs are contrasted with the AROP indicator and the minimum income benefit. The article concludes with a discussion of the two RBs' suitability for assessing the adequacy of the minimum income benefit and the implications of the present study.

An overview of the methodologies of reference budgets
The construction of reference budgets (RBs), poverty indicators, or any other living standard indicator is often guided by the following question: 'How much income is enough?' According to Dubnoff (1985), this question should be complemented with considerations of: (1) 'How much income is enough to do what?' ('enough to do what'); (2) 'How much income is enough for whom?' ('enough for whom'); and (3) 'How much income is enough according to whom?' ('enough according to whom'). These questions should also be considered constitutive for RBs.
In the context of reference budget studies, 'enough to do what' refers to the targeted living standard, 'enough for whom' to the target population, and 'enough according to whom' to the information base and evaluators (Goedemé et al., 2015b). The information base is a key factor distinguishing RBs (Deeming, 2017;Hårvik Augstulen and Borgeraas, 2018). An 'information base' is used to translate the targeted living standard into items in the budget (Goedemé et al., 2015b). However, RBs typically use many different information bases simultaneously, which can blur the boundaries between different information bases (Deeming, 2017); for example, expert knowledge may be brought into focus-group discussions, and focus groups may accordingly base their decisions on this information. Nevertheless, RBs can be separated by the relative weight allocated to different information bases (Hårvik Augstulen and Borgeraas, 2018). 'Evaluators' determine the living standards, information base(s), and selection criteria. Typically, several different actors, such as researchers, experts, and focus-group participants, act as evaluators during the process. Meanwhile, 'selection criteria' determine the information that is kept from the information base. For consensual RBs, selection criteria should establish rules regarding what constitutes a consensus. For social surveys, this would mean setting thresholds for the share of people who consider some items necessary for the targeted living standard (Goedemé et al., 2015b). Deeming (2020) identified three approaches for constructing reference budgets, which differ according to the information base used: (1) an 'expert-led approach' makes use of expert knowledge; (2) a 'survey-led approach' is based on survey data; and (3) a 'public-led approach' brings citizens together to discuss and draw a budget. The choice of method is related to fundamental differences in how minimum living standards should be captured (Deeming, 2017). The two Finnish RBs in this study have distinctive features and can be categorised into two approaches: while the Centre for Consumer Society Research (CCSR) takes a public-led approach, the ImPRovE project's approach is an expert-led one. However, both methods also utilise additional information sources.
In the public-led approach, groups of citizens are brought together to form RBs. Often, RBs that rely on citizen input are described as 'consensual RBs', as some sort of agreement on the content of the RB is needed. Proponents of the public-led approach suggest that minimum living standards are socially and culturally specific (Walker, 1987); therefore, needs are socially constructed, as are the items that satisfy these needs (Valadez and Hirsch, 2014). In the public-led approach, discussions are presumed to provide information about everyday needs. This approach considers citizens as the best experts for defining the minimum living standard.
The most notable example of consensual RBs is the Minimum Income Standard (MIS) method developed by Bradshaw et al. (2008). The MIS approach involves sequential focus groups of different people. The process starts with the first focus groups defining the targeted living standard. Next, a sequence of focus groups check and revise the budget (Davis et al., 2015). It has been argued that the MIS approach reflects the 'values and social norms of [a particular] society' (Valadez-Martinez, 2018). However, Deeming (2010) argues that a focus-group consensus is unlikely to reflect a universal attitude in society, since focus groups comprise only a limited number of people who are not necessarily representative of their context. This, in turn, may lead to unreliable results. Along these same lines, it is important to note that the focus-group method is sensitive to small aspects, such as the composition of the focus group and how the moderator conducts the discussions; thus, previous research argues that consensual RBs might not produce consistent results. The results of a Dutch study suggest that the RBs defined by different focus groups differ to a great extent (Hoff et al., 2010), even though the same procedures were used. Similar inconsistencies between family types have also been found in MIS studies in England (Deeming, 2010).
Meanwhile, expert-led RBs follow a different logic, drafting content based on scientific knowledge, guidelines, and recommendations. The expert-led approach relies on the assumption that certain human needs must be met, irrespective of time and place. Although the living standard is considered constant, this does not imply that the items needed to meet this living standard are similar. Instead, the expert-led approach assumes that necessary items are likely to differ over time and typically defines a targeted living standard according to which items are required to meet given needs in a particular context (Deeming, 2017;Goedemé et al., 2015a).
Generally, critics of expert-led RBs focus on how the roles of experts and researchers may lead to arbitrary decisions and opinions about budget content (Hårvik Augstulen and Borgeraas, 2018). For example, it has been suggested that RBs based on expert knowledge do not accurately identify items that fulfil social needs, as they lack information about customary and acceptable practices. Other critics argue that the social dimensions of customary lifestyles can only be grasped by looking at individual experiences (Gilles et al., 2014). To grasp social dimensions, expert-led RBs typically incorporate focus groups to review their budgets; however, as focus groups are used to review the lists drawn by experts, this may confine discussions in focus groups and thus skew the items that are included (Mac Mahon, 2015).
In sum, these different information bases provide evidence of a clear methodological distinction between the expert-led and public-led approaches. In the expert-led approach, needs are defined based on scientific knowledge and are assumed to be universal; meanwhile, the public-led approach typically bases needs on the opinions of study participants. Observing this difference, Bradshaw (1994) classifies the expert-led and public-led approaches as 'top-down' and 'bottom-up' approaches, respectively. Applying Bradshaw's (1994) framework to our case, the approach of the ImPRovE project is a 'top-down' one, while the CCSR's approach is a 'bottom up' one.
The next section outlines the current study's research design and the characteristics of the two RBs.

Research design
This study analysed whether two Finnish reference budgets (RBs), constructed by the Centre for Consumer Society Research (CCSR) and the ImPRovE project respectively, produce similar results. It also explored how the adequacy of minimum income benefits differed depending on the RB used. Another aim of the study was to inspect the elements in each RB's design and implementation that cause any differences observed. Special focus was put on the impacts of differences in information bases, selection criteria, and evaluators. These are crucial elements that distinguish different RBs.
The CCSR and ImPRovE RBs offer a meaningful basis for comparison because they both target the same living standard and are constructed for the same target population and time. The CCSR budget was constructed for 2013 and the ImPRovE budget for 2014. 3 The study focused on the comparable RBs by the CCSE and ImPRovE project for four household types: a single man, a single woman, a heterosexual couple, 4 and a heterosexual couple with two children (a four-year-old boy and a ten-year-old girl). Both RBs assume that the hypothetical households live in Helsinki and have access to public transportation. They likewise assume that the families are in good health, selfreliant, 5 and know how to lead a healthy life. Lastly, both RBs target living standards that enable social participation. The ImPRovE budget describes 'social participation' as being able to assume 3. The CCSR's RB was priced in May 2013 and rechecked in November 2013. ImPRovE's RB was priced in the first half of 2014. For this article, the RBs were not adjusted by the consumer price index, since this did not move materially between November 2013 and May 2014 (November 2013: 108,13 (2010 = 100) and May 2014: 108,87) (Official Statistics of Finland 2021). 4. There is some difference between the couples in the RBs. In the CCSR's RB, the woman in the couple is assumed to be of working age while the man is retired. In ImPRovE's RB, both members of the couple are of working age. Even though the assumptions somewhat differ, the targets of the RBs are the same. 5. In the CCSR's RB, the hypothetical households are assumed to be able to cook and conduct small repairs. the roles that one should be able to play as a member of society. This is further defined as the fulfilment of physical, psychological, and social needs (Goedemé et al., 2019b). The CCSR's budget defines the living standard as a life comprising dignity, sustainable health, and all the items that enable a household to function in everyday life; in other words, the 'fulfilment of physical and psychological needs and the possibility for social participation' (Lehtinen et al., 2010: 16). Moreover, both the CCSR's RB and the ImPRovE project's RB are founded on the theory of human need (Doyal and Gough, 1991) and the capability approach (Sen, 1983). Even though their wording is not identical, both RBs target social participation and identify the same needs to be fulfilled. Additionally, both RBs were designed to assist in the evaluation of the adequacy of minimum social security.
At the beginning of the ImPRovE project, the Finnish team used some information from the CCSR's 2010 budget. Since the purpose was to construct cross-national comparable RBs, differences in the compositions of the baskets used by the CSSR versus ImPRovE were problematic. 6 Therefore, for the purposes of this article and to ensure comparability, the ImPRovE baskets were restructured to accord with those used by the CCSR. Each basket was examined and the items in the ImPRovE RB were moved to a corresponding CCSR basket.
The data used in this study consist of reports based on the RBs (for the CCSR: Lehtinen et al., 2010 andAalto, 2014; for ImPRovE: Goedemé et al., 2015a). For ImPRovE, detailed item lists were also available, including the number of items, their lifespans, and their prices. The ImPRovE budget was constructed for Helsinki, whereas the CCSR budget offered separate calculations for Helsinki and other parts of Finland. For the sake of comparability, the analysis focused on the RBs constructed for Helsinki. As noted above, the two RBs were constructed using different methods: while the CCSR took a consensual (or public-led) approach, ImPRovE took an expert-led approach (however, ImPRovE's method was really a hybrid approach because it used focus groups to validate the experts' results).

The Centre for Consumer Society Research's consensual approach
The Centre for Consumer Society Research (CCSR) has constructed consensual reference budgets (RBs) since 2010. Notably, its 2010 report established a methodology for consensual RBs (Lehtinen et al., 2010). The 2010 budget was uprated with price changes in 2013 and the entire content of the RB was rebased in 2018. For the purpose of this article, the CCSR report on the 2010 budget is included, as it provides detailed information about its methodology. More specifically, the contents of the CCSR's budget were consensually determined by focus groups, comprising three different groups for couples, single adults, and families with children over three rounds. Therefore, the CCSR's budget has similarities to the Minimum Income Standard (MIS) approach (although, in contrast to the MIS approach, the sequential focus groups in Finland comprised the same people).
In the first round, participants were asked to name items for each pre-specified category representing the targeted living standard. As the targeted living standard is predefined and not based on 6. The ten baskets in ImPRovE's RB were food, kitchen equipment, clothing, personal care, health care, rest and leisure, maintaining social relations, safe childhood, mobility, and housing. The ten baskets in the CCSR's RB were food, clothing, personal care, health care, leisure and hobbies, household items, vehicles, electricity and insurances, mobility, and housing.
the participants' views, needs were not 'socially perceived', but instead based on expert knowledge (cf. Doyal and Gough, 1991). After the first round of focus groups, the researchers drafted a list of commodities based on focus-group discussions as well as previous reference budgets, scientific knowledge, and expert opinions. These were further discussed in the next rounds. In preparation for the second round, the participants were asked to complete 'homework' by determining whether items were essential, necessary, unnecessary, or not applicable to their households, defining the number of items for inclusion, and determining the lifespans of items. In the second round, participants discussed the necessity of the items and budget contents of the hypothetical households. As part of this work, the households reviewed food menus drafted in 2010 and drafted the food budget, taking into consideration nutritional guidelines. Items that all or nearly all participants defined as 'necessary' were included in the RB. (Notably, the report does not explicitly state the cut-off points for including the items, which makes the selection criteria ambiguous; this does not mean that the researchers did not follow a procedure, but that the procedure was not stated in the report.) The commodity lists were assessed by an expert group. In the last round, participants assessed the budget contents and were able to add or remove items from the RB (Lehtinen et al., 2014). In essence, the CCSR's approach situates focus groups in a deliberative role to reach a consensus on budget content (Saunders and Bedford, 2017). In 2010, seven focus groups comprising 49 people in total were organised across three cities. In 2013, prices were updated, with average prices taken into consideration in the pricing of items. In addition, more households were included. Thus, in the 2013 round, 18 participants took part in focus-group discussions for single-parent families and families with teenagers.

ImPRovE's expert-led approach
In the ImPRovE project, cross-nationally comparable reference budgets (RBs) were constructed for seven cities. Cross-comparability was pursued using a common theoretical and methodological framework. In constructing the RBs, the ImPRovE project team made use of national and international guidelines, recommendations, and expert knowledge. All countries started from a Belgian list of goods and services and adapted them to their national contexts. For adaptation, different information sources, such as survey data and national guidelines, were used. Differences between countries were limited to differences in institutional settings, climate and geographical conditions, culture and availability, quality, and item prices. The pricing was conducted so that a target price level was set for each basket. The purpose was to set prices so that they were justifiable for the targeted living standard, while allowing some freedom of choice and autonomy for the hypothetical households.
Focus groups reviewed the goods list. One round of focus groups was held, comprising three groups with 16 participants altogether. Focus groups were used quite differently in the two RBs: while the Centre for Consumer Society Research's focus groups were more deliberative, ImPRovE's focus groups were more advisory. In ImPRovE's approach, revisions to the budget suggested by the focus groups were checked by the coordinating team to ensure cross-country comparability (Goedemé et al., 2015a). Although comparability was given precedence, differences between countries were permitted if compelling evidence could be provided. Essentially, researchers acted as evaluators, deciding which information to retain from focus-group discussions based on the budget's theoretical background. Thus, although ImPRovE's budget is not fully based on expert knowledge, it places relatively more weight on experts' knowledge. While ImPRovE's budget is described as consensual, 'consensus' here means that the resulting budget is based on 'observable social norms' and that focus-group participants 'largely agree' on the proposed items (Goedemé et al., 2015a). Notably, it has been suggested that focus groups are prone to accepting predefined lists (Mac Mahon, 2015). Indeed, only a limited number of changes were made to the content of ImPRovE's RB following the focus groups' review. Whether the lack of change was due to procedural constraints on participants' views or participant affirmation of the content is beyond the scope of this study.

Different methods, different results?
To summarise, the two reference budgets (RBs) on which this study focuses have similar characteristics. Both were constructed using a hybrid approach. Both relied on focus-group information (although the CCSR placed focus-group discussions higher in the information bases' hierarchy, while ImPRovE emphasised expert knowledge). In addition, both used guidelines and expert knowledge, even if ImPRovE employed these tools more than the CCSR. Despite these similarities, the two RBs generally represent different traditions in RB research, with different views on the best way to gather information.
For the reliability of RBs (or any other social indicator), small differences in how the indicator is constructed should not produce radically different results. This section analyses whether the CCSR's RB and ImPRovE's RB produce similar results. Table 1 presents each RB without housing costs for comparable households and household members. 7 Housing costs are excluded here because the analysis also takes place at the household-member level. Moreover, assigning housing costs for household members is not meaningful. The last column refers to how much the two RBs differ in percentage points. Table 1 illustrates that the two RBs differ in estimating the costs of social participation. In four of the six RBs, the difference was at least 10 per cent, and in two comparisons, the difference exceeded 30 per cent. For single persons-both men and women-the CCSR's budget presents higher costs for social participation, although the differences for single adults vary from 4 to 10 per cent, depending on gender. Conversely, for children, the ImPRovE budget presents higher costs for social participation. This difference is especially significant for the four-year-old boy: the ImPRovE budget's cost for this population was over 40 per cent higher than the comparable cost in the CCSR's budget. However, for a couple, the two reference budgets produce almost identical results, although the difference increases significantly for couples with two children, as the CCSR's budget is 30 percent higher (EUR 504 in absolute terms). Thus, the CCSR's RB varies from 70 to 130 per cent compared with the ImPRovE RB. The comparison suggests that, at least in some cases, the RBs arrive at quite different estimates. The difference in estimates is especially large for the four-year-old boy and for the couple with two children.
The results are not consistently higher in one RB, but vary between household types and household members. It has been suggested that the expert-led RBs produce lower standards than consensual RBs (Gilles et al., 2014). Ultimately, this study's comparison of these two Finnish RBs does not offer conclusive results. For adults, the results are slightly higher in the CCSR's consensual budget; and, for children, the results are higher in the ImPRovE budgets.
To obtain a clearer picture of the differences, Table 2 breaks down the RBs. The results suggest that there are baskets for which differences between the two RBs exceed 10 per cent. However, not all percentage differences indicate significant absolute differences. Alternatively, some baskets have similar estimates. Table 1 shows that the CCSR's RB was somewhat higher for single men and single women. However, the CCSR's budget is not consistently higher across baskets, as leisure and mobility expenditures are higher in the ImPRovE project's budget. The higher CCSR budget is mainly due to the higher amounts it reserves for food and personal and health care. The higher food cost-circa EUR 25/month-is almost entirely explained by the pricing of lunch.
For children, the opposite is true: the higher cost presented in the ImPRovE budget is due to the higher amount reserved for food and personal and health care. The differences in food costs are mostly related to different assumptions. The CCSR's budget assumes that the 4-year-old boy spends the whole day in day care, 8 whereas the ImPRovE budget assumes that he spends half a day in day care, which increases food expenditure in the home by approximately EUR 15 per month. The differences in food costs also likely reflect the budgets' different pricing procedures.
While for many baskets, differences between the two RBs are small or non-existent -for example, the two RBs present almost the same costs for clothing and household items -differences in mobility costs for a couple with two children are considerable. In fact, this difference in mobility costs almost entirely explains the total difference between the two RBs for this household type. For couples with children, the almost five-times higher mobility costs in the CCSR's RB (EUR 545 in absolute terms) reflects the inclusion in the budget of a car. Conversely, the ImPRovE project's budget assumes the family will use public transportation. Notably, the method for the inclusion of cars here matters. As noted above, the CCSR's report declared that items considered necessary by all or almost all participants were included in the budget, although it did not offer explicit thresholds for inclusion. In 2010, 60 per cent of families with children perceived a car as necessary, whereas for couples, the share was 80 percent (Lehtinen et al., 2010). Nonetheless, cars were 8. This is partly explained by the different assumptions regarding employment. The CCSR's budget assumes that household members are working except for the retired man. Meanwhile, the ImPRovE budget does not make this assumption. Despite these different assumptions, the RBs can be seen comparable, as they target the same living standard and are meant to be used for the same purpose.  included only for families with children. A car was not included for single-parent families, who were presumed to have the same mobility needs as two-adult households, perhaps because participants in the single-parent focus group did not consider a car as necessary. The inclusion of cars for only some households, despite similar or even higher perceived necessity for a car, illustrates that an unambiguous selection criterion for item inclusion was not established. Moreover, this approach to the inclusion of cars may also indicate problems in the validity of the RB approach, as different family types may not achieve the same living standard even though the same need for mobility was put forward. Overall, a single decision-the inclusion of a car-produced a drastic difference between the two RBs for the couple with two children. Differences exist between the RBs for leisure and personal and health care, too. Available data demonstrate a considerable difference in the number of items: in the ImPRovE budget, health care consists of almost 30 different items, whereas in the CCSR's budget, health care consists of only 11 items. However, the CCSR's basket is higher. One reason for this is the CCSR's inclusion of glasses, which were not part of the ImPRovE project's health care basket. Some of the differences between the personal and healthcare baskets of the two RBs can be traced to their theoretical frameworks. For instance, the ImPRovE project's budget includes fewer hair products, such as hairsprays, conditioners, and hair dyes, for women compared with the CCSR's budget. The ImPRovE project's focus-group participants considered that conditioner should be a part of the personal care basket. However, the rationale for not including this and other hair care products relates to the theoretical framework, namely, the theory of human needs (Doyal and Gough, 1991). It is argued that some hair care products may have potential side-effects on health; thus, adding these products to the basket would harm the basic need of good health. Thus, when the suggestions by the focus groups collided with the theoretical framework, the ImPRovE project's RB gave precedence to guidelines over social conventions in establishing criteria about which information should be retained. However, one could question whether this decision means that the RB misses some key aspects of everyday life and accepted norms in Finland.
In addition, there seems to be variation in leisure costs between the two RBs, despite the general acceptance of these estimates in both sets of the focus groups. The rather different amounts of leisure expenditure highlight that there is no exact method for determining the items that satisfy leisure or social needs. Both RBs rely heavily on focus-group input, which is unlikely to be robust. Therefore, social norms or expectations of leisure expenditure are not unified.

Housing costs -different standards?
Previous analyses have focused on reference budgets (RBs) without housing costs; yet, housing is the largest single basket in RBs. This section focuses on housing baskets, especially housing costs and apartment sizes ( Table 3). The two RBs in focus both used private-sector rents. However, they took different approaches to estimating housing costs, and these approaches significantly impacted their results. As Table 3 makes clear, housing costs in the Centre for Consumer Society Research's (CCSR) budget are considerably higher for all household types. This results firstly from the assumed sizes of the apartments, which are larger in the CCSR's RB. Helpful to note is that the differences in apartment sizes increase with the number of household members. Secondly, and most importantly, the mean rent in the CCSR's budget is higher. Together, these two factors result in almost 50 per cent higher housing costs for single adults and over 100 per cent higher housing costs for couples and couples with children in the CCSR's budget. Some differences in housing costs are to be expected due to considerable variation in real housing costs between households; yet, given that the purpose was to construct minimum budgets, these differences seem remarkably high.
What makes these differences so great? First, the information base plays a role in determining apartment size. In the CCSR's budget, size was determined in focus groups. The participants mostly discussed the number of rooms needed for each family type, which were then converted into square metres by the researchers. In the ImPRovE project, the size of the apartment was determined by using expert knowledge, based on recommendations from the UK (Department for Communities andLocal Government, 2013, 2015). Second, the information base affects pricing. The prices in the CCSR's RB are based on the mean rents of privately financed apartments, which were decided by the researchers. The focus groups suggested quality criteria, emphasising that the dwelling should be warm, safe, and healthy. However, these criteria were not explicitly transferred to pricing. In the ImPRovE project, housing costs were based on EU-SILC data, so that some minimum quality requirements 10 were met ( Van den Bosch et al., 2016). Adequate housing costs were derived by observing the 30 th percentile of all housing costs. In setting this threshold, the researchers focused on a minimum budget that households need for adequate housing while also arguing that, at this level, a considerable housing stock (i.e., 30% of the market) is available. As the ImPRovE budgets give a higher weight to comparability rather than acceptability in each country, the housing basket was not discussed in the ImPRovE focus groups, so their acceptability in Finnish society was not tested. In sum, these decisions have led to significantly different estimates of the proper housing costs. Table 4 presents the total budgets of the two reference budgets (RBs). The total budget includes the housing basket presented in Table 3 and all other baskets presented in Table 1. When all baskets are combined, the differences between the RBs became even more distinctive. The RB devised by the Centre for Consumer Society Research (CCSR) is higher for all household types, with the smallest difference for single-adult households at 25 per cent. For couples with no children, the CCSR's RB is over 40 per cent higher (EUR 700 in absolute terms). The biggest difference, however, is for couples with children, as the CCSR's RB is over 50 per cent higher than the ImPRovE project's RB. For this family type, the difference in absolute terms is staggering. According to the CCSR's RB, the couple with two children would need over EUR 1,300 per month more for adequate social participation than assumed in the ImPRovE RB.

Total reference budgets in focus
RBs have been used as poverty indicators in previous research. Therefore, it is meaningful to contrast the two RBs against the widely used at-risk-of-poverty (AROP) indicator and its associated threshold (Table 5). In all family types, the ImPRovE project's RB is somewhat lower than the AROP threshold, ranging from 93 per cent to 99 per cent of the AROP threshold. It seems, then, that the ImPRovE project's RB follows the AROP threshold quite closely. By contrast, the CCSR's RB is significantly higher than the AROP threshold for all family types. This leads to interesting results when comparing the two RBs to the AROP threshold. The difference between the CCSR's budget and the AROP threshold increases as family size increases. For single-adult households, the CCSR's budget is almost 20 per cent higher than the AROP threshold. However, for couples, the CCSR's budget is over 30 per cent higher than the AROP threshold. For couples with two children, though, the CCSR's RB is over 50 per cent higher than the AROP threshold. Another way to look at this is to compare the RBs with the median income of a couple with two children. Specifically, a couple with two children would need 91 per cent of the median income to achieve the CCSR's estimate. Considering that the aim was to construct a standard that reflects the minimum resources needed for social participation, the CCSR's estimate seems high.

Reference budgets in comparison with Finland's minimum income benefit
This section analyses whether Finland's minimum income budget is adequate in light of the reference budgets (RBs) developed by the Centre for Consumer Society Research (CCSR) and the ImPRovE project. The Finnish minimum income benefit is a means-tested, last-resort form of social assistance, which covers the basic necessities of life. Here, 'social assistance' signifies an income transfer paid to households that do not have sufficient resources to cover their expenditures. The amount of social assistance in 2014 after housing costs (which are assumed to be fully covered by the state) was EUR 477,3 per month for a single person, EUR 811 for a couple, and EUR 1,422 for a family with children ( Table 6). The comparison suggests that, according to both RBs, existing Finnish social assistance is insufficient. Even though the two RBs arrive at different estimates, the general conclusion does not change: the amount of the minimum income benefit is significantly below both RBs. The previously observed fluctuation in the CCSR's budget is evident when compared with social assistance, as the adequacy of social assistance varies between family types and is lowest for the couple-household with two children. Similar fluctuations do not occur in the ImPRovE budget: social assistance ranges from 79-83 per cent of the ImPRovE project's RB, depending on family type. The variation in the CCSR's budget is clearly affected by the inclusion of a car for families with children, which considerably increases the amount of RB for couples with children. As housing costs are omitted from the analysis, the differences appear moderate, except for the couple with two children. Nonetheless, the insufficiency of social assistance depends on the RB used. According to the ImPRovE project's RB, a 20 per cent increase in social assistance would suffice; but, a heftier increase is needed if one bases the analysis on the CCSR's RB. This result again highlights the clear differences between the two RBs' sense of the income needed for social participation. It is evident that the choices made in the RBs' design and implementation significantly impact their estimates.

Discussion and conclusion
Reference budgets (RBs) are a key indicator for monitoring the adequacy of minimum income benefits in Europe. Currently, comparable RBs do not exist in all European countries; instead, RBs are constructed using different methods across countries. Previous research has noted that RBs produce different results, but little work has been done on the mechanisms that produce these different results. The purpose of this study was to 'open the black box' of RBs by examining elements of their design and implementation responsible for producing different results across RBs that use different methods, but have similar frameworks and theoretical backgrounds. More specifically, this study examined whether two Finnish RBs, produced by the Centre for Consumer Society Research (CCSR) and the ImPRovE project respectively, yielded similar and consistent results across different baskets and household types. It also examined how the two RBs compare to the at-risk-of-poverty (AROP) threshold and the Finnish minimum income scheme. The two RBs offer a strong basis for comparison, as they apply the same theoretical framework, target the same living standard and population, and seek to assess the level of minimum income benefits. They do, however, use different methods. The CCSR's budget applies a consensual approach, using mainly focus groups to form budgets. Meanwhile, the ImPRovE budget applies an expert-led approach, as even though it utilises focus groups, it places relatively more weight on guidelines, recommendations, and existing scientific knowledge. The results of this study indicate that, when compared with both RBs, Finland's social assistance does not ensure that recipients can reach living standards to enable social participation. However, the insufficiency of the Finnish minimum income benefit differs greatly, depending on the RB used for comparison. This study has sought to better understand why the RBs produce different results. First, the results suggest that by excluding housing costs from the analysis, the RBs typically arrive at relatively similar estimates of the resources needed for social participation for single adults, 10-year-old girls, and couples; however, they still arrive at different estimates of the resources needed for the social participation of 4-year-old boys and couples with two children. The biggest differences observed in this comparison were for couples with two children. For this family type, the RBs estimated different mobility costs, which made the overall costs significantly different. Differences in mobility costs were answerable to the CCSR's inclusion of a car for some family types, but not others, based on focus-group discussions. Second, when housing costs are included, the differences between the two RBs increase, with vast differences evident for couples and couples with two children, as well as very different minimum housing costs across the two RBs. This suggests that the two RBs produce very different standards of what is needed for social participation in Finland even though they identify the same needs. When looking at the total RBs (including housing costs), the results indicate that public-led RBs might produce higher costs than expert-led ones, which affirms previous research (Gilles et al., 2014). These differences were traced back to the constituting characteristics of the RBs, namely the methods, i.e., the information bases, selection criteria, evaluators, and pricing. For example, the inclusion of a car suggests that the selection criteria used for determining which information is retained from the focus-group discussion were not clear in the CCSR's RB; conversely, in the ImPRovE approach, more weight was attached to consistency across countries and household types.
Reference budgets should be scrutinised to assess their suitability as social indicators. The Indicators Sub-Group has set up quality criteria for EU social indicators (ISG, 2015). First, RBs must be valid: that is, they must be proven to correspond to the targeted living standard. The two RBs presented here produce very different estimates for the same living standards. Small differences between the two indicators would be acceptable, but such a great difference between the two RBs begs the question: do the two RBs represent the same living standard? In particular, the high estimates of the CCSR's RB cast doubt on whether the targeted living standard has been achieved. To illustrate: whereas the ImPRovE project's RB follows the AROP threshold of 60 per cent of the national median equivalent disposable income, the CCSR's budget is significantly higher than this AROP threshold. The results of this study suggest that this occurred due to the choices made in the RBs' design and implementation. The roles of the focus groups as the principal information base, the selection criteria, and the researchers who chose the pricing procedures are significant. An indication of this is the choice of average rent prices assumed in the CCSR's budget, which might have inflated the RB and therefore overestimated the resources needed.
Second, the indicator should produce 'meaningful' results; that is, it should not inflate the number of people who cannot reach the living standard. The indicator must produce robust results and be statistically validated, i.e., the data should not be based on arbitrary adjustments and should be considered statistically reliable. It is evident that robustness is an issue, as there discussions. Moreover, the ImPRovE approach-where the focus groups were merely given an advisory role-may not provide a sufficient platform for people to discuss the habits, conventions, and social norms of the given society (Pereirinha et al., 2020). Thus, future studies should explore the possibility of using random sample surveys, as they could help in increasing robustness. At the same time, they will give people the opportunity to express their opinions. This would require international collaboration; however, this would likely provide valid, transparent, and robust RBs for Europe. Indeed, this work is necessary if RBs are to be used to monitor Principle 14 of the EPSR.