Creating a Culture of Meaningful Evaluation in Public Libraries

The current state of practice sees public libraries, like all public institutions, enduring funding challenges within the dominant political-economic environment, which is shaped by the tenets of new public management and the neoliberal audit society. Libraries, feeling threatened and unsure about their future stability, seek new ways to demonstrate their value. However, they face institutional cultural constraints when attempting to introduce new assessment methods to meet this challenge. The new dynamics require them to go beyond output measures (counts). With research findings supported by survey and interview data from Ontario public libraries, and in agreement with the current literature on the subject, we propose a new model to address this phenomenon, serving two purposes: demonstrating a library’s present state of cultural readiness to introduce new systems of outcome assessment and charting a path toward creating a culture of meaningful evaluation.

The societal context in which public libraries operate is rapidly changing, presenting them daily with various challenges: in the field of digitization, changing usage patterns, and evolving expectations of patrons. At the same time, Ontario public libraries, as creatures of their respective municipalities, exist within a predominately neoliberal audit environment that constantly challenges them to demonstrate their value and relevance. As the prevailing ideological positioning of many decision makers tends to take on a new public management approach (NPM), in which private sector principles and practices are applied in public sector organizations (Howlett, Ramesh, & Perl, 2009;McDavid, Huse, & Hawthorn, 2013;Pal, 2010), the valuation of libraries takes the form of economical frames. As such, public libraries compete with other municipal services for resources and priority.
In this article, we take a traditional public service perspective to explore the challenges inherent in introducing an outcome-based evaluation system to a public library, rejecting the currently in vogue neoliberal mind-set. As we delved into this issue, we discovered that the current preference for NPM alone could not satisfactorily explain public libraries evaluation preferences. Organizational culture plays an equal, if not more significant, role in maintaining the current state of affairs. This article unpacks our findings through five sections. First, we define the issue of evaluation through a classic public service lens. We then present results from a survey of the current state of outcome measurement in Ontario public libraries. Third, we analyze the influence of organizational culture in maintaining the status quo. Fourth, we present a theoretical model integrating organizational culture and evaluation, focused on moving libraries from a culture of resistance to meaningful evaluation, through to accepting and embedding the practice in library work flows. Finally, we suggest steps for moving beyond theory and implementing the model in library practice. We believe that establishing more effective forms of evaluation will help public libraries to better demonstrate the impact they have on their communities.

Defining the Issues
Public libraries tend to evaluate their activities either entirely or in the main through outputs: circulation statistics, program attendance, and visits (in person or online). While these indicators demonstrate a certain level of activity, they do little to express the true value of the library experience. Outputs demonstrate narrow program productivity, that measure being merely equated to a count of a single transaction or event, without demonstrating any of the outcomes deriving from that count. Impact, which is the real story behind the output count, is lost, and as a result the true worth of public library activity is either discounted or misunderstood. Brophy and Coulling (1996), in their analysis of evaluation methods in academic libraries, note that, "all too often performance measures are based on the philosophy of 'measuring the measureable'" (p. 157). An apt analogy to describe libraries' evaluation strategies is that they are cutting the suit given the bolt of available cloth, as opposed to seeking the required material to ensure the suit actually fits and provides adequate cover.
The reporting of only traditional library program outputs is becoming less convincing to outside stakeholders. Libraries can no longer coast on their reputation as a public good whose value to society is self-evident. This notion may have held true previously, but it is a saintly self-perception that is not shared by legislators or administrators who allocate library budgets (Debono, 2002;Lakos & Phipps, 2004). As noted by Newcomer (2004) with respect to evaluation building capacity in U.S. federal agencies, the "current environment emphasizes performance reporting and evidencebased policy." This holds equally well for all publicly funded agencies and nonprofits, including libraries (Oakleaf, 2010) and other human services organizations (Hatry, van Houten, Plantz, & Taylor Greenway, 1996).
There are both internal and external motivators for an organization to engage in program assessment. Externally, accountability to funders and the community, as well as accreditation pressure, provide powerful incentives. Internally, evaluation can help an organization measure its achievements, improve program and service delivery, and identify the need or feasibility of new programs (Hatry et al., 1996;Hiller, Kyrillidou, & Self, 2008;Hodges & Hernandez, 1999). However, there exists the danger that the adoption of new evaluation systems will be perceived by workers as a weapon of managerial control, rather than for "demonstrating success or learning of the need to change strategies" (Lakos & Phipps, 2004, p. 353).
The present review draws heavily from studies of academic libraries, given that most evaluation studies deal with issues in academic rather than public libraries. This is due to a number of factors. Academic libraries are part of larger institutions that actively promote the use of outcome measures, particularly as a result of the student learning outcomes movement in higher education (Lakos & Phipps, 2004;Oakleaf, 2010). In addition, academic groups such as the Association of Research Libraries (ARL) have been encouraging and educating their members about assessment for some time, whereas public library associations have been slower to engage with the topic. However, this is changing, as is evident by the strong interest that several public library groups have shown in the present research. Finally, due to requirements for promotion and tenure, academic librarians are often more motivated to conduct and publish research than are public librarians, and more likely to receive release time to do it. Nevertheless, despite differing reward systems and administrative reporting structures, the similarities between types of library are significant enough that the lessons to be learned are transferable. Oakleaf (2010) draws from the literature on school, public, and special libraries to paint a picture of the value of academic libraries to their communities. Hatry and colleagues (1996), in the United Way manual Measuring Program Outcomes: A Practical Approach, assert that outcome-based evaluation is appropriate for all types of human services organizations.

Current State of Public Library Evaluation
The key goal of our research is to examine existing evaluation systems in use by public libraries, in an effort to replace the current reliance on predominantly output-based performance measurement models, with new forms of evaluation based on capturing outcomes.
Outcome models assist policy makers, both within and outside organizations, to address fundamental questions about resource allocation, delivery methods, and agency design and purpose (McDavid et al., 2013). Ontario public libraries, although governed by provincial legislation, are in reality creatures of their respective municipalities, given that the vast majority of their funding comes from local governments. As such, given the practices of municipal governments, they exist within a predominantly traditional output-based measurement environment. In this environment, they are challenged to constantly demonstrate their value and relevance.
Our research goal is to investigate the potential for establishing model public library evaluation systems based on sociological frames of reference: a model that will examine long-term outcomes and impacts (sometimes referred to as "quality of life issues"), in a similar fashion to how qualitative measures have been developed and implemented for public health interventions and recreation programs and services. We approach this issue utilizing a threefold method, systematically unpacking the issue commencing with an understanding of the current state of practice: (a) exploring the limitations and challenges inherent in the current quantitative evaluation system; (b) creating a better understanding of the cultural impacts and policy implications of using a sociological performance framework on library organizations, their constituencies, and other stakeholders; and (c) capturing changes in the Practice-Program-Policy continuum and the downstream implications for how libraries are viewed and valued upon the introduction of new systems of performance measurement.
Little previous research has been conducted on the topic of outcome evaluation in public libraries. Huysmans and Oomes (2013) note that "most of the research is aimed at university libraries and research libraries" (p. 169). The challenge, they conclude, even with this emphasis on academic libraries, is that authors seldom move beyond the theoretical. We have reached a similar conclusion through our own review of the contemporary literature (Farkas, 2013;Hiller et al., 2008;Linn, 2008;Nussbaumer & Merkley, 2010).
A good starting point for a current review of the state of evaluation practice can be found in the work of the Ontario Public Library Association (OPLA) in its review of teen programming assessment in public libraries. In 2012, OPLA undertook a province-wide survey to better understand the services and programs offered to teens by public libraries, and issued their findings in the Teen Services, Benchmarking and Statistical Report, 2013. This was the first time that OPLA collected benchmark data focused on teen services. Of 94 libraries reporting that they offer teen services (147 of 320 OPLA member libraries responding), fewer than half indicated that they measure outcomes or impacts. The rationale for OPLA undertaking this research centered on public libraries' increased focus on and resource allocation for teen services. For many libraries, this degree of attention to teen services was relatively new; driven in part by libraries adopting the OPLA Teen Rights in the Public Library, a charter that formally recognizes the need for specialized library services unique to the developmental needs of teens, and in part as a focused effort to ensure that a new generation of patrons is introduced to public libraries. One particular survey response contained in the report piqued our interest, leading us to a closer examination of the data.
The OPLA report uses a traditional approach in its methodology: a quantitative assessment of performance metrics, tallying such indicators as program attendance, per capita dollars expended, floor space utilized, and Internet usage. A single question in the report relates specifically to outcome evaluation, and this constituted the starting point for the present research. When asked which public libraries reported measuring outcomes or impacts of teen programming, fewer than 50% reported doing so (Table 1). Given that teen services are seen as a relatively new stand-alone service in many libraries, and that there is a growing awareness of the need for outcome evaluation (the assumption being that it is easier to implement new evaluation practices for newer vs. traditional services), we were intrigued by this response. We were also interested in the demographic breakdown of the libraries that were not engaged in outcome evaluation.
From this response, it appears that larger organizations, assumedly those with greater resources to expend on newer practices, were less inclined to conduct outcome evaluation than were smaller organizations. Discounting public libraries serving populations smaller than 5,000 people (given their state of resource poverty), the data show that the larger the population served, the less likely it is to engage in outcome evaluation. The question for us then became, "What is happening here, and why are larger organizations engaging in outcome evaluation less often than smaller ones?" Given the three foci of our research-the limitations and challenges inherent in the current system, understanding the cultural impacts and policy implications, and capturing changes in the Practice-Program-Policy continuum-we felt that a better understanding of this phenomenon could reveal important information in terms of our broader research agenda.

Our Research
The Teen Services Report seemed like a good launching point for a closer examination of evaluation practices. The OPLA leadership was enthusiastic to collaborate, and provided us with research funding and the report's raw data, including staff contact information for the 94 libraries that had previously indicated offering teen programs. They also sent an email message to each library, endorsing our efforts and encouraging them to work with us. We conducted a short survey using the Interceptum 1 online survey platform. The questionnaire included branching logic to investigate whether outcome-using respondents define outcomes and impacts in the same way as those who use different evaluation methods. Nine questions, including both fixed and free choice options, were included ( Figure 1). The survey instrument was approved by the Huron University College Committee for Ethical Review of Research Projects Involving Human Participants.
After the survey was closed and a coarse thematic analysis of responses conducted, seven libraries were selected for more in-depth telephone interviews. Small, medium, and large library systems were chosen to participate, with the rationale that libraries of different sizes face different challenges with respect to stakeholder expectations, bureaucratic complexity, number and size of programs offered, as well as other possible unanticipated factors.

Survey Results
All 94 libraries from the OPLA survey that had indicated conducting teen programs were recruited for the present study. Invitations were sent including a brief description of the research, an informed consent form, and a link to the online survey. Participants were given 8 weeks to complete it, and three reminder messages were sent over the course of Respondents were asked for their job title (Figure 2), given that their position in the library might affect the nature of responses. Senior administrators responsible for public relations are likely to give different answers than librarians responsible for youth programming, evaluation, or other activities or constituencies. Youth services librarians constituted the largest group (n = 16), and directors/CEOs (n = 8) and other senior administrators (n = 8) were next. Other respondents included child services librarians (notably for libraries without a dedicated youth services staff), adult services librarians, and paraprofessional workers. The nature of responses from these different groups was not closely examined, an area for further study. It can be speculated that administrators' global perspective on library functioning, including strategic objectives, funding agency demands, and a general understanding of organizational effectiveness, gives them a different perspective from staff lower in the hierarchy, who, lacking the executive perspective, nevertheless have a strong understanding of the organization due to their active participation on the library front lines.
The relatively small pool of respondents, as well as the possibility of participant self-selection bias, precluded the use of rigorous statistical analysis, so the limitations inherent in qualitative analysis of text responses apply to this study. There are also limits to the generalizability of the present results. Respondents were, in the main, middle, or upper level managers in their respective libraries, typically those tasked with evaluation activities. While these are likely to be the best informed people about library evaluation activities, they represent a particular administrative viewpoint that may not be shared by staff members in both the lower and higher ranks of their libraries.
Most respondents indicated that they use output measures in the evaluation of teen services ( Figure 3): All but one record attendance and more than half (51%) keep track of the  number of sessions offered, and all but three libraries reported using more than one type of method. Twenty-seven libraries (63%) indicated that they use outcomes or impacts in their program evaluation activities. However, given the apparent rarity of outcome measurement reported in the literature, this seemed like an inordinately high number. We were skeptical that participants completely understood the meaning of the concept of outcomes, so the definitions supplied by participants were analyzed to determine whether this was indeed the case.
Respondents were asked for their own definition of "outcomes/impacts." Due to the survey's branching structure, it was possible to separate the responses of participants who did from those who did not indicate the use of outcomes. Thirty-five responses were given, by 23 outcome users and 12 nonusers. It was anticipated that the two groups would understand the term differently, with those using outcomes likely to have a better grasp of the concept. Various definitions ranging in accuracy were received, and these can be roughly divided into four categories. "Strong" definitions emphasized changes in behavior, knowledge, or social integration, and were user-centric rather than library-centric. Several noted that outcomes are qualitative results that can be difficult to measure, and the best definitions also referred to evaluation by triangulation using multiple qualitative outcome and quantitative output measures. Strong responses were also highly varied, demonstrating the large number of community benefits potentially attributable to libraries. "Weak" outcome definitions described the usual array of feedback measures, from anecdotes, to customer satisfaction surveys, to focus groups. While these methods engage program participants at least minimally in the evaluative process, few respondents explained how this qualitative data are used as evidence in decision making (this was not directly asked for in the survey, and could be a question for further investigation).
"Conflating outcomes with outputs" is the third response category, with a majority of participants including attendance and/or the number of sessions offered as part of their definition. Finally, a small number of definitions included "benefits accruing to the library" rather than to library patrons, such as unanticipated spinoffs or relationships formed between the library and some external stakeholder. The two most commonly observed themes were weak outcomes and the conflation of outcomes with outputs, while strong outcomes were less common. There was relatively little difference between the two groups. Generally speaking, respondents who do not use outcomes provided vaguer definitions, such as "beyond the numbers to more social, environmental effects." Somewhat surprisingly, they were also less likely to conflate outcomes with outputs. This suggests that libraries who report using outcomes may be overconfident in their evaluation systems.
Twenty-seven participants reported using outcome/impact evaluation to assess teen programs. When queried about the outcomes they assessed, there was evidence of definitional confusion. A majority (n = 17) reported using written, verbal, or anecdotal feedback. While this is an outcome measure, its dependence on patron self-reporting makes it of limited utility, as patrons may want to spare the feelings of librarians if they did a poor job of it. One librarian reported that feedback from teens has led to "our programs constantly changing due to their recommendations." Another indicated that a program was successful if it resulted in new spinoff activities, program expansion, or new partnerships with the community. While this is a valid measure, it focuses internally on outcomes for the library rather than benefits to patrons. Overall, a lack of clarity about what is meant by outcomes was quite obvious, with 15 of 27 libraries claiming that program attendance, and four the number of sessions offered, are types of outcome measure.
Sixteen respondents indicated not using outcomes, and supplied a variety of reasons, including unfamiliarity with the concept. Inadequate staff capacity, including the skills needed to conduct effective evaluation and sufficient time to allocate to the task, was mentioned repeatedly. Two libraries indicated that the small number of programs offered and few participants did not warrant the effort of conducting rigorous evaluation. At a higher organizational level, several respondents noted that outcome evaluation was not deemed a priority by the library board or management. Two libraries noted that they were actively in the process of transitioning toward outcome-based measurement systems, and two others that they intended do so in the future. These reasons are in agreement with other research, especially the observation that librarians lack adequate training in social science research methods, including research design, qualitative and quantitative data collection, analysis, and application to decision making (Hiller et al., 2008).
As for outcomes not being a priority for management, this may be due to any of several reasons, which we explore in greater detail in the next section of this article.
Respondents who do not use outcomes were asked whether they are used elsewhere in the library. It was hypothesized that due to the relative novelty of teen services, either of two possibilities might hold true: On one hand, services with longer histories might have better developed evaluation systems. On the other hand, evaluation practices, whether good or bad, might remain consistent across units in the library. Seven respondents replied that they use outcomes in evaluating children's services, and five that outcomes are not used elsewhere without providing any further explanation. Due to this limited data, we cannot draw any firm conclusions on this topic.
Respondents were asked about their level of satisfaction with the evaluation methods currently in use. Overall, approximately half indicated that they were satisfied. However, those who reported using outcome measures were far more likely to be satisfied than those who do not (Table 2). When asked to elaborate, a number of patterns are evident: Satisfied outcome users (n = 18): Eight respondents provided in-depth responses. They acknowledged the challenges involved in conducting meaningful evaluation, especially with teens who can be difficult to communicate with. However, they found that the feedback obtained "showed great results for both teens and staff," "provides an excellent picture into impacts," and allows for rapid identification and modification of unsuccessful programs, which in turn permits improved allocation of staff and financial resources. Satisfied nonusers (n = 4): This small group indicated that because they offer few programs, the complexity of outcome evaluation would be overkill, but that "as we increase programming, the method of evaluation will necessarily change." Unsatisfied outcome users (n = 9): All nine of these respondents answered this question, indicating some frustration. They complained that outcomes are difficult to measure, and that many staff "get caught up in the numbers game" and do not accept qualitative outcomes as valid measures. There is a lack of shared vision, leading to inconsistent practice among staff who are not provided with a clear set of institutional guidelines to follow. Unsatisfied nonusers (n = 12): All 12 of these respondents expressed frustration with their dependence on output measures such as attendance and circulation numbers. They realize that outputs do not provide a meaningful understanding of patrons, and that qualitative measures such as interviews could be helpful in explaining things such as low program participation rates. Most also noted that better evaluation methods are needed not just to measure program success, but to help improve future planning and execution. They understand that questions such as "How does volunteering at the library impact teens?" "What can the library do to get teens to participate?" or "Who should attend? What do they learn? Why do we need programs?" cannot be meaningfully answered using quantitative metrics. Several also expressed an interest in the results of the present research, with the belief that it might help them modernize their methods. As one librarian noted, "we are currently very traditional and are moving toward more dynamic methods of evaluation." Several libraries noted that their teen programs do not receive any special funding, making this question irrelevant.
Of the 27 responses received, only 6 indicated the need to report program results. Two libraries report outcomes for funding received through provincial government agencies or corporate sponsorships, one supplies attendance figures to its external funder, and two report output measures to the board of directors. The small sample precludes generalization but suggests that external pressure from funders is not yet an important driver for adopting better evaluation practices, suggesting further possible research questions: Research Question 1: How does your library use patron feedback in decision making? Research Question 2: How do administrators and line librarians see evaluation differently? (A demonstration of organizational effectiveness or of managerial control over worker behavior?) Most libraries do not report outcomes, or else they conflate outputs as outcomes. It is felt that it is too much work to develop evaluations for small programs with few participants. There is a lack of recognition that in this case evaluation can be used to modify programs to increase participation.
Evaluation systems are more geared to program improvement than to budgetary decision making. Some libraries are actively examining their evaluation systems. Initial experiments with outcomes give real knowledge of program quality, which helps create staff buy-in: success breeds enthusiasm. Staff recognize the limited utility of quantitative data for decision making, but are concerned that more robust measures would create extra work, be used to enact austerity measures, or to evaluate staff performance rather than program success.
While the frustration with quantitative measures suggests a readiness for change, a lack of understanding about how to improve the evaluation system is a major barrier. Skills deficits, organizational priorities, and lack of time all impede the use of outcomes, and top-down driven change without adequate staff consultation or buy-in generates resistance, a cultural problem for libraries, which tend to be hierarchically structured.
Culture appears to play a significant role in preventing the successful implementation of program evaluation in libraries. We felt that in the context of evaluative practice gatekeeper, the role of culture required greater investigation.

The Role of Organizational Culture
Through the responses to our survey's open questions and subsequent follow-up interviews with key informants, it became increasingly apparent that for public libraries to integrate effective evaluation practices into everyday work flows, there is a need to address issues of institutional culture. Organizational culture has been defined as the social or normative glue that holds an organization together (Siehl & Martin, 1981). It expresses the social ideals, values, and beliefs that members of an organization come to share (Louis, 1980). These values or patterns of belief are manifested by symbolic devices such as myths (Boje, Fedor, & Rowland, 1982), rituals (Deal & Kennedy, 1982), stories (Mitroff & Kilmann, 1976), legends (Wilkins & Martin, 1980), and specialized language (Andrews & Hirsh, 1983). These studies assert that culture can have enduring consequences, and can have powerful positive or negative effects on individual and organizational performance (Kotter & Heskett, 1992;Lim, 1995;Wilkins & Ouchi, 1983).
In his seminal work on organizational culture, Schein (1990) asserts that its impact on organizations is the critical factor that can either advance or stymie innovation. He defines culture as a pattern of basic assumptions that works well enough to deal on a daily basis with emerging issue and challenges (Schein, 1985). These patterns of operational shortcuts (heuristics) work "well enough," and are taught to new institutional members as correct ways to perceive and react to emerging issues. Linn (2008) reviews the influence of organizational culture on the ability of academic library administrators to propose change, emphatically stating, (Culture) is something that can easily make the difference between an administrator's proposed change succeeding or failing. The obvious problem for a manager trying to take organizational culture into account during decision making is the wildly different ideas of what organizational culture is, why it is important, how it should be measured, under what conditions it should be changed, and how one might be able to change it. (p. 92) From the responses to our open-ended survey questions, we observed two main rationales for not implementing a more fulsome system of outcome evaluation: Staff perceived it as a direct challenge to held cultural beliefs regarding the "true" work of librarianship; and managers felt that it was too difficult to implement because it would "fly in the face of" existing cultural norms. In the first case, common responses ranged from evaluation seen as "busy work" that detracts from the real work of staff, to it being viewed as additional and unnecessary work. In the second case, common responses from management stated that staff would not want to undertake anything more onerous than what they are currently engaged in, and it would challenge the staff's perceived sense of professional autonomy. These responses are in line with the observation made by Hodges and Hernandez (1999) that "culture in organizations can be thought of as the beliefs, values and meanings shared by members in the organization" (p. 185). Schein's (1990) theory of culture postulates that its impact on organizations can be viewed on two levels: the official, formal culture as evident in devices such as mission statements and public pronouncements of values; and the unofficial subculture, where the underlying principles and hard-held truisms of the organization reside. Like an iceberg, much of the true weight, impact and influence of culture operate below the waterline. Mixing metaphors, the flimsy paper boat of formal culture is no match for the whale of subculture, the most important driver of organizational behavior ( Figure 4).
Cultures, and particularly subcultures, can be recognized as one of the greatest inculcators of organizational beliefs and practice, so it is perhaps not surprising that that some of our survey respondents tended to question the worth and purpose of evaluations, postulated that evaluation work has a negative influence over other real priorities, and viewed new forms of evaluation as simply "busy work." Outcome-based models of evaluation in public libraries require significant additional time and effort to design and implement than more traditional output-based quantitative models, if for no other reason than that they represent the road less traveled. Challenges noted by early efforts to introduce new models of evaluation into Ontario public libraries are captured in the following comments during the interview stage of our research. One interviewee, a manager in a large urban library system, stated that a common response from staff asked to introduce new evaluation methods as "we are so busy, why do we need to do this? It is just busy work." Similarly, a manager in a midsized urban library system stated that staff's response to the introduction of some new evaluation approaches was that "we need a change in temperament, a change in perception. Some (staff) don't like too much work." The cultural indoctrination that new staff members receive can be seen as a primary contributor to the resistance toward new and different evaluation models; it can create a disconnect between outcome evaluation and its relationship to librarianship. As demonstrated by the two responses above, this contributes significantly to resistance. Further (and stronger) opposition emerges from the ritualistic nature, a form of professional inculcation, that the long-standing practice of output evaluation plays in obstructing the adoption of new evaluation methods. Farkas (2013) skillfully captures this phenomenon in her review of cultural influences affecting library practices: Current models of evaluation can be seen as organizational artifacts in public libraries. Circulation and attendance counts serve as iconic artifacts (as well as practice) rituals in which staff undergo both defining them as part of the organization and bonding them to it. As iconic ritual they serve as central tenets to the profession, and not conducting them or changing their delivery in any discernible way is tantamount to heresy. (p. 15) In this sense, changing evaluation practices can be seen as a challenge to professional judgment and autonomy, a perception that creates powerful cultural resistance. Linn (2008) concludes that while cultural impacts on organizations vary, organizational culture is "a fundamental part of what integrates members of a group" (p. 89). Drilling down to understand the degree of impact culture has on introducing new work flows, and the need to create a different sense of value to the work on the part of staff, in this case outcome evaluation, is the challenge. Library leaders who ignore the central role that culture, especially the part residing below the proverbial waterline, has on the enterprise, do so at their own risk. "An institution's culture can be one obstacle . . . to having a library's director being able to institute changes" (Linn, 2008, p. 88).
Where we differ from Schein (1985) lies in his testament that the only real work of managers is to build and maintain cultures. We assert that library culture transcends the ability of any one individual manager, or group of managers, to unilaterally implement change. Although authentic engagement of management in the effort is necessary, it is not sufficient for success. Given the deep and long-standing professional indoctrination practices in the profession, we proffer that a more organic and inclusionary process be considered. In this vein, we concur with Preston's (2004) observation that while cultural change is one of the more important factors to consider, given its intangible nature, it is one of the hardest to effect. In order for new and effective evidence-based models of outcome evaluation and decision making to take firm root within public libraries, existing organizational cultures need to be acknowledged, understood, and addressed simultaneously with the introduction of new evaluation systems.
A focused approach to cultural change is needed to prepare the way for the introduction of new assessments. New evaluation methods, if authentically executed, have the potential to fundamentally realign how the public library conducts its business, prioritizes its activities, and cherishes what is seen as valuable. When utilized in this fashion, evaluation provides an objective lens that reveals different and untraditional insights into how the organization operates, which may at times be a direct challenge to the current dominant culture. New systems of evaluation can be revolutionary, leading to a cultural paradigm shift within the organization. Farkas (2013) cites many articles in the library literature that "suggest that organizational culture is to blame for the lack of assessment cultures in many libraries" (p. 14). Furthermore, she offers an explanation that although this challenge may exist, "a culture of assessment could instead be used as a lever to change organizational culture." What we are offering as a result of our research on this subject is somewhat different; that is, neither a model where cultural change leads to evaluation change, nor a model where evaluation change leads to evaluation change. Rather, we propose a model where new systems of evaluation and corresponding cultural change occur simultaneously.

Culture and Evaluation
Our proposed model is akin to an organic progression. Implementing new systems of outcome evaluation reveals the nuanced qualitative differences that service providers are making, affecting both library patrons and the staff delivering the service. Corresponding cultural shifts will follow, in terms of how the library sees itself, its role, and its place in the community. Simultaneously, shifts in culture create more willingness to change, allowing staff to participate in systems of evaluation that may create substantive transformations in how they understand their role in the library and community. This should in turn lead to real organizational realignment. This approach turns causality on its head. We are proposing a model where the notion of this change will impact that change is not valid. Rather, both this and that need to change simultaneously; they act as co-determinants of each other.
The challenge in implementing this model of a dualistic cultural-evaluative paradigm shift resides in the problem of cultural inertia. Administrators and staff tend to perceive their privileged place as directly attributable to the existing structure of the organization and its corresponding (relatively stable) culture. Nussbaumer and Merkley (2010), in their review of obstacles to introducing new systems of assessment at the University of Lethbridge, noted, (Its) culture gave precedent to the preference of library staff over the needs of students, faculty and other library clients. Rules and regulations abounded and the staff was not empowered to make decisions. Innovation was discouraged. The status quo ruled. Morale was low and many of the library staff had lost their voice through fear of negative repercussions from their colleagues. There was very little sense of personal responsibility and accountability. The culture reflected an inflexible and hostile environment. (p. 683) The element of control is, to a large extent, a means for ensuring cultural stability, because the system in its current configuration is perceived as the source of power, position and privilege for the organizational office holder. Our research uncovers this pattern in some of the public libraries we surveyed, a point reinforced by some interviewees. As one participant from a large urban library recounted, "Our managers have had workshops on outcome evaluation, so the new program assessment approaches are out there. Honestly I can't say how much support (for the new approaches) we are getting from our managers." Along the same lines, with respect to the relationship between information control and message management, another interviewee from a mediumsized library stated, The comments we collect from our evaluation are filtered to the staff and the Board. There is a choosing on which ones will go forward. Each manager chooses the ones from the department that will be presented. The Board likes the positive messages.
Further comments from this participant serve to reinforce the concept that the evaluation process is controlled in a manner that negates its overall effectiveness and ability to enact authentic change: "The Board wants to hear positive comments from the public. They are not interested in a balanced view." In this instance, evaluation is primarily a political activity rather than one designed to improve the library's effectiveness.
Given the impediments to cultural change with respect to creating an assessment-friendly environment, a model is needed that provides both contextual supports and guiding pathways for such a change to occur. Building upon current models of organizational change, we propose utilizing elements of both Kotter's (1995) and Schein's models in crafting our approach. Kotter's eight-step process for organizational change is perhaps one of the best known ( Figure 5). He focuses on embedding sustainable change within organizational culture by establishing a sense of urgency, empowering action, and consolidating changes.
Kotter provides a starting point for our proposed model. Schein (1990) describes the need to address cultural dynamics through guided evaluation and managed change. His model includes seven steps: (a) unfreeze the present situation, (b) articulate a new direction, (c) fill key positions with new incumbents, (d) systematically reward adoption of new directions, (e) seduce and/or coerce members into adopting new positions, (f) discredit sacred cows and destroy artifacts associated with them, and (g) create new emotionally charged rituals and symbols.
We utilized elements of both Kotter's and Schein's models as reference points in developing our model that specifically addresses the challenges of organizational culture, with respect to introducing embedded outcome-based evaluation in public libraries. Our research respondents identified five key obstacles to embracing a more robust approach to program evaluation: (a) a lack of education, (b) not being able to produce meaningful results and change, (c) lack of inclusion and "Big Picture" relevance, (d) insufficient training in best practices, and (e) librarians who feel that their skill set is inadequate to the task, so they are reluctant to change. To be successful, any new model needs to address all of these obstacles.
A new model, while providing a structure and a context for change, also needs to support an organic approach. That is, given the diverse configurations and complexity of different libraries, with no two being exactly alike, a one-size-fits-all, top-down approach is destined to fail. A more self-directed, introspective approach is required. As noted by Martin and Meyerson (1988) in their analysis of library culture, Through more intensive observation, through more focused questions, and through involving motivated members of the group in intensive self-analysis, one can seek out and decipher the taken-for-granted underlying, and usually unconscious assumptions that determine thought processes, feelings, and behaviour. (p. 12) Finally, the current prevalence of NPM philosophy and the ideological imperatives of the neoliberal audit society create external motivators for change in evaluation systems. In the unlikely event that these kinds of pressure subside, there is a distinct possibility that many organizations will revert to their previous patterns because they have not internalized the values of outcome evaluation. The advantage of moving to an outcome evaluation model must transcend the need for external fiscal accountability and address more critical questions of organizational relevancy and evolving public needs and expectations. Dealing with change needs to be inculcated in the organization, or as Farkas (2013) notes, "people will eventually resume their old habits once the urgency has subsided" (p. 14).

The Model: Inculcating Evaluation Within an Existing Organizational Culture
The present model was developed to demonstrate how an organization can progress through a series of steps, moving from output-based performance metrics, to an organizationally aligned outcome-based evaluation system that highlights the necessary operational and cultural transformations (Table  3). It can also be used as an analytical tool to determine where an organization is situated in terms of its cultural readiness for change. Our model is in keeping with Lindblom's (1959) theory of policy development favoring an incremental approach to institutional change, a "method of successive limited comparisons" (p. 81). It is designed to address concerns that contribute to a commonly held defeatist perception of outcome evaluation in public libraries as the practice of "measuring the unmeasurable" (Train & Elkin, 2001), or as our research participants stated as being "too hard." Kramer (2009) calls the practice of small exposures toward assessment over time as "building assessment anti-venom." Incremental change minimizes the perils of culture shock and catastrophic system failure that sudden, wholesale structural change can bring about.
We believe that our model will serve as a roadmap to assist organizations in moving successfully through a series of manageable steps by infusing a culture of outcome evaluation within the organization. Once an organization has achieved the final stage, evaluation then serves as an accepted and valued tool for (a) identifying operational issues and challenges, (b) realizing organizational priorities, and (c) educating stakeholders and funders about the range of possible quantitative and qualitative program and service impacts. As an analytical tool to assist with organizational culture change, transforming the workplace from mere grudging acceptance of evaluation to enthusiastically embracing it, this model addresses two objectives: 1. Situating the current place of a library within the model's five stages, leading to an awareness of the cultural context of the organization and subsequent amount of work needed to create robust change (self-awareness); 2. Plotting a course of action with embedded feedback loops leading to the emergence of a true culture of assessment (action plan).
This model for moving organizations toward full engagement in a culture of outcome evaluation and assessment is depicted in the above matrix (Table 3). There are five stages, each representing a level of institutional progression, awareness, acceptance and understanding, and these can be seen as steps toward evaluation enlightenment. Our experience and research suggest that the majority of libraries are either at Stage 0, or that at least some of their current practices are reflected in the dimensions of this stage. Kotter (2008) argues that getting buy-in is not enough because it only engages the head, not the heart. Moving through the stages is akin to Kotter's Step 6, creating short-term wins, building staff confidence, and utilizing an incremental approach. The model is designed to address many of the obstacles that inhibit organizational change, the most notable being subcultures. Subcultures can be seen as the greatest inculcator of organizational values (Schein, 1990) and as such are primary contributors to staff perception of the purpose and value of evaluations, having assessment viewed as "busy work." Ours is an inclusionary model. It goes beyond the shallow engagement of staff typical in top-down approaches mandated by senior management, instead encouraging full staff participation through consultation and training, for them to take ownership of the process. The processes underlying each stage of the model are detailed below, demonstrating how it specifically addresses Schein's three components of organizational culture: artifacts, espoused values and beliefs, and underlying assumptions. As an inclusionary model there is a need for all institutional members to take a full participatory role in the process, as Farkas (2013) observes: At many institutions, those tasked with building a culture of assessment are not administrators and do not have the ability to initiate such a system-wide change. The library administrator(s) may be supportive of building a culture of assessment, but the task of creating it is frequently delegated. (p. 17) The challenge to instilling an authentic climate of evaluation rests with institutional leadership and its ability to inspire, or better yet fully participate in this activity.

The Model's Six Dimensions
The model is designed to be flexible and adaptable in its implementation, highly responsive to local needs and context. It is not a one-size-fits-all model. This is evident in the design and range of each of the model's dimensions: 1. Purpose: Steps in the Purpose continuum capture the rationale driving the evaluation work. For example, the Stage 1 rationale is listed as justification. At this stage participants are likely to be only reluctantly engaged in evaluation work, viewing this activity as part of the greater neoliberal audit society. They see themselves as unwilling participants in a command and control culture from which they will quickly disengage if the opportunity arises. 2. Motivation: Steps in the Motivation continuum reflect both the institution and the institutional players' impetus for engaging in new approaches to assessment. For example, the Stage 1 motivation is fear and survival. Organizations at this stage adopt new models of evaluation in a more or less cynical attempt to appease funders' expectation (real or perceived), to ensure institutional survival, rather than out of a genuine desire to improve organizational functioning and success. 3. Organizational Impact (tactical/strategic): Steps in the Organizational Impact (tactical/strategic) continuum differentiate between short-term tactical acquiescence to evaluation, and long-term strategic acceptance, illustrating the practice as an authentic cultural artifact, rather than a temporary adjustment to a crisis situation.

Organizational Impact (internal/external):
Steps in the Organizational Impact (internal/external) continuum chart organizational transition from a closed to an open system; becoming more attuned to ensuring that program outcomes reflect the expectations, needs, and wants of the broadest possible constituency, and are not merely reflective of perceived needs of narrow, internally focused organizational imperatives.

Inhibitors and Enablers: Steps in the Inhibitors and
Enablers continuum are designed to help understand where resistance (inhibitors) to change resides and what the real issues underlying that resistance are. They also assist in identifying champions for change (enablers) within the organization who can help to institutionalize the new approaches (i.e., Kotter's eighth step). 6. Implications: Steps in the Implications continuum help identify the potential consequences and impacts (organizational and psychological) that can be expected in libraries having attained a given stage of development. Both official cultures and subcultures are impacted.

Stage 0: Complacency
The key challenge at this stage is to overcome organizational inertia. Staff pushback is seen as a key cultural obstacle. Hiller et al. (2008) describe the standard cultural environment for libraries at this stage: In general we found a number of library staff skeptical of quantitative or qualitative data from customers, preferring instead to rely on their own assumptions and past practices to make decisions. The lack of staff competencies in research methodologies and data analysis contributed to this skepticism. (p. 228) Staff skepticism represents a significant inhibitor to moving beyond the complacency stage. Pfeffer and Sutton (2006) composed the following list of typical answers to the question "what makes it hard to be evidence based?" • • There's too much evidence; • • There's not enough good evidence; • • The evidence doesn't quite apply; • • People are trying to mislead you; • • You are trying to mislead you; • • The side-effects outweigh the cure; • • Stories are more persuasive anyways. Pfeffer and Sutton's (2006) list is consistent with the rationales stated by our survey and interview participants for not using outcome/impact measures: • • Lack of staff capacity; • • Do not have anyone trained in outcome/impact measurement; • • Too time-consuming/lack of time; • • Not currently considered a priority by our governing body; • • Assuring outcomes fall short on our priority list; • • No formal criteria which is implemented system-wide.
It is interesting to note that although our research was conducted with Ontario public libraries in 2014, whereas Pfeffer and Sutton's (2006) findings come from U.S. academic libraries, there is significant consonance between the two lists. Apparently not much progress has been made in the interim in addressing the lack of meaningful evaluation in the library profession.
To move beyond complacency, library staff need a focused effort from management. Hodges and Hernandez (1999) note that staff agreement with a program or agency's stated vision and/or mission cannot be assumed. Willing consent is a critical element to successfully introduce a new system of outcomebased evaluation, as well as for establishing conditions amenable to cultural change (if necessary). Authentic participation requires a motivational force that inspires staff. In this scenario, the determinants and conditions of the status quo are evident, as identified by Nussbaumer and Merkley (2010), and need to be addressed as a precondition to progressing to Stage 1: • • Due to the "everyone does the same thing" culture and operational model it was impossible to make a change to workflow in one area without it directly impacting other areas-therefore systemic change was necessary. • • Organizational politics were so strong and polarized that they stalled or destroyed the development and implementation of new initiatives. • • The existing structures were so convoluted that the technical services review groups could not explain them and the focus needed to change from a "review" to "building it today." (p. 680) A final symptom of Stage 0 that must be addressed to advance the organization to Stage 1 deals with locus of control. In addition to the need for a shared vision/mission, shared control is also a prerequisite to successful execution of new institution-wide systems of assessment. The locus of control, also defined as power, is best shared by encouraging decision making and effective action by those staff most affected by it; a top-down approach by management is unlikely to succeed. Distributed decision making enables the introduction of outcome evaluation; but equally (if not more importantly) informs actions to take place from the information obtained.

Stage 1: Justification
At this stage, the need to conduct outcome-based assessment has become evident to key organizational decision and policy makers. It can be seen as an awakening, and at its most basic level the primal organizational imperative, survival, is the catalyst behind its introduction. Organizations at this stage tend to embrace simplistic elements of the NPM approach when planning the form of their new assessment system. Evaluation tends to take on a "Return-On-Investment" (ROI) posture: organizations are driven to demonstrate "worth," their competitive advantage and value propositions vis-à-vis similar competitors, who might be either private sector doppelgangers (e.g., bookstores) or direct competitors for funding in the form of other public services (e.g., police services). This being the case, there is generally no concerted effort taken to connect culture with evaluation in a meaningful or systematic fashion. As noted by Farkas (2013), "With limited time, faculty will look to using assessment tools that require the least investment of time rather than those that will provide the most meaningful data" (p. 22).
An excellent illustration of a Stage 1 justification approach is evident in the Best Value Initiative launched in the United Kingdom in 2000. Developed using NPM tenets, it required all public service agencies, including libraries, to submit a Best Value Performance Plan (BVPP) annually, to "demonstrate that the service it provides is delivered in the most 'economic, efficient and effective' way possible" (Train & Elkin, 2001, p. 296). Evaluation includes the "4Cs" of best value: • • Challenging why/how a service was delivered; • • Comparing performance with other organizations in the private and volunteer sectors; • • Embracing "Fair Competition" as a means of securing efficient and effective service; • • Consulting with local taxpayers, customers and the business community.
Train and Elkin review the implications of BVPPs in the evaluation of library literacy services. The very nature of this evaluation led to staff pushback, given that its design was predicated on proving the value of institutional existence, rather than meeting goals, and measuring outcomes and impacts: in short, making the service better. The rather shallow focus of the Best Value Initiative eliminated any expectation for system-wide buy-in. The best value approach was ideologically driven by the politics of the day and hence was tactical rather than strategic in nature. As such, it was viewed by library staff as a short-term scheme, easily abandoned when the political climate changed.

Stage 2: Self-Awareness
This stage centers on the organization moving away from a position of fear and survival and toward a sense of selfefficacy. This new sense of awareness is typified by a statement from one of our participants from a midsized library: In terms of our staff perspective on a different type of evaluation; they are frustrated in evaluating programs based solely on the number of participants. Certain programs (we offer) are limited on the number of participants attending, given the current (evaluation's) design and purpose.
Further internal challenges, resource scarcity, and pressures can be impediments to better buy-in and more enthusiastic use of evaluation by staff. In their study of how ARL members use the results of evaluations, Hiller et al. (2008) observe that only a few libraries understood and were able to analyze and present data effectively. As one of our interviewees from a large urban library noted, a perceived lack of practical application of assessment results to decision making is frustrating and slows down acceptance of the new approach; "We have no results to give to staff to show that this (evaluation work) can help them in their work." The same interviewee stated that this lack of applicability has resulted in a "misunderstanding" with "front line staff saying I'm very busy and this is extra work for me." In this case, staff members had been ready and willing to participate in implementing a new approach to assessment. Unfortunately, the library system lacked the ability (whether due to skills or politics) to utilize the data in a manner that would be seen as meaningful to staff, thereby needlessly squandering their goodwill. As previously noted, initial staff enthusiasm and acceptance of change is precious capital that once expended without tangible results is quite challenging to re-accumulate. Once lost, trust is difficult to regain.
In Stage 2, staff cautiously moves toward engagement. A lack of effective practical application of assessment results can act as an inhibitor, stalling or even reversing forward movement of the model. Conversely, effective engagement (a constructive feedback loop involving results-involvement-use of data) can act as a catalyst, propelling the organization on to the next stage.

Stage 3: Alignment
At Stage 3, the evaluation enterprise shifts away from being viewed as a tactical tool used to justify financial expenditures, create one-time budget victories, or realize short-term project benefits. Evaluation becomes a strategic exercise, positioning the organization for long-term success and refocusing human and financial resources in relevant, effective and sustainable ways. In short, it leads toward a system-wide alignment between the organization's vision/mission and its ability to execute the mission and realize its goals. This is an important pivot point in the model. Once Stage 3 is achieved, backsliding into previous ineffective behavior becomes more difficult because the organizational culture has evolved. A self-perpetuating degree of forward momentum, a virtuous cycle, is realized. Hodges and Hernandez (1999), reviewing evaluation as a tool for demonstrating greater relevance in communityservice organizations, stress the value of system-wide alignment: By linking outcome accountability with systems change, the assumption is made that if child-serving organizations know more information about the outcomes or results of the work they are doing, they can use this information to improve upon their work and make systems more responsive to the needs of the children and families they serve. (p. 184) A research interviewee from a small/medium-sized library described the challenges of having staff readily use outcome evaluation at a strategic level; "staff value conducting evaluation, and they grasp the importance of it.
They are grasping what is offered and using data to fix it (programs requiring change)." In stark contrast, the same library's board of directors is not interested in strategic uses of data: "We are not using evaluation results in making policy and financial decisions. It (evaluation) is separate from their (the library board's) agenda. It doesn't fit into their institutional perception." The board's traditional culture and simplistic perception of the evaluation effort has frustrated the staff, stymieing progress into Stage 3. When it is restricted to merely fulfilling the role of providing a set of tactical tools, meaningful evaluation can only go so far in helping libraries achieve greater relevance and sustainability. For authentic organizational alignment to be achieved, it must be seen and utilized as a strategic instrument affecting a greater degree of overall systemic alignment, including an intractable inculcation of shared vision and goals among all institutional stakeholders.

Stage 4: Actualization
Stage 4 involves the complete integration of the assessment system into the organization's culture. Evaluation has been inculcated within the organizational membership as a worthwhile and positive enterprise, becoming a naturalized element in work flows. Strategically, data from outcome evaluations assist in informing and forming priorities and activities, allocating and reallocating resources, and ensuring that the organization remains responsive and relevant to all of its stakeholders, both internal and external: In a culture of assessment, assessment becomes the norm and a valued part of planning and teaching. New services are planned for with consideration for how they will be assessed. The library does not just collect data; it acts on and learns from the data. (Farkas, 2013, p. 15)

Steps to Moving Forward
The process of aligning evaluation with organizational culture is more than just "tinkering." Nussbaumer and Merkley (2010) state that a focused approach by management to create alignment is essential, and requires a commitment on the part of library administration, not to a rigid plan but to strategies that engage staff in an ongoing dialogue to clarify the vision and to encourage staff to see change as serving both the library's interests and their own self-interest. (p. 686) There is a need to gauge the level of cultural readiness when new forms of evaluation process are initiated. As Hodges and Hernandez (1999) note, "if managers focus their attention on the cultural processes of their organizations, they might better understand the influence these cultural factors will have on the success of their quality improvement efforts" (p. 185). It is to this end that the present model was created.
Our plan is now to work with our public library partners, utilizing the model in a stock-taking exercise that will gauge each organization's state of outcome evaluation readiness. From there, future research will include working with our partners to coproduce a more prescriptive process that will help move organizations toward a culture of evaluation, as well as developing the tools and performance indicators necessary to conduct outcome evaluation in this environment.

Declaration of Conflicting Interests
The author(s) declared no potential conflicts of interest with respect to the research, authorship, and/or publication of this article.

Funding
The author(s) received no financial support for the research and/or authorship of this article.