Skip to main content
Intended for healthcare professionals
Free access
Research article
First published online November 20, 2016

Measuring the Readability of Sustainability Reports: A Corpus-Based Analysis Through Standard Formulae and NLP

Abstract

This study characterises and problematises the language of corporate reporting along region, industry, genre, and content lines by applying readability formulae and more advanced natural language processing (NLP)–based analysis to a manually assembled 2.75-million-word corpus. Readability formulae reveal that, despite its wider readership, sustainability reporting remains a very difficult to read genre, sometimes more difficult than financial reporting. Although we find little industry impact on readability, region does prove an important variable, with NLP-based variables more strongly affected than formulae. These results not only highlight the impact of legislative contexts but also language variety itself as an underexplored variable. Finally, the study reveals some of the weaknesses of default readability formulae, which are largely unable to register syntactic variation between the varieties of English in the reports and demonstrates the merits of NLP in report readability analysis as well as the need for more accessible sustainability reporting.

Introduction

Sustainability reporting, under its various names and in its various forms, has increased considerably in adoption in the 21st century. The 2013 KPMG Survey of Corporate Responsibility Reporting indicated that 51% of companies issuing reports worldwide include corporate responsibility (CR) information in their annual financial reporting, up from 20% in 2011 and a mere 9% in 2008. KPMG sets a low barrier for entry in not specifying a minimum required amount of sustainability-related disclosure (e.g., in pages or words),1 but these numbers still reveal an undeniable trend. The survey reads that “the debate [whether or not to report] is over. [ . . . ] It is now about the quality of CR reporting and the best means to reach relevant audiences.”
Quality reporting delivers relevant, accessible information to interested audiences, and one important facet of this process that scholars have devoted relatively little attention to is these reports’ readability. While sustainability reports have much in common with the notably difficult-to-read genre of corporate financial reporting (Courtis, 1995, 1998; Li, 2008; Stanton & Stanton, 2002), the latter typically addresses a far more specialised readership of investors and analysts likely better equipped to deal with its textual complexity. The audiences that might benefit from corporate sustainability reporting, conversely, are remarkably diverse. Any stakeholder in the company’s operations, be they investor, employee, or member of the community in which the company operates, might have an interest in its social or environmental performance. Not all members of this extended readership, however, will be able to decode a level of textual complexity similar to that of financial reporting. Furthermore, Lehavy, Li, and Merkley (2011) show the importance of producing readable texts, even for financial report writers, demonstrating that investors will rely more heavily on expert analyses as a company’s reporting becomes less readable.
We are aware of only two previous studies into the readability of corporate sustainability reporting, both echoing often-conducted studies into the (poor) readability of corporate financial reporting. Farewell, Fisher, and Daily (2014) indicate low readability for their sample of sustainability reports and voice concerns that “the average customer” will struggle to decode these reports, and they conclude with the plea that “companies should work harder to choose simple language,” which reinforces KPMG’s demand for higher quality reporting.2 Similarly, Abu Bakar and Ameer (2011) find consistently high reading difficulty across a sample of Malaysian corporate social responsibility (CSR) communications, noting that their communications’ readability deteriorates as company performance does.
These findings support the “obfuscation hypothesis” (Courtis, 1998; Rutherford, 2003), which posits that companies will make unfavorable news more difficult to decode. Research on the use of visuals in CSR offers further evidence for obfuscation and impression management. Cho, Michelon, and Patten (2012a, 2012b) found that sustainability reports, just like financial reports, show a preference for graphs that display positive trends while additional graph distortion is used to embellish results. Studies like Hrasky (2012) and Boiral (2013) in their turn illustrate how attractive imagery is used for window-dressing and green-washing in those cases where the impact of sustainability measures is unclear. These reports’ susceptibility to manipulation of their presentation further justifies examining their textual content.
We aim to expand research into corporate sustainability reporting readability in scope, genre and variety, and means of analysis. Our approach yields three key differences with previous studies. First, we measure the impact of language variety and industry by examining readability across a larger corpus than previous studies which consists of approximately 2.75 million words, representing five language varieties and four industries, totaling 470 texts. Second, by using a genre-diversified corpus we aim to compare sustainability-related content’s readability compared to financial content as represented by CEO letters and verify whether the sustainability content takes its extended audience’s accessibility requirements into account compared with financial reporting. For those purposes, we included the same companies’ chairmen’s letters (the most often-consulted sections of corporate reports according to Courtis, 1998; Clatworthy & Jones, 2003; and others) for both the company’s annual financial and sustainability reports, in addition to sustainability-related disclosures. As collecting or comparing with full financial reports is not viable, we focus mainly on the contrast between content types in CEO letters. Third, we explore how natural language processing (NLP) techniques might supplement the often-employed “shallow” readability formulae with a finer-grained level of analysis. In addition to measuring readability in terms of word and sentence length, NLP tools can quantify, for example, the use of passive structures, the syntactic (and thus, potentially, relational) depth of a given sentence, or lexical density. This study focuses on those specific variables.
This expansion of scope and methodology compared to previous research not only allows us to describe sustainability reporting’s language in greater detail but also examine whether the linguistic facets of prescriptive regulations (e.g., Securities and Exchange Commission 1998), or of the language variety itself (see Precht, 2003a, for instance, on American directness3) aid in characterizing the genre’s complexity, similar to how extent of legal enforcement can aid in predicting extent of earnings management (see Leuz, Nanda, & Wysocki, 2003). Additionally, it opens up avenues for more in-depth analysis compared to the entrenched use of readability formulae, with more nuanced perspectives on reports’ use of language and its impact across borders and language barriers in an increasingly internationally prominent field.
The rest of this article is structured as follows. First, we present a literature review and subsequent hypotheses focusing on three areas: readability, sustainability (reporting), and the determinants of readability in corporate (sustainability) reporting. Second, we will discuss the corpus and how we compiled it. Third, we will examine the readability of the corpus, hypotheses, and test the impact of a number of variables such as industry and language variety on both formula-based readability and analysis as extracted through NLP techniques. Finally, we present avenues for future research along with our conclusions.

Literature Review

Readability

Scholars have fiercely contested the merits and methods of expressing a document’s readability numerically ever since “A New Readability Yardstick,” Rudolf Flesch’s (1948) seminal foray into the field. Edgar Dale and Jeanne Chall (1948), Robert Gunning (1952) and Kincaid, Fishburne, Rogers, and Chissom (1975) devised some of the better-known formulae for readability calculation still employed today. By 1980, the formulae proposed for readability calculation numbered in the hundreds, and that of its advocates and critics in the thousands (DuBay, 2004). However, consensus on how researchers should define readability remains elusive. DuBay (2004), for example, defines it as the text-internal characteristic of “what makes some texts easier to read than others,” opposed to the formal aspect of legibility, “which concerns typeface and layout,” but emphasises “understandability” and “comprehensibility” as concepts crucial to readability in most definitions he cites (e.g., Klare, 1963; McLaughlin, 1969). Conversely, Smith and Taffler (1992) contrast readability with understandability. They describe the former as a set of purely text-internal characteristics that determine difficulty and the latter as the interaction between the text and its reader, with prior knowledge also affecting comprehension.
As the concept of readability remains contested, this study will simply assume that when a text’s features make it easier for the reader to extract desired information it is more readable. Most changes in the text that make it easier for a reader unfamiliar with the genre to gain such information will make that same process easier for the more experienced reader, even though the ultimate extent of their understanding may differ. To offer a reductive example: the Securities and Exchange Commission (1998) promotes active over passive voice in order to facilitate reading; implementing that recommendation might make a fairly accessible text straightforwardly accessible to an expert, or simplify a nonexpert’s experience from very difficult to merely difficult.
We nevertheless need some means of quantifying readability. We consider three common readability indices: the Flesch Reading Ease Score, the Flesch-Kincaid Grade Level score, and the Gunning Fog Index. The Flesch Reading Ease Score (Flesch, 1948) is one of the oldest and still most widely used formula for computing readability and is thus most suited for the first step of our inquiry, that is, comparing our corpus’ readability with that of other genres. Courtis (1995), for instance, adopts Flesch’s (1949) expansion on his original work of defined “degrees” of readability from 0 to 100, with a range of 0 to 30 signifying the lowest reading ease and incrementing in steps of 10 from there on. Drawing on the same textual variables as the Flesch Reading Ease Score (Kincaid et al., 1975), the Flesch-Kincaid Grade Level attempts to quantify the years of education that the text requires of the reader. While, again, approximations at best, grade levels allow for more intuitive results than the Flesch Reading Ease Score does. The Gunning Fog Index (Gunning, 1952, revised in Bogert, 1985), finally, attempts to distil a grade-level measure similar to the Flesch-Kincaid formula’s, but places a stronger emphasis on the ratio of polysyllabic (“complex”) to mono- or disyllabic words present in the text. Studies into annual report readability often measure it through the Fog Index (e.g., Lehavy et al., 2011; Li, 2008), instead of or in addition to Flesch-based readability. Incorporating the Fog Index into our metrics aids us in comparing our results with those of previous studies into corporate report readability. Table 1 contains the exact formulae we used to calculate these readability measures.
Table 1. Readability Formulae.
Flesch Reading Ease Score206.835 − 1.015 * average sentence length) − (84.6 * average syllables per word)
Flesch-Kincaid Grade Level(0.39 * average sentence length) + (11.8 * average syllables per word) − 15.59
Gunning Fog Index0.4 * (average sentence length + percentage of polysyllabic words)
While we rely in part on the familiar readability formulae, we must echo the familiar caveat that they are a cost-effective means of estimating readability but do not necessarily allow for a fine-grained or universally applicable analysis of how easy a text is to decode. We will explore how NLP might assist in fine-grained text analysis by quantifying the deeper level linguistic features of lexical density, subordination, parse tree depth, and passivization (see section “Analysis”). Lexical density quantifies the number of content words (e.g., “sustainability” or “company”) relative to the number of grammatical words (e.g., “if,” “but,” “will”). Higher lexical density can lead to higher textual complexity (Halliday, 1989; Harrison & Bakker, 1998) due to a higher conceptual load. We quantify subordination as the average number of subclause-introducing elements per sentence, which serves as a syntactic complexity measure (Beaman, 1984; Dell’Orletta et al., 2014), and parse tree depth as the average number of levels in a sentence parse tree, which can indicate complexity and cognitive load (Coleman, 1964; Dell’Orletta et al., 2014). Finally, we measure the average number of passive structures per sentence, partially because the SEC strongly advises against passives in its Plain English guidelines (Securities and Exchange Commission, 1998). They argue its altered (and sometimes concealed) agency pattern runs counter to how readers typically process information. The above measures will indicate that our definition of readability is purely text-internal. Paratextual features such as font, layout, pictures, and graphs almost certainly affect most readers’ interpretation of these reports, and are also susceptible to obfuscating manipulation (see Cho et al., 2012a, 2012b). However, as the technical limitations of our means of analysis demand that we discard this paratext, we do not integrate it into our definition of readability as we explore its position in sustainability reporting.

Sustainability

The United Nations Environment Programme (UNEP; 2013) describes corporate sustainability as “an emerging discipline for which there is currently no universally agreed definition.” Because it is easily as contested a concept as readability is, distilling a single definition lies outside of the scope of a study into the linguistics of its reports. Scholars, producers, and users of these reports employ a plethora of terms, prime among which “sustainability” or “corporate (social) responsibility,” to describe various, often overlapping concepts. Dahlsrud (2008), for instance, counts a nonexhaustive 37 definitions. The World Commission on Environment and Development (1987; “Brundtland Commission”) offered perhaps the most rhetorically salient description of sustainability in its seminal (Hahn & Kühnen, 2013) report Our Common Future: “meet[ing] the needs of the present without compromising the ability of future generations to meet their own needs.” This phrasing still echoes in today’s corporate sustainability reports (e.g., Infineon, 2013).
While companies’ definitions of corporate sustainability can vary greatly, its implementation sees more uniformity due to the transnationally employed Global Reporting Initiative (GRI; 2013a) guidelines’ position as the de facto standard for sustainability reports’ form and content (Temouri & Jones, 2014). These voluntary guidelines offer a set of principles and requirements may choose to subscribe to when issuing sustainability reports, which emphasise, inter alia, multistakeholder engagement, transparency, and materiality (i.e., identifying those issues that are key to a company’s sustainability). Based on these guidelines, companies self-declare their level of compliance with GRI’s principles and requirements.
Additionally, we note that scholars do distinguish between two types of CSR based on the impetus behind them: explicit CSR occurs when companies signal their own behavior as engaging in CSR, while implicit CSR occurs when preexisting structures organically impel corporations toward CSR, without their explicitly indicating that commitment. Matten and Moon (2008) and Jackson and Apostolakou (2010) present U.S. and European corporate culture as prototypes of explicit and implicit CSR, respectively (see section “The Impact of Language Variety”; they expect explicit CSR culture to incentivise more active CSR-related stakeholder communication.
As CSR is a mosaic of concepts, a quantitative, broad-scope study such as this one will inevitably struggle to distil them into a single definition. Fortunately, as was the case for readability, this study can cast a wide net: as we study these reports from a linguistic perspective, we can consider “sustainability reporting” a linguistic construct that varies per reporting company. Due to the quantitative nature of the study, we approach the concept as we expect an “average” reader might, without a single fixed notion of what does and does not constitute CSR, but with the expectation that it targets a wider audience than financial reporting. Although we recognise that CSR reporting is an inherently complex and manifold genre, for most purposes, we do not need to define sustainability, CSR, or its related terms beyond how the authors of the reports in the corpus define them, as the company’s individual definition ultimately determines which content makes it into the report — and thereby determines the report’s linguistic features.
Regardless of reporters’ specific interpretations, a flexible definition of CSR enables us to analyze a genre that, like the financial report before it, demands inquiries into its readability because of its meteoric rise in prominence but has received little such attention. As is still the case for the financial report, both producers and readers of sustainability reports stand to benefit from having disclosures readable and accessible to their intended audience. To do otherwise is typically a waste of resources, an attempt at obfuscation, or both. That makes the sustainability report’s wider audience of stakeholders a crucial determinant in how such reporting should be conducted, albeit one that the “Analysis” section’s results suggest may receive the necessary attention.

Determinants of Corporate (Sustainability) Report Readability

Previous studies into the readability of corporate reporting, typically of the annual financial report (e.g., Courtis, 1995, 1998; Kumar, 2014; Li, 2008; Lehavy et al., 2011), consistently portray it as a far more difficult genre to read than the average text. Jones and Shoemaker (1994) and Courtis (1995) also note the potential impact of the intended audience’s prior knowledge. The latter describes the prose in annual reports as “beyond the fluent comprehension skills of about 90 per cent of the adult population and 40 per cent of the investor population” (p. 5) based on findings derived from readability formulae. Courtis (1995) suggests a variance in sophistication between different investors while implicitly confirming them as a primary intended audience of financial reporting. Financial interests will similarly motivate most other groups interested in these reports, for example, analysts. Although many readers of sustainability reporting will similarly consult financial reports to make informed investment decisions (see, e.g., Carnevale & Mazzuca, 2014), we explore how sustainability reporting also appeals to an extended stakeholder audience with different readability requirements, in addition to investigating the potential impact of language variety and region, as well as company-specific features such as GRI-compliance, industry, and performance.

Language Variety and Region

Most scholars have, however, hitherto neglected the impact of language variety on corporate reporting, or indirectly integrated it as a variable at best. For instance, Leuz et al. (2003) distinguish between three clusters of declining legal enforcement: the United States, the United Kingdom, and Australia belong to the cluster with the highest enforcement, most European countries (save Greece, Portugal, Italy, and Spain) to the second, and the remaining ones, along with India, to the last, which faces the least legal enforcement. As the study finds that clusters with greater legal enforcement exhibit less earnings management, we might similarly expect that the countries in the first cluster will exhibit less textual manipulation, and thus better readability, than those in the clusters with a lesser extent of enforcement. Cho et al. (2012b), also drawing on Leuz et al.’s (2003) framework, find a greater skew toward positive graphs in countries from less-regulated clusters, similarly suggesting manipulation. Language variety is present by proxy in this analysis as we mainly find those countries with English as a sole official language in the first cluster and countries that employ Business English as a Lingua Franca in the second and third, which more linguistically diverse India also occupies.
However, scholars such as Precht (2003b) and Creese (1991) suggest variation between varieties in the same cluster in their application of syntactic and semantic elements such as passivation, impersonalization, and directness. As corporate reports reach ever-increasingly international audiences (Townsend, Bartels, & Renaut, 2010), we wish to examine how textual complexity, expressed both as a “shallow” formula or a set of linguistic features, differs across the five varieties present in our corpus. For instance, a British report might contain more passive structures in order to express itself less directly and maintain an (expected) discursive distance from the British reader, but might, in doing so, strike an American reader as evasive. Sections “Analysis” and “The Impact of Language Variety” demonstrate that, beyond the different clusters of institutional climates that might affect reporting, the different varieties of English also represent different linguistic attitudes that may influence report readability. As such, results may spur future research to examine language variety as an important variable in its own right. Given the above, we want to formulate two hypotheses: one that addresses the impact of legal enforcement on readability and one that addresses possible internal variation within these clusters by acknowledging the impact of language variety:
Hypothesis 1: Reports from clusters with higher legal enforcement will be more readable.
Hypothesis 2: report readability will differ by region.

Sustainability Reports’ Expanded Audience

We expect sustainability- or CSR-related reporting to appeal to a wider audience than financial-only reporting as the latter will primarily be relevant for shareholders, but the former can be relevant to a company’s stakeholders in the widest sense (see, e.g., GRI, 2013a). This distinction draws on Sacconi (2004), who presents stakeholders in the strict sense as “those who have an interest at stake because they have made specific investments in the first (in the form of [human, financial, social, physical or environmental capital, or trust]),” (p. 7) opposing them to the wider-sense stakeholders, whom corporate operations affect positively or negatively even without their direct participation. Sacconi consequently models CSR as a model of corporate governance in which the company should not just represent the interests of its owner(s), but those of “all its stakeholders (the owners included).” Varda (2014) emphasises two-way stakeholder communication as one of the prime means of ensuring said communication is comprehensible to stakeholders, and we expect companies adhering to this model of CSR to want to a maximal proportion of this wider sphere of stakeholders in their reporting, be it out of moral obligation, a desire to legitimate corporate operations (Tilling, 2004), reputation management concerns (Bebbington, Larrinaga-González, & Moneva-Abadía, 2008; Sacconi 2004), or, conversely, to signal intent, commitment, or competitiveness (Varda, 2014).
This sentiment echoes throughout many sustainability reports that acknowledge these “wider-sense” stakeholders, such as the communities in which companies operate, as integral to CSR, and emphasise stakeholders’ positions in these reports’ readership. For instance, mining company Lonmin (2013) states in its first sustainability report after what it refers to as “The Marikana Incident,”4 which led to 46 deaths, that “[t]he decision to produce a printed report was made to ensure accessibility by more stakeholders who may not have access to the report on an online platform.” Total (2013) points out its efforts to submit its CSR report to wider-circle stakeholders such as NGOs or governments local to operations and gather feedback. The Adidas Group’s (2013) sustainability reports welcomes their reader to the report with the acknowledgement that “you and many of our consumers and stakeholders have high expectations [ . . . ] when it comes to [our] sustainability efforts.” Not all reports signal their intended readership explicitly, but many integrate the breadth of their CSR efforts into their corporate sustainability narrative and acknowledge engaging with stakeholders beyond shareholders and partners.
These reports’ readers also appear to profile themselves as consumers more than investors in their use of sustainability reporting. Townsend et al.’s (2010) Readers and Reporters [of Sustainability Reporting] Survey indicates that the primary motivation behind readers’ use of sustainability reports is “inform[ing] decisions on use of the organisation’s products/services,” closely followed by “inform[ing] investment/divestment decisions.” While the investment motive remains prominent, sustainability and financial reporting audiences’ motives do appear to diverge. The authors define “readers” rather broadly, as “any stakeholders that have been engaged with an organisation’s reporting output.” The survey indicates the average reader read three reports but the top 5% of readers read between 10 and 20 reports per year; we thus observe that most readers in the sample are nonexpert, possibly casual users of these reports likely to lack the experience of veteran analysts or investors that regularly read financial reports.
Townsend et al.’s (2010) composition also suggests a notable expansion beyond financial reports’ audience. While 48% of the sampled readers were company-internal and 16% were investors, 14% consisted of the company-external value chain, and finally, 22% originated from “Civil Society,” entailing media, labor unions, public institutions, academics and other experts, and concerned citizens and consumers. Given this diversity, we might expect companies to adjust the accessibility of their reports accordingly, by reducing textual complexity, for example, by ensuring high readability compared with financial reporting, as sections “Stand-Alone Versus Mixed-Content Reports” and “The Expanded Audience” examine in greater detail. Readers have seemingly grown less skeptical as CSR matures: KPMG (2013) and Townsend et al.’s (2010) survey indicates that a shrinking minority of readers sees sustainability reporting as “greenwashing,” that is, insubstantive impression management, and respondents considered corporate accountability the prime motivation behind reporting.
Based on all the above, as well as the few previous studies’ (Abu Bakar & Ameer, 2011; Farewell et al., 2014) findings on the genre’s relatively high reading difficulty, we formulate several hypotheses, two of them closely intertwined:
Hypothesis 3: Sustainability information in stand-alone sustainability reports will be more readable than that in financial reports.
Hypothesis 4a: Corporate sustainability reporting will be more readable than financial reporting.
Hypothesis 4b: Corporate sustainability reports will (nevertheless) be more difficult to read compared to other genres.

Company-Specific Features

Scholars appear more skeptical than readers do: Parsons and McKenna (2005) and Boiral (2013), conversely, signal how language used in sustainability reporting can twist its narrative frames to the company’s advantage,5 and Story and Neves (2015) indicate the risks of alienating readers when they perceive corporate social responsibility initiatives as purely strategic. “[I]f organizations do not engage in CSR they may jeopardise their brand and reputation, which, in turn, could decrease short- and long-term profitability,” (p. 111) potentially endangering the company’s social and environmental license to operate (Deegan, Rankin, & Tobin, 2002). Ideally (but not always), reports should contain a balance between positive and negative news as proposed in, for example, the GRI guidelines’ (GRI, 2013a) tenet of “balance.” The section “Introduction” already anticipated the risk of too positive a perspective as often problematic in both textual and nontextual content, and based on the obfuscation hypothesis, we might expect that companies with worse performance will issue less readable reports. Conversely, based on signaling theory, which posits that companies will want to indicate their execution of CSR processes in order to indicate underlying qualities and build corporate reputation (Varda, 2014), we see incentive for companies with better results to disseminate those results more efficiently and, potentially, vice versa. GRI compliance might offer a twofold impact here, creating better sustainability processes as well as better anticipating stakeholders’ needs through an enhanced materiality process.
Another company-specific feature likely to affect the reporting process is the industry the company operates in. Farewell et al. (2014) cite this as an avenue for research, expecting different industries to address the different relevant issues through different structures, as reporting complexity can vary across industries. For instance, many extractive industries have inherently nonrenewable production processes, which will typically require addressing in a sustainability report and render the company’s environmental performance a sensitive issue. Similarly, companies with a history of worker rights issues might face additional scrutiny or legislation in these areas, such as is the case with the U.S. semiconductor industry that requires companies that file with the SEC to disclose and describe their use of conflict minerals (Securities and Exchange Commission, 2014).
Hypothesis 5: GRI-compliant reports will be more readable than noncompliant reports.
Hypothesis 6: Reports from better performing companies will be more readable than those from poorer performing companies
Hypothesis 7: Report readability will differ by industry.

Methodology

Corpus

Our corpus consists of 470 texts, split along three primary axes: region, industry, and (sub-)genre. These texts represent approximately 3.95 million tokens, 2.75 million of which are running plain text usable for NLP purposes.6 The five regions present in the corpus are the United States, the United Kingdom, (non-U.K.) Europe, Australia, and India. The four industries are mining and metals, oil, semiconductors, and apparel. The three subgenres present are the sustainability report and two types of CEO or chairman’s letter or address, one type oriented toward financial performance and the other concerning sustainability. We further subdivide the sustainability report between stand-alone sustainability reports and sustainability chapters from financial reports. Table 2 shows the number of texts per region and per industry for each of the genres, and Table 3 shows the number of unique companies per category.
Table 2. Corpus Composition.
Genre and regionIndustry
MiningOilSemiconductorsApparelGrand total
Fin.-oriented CEO letter(s)95823012219
 USA113522472
 UK18112031
 Europe17155845
 Australia44181063
 India53008
Sust.-oriented CEO letter(s)383512388
 USA4148228
 UK1471022
 Europe983121
 Australia850013
 India31004
Sustainability report78591610163
 USA91810239
 UK18112031
 Europe17164845
 Australia29110040
 India53008
Totals
 USA count2467408139
 UK count50295084
 Europe count43391217111
 Australia count813410116
 India count1370020
Grand total2111765825470
Table 3. Corpus Composition by Unique Company.
Region and IndustryGenre
CEO letter(s)Sustainability letterSustainability reportGrand total
Australia63134064
 Mining4482945
 Oil1851118
 Semiconductors1  1
Europe45214548
 Apparel8189
 Mining1791717
 Oil1581616
 Semiconductors5346
India8488
 Mining5355
 Oil3133
UK31223131
 Mining18141818
 Oil1171111
 Semiconductors2122
USA72283978
 Apparel4224
 Mining114913
 Oil35141837
 Semiconductors2281024
Grand total21988163229
We selected these four industries in order to achieve a diversified selection of potentially environmentally sensitive (cf. Cho & Patten, 2007) companies in the cases of Mining/Metals and Oil, and potentially socially sensitive companies in the cases of Semiconductors and Apparel. We assert the latter two’s sensitivity to social issues on the basis of the use of conflict materials (Bafilemba, Mueller, & Lezhnev, 2014) and recent controversies around worker rights (e.g., the Savar building collapse; BBC, 2013), respectively. We selected five varieties of English to achieve a balance between some of the most prominent ones used in business today. The three (sub-)genres, finally, isolate the most often-read component of corporate reporting (Courtis, 1998), the CEO letter,7 for both types of content. The separation from the rest of the sustainability report proper serves to reveal differences between this most-consulted section and more in-depth reporting narrative.
However, several of these cross-sections lack entries altogether, and the totals indicate a skew toward the extractive industries (mining and oil). While corpus balance and representativeness may seem suboptimal based on these numbers, the data at least partially satisfy the representativeness criterion as this is the complete set of texts available based on the data collection process set out in the next section.

Data Collection and Processing

In addition to a wide selection of texts, this study needed both financial (to control for its impact) and nonfinancial (to investigate its impact in future studies) performance data. As no corpus was readily available, we assembled our own, based on Thomson Reuters Datastream, which offers various aggregate performance scores potentially relevant to sustainability reporting through its ASSET4 database8 (Thomson Reuters, 2013). Selecting companies included in Datastream also ensures that we select prominent, visible companies with strong incentive to perform well at CSR, lending additional relevance to results. We selected companies with information available based on the “ASSET4 Template,” and adopted ASSET4’s industry and country divisions and collected available texts for all listed companies within “Mining/Metals,” “Oil,” “Textiles/Apparel,” and “Semiconductors” for the United States, the United Kingdom, Australia, India, and all non-U.K. European countries.
We downloaded, where available, the annual financial and sustainability reports for fiscal 20129 from the corporate website of all companies included in ASSET4 that met our criteria. When available, we integrated the entirety of the company’s separately published sustainability report into the corpus, only separating CEO letters into a separate document. As they address the reader directly through an idiosyncratic rhetoric, we consider them a separate subgenre.
When the company did not issue a separate sustainability report, we extracted any chapters with sustainability, CSR, or ESG-related keywords such as “health and safety” or “community engagement” present in the heading from a financial annual report or (self-declared) integrated report. We classified these as the company’s (de facto) sustainability report. We are unable to separate integrated reports as a genre; as the GRI (2013b) indicates, no universal standard for integrated reports was available for fiscal 2012. More recently, Wee et al. (2016) indicate that even with the International Integrated Reporting Council’s <IR> framework available, <IR> is still in its infancy and remains difficult to delineate.
Finally, the majority of companies issued a CEO letter or chairman’s address focusing on the company’s financial performance in the annual financial report. For a fair number of companies in our corpus, this is the only text type available out of the three.
We used ABBYY FineReader software for conversion from the report PDFs to plaintext usable for our analysis. During this conversion, we manually tagged any numerical or mixed-content tables encountered in order to be able to remove them and pass only running text along to the NLP toolset. A regular expression10 stripped away paragraphs that did not adhere to sentence case and end in a period, colon, semicolon, question mark, or exclamation mark. This extracted running text and purely textual enumerations, removing, for example, headings, subheadings, and tables. Finally, we normalised case so that words starting with a lowercase character changed entirely to lowercase, mitigating OCR casing errors. This process discarded about 1.2 million tokens, conserving 2.75 million out of an initial 3.95 million. To ensure sufficient remaining text length, we eliminated texts less than 200 words long after cleaning from our analysis, and omitted four outliers (Fog score > 27) where a document mainly consisted of enumerations without sentence boundaries. This issue substantially inflated the readability scoring relative to the text’s complexity, prompting the omission.
As conducting deep-level syntactic analysis requires a parsed corpus, and manually parsing a multi-million-word corpus is not viable, we rely on NLP technology capable of automatically analyzing text. NLP tools are significantly more technically demanding to implement than readability formulae, but should also allow for finer-grained analysis. After extracting the running text, we analyzed each of the trimmed files using the Stanford CoreNLP suite (Manning et al., 2014), which automatically annotates its input for part-of-speech, presence of named entities, syntactic structure, coreference, and other linguistic features. This allows us to quantify key linguistic aspects of our corpus not just in terms of readability formulae but also deeper level syntactic parameters such as the average depth (in levels) of the parse tree, the number of passive structures per sentence, or use of various types of connectives.
The parse tree above (Figure 1; generated by Stanford NLP Group 2015) visualises the levels of syntactic depth, with elements to the right deeper in the tree, whereas traditional formulae would only measure sentence and word length. For example, the parse defines “is of such good quality that we have official approval [ . . . ]” as the highest-level verb phrase (VP) in the sentence, with “SBAR” denoting subordination (an embedded sentence) introduced by the element “that.” We can count 13 levels to the parse tree. We used the Stanford CoreNLP suite (Manning et al., 2014) to conduct similar analyses on every sentence in our corpus, which iterates various task-specific annotators, such as a part-of-speech tagger or syntactic parser, over a given piece of text. As the toolkit integrates a staggering number of automatic annotators, we abstain from further technical description of their inner workings; the Stanford NLP Group (2016) offers a detailed description for each of them. This study’s primary concern is examining how the above shallow and deeper level readability measures, as made computable by CoreNLP, differ between (sub-)genre, industry, language variety, and economic performance.
Figure 1. Visualization of the sentence “In fact, much of the water used in production is of such good quality that we have official approval from relevant authorities to discharge it directly into rivers” (Infineon, 2013) as parsed by the Stanford NLP framework.

Analysis

While exploring our hypotheses, we will first characterise the corpus in terms of formula-based “shallow” readability, and then explore the added value of linguistic features over readability formulae. As Datastream only provided GRI compliance information for a part of the corpus, we will analyze the effects of GRI compliance after the primary analysis. This section only describes the linear models we assessed per dependent variable. The “Results” section investigates the results of these analysis in function of the hypotheses, and the “Discussion” section discusses the implications of our findings.
We used SPSS Version 23 in order to conduct analyses. These consist, unless noted otherwise, of univariate general linear models investigating the impact of region, industry, whether a report originates from a financial report, and the company’s economic performance on a single measure of or proxy for readability, with post hoc analyses applied where appropriate. A single data point represents a single text from the corpus. Although we sought to control for company’s aggregate economic performance throughout the analysis, we incorporate it as a covariate in order to investigate the extent of obfuscation in the corpus. We conducted most analyses per genre unless comparing the three directly. In the latter case, we compare only those cases where all three texts are available. We set the α level at .05 and apply Bonferroni correction to post hoc analyses. We test the assumptions for ANCOVA, applying the Kolmogorov-Smirnov and Shapiro-Wilk tests and visually inspecting the Q-Q plots for the dependent variables as well as the homoscedasticity of residuals. Save for cases where only the Shapiro-Wilk test, which is (too) sensitive at larger sample sizes, is violated, we discuss any violations for the dependent variables. Tables 4, 5, and 6 summarise results for sustainability reports, financial CEO letters, and CEO letters from sustainability reports, respectively.11 Table 7 compares the three (sub-)genres.
Table 4. Analysis for Sustainability Reports.
Sustainability reportsFlesch scoreKincaid scoreFog scoreLexical densitySubordinators/sentenceParse tree depthPassives/sentence
MSDMSDMSDMSDMSDMSDMSD
Regions
 Australia12.357.4017.441.4522.041.790.6400.0230.4490.15010.6350.9780.3470.104
 Europe19.365.5316.381.5020.731.650.6290.0170.4200.10610.3070.8100.2950.082
 India15.042.9016.460.6220.860.740.6570.0160.2670.0869.8740.4970.3010.063
 UK18.344.9216.431.0620.961.220.6250.0150.4860.09210.5630.6670.2930.081
 USA17.056.2416.391.3120.731.350.6490.0160.4640.13810.1390.6670.2020.061
Industries
 Apparel21.7910.0616.081.3420.171.480.6250.0160.4410.12010.4090.8700.2920.045
 Mining15.567.5916.831.6321.351.870.6380.0220.4340.13710.3750.8680.3080.096
 Oil17.604.9516.561.1121.011.210.6360.0200.4540.13410.3790.7360.2750.095
 Semiconductors16.894.3716.390.8920.640.900.6410.0160.4450.09710.2950.7680.2330.084
Report type
 Stand-alone16.865.0516.601.2120.981.370.6360.0170.4440.10310.3320.7000.2790.086
 Mixed-content16.648.0416.711.6021.241.830.6380.0240.4400.16010.4190.9320.2910.107
M16.766.5016.651.3921.091.580.6370.0200.4420.13010.3700.8070.2850.096
Table 5. Analysis for Financial CEO Letters.
Financial CEO lettersFlesch scoreKincaid scoreFog scoreLexical densitySubordinators/sentenceParse tree depthPassives/sentence
MSDMSDMSDMSDMSDMSDMSD
Regions
 Australia25.166.9616.341.8020.782.070.6220.0210.5470.21311.4631.2990.2440.087
 Europe26.4510.2115.672.4119.832.700.6100.0270.5370.24010.8691.4200.2080.092
 India28.396.5314.741.6819.181.840.6380.0230.2960.1239.9570.9430.2030.087
 UK27.866.4515.671.5820.121.690.6140.0190.5370.15511.1800.8620.2250.082
 USA25.698.7015.431.8619.722.040.6370.0220.4910.16910.4991.0040.1670.068
Industries
 Apparel30.3911.0014.931.8618.943.010.6070.0290.5780.27610.7211.8080.2080.090
 Mining25.987.0615.931.8720.302.110.6200.0230.5350.21011.1061.2940.2210.087
 Oil26.608.4115.651.9420.042.150.6270.0240.4840.17410.8781.3290.2010.086
 Semiconductors23.339.3115.811.9620.012.120.6360.0250.5130.19810.6071.0330.1790.082
M26.098.2215.751.9620.092.190.6240.0250.5160.20010.9331.2420.2070.087
Table 6. Analysis for CEO Letters From Sustainability Reports.
Sustainability CEO lettersFlesch scoreKincaid scoreFog scoreLexical densitySubordinators/sentenceParse tree depthPassives/sentence
MSDMSDMSDMSDMSDMSDMSD
Regions
 Australia19.086.6216.621.6220.981.770.6130.0230.5800.18511.2891.3520.2010.113
 Europe24.339.7015.541.9519.742.150.6040.0260.5970.25310.4570.8110.2030.058
 India20.298.3315.701.3919.442.150.6160.0360.3990.18810.7430.5950.1260.069
 UK22.377.2015.941.5920.201.910.5900.0260.6760.32611.2131.4980.2040.132
 USA16.279.3216.772.2021.122.400.6130.0290.6200.27810.7941.2210.1370.060
Industries
 Apparel27.158.5115.271.0718.621.310.5790.0330.9080.39111.2710.0800.2030.066
 Mining19.768.0616.391.6520.611.860.6020.0230.6530.29411.1741.3810.2030.120
 Oil19.719.3716.241.8620.772.020.6050.0300.5620.20210.6860.8730.1610.072
 Semiconductors21.2110.5315.852.8919.713.120.6230.0300.5670.31710.5501.6530.1410.058
M20.218.9416.211.9220.482.150.6050.0280.6130.27110.8901.2330.1770.095
Table 7. Per Genre Analysis.
Full corpusFlesch scoreKincaid scoreFog scoreLexical densitySubordinators/sentenceParse tree depthPassives/sentence
MSDMSDMSDMSDMSDMSDMSD
Genres
 Financial CEO letter26.497.7215.481.9219.802.120.62280.0220.4870.16910.7401.0920.1910.075
 Sustainability-related letter20.529.0216.141.9320.402.160.6050.0290.6100.26410.8781.2270.1820.098
 Sustainability report17.325.0716.471.1020.861.240.6220.0170.4460.09910.3510.6180.2700.087
M21.438.3316.282.1120.361.930.6220.0270.5140.20110.6541.0310.2150.095
Note. This table only considers the 78 companies with all three (sub-)genres available without outlier data in at least one (sub-)genre.

Readability Formulae

Flesch Score

The sustainability reports’ mean Flesch score of 16.76 (SD = 6.5) corroborates the few previous studies’ finding that sustainability reports are a very difficult to read genre according to formula-based analysis, as they firmly occupy the most difficult spectrum (0-30) that the Flesch Reading Ease Score accounts for. CEO letters, both financial and otherwise, show similar results, albeit with a slightly higher overall readability of 28.09 (SD = 8.22) and 20.21 (SD = 8.94), respectively. This indicates that financial CEO letters are most readable overall, with the mean almost overlapping with the Flesch score’s upper boundary for “very difficult” material. In a by-genre analysis (R2 [adjusted] = .203 [.196]), CEO letters from financial reports are significantly (p < .001) more readable than those from sustainability reports, which are in turn significantly (p = .042) more readable than sustainability reports proper.
The model for sustainability reports (R2 [adjusted] = .290 [.206]) shows a significant association for region (p = .019) as well as the interaction between region and industry (p = .017). The mean score for the Australian mining sector (10.86) is considerably lower than the next lowest of U.S. semiconductors (15.69). The model for financial CEO letters shows no significant associations, while that for sustainability-related CEO letters (R2 [adjusted] = .219 [.135]) had a significant (p = .005) association for region, with the score for U.S. letters significantly higher than both U.K. and European letters.

Flesch-Kincaid Grade Level

The mean Flesch-Kincaid Grade Level of 16.64 (SD = 1.39) for sustainability reports again corroborates previous studies’ findings, estimating the required number of years of schooling for optimal understanding at approximately 4 years of higher education, that is, graduate level. While these are of course only estimates, this score is relatively higher than many other genres. Within the by-genre model (R2 [adjusted] = .051 [.043]), CEO letters from financial reports are significantly (p < .033) more readable than sustainability content.
Models for neither sustainability genre showed significance for independent variables, but the model for CEO letters from financial reports did (R2 [adjusted] = .198 [.100]). The region variable, as well as its interactions with industry and economic performance, show significant association with the Flesch-Kincaid score. Given this plurality of interactions, this outcome is difficult to interpret, but might be meaningful for attempts to predict this score in future studies. Finally, the Kolmogorov-Smirnov test narrowly (p = .048) indicates nonnormality, but the Shapiro-Wilk test (p = .056) and visual inspection contradict this.

Gunning Fog Score

The mean 21.09 (SD = 1.58) Gunning Fog Score for sustainability reporting aligns with the previous two scores’ outcomes, albeit likely inflated due to the Fog score’s greater weight on polysyllabic words, as its estimates readability requirements at 9 years of education past secondary education. Per genre analysis (R2 [adjusted] = .046 [.037]) indicates CEO letters from financial reports as significantly higher (p = .004) for the Fog score.
Only the model for CEO letters from sustainability reports (R2 [adjusted] = .337 [.190]) shows significant independent variables in region, industry, and the interaction between them, roughly consistent with sustainability reports’ results for the Flesch score. The Australian mining sector is again among the least readable, joined by U.S. oil and mining and European semiconductors. The U.K. semiconductor industry is most readable by a wide margin. Although both Kolmogorov-Smirnov and Shapiro-Wilk tests indicate nonnormality, visual inspection of the QQ- and residual plots indicates no violation of assumptions.

Syntactic Features

Lexical Density

The deeper level syntactic features show more systematic, interpretable differences between our independent variables, chiefly region. This comes at the cost of difficulty to calculate and compare with other studies, as fewer are conducted. Lexical density, however, immediately demonstrates the merits of deeper level analysis, with region consistently the only significant predictor for each of the three genres. These also differ internally (R2 [adjusted] = .263 [.256]), with CEO letters from sustainability reports least lexically dense, followed by those from financial reports and, finally, sustainability reports proper (p < .001).
The model for sustainability reports (R2 [adjusted] = .285 [.240]) shows a significant association (p < .001) between lexical density and region, with the U.K. reports significantly lower than the other regions except Europe, and European reports lower than Indian or U.S. reports. The results are similar but less pronounced for financial CEO letters (p < .001; R2 [adjusted] = .217 [.186]), with U.S. CEO letters significantly higher than all but Indian ones, and European letters additionally significantly lower than their Indian equivalents. CEO letters from sustainability reports show the same association (p = .04; R2 [adjusted] = .207 [.122]) but exemplify the Bonferroni procedure’s conservative nature as none of the contrasts between regions cross the threshold of significance.

Subordinators

By-genre analysis (R2 [adjusted] = .127 [.119]) shows the CEO letters from sustainability reports to have the most subordinators overall (p < .001). This dependent continues the trend of the region variable being most significant, with Indian sustainability reports containing fewer subordinators than any other region (p = .001; R2 [adjusted] = .140 [.085]). The same significant association (p = .023) exists in the model for CEO letters from financial reports (R2 [adjusted] = .076 [.039]), but the amount of variance explained is rather limited and visual inspection of the residuals suggests some heteroscedasticity. As the dependent is not normally distributed for CEO letters from sustainability reports, we are unable to fit a suitable model.

Parse Tree Depth

The by-genre model (R2 [adjusted] = .055 [.047]) shows that the sustainability reports proper have the shallowest parse trees, significantly more so than both letters from financial (p = .041) and sustainability reports (p = .001).12 Remarkably, for the sustainability reports, the only significant independent variable is economic performance (p = .023; R2 [adjusted] = .122 [.066]); the better the performance, the shallower the trees. The CEO letters from financial reports (R2 [adjusted] = .232 [.138]) see significant associations with both region (p = .001) and its interaction with economic performance (p = .019). While neither test indicates normality, QQ-plots for the variable do.

Passivization

As an independent variable, the average number of passive structures shows the most remarkable effects overall. In by-genre analysis (R2 [adjusted] = .165 [.158]), the sustainability reports proper show significantly more passives compared to both types of CEO letter (p < .001). The model for sustainability reports explains a significantly greater amount of variance (R2 [adjusted] = .353 [.312]) compared with the other variables, with both region and economic performance showing significant associations. U.S. reports contain significantly fewer passives than other regions’ (p < .048), while an increase in economic performance is associated with a decrease in passive structures. The same effect (albeit with a lower R2 [adjusted] = .157 [.123]) exists for CEO letters from financial reports, with U.S. letters significantly (p < .022) more active than Australian and U.K. letters and increased economic performance associated with more active language. For CEO letters from sustainability reports, only the latter association (p = .007; R2 [adjusted] = .230 [.148]) persists, although we note that for the latter genre the dependent variable is somewhat skewed.

GRI Compliance

We repeated the above analysis for only those reports for which Datastream specified their GRI compliance (n = 96) or lack thereof (n = 23) and found that including GRI compliance as an independent variable in these dependents’ respective models never showed a significant association (p > .59) between GRI compliance and any of the above readability features.

Results

Region

Throughout our examination of the dependent variables, we find region to show the most significant as well as the strongest associations with most of the independents for most of the genres. We accept Hypothesis 2 as we see that the region or variety of English in which reports are written affects the different measures of and proxies for readability, typically more so than any other variables. We find, for instance, that Australian documents are typically less readable than other regions’ (at least for given industries) in terms of readability formulae, and U.S. documents are on the more readable end of the spectrum. Remarkably, the largest difference exists between two regions within the highest level of legal enforcement described in Leuz et al. (2003). Overall, however, the associations between region and the dependent variable are stronger for the syntactic variables, chiefly lexical density and number of passive structures.
For lexical density, we find U.K. and European reports and financial CEO letters on the lower end, and U.S. reports on the higher. Indian documents, in turn, show less subordination than other regions. Parse tree depth, meanwhile, shows a stronger association with economic performance (cf. section “Economic Performance”). Average number of passive structures, however, shows the most remarkable outcomes in the strong association with region, primarily through U.S. documents exhibiting considerably (and significantly) more active voice. Again, however, we see some influence from economic performance. As many of these patterns do not align or even run counter (as is the case for subordination in Indian reports) to readability improving in clusters with higher legal enforcement, we do not accept Hypothesis 1. Language variety appears to have a considerably stronger effect than legal enforcement.

Stand-Alone Versus Mixed-Content Reports

For the models for sustainability reports, we included as a variable whether the report came from a stand-alone sustainability report or was a part of the company’s financial report. This variable is never significant (p > .363) for any of the models. As such, we are unable to reject the null hypothesis for Hypothesis 3, finding no significant difference between both types of sustainability content. This outcome suggests that report writers of sustainability reports do not attempt to modify or do not succeed at modifying the readability of their output for the wider stakeholder audience. This outcome aligns with our findings for Hypothesis 4a, that is, sustainability content not being more readable than financial reporting.

Readability of Sustainability Reporting

Based on the three formulae we used, we observe that the mean readability for these documents generally occupies the least readable strata that these formulae distinguish between, which is consistent with previous studies’ findings. Courtis (1995), for instance, compiles mean Flesch scores for chairman’s addresses from various U.S., Canadian, New Zealand, and Hong Kong reports, with a minimum of 28.96 and a maximum of 47.2. The disparity with our own outcomes might be due to Jones and Shoemaker’s (1994) observed effect that the mean textual complexity of corporate reporting may increase outright over time. Abu Bakar and Ameer (2011) and Farewell et al. (2014) similarly indicate the majority of their sample occupying the “very difficult” categories of readability.
We have every reason to accept Hypothesis 4b given these outcomes of high reading difficulty. The outcome for Hypothesis 4a, however, does not meet our expectations. While we find that reports are generally more syntactically complex than CEO letters, content type (financial vs. sustainability) shows a stronger association with readability than (sub-)genre does, especially for the formulae, which cast both sustainability reports and the CEO letters from those reports as less readable than CEO letters from financial reports. Although we do note that CEO letters from financial reports are more lexically dense than those from sustainability reports (and thereby possibly more substantive), the overall trend casts sustainability content as less readable overall. As such, we reject Hypothesis 4a; rather than fail to reject the null hypothesis, the data supports the opposite hypothesis of sustainability reporting being less readable.

GRI Compliance

As we see no significant association with GRI compliance, we are unable to accept Hypothesis 5. The enhanced materiality process inherent to GRI compliance does not appear to systematically affect a report’s readability, be it formula-based or syntactic. This is not particularly surprising overall, as the effect of company-specific features seems altogether limited, with the remarkable exception of economic performance on specific syntactic features.

Economic Performance

While we included companies’ economic performance in the model primarily in order to control for the confounding effect that better or poorer company-specific performance might have, we observe that it shows a significant negative association with the syntactic depth and frequency of passive structures of both sustainability reports and financial CEO letters. In other words, better economic performance is associated with lower syntactic complexity and, consequently, higher readability. While this association does not exist across all the readability measures, we consider it strong enough to accept Hypothesis 6, with the caveat that this effect does not appear to occur in a meaningful way across the formula scores. As such, these findings do not align with Abu Bakar and Ameer (2011), who also found sustainability reports to exhibit textual obfuscation based on company performance, but observed this effect based on readability formulae. While not entirely consistent with one another, both outcomes are consistent with the obfuscation hypothesis.

Industry

Industry is not a significant variable in most models, and only ever so in those models where its interaction with another variable, typically region, shows a significant association with the dependent. That significant association furthermore only ever occurs in models related to readability formulae. As the few effects that we see are usually conditioned on the region variable and the region variable shows a much stronger, more informative pattern of significance, we find limited evidence in favor of Hypothesis 6, especially compared with Hypothesis 2, and no useful interpretation for the pattern we do see. Variation does not appear to run alongside the divide between environmentally and socially sensitive industries, and where it exists, appears to vary per region. However, as our results show that some variation based on industry may exist, conditioned on region, we cannot fully reject Hypothesis 6, either, limited though the effect appears to be. We consider these results inconclusive and invite further study into the issue.

Discussion

Formula-Based Versus NLP-Based Readability

From a methodological point of view, this study sought to investigate how “shallow” readability formulae relate to deeper level syntactic features obtained through NLP techniques for the purposes of quantifying corporate report readability. On the whole, we see more associations between independent and dependent variables for the syntactic features, at the cost of difficulty to compute and difficulty to compare with previous studies, as formula-based readability analysis remains the current standard. Nevertheless, for the purposes of this study, the benefits substantially outweigh the drawbacks in offering far more nuanced insight into the variation in readability between language varieties. Overall, the “shallow” (easily computable, low-feature) approach to readability analysis appears to yield substantially more shallow results than the NLP approach; they are, for instance, by design unable to register U.S. reports’ more active language. NLP techniques’ additional nuance offered certainly seems to merit their inclusion in further study into (corporate report) readability.

The Expanded Audience

As section “Determinants of Corporate (Sustainability) Report Readability” discussed in greater detail, sustainability content is more likely to appeal to financial reports’ audience than the other way around. As such, sustainability reports will likely have the larger audience, expanded from the investors that financial reports address with NGOs, employees, concerned citizens, and other direct or indirect stakeholders. That expansion makes it less likely that every member of the more heterogeneous stakeholder population has the experience or capacity to process content that the more homogeneous investor population generally can. Rather than support our hypotheses that sustainability content’s readability would accommodate this disparity, the readability formulae indicate that sustainability content is both difficult to read in general, and less readable than financial content. Although both formulae and syntactic variables at least demonstrate that CEO letters, as the most often-read section of corporate reports (serving to offer a “first glance”), are more accessible than the reports proper, the outcomes for the formulae show no indication that the composers of sustainability reports consistently adjust their language to their expanded target audience.

The Impact of Language Variety

From a more linguistically oriented point of view, another of this study’s aims was to assess the impact of language variety on the readability and language use in corporate (sustainability) reporting. As section “Formula-Based Versus NLP-Based Readability” indicates, NLP-based syntactic features portray this difference substantially better than conventional readability assessment’s “shallow” formulae do. We find considerable evidence for the impact of different varieties of English on textual features that may affect readers’ experience and understanding of the text, albeit no systematic difference in readability between regions that exhibit implicit or explicit CSR.
As far as region is concerned, the most salient outcome of formula-based analysis is Australian reports generally having the lowest readability, albeit with the significant associations often existing within an interaction with the industry variable. Australian reports’ somewhat increased complexity, in as much as readers might notice it, might be partially due to Australian Stock Exchange reporting requirements facing some flux between less stringency than U.S. or U.K. requirements (CPA Australia, 2013) and rising reporting standards (Nearmy, 2014) at the time of data collection.13
The deeper level features show much more meaningful differentiation between regions, suggesting that language variety deserves more attention as a determinant of sustainability report readability, but also shows a greater relative effect size for number of passive constructions, suggesting that language variety can have a greater impact than company performance even in cases where we find evidence for obfuscation (cf. section “Company Performance”). The salient, recurring effects are lower lexical density for the United Kingdom and Europe, fewer subordinators for Indian documents, and more active language in U.S. reports.
The lower lexical density for U.K. and EU reports and the typically shallower parse trees for Indian reports demonstrate how readability varies by region, and raises the question how these variations in readability might affect reader or stakeholder perception across language varieties; for instance, would a British reader of an American report with higher lexical density find themselves frustrated with the report’s language? Would the same occur for an Indian reader of an Australian report with deeper parse trees?
The strongest and most salient association out of all the models we examined, however, remains that between region (as well as economic performance) and number of passive structures. Precht’s (2003a, 2003b) studies into U.S. English found it to be significantly more direct and active than U.K. English, and our findings allow us to extend that assessment to the rest of our corpus. This effect may also be partially due to the influential (Colesanti, 2012) U.S. Securities and Exchange Commission Plain English Reporting Guidelines of 1998, recommending against passive structures based on their impact on readability, which might encourage active voice in U.S. reports regardless of linguistic attitudes. Again, these outcomes raise the question whether, for example, a U.S. reader of a U.K. report might find it evasive, or a U.K. reader of a U.S. report find it overly direct or potentially disingenuous.

Company Performance

This section cannot be complete without a brief discussion of how the obfuscation hypothesis manifests in this corpus. While the more commonly used readability formulae show no association for the corpus between company performance and readability, our findings on syntactic features do align with, for instance, Courtis (1998), Rutherford (2003), or Abu Bakar and Ameer (2011), in better economic performance yielding shallower parse trees and more active language, both of which are likely easier to process for the reader. Additionally, passive structures allow for obfuscation in enabling report writings to conceal agency and, thereby, potentially, responsibility. (compare, e.g., the fictitious “The company jeopardised its profitability this year” with “The company’s profitability was jeopardised this year”). The reason behind this incidence of obfuscation may be that favorable financial performance invites a culture of company-internal attribution of results described in corporate reporting, be they financial or otherwise, while poor performance invites external attribution. While this evidence of obfuscation in the corpus is certainly relevant to a description of its readability, obfuscation as a form of defensive attribution is a complex issue (see, e.g., Aerts, 1994, or Aerts & Cheng, 2011) beyond the scope of this study. We find the remarkable patterning of obfuscation between formula-based readability and deeper level syntactic readability invites more detailed future analyses.

Future Research

A first avenue for future research emerged in the association between company performance and readability, which invites more detailed investigation into not only the aforementioned association, but also that between nonfinancial performance and readability, as financial sustainability is infrequently the focal area of sustainability reports. This study’s initial outcomes, along with a more detailed inquiry’s results and the sustainability reports’ noted complexity, may elucidate whether sustainability content requires additional legislation to counteract risks of obfuscation that, among others, Abu Bakar and Ameer (2011) and Cho et al. (2012a, 2012b) note.
On a methodological note, this study tried to maintain a broad scope, and future quantitative expansion into language variety or the impact of industry might certainly make a valuable contribution to readability research into corporate reporting. There is similar value in more qualitative approaches, such as the study of finer-grained pragmatic factors’ impact that this study necessarily glossed over. These might include companies’ preference for biased and transparent reporting (e.g., Basu & Palazzo, 2008) or the different, difficult-to-delineate extents of integrated reporting that the GRI (2013b) recognises.
Finally, given the sometimes counterintuitive results this study encountered, we see merit in further investigating the relationship between these reports’ language, their authors, and their audience. Such studies might query the extent to which report writers are aware of the linguistic differences between varieties that this study attests, or whether they deliberately try to adjust sustainability reports’ language to its expanded audience. Focusing on the audience, we see merit in investigating the extent to which the syntactic differences between varieties affect readers’ perceptions of reports written in other varieties of English.

Conclusion

This study finds that KPMG’s plea for more qualitative sustainability reporting rings as true for its results as for the few that have preceded it: traditional readability formulae place the sustainability report among the most complex genres of writing. High-quality reporting means ensuring accessibility, and few companies will want to make the intense investment that exhaustive triple bottom line reporting requires only to frustrate their reports’ readers. Sustainability themed content readability contrasts unfavorably to even financially themed content, implying that (sections of) these reports may well tax the average reader of the latter, let alone the “average customer” that Farewell et al. (2014) mention, or many other groups within the sustainability report’s far wider audience. Many companies claim to take a stakeholder-inclusive CSR stance, but a less specialised audience and an increase in textual complexity for relevant content make an inefficient combination at best.
This study also evidences an association between textual complexity and different varieties of English, primarily in the “deeper level” syntactic features extracted through NLP techniques, revealing, for instance, U.S. reports’ markedly more active structures or U.K. reports’ lower lexical density. These findings underline the importance of language variety as a predictor of linguistic complexity in addition to current models, such as Leuz et al.’s (2003) clusters of legal enforcement. Additionally, the same analysis based on syntactic features reveals an association between better economic performance and lower syntactic complexity, supporting the long-contested obfuscation hypothesis. Both outcomes demonstrate the merit of supplementing the typical formula-based approach to corporate report readability analysis with now computationally viable NLP techniques.
The concept of sustainability will almost certainly continue to ingrain itself in many aspects of society. With over half of companies reporting on their CSR initiatives, the corporate sustainability report has become one of the primary means, for those actors whose interests may not always appear to align with society’s, to maintain social license by demonstrating clearly and transparently that how they fulfil their needs does not impact future generations’ ability to meet their own. Without that clarity present in the language, however, companies might ensure these reports availability to a large audience, but both corporate and scholarly initiatives such as KPMG’s reporting survey and this study may still impel them toward transparency and accessibility to that same audience.

Acknowledgments

The authors would like to thank Oveis Madadian for assistance with data collection, Orphée De Clercq for assistance with data collection and processing, and Véronique Hoste for assistance and guidance throughout the entire process.

Declaration of Conflicting Interests

The author(s) declared no potential conflicts of interest with respect to the research, authorship, and/or publication of this article.

Funding

The author(s) disclosed receipt of the following financial support for the research, authorship, and/or publication of this article: Research funded by Flanders Innovation & Entrepreneurship (VLAIO).

Footnotes

1. Tap Oil’s (2013) annual report for financial 2012, for example, has a single-page “Health, Safety, Environment & Community” section out of a 105-page report.
2. The meaning of “qualitative” in “qualitative reporting” is of course inherently subjective; it might be construed as “exhaustive,” “authoritative,” and thereby even “complex” or “complicated” just as well as into “transparent,” “complete,” or “accessible.” One potential reason for corporate reporting’s relatively high reading difficulty might be inherent to the genre: given the genre’s long history, reputation, and often specialist content, an overly simplistic report might simply lose credibility.
3. Based on a 100,000-word corpus of British and American conversation, Precht finds that the former uses more modal verbs and the latter more emotive affect and emphatics.
4. This incident, also commonly called the “Marikana Massacre” in media (e.g., Smith, 2012), occurred in South Africa in 2012 when protests over worker remuneration at a Lonmin-operated mine turned violent and police opened fire on protestors. This national tragedy made international headlines and raised substantial questions about the sustainability of Lonmin’s operations (Smith, 2012; Tolsi, 2015).
5. For example, Parsons and McKenna (2005), in a case study, point out that many statements such as “we set out to build enduring relationships with our neighbours characterised by mutual respect [ . . . ]” offer no specific action that the company intends to take, nor a timeframe, nor, due to its vagueness, the ability to challenge these claims.
6. In order to prepare the data for processing, we stripped the texts of all paragraphs that were not formatted as running text. This not only removed, for instance, headings and subheadings but also all numeric or non-full-sentence information contained in graphs and tables, dramatically reducing the length of many texts. As we required texts to contain at least 200 words after this process, we eliminated six texts from the corpus.
7. We equate with one another, for the purposes of this study, the “CEO letter,” “Chairman’s Address,” and other variations thereupon that offer top-level management commentary on company performance.
8. Thomson Reuters (2015) describes ASSET4 as a database that “provides objective, relevant and systematic environmental, social and governance (ESG) information based on 250+ key performance indicators (KPIs) and 750+ individual data points along with their original data sources.”
9. In some cases, such as companies issuing biennial sustainability reports, we selected the report containing the greatest possible part of the calendar year 2012.
10. The full regular expression we used (via the PowerGREP application) was as follows: (?<=^\d{0,3}\W*\d{0,3}\W*)(?<!\W*\t\W*\t\W*)([(““‘‘] ?)?\w[^\t]+?(?<=[^\t]+? [^\t]+? [^\t]+?)[;:.?!](and| or)?,?( ?[“”‘‘„)])?(? = ?([0-9]|\W)? ?$).
11. These tables do not control for economic performance. That analysis is available on request.
12. While this outcome is not immediately relevant to our hypotheses, it is may well be due to CEO letters requiring somewhat denser syntax due to special constraints.
13. At time of data collection, both governments and market regulators had implemented sustainability reporting recommendations or requirements in the United States, Australia, and India, and governments had implemented sustainability policy or regulation for certain companies or key performance indicators in the United Kingdom, Western Europe, and Scandinavia (UNEP, 2013). UNEP indicated no such recommendations or requirements in place for Ireland or Central Europe.

References

Abu Bakar A. S., Ameer R. (2011). Readability of corporate social responsibility communication in Malaysia. Corporate Social Responsibility and Environmental Management, 18(1), 50-60.
Adidas Group. (2013). Sustainability progress report 2012: Performance counts. Retrieved from http://www.adidas-group.com/media/filer_public/2013/08/13/adidas_spr2012_full.pdf
Aerts W. (1994). On the use of accounting logic as an explanatory category in narrative accounting disclosures. Accounting, Organizations and Society, 19, 337-353.
Aerts W., Cheng P. (2011). Causal disclosures on earnings and earnings management in an IPO setting. Journal of Accounting and Public Policy, 30, 431-459.
Bafilemba F., Mueller T., Lezhnev S. (2014). The impact of Dodd-Frank and conflict minerals reforms on Eastern Congo’s conflict. Retrieved from http://www.enoughproject.org/reports/impact-dodd-frank-and-conflict-minerals-reforms-eastern-congo%E2%80%99s-war
Basu K., Palazzo G. (2008). Corporate social responsibility: A process model of sensemaking. Academy of Management, 33, 122-136.
BBC. (2013, May 10). Bangladesh factory collapse toll passes 1,000. Retrieved from bbc.com/news/world-asia-22476774
Beaman K. (1984). Coordination and subordination revisited: Syntactic complexity in spoken and written narrative discourse. Coherence in Spoken and Written Discourse, 12, 45-80.
Bebbington J., Larrinaga-González C., Moneva-Abadía J. M. (2008). Corporate social reporting and reputation risk management. Accounting, Auditing & Accountability Journal, 21, 337-361.
Bogert J. (1985). In defense of the Fog Index. Business Communication Quarterly, 48(2), 9-12.
Boiral O. (2013). Sustainability reports as simulacra? A counter-account of A and A+ GRI reports. Accounting, Auditing & Accountability Journal, 26, 1036-1071.
Carnevale C., Mazzuca M. (2014). Sustainability report and bank valuation: Evidence from European stock markets. Business Ethics, 23, 69-90.
Cho C. H., Patten D. M. (2007). The role of environmental disclosures as tools of legitimacy: A research note. Accounting, Organizations and Society, 32, 639-647.
Cho C. H., Michelon G., Patten D. M. (2012a). Enhancement and obfuscation through the use of graphs in sustainability reports: An international comparison. Sustainability Accounting, Management and Policy Journal, 3(1), 74-88.
Cho C. H., Michelon G., Patten D. M. (2012b). Impression management in sustainability reports: An empirical investigation of the use of graphs. Accounting and the Public Interest, 12(1), 16-37.
Clatworthy M. A., Jones M. J. (2003). Financial reporting of good and bad news: Evidence from accounting narratives. Accounting and Research, 33, 171-185.
Coleman E. B. (1964). The comprehensibility of several grammatical transformations. Journal of Applied Psychology, 48, 186-190.
Colesanti J. S. (2012). Demanding substance or form? The Sec ‘S plain English handbook as a basis for securities violations. Fordham Journal of Corporate and Financial Law. Retrieved from http://ir.lawnet.fordham.edu/jcfl/vol18/iss1/4/
Courtis J. K. (1995). Readability of annual reports: Western versus Asian evidence. Accounting, Auditing & Accountability Journal, 8(2), 4-17.
Courtis J. K. (1998). Annual report readability variability: Tests of the obfuscation hypothesis. Accounting, Auditing & Accountability Journal, 11, 459-472.
Creese A. (1991). Speech act variation in British and American English. PENN Working Papers, 7(2), 37-58.
Dahlsrud A. (2008). How corporate social responsibility is defined: An analysis of 37 definitions. Corporate Social Responsibility and Environmental Management, 15(1), 1-13.
Dale E., Chall J. S. (1948). A formula for predicting readability: Instructions. Educational Research Bulletin, 27(2), 37-54.
Deegan C., Rankin M., Tobin J. (2002). An examination of the corporate social and environmental disclosures of BHP from 1983-1997: A test of legitimacy theory. Accounting, Auditing & Accountability Journal, 15, 312-343.
Dell’Orletta F., Wieling M., Cimino A., Venturi G., Dell’Orletta F., Montemagni S. (2014, June 26). Assessing the readability of sentences: Which corpora and features? Paper presented at the Ninth Workshop on Innovative Use of NLP for Building Educational Applications. Baltimore, MA.
DuBay W. (2004). The principles of readability. Costa Mesa: Impact Information.
Farewell S., Fisher I., Daily C. (2014). The lexical footprint of sustainability reports: A pilot study of readability. Paper presented at the American Accounting Association Annual Meeting and Conference on Teaching and Learning in Accounting, Sarasota, FL.
Flesch R. (1948). A new readability yardstick. Journal of Applied Psychology, 32, 221-233.
Flesch R. (1949). The art of readable writing. New York, NY: Harper & Row.
Global Reporting Initiative. (2013a). An introduction to G4. The next generation of sustainability reporting. Retrieved from https://www.globalreporting.org/resourcelibrary/GRI-An-introduction-to-G4.pdf
Global Reporting Initiative. (2013b). The sustainability content of integrated reports: A survey of pioneers. Retrieved from https://www.globalreporting.org/resourcelibrary/GRI-IR.pdf
Gunning R. (1952). The technique of clear writing. New York, NY: McGraw-Hill.
Hahn R., Kühnen M. (2013). Determinants of sustainability reporting: A review of results, trends, theory, and opportunities in an expanding field of research. Journal of Cleaner Production, 15, 5-21.
Halliday M. A. K. (1989). Spoken and written language. Language education. Oxford, England: Oxford University Press.
Harrison S., Bakker P. (1998). Two new readability predictors for the professional writer: Pilot trials. Journal of Research in Reading, 21, 121-138.
Hrasky S. (2012). Visual disclosure strategies adopted by more and less sustainability-driven companies. Accounting Forum, 36, 154-165.
Infineon. (2013). The determining factor. Infineon Technologies AG annual report 2012. Retrieved from http://www.infineon.com/dgdl/Infineon-GB2012-E.pdf?fileId=db3a30433b92f0e8013b989bf5cd15f3
Jackson G., Apostolakou A. (2010). Corporate social responsibility in Western Europe: An institutional mirror or substitute? Journal of Business Ethics, 94, 371-394.
Jones M. J., Shoemaker P. A. (1994). Accounting narratives: A review of empirical studies of content and readability. Journal of Accounting Literature, 13, 142-184.
Kincaid J. P., Fishburne R. P., Rogers R. L., Chissom B. S. (1975, February). Derivation of new readability formulas (Automated Readability Index, Fog Count and Flesch Reading Ease Formula) for Navy enlisted personnel. Retrieved from http://www.dtic.mil/dtic/tr/fulltext/u2/a006655.pdf
Klare G. R. (1963). The measurement of readability. Ames: Iowa State University Press.
Kumar G. (2014). Determinants of readability of financial reports of U.S.-listed Asian companies. Asian Journal of Finance & Accounting, 6(2), 1.
Lehavy R., Li F., Merkley K. (2011). The effect of annual report readability on analyst following and the properties of their earnings forecasts. Accounting Review, 86, 1087-1115.
Leuz C., Nanda D., Wysocki P. D. (2003). Earnings management and investor protection: An international comparison. Journal of Financial Economics, 69, 505-527. doi
Li F. (2008). Annual report readability, current earnings, and earnings persistence. Journal of Accounting and Economics, 45, 221-247.
Lonmin. (2013). Sustainable development report for the year ended 30 September 2012. Retrieved from http://sd-report.lonmin.com/2012/downloads/lonmin-online-sustainable-development-report-2012.pdf
Manning C. D., Surdeanu M., Bauer J., Finkel J., Bethard S. J., McClosky D. (2014). The Stanford CoreNLP natural language processing toolkit. Paper presented at the 52nd Annual Meeting of the Association for Computational Linguistics: System Demonstrations. Retrieved from http://nlp.stanford.edu/pubs/StanfordCoreNlp2014.pdf
Matten D., Moon J. (2008). “Implicit” and “explicit” CSR: A conceptual framework for a comparative understanding of corporate social responsibility. Academy of Management Review, 33, 404-424.
McLaughlin G. H. (1969). SMOG grading: A new readability formula. Journal of Reading, 12, 639-646.
Nearmy T. (2014, March 27). New ASX guidelines to force sustainability reporting. Conversation. Retrieved from http://theconversation.com/new-asx-guidelines-to-force-sustainability-reporting-24885
Parsons R., McKenna B. J. (2005, November). Constructing social responsibility in mining company reports. In Lê T., Short M. (Eds.), Proceedings of the International Conference on Critical Discourse Analysis Theory into Research (pp. 595-608). Retrieved from http://195.130.87.21:8080/dspace/bitstream/123456789/262/1/Parsons&McKenna-constructingsocialresponsibility.pdf
Precht K. (2003a). Great vs. lovely: Grammatical and lexical stance differences in American and British English. In Meyer C., Leistyna P. (Eds.), Corpus analysis: Language structure and language use (pp. 133-151). Amsterdam, Netherlands: Rodopi.
Precht K. (2003b). Stance moods in spoken English: Evidentiality and affect in British and American conversation. Text: Interdisciplinary Journal for the Study of Discourse, 23, 239-257.
Rutherford B. A. (2003). Obfuscation, textual complexity and the role of regulated narrative accounting disclosure in corporate governance. Journal of Management and Governance, 7, 187-210. doi
Sacconi L. (2004). Corporate social responsibility (CSR) as a model of “ extended ” corporate governance. An explanation based on the economic theories of social contract, conformism. Liuc Papers, 147(Wieland 2003), 1-49.
Securities and Exchange Commission. (1998). A plain English handbook. How to create clear SEC disclosure documents. Retrieved from https://www.sec.gov/pdf/handbook.pdf
Securities and Exchange Commission. (2014). Fact sheet. Disclosing the use of conflict minerals. Retrieved from https://www.sec.gov/News/Article/Detail/Article/1365171562058
Smith D. (2012, September 7). Marikana mine shootings revive bitter days of Soweto and Sharpeville. The Guardian. Retrieved from http://www.theguardian.com/world/2012/sep/07/marikana-mine-shootings-revive-soweto
Smith M., Taffler R. (1992). The chairman’s statement and corporate financial performance. Accounting & Finance, 32(2), 75-90.
Stanford NLP Group. (2015). Stanford parser. Retrieved from http://nlp.stanford.edu:8080/parser/
Stanford NLP Group. (2016). Software. Retrieved from http://nlp.stanford.edu/software/
Stanton P., Stanton J. (2002). Corporate annual reports: Research perspectives used. Accounting, Auditing & Accountability Journal, 15, 478-500.
Story J., Neves P. (2015). When corporate social responsibility (CSR) increases performance: exploring the role of intrinsic and extrinsic CSR attribution. Business Ethics: A European Review, 24, 111-124.
Temouri Y., Jones C. (2014). Introduction: International business and institutions after the financial crisis. Academy of International Business (UKI), 21, 1-6.
Thomson Reuters. (2015). Asset4. Retrieved from http://www.trcri.com/index.php?page=asset4
Tilling M. V. (2004). Refinements to legitimacy theory in social and environmental accounting. Commerce Research Paper Series, 6(4), 1-11. Retrieved from https://www.flinders.edu.au/sabs/business-files/research/papers/2004/04-6.pdf
Tolsi N. (2015). Marikana. Retrieved from http://marikana.mg.co.za/
Townsend S., Bartels W., Renaut J.-P. (2010). Reporting change. Change, 1-33. Retrieved from http://www.sustainability.com/library
United Nations Environment Programme. (2013). Frequently asked questions on corporate sustainability reporting. Retrieved from https://www.globalreporting.org/resourcelibrary/GoF47Para47-FAQs.pdf
Varda H. (2014). Signalling sustainability: Drivers, types of signals and methods a comparative study between certified and non-certified companies within the UK sustainable fashion sector (Unpublished doctoral dissertation). Cardiff University, Cardiff, Wales.
Wee M., Tarca A., Krug L., Aerts W., Pink P., Tilling M. V. (2016). Factors affecting preparers’ and auditors’ judgements about materiality and conciseness in integrated reporting. London, England: ACCA. Retrieved from http://integratedreporting.org/wp-content/uploads/2016/08/pi-materiality-conciseness-ir-FINAL.pdf
World Commission on Environment and Development. (1987). Report of the World Commission on Environment and Development: Our common future (The Brundtland Report). Retrieved from http://www.un-documents.net/our-common-future.pdf

Biographies

Nils Smeuninx is a doctoral researcher at Ghent University. His research focuses mainly on textual analysis of corporate sustainability reporting through natural language processing.
Bernard De Clerck is an associate professor at Ghent University. His main research interests include language variation and change and business communication from a multilingual and intercultural perspective.
Walter Aerts is a full professor at the Department of Accounting and Finance (University of Antwerp). He is part-time professor at Tilburg University and at the Antwerp Management School. His research focuses on nonfinancial corporate disclosure.

Cite article

Cite article

Cite article

OR

Download to reference manager

If you have citation software installed, you can download article citation data to the citation manager of your choice

Share options

Share

Share this article

Share with email
EMAIL ARTICLE LINK
Share on social media

Share access to this article

Sharing links are not relevant where the article is open access and not available if you do not have a subscription.

For more information view the Sage Journals article sharing page.

Information, rights and permissions

Information

Published In

Article first published online: November 20, 2016
Issue published: January 2020

Keywords

  1. corpus linguistics
  2. readability
  3. sustainability reporting
  4. language variety
  5. natural language processing

Rights and permissions

© The Author(s) 2016.
Request permissions for this article.

Authors

Affiliations

Nils Smeuninx
Bernard De Clerck
Ghent University, Ghent, Belgium
Walter Aerts
University of Antwerp, Antwerp, Belgium

Notes

Nils Smeuninx, Department of Translation, Interpreting and Communication, Ghent University, Groot-Brittaniëlaan 45, Ghent 9000, Oost-Vlaanderen, Belgium. Email: [email protected]

Metrics and citations

Metrics

Journals metrics

This article was published in International Journal of Business Communication.

VIEW ALL JOURNAL METRICS

Article usage*

Total views and downloads: 5122

*Article usage tracking started in December 2016


Altmetric

See the impact this article is making through the number of times it’s been read, and the Altmetric Score.
Learn more about the Altmetric Scores



Articles citing this one

Receive email alerts when this article is cited

Web of Science: 33 view articles Opens in new tab

Crossref: 32

  1. The visual stories in addressing climate change that a petroleum busin...
    Go to citation Crossref Google Scholar
  2. The progression of disclosures in the basic materials industry of Sout...
    Go to citation Crossref Google Scholar
  3. The Art of Post Captions: Readability and User Engagement on Social Me...
    Go to citation Crossref Google Scholar
  4. Motivations for social and environmental reporting in Spanish SMEs: An...
    Go to citation Crossref Google Scholar
  5. Readability of Sustainability Reports: A Bibliometric Analysis and Sys...
    Go to citation Crossref Google Scholar
  6. The diachronic change of research article abstract difficulty across d...
    Go to citation Crossref Google Scholar
  7. The role of ESG ranking in retail and institutional investors' attenti...
    Go to citation Crossref Google Scholar
  8. Sustainability reporting, board diversity, earnings management and fin...
    Go to citation Crossref Google Scholar
  9. Revised guidelines for sustainability reporting: readability and assur...
    Go to citation Crossref Google Scholar
  10. Automated text analyses of sustainability & integrated reporting. A li...
    Go to citation Crossref Google Scholar
  11. Does the rhetoric art in sustainability reports obstruct the assurance...
    Go to citation Crossref Google Scholar
  12. Make it easy: the effect of prospectus readability on IPO performance
    Go to citation Crossref Google Scholar
  13. Legibilidade dos Relatórios de Gestão no Setor Público Brasileiro
    Go to citation Crossref Google Scholar
  14. Reporting on Human Rights by Large Corporates: Interplay Between Compr...
    Go to citation Crossref Google Scholar
  15. Driving through the fog: exploring factors affecting disclosure readab...
    Go to citation Crossref Google Scholar
  16. Through the rhetoric art: CEO incentives in sustainability sensitive i...
    Go to citation Crossref Google Scholar
  17. Board composition and textual attributes of non-financial disclosure i...
    Go to citation Crossref Google Scholar
  18. Greenwashing, Sustainability Reporting, and Artificial Intelligence: A...
    Go to citation Crossref Google Scholar
  19. The role of communication in restaurant crowdfunding success
    Go to citation Crossref Google Scholar
  20. It's not just a phase: Investigating text simplification in a second l...
    Go to citation Crossref Google Scholar
  21. Features of context the sustainable development report as a reflection...
    Go to citation Crossref Google Scholar
  22. The genre of banking financial product information: The characters, th...
    Go to citation Crossref Google Scholar
  23. Adjectives and adverbs in life sciences across 50 years: implications ...
    Go to citation Crossref Google Scholar
  24. Financial Statement Fraud Detection using Analysis of Corporate Social...
    Go to citation Crossref Google Scholar
  25. Does IFRS and GRI adoption impact the understandability of corporate r...
    Go to citation Crossref Google Scholar
  26. Readability of integrated reports: Evidence from worldwide adopters
    Go to citation Crossref Google Scholar
  27. Accessible Communication of Corporate Social Responsibility: Developme...
    Go to citation Crossref Google Scholar
  28. Controlling hallucinations at word level in data-to-text generation
    Go to citation Crossref Google Scholar
  29. Investigations Regarding the Linguistic Register Used by Managers to C...
    Go to citation Crossref Google Scholar
  30. Sustainability Reporting by New Zealand Wineries
    Go to citation Crossref Google Scholar
  31. The readability of integrated reports
    Go to citation Crossref Google Scholar

Figures and tables

Figures & Media

Tables

View Options

View options

PDF/ePub

View PDF/ePub

Get access

Access options

If you have access to journal content via a personal subscription, university, library, employer or society, select from the options below:

ABC members can access this journal content using society membership credentials.

ABC members can access this journal content using society membership credentials.


Alternatively, view purchase options below:

Purchase 24 hour online access to view and download content.

Access journal content via a DeepDyve subscription or find out more about this option.