A Comparison of Children’s Reading on Paper Versus Screen: A Meta-Analysis

This meta-analysis examines the inconsistent findings across experimental studies that compared children’s learning outcomes with digital and paper books. We quantitatively reviewed 39 studies reported in 30 articles (n = 1,812 children) and compared children’s story comprehension and vocabulary learning in relation to medium (reading on paper versus on-screen), design enhancements in digital books, the presence of a dictionary, and adult support for children aged between 1 and 8 years. The comparison of digital versus paper books that only differed by digitization showed lower comprehension scores for digital books. Adults’ mediation during print books’ reading was more effective than the enhancements in digital books read by children independently. However, with story-congruent enhancements, digital books outperformed paper books. An embedded dictionary had no or negative effect on children’s story comprehension but positively affected children’s vocabulary learning. Findings are discussed in relation to the cognitive load theory and practical design implications.

484 routines that differ from those of conventional readers. Given that young children cannot decipher words independently, the question emerges whether digital books can provide the support emergent readers need to understand books on their own, without reliance on adults. A substantial body of experimental research focuses on comparing children's reading of digital books with print books, with evidence of both positive and negative effects on children's story comprehension and vocabulary learning. In previous studies, the difference in impact has been linked to the medium (paper vs. on-screen; e.g., Hoel & Tønnessen, 2019), the design of digital books (e.g., Christ, Wang, Chiu, & Strekalova-Hughes, 2019), and adults' support (e.g., Strouse et al., 2019). This meta-analysis sought to determine the relationships between some key children's learning outcomes and the reading of digital versus paper books, with a specific focus on the moderating effect of the design of digital books and the presence of literacy-and language-stimulating features, such as dictionaries and adult reading support. The focus of the meta-analysis is on digital books, also known as e-books, picture book apps, story apps, and iBooks, that have been available on the children's book market in various formats since the appearance of a desktop computer in the 1980s.
Current literature reports both positive and negative learning effects of young children's digital picture books, with several variables proposed to explain this variation. While digital enhancements aligned with the story content can support children's reading outcomes (Christ, Wang, Chiu, & Cho, 2019), digital books with enhancements unrelated to the narrative can have a negative effect. In particular, the presence of short games embedded in story apps may explain children's poor comprehension of digital books, as these distract young children's attention from the story (Munzer et al., 2019;Parish-Morris et al., 2013). Specific language-promoting features, such as embedded dictionaries that provide word definitions and follow story context definition, were found to enhance children's word learning . However, it is not clear how they impact story comprehension.
The inconsistent findings in the children's print-versus-digital reading raise the question of whether the pure presence of a screen makes a difference to children's learning, how the adults' reading support influences possible differences, and how the specific design features of digital books, such as the presence of a dictionary, affect children's learning outcomes. This meta-analysis aimed to determine the strength of these associations for 1-to 8-year-old children, that is, children at the earliest stages of their reading development also referred to in the literature as early or emergent readers, whose first experience of books is typically modeled and mediated by adults at home (Teale & Sulzby, 1986).

Media Effects
Media or medium effect studies have a tradition in reading research, with three recent meta-analyses showing that reading on screen, when compared to reading on paper, is related to lower reading performance among adults, students, and secondary/primary school-aged children (Clinton, 2019;Delgado et al., 2018;Kong et al., 2018). This finding is referred to in the literature as the so-called "paper advantage" and "screen inferiority" effect. Despite the cumulative evidence favoring paper to on-screen for conventional readers, there is no quantitative synthesis of outcomes related to the youngest children's reading on paper versus on-screen. Literature reviews that summarize evidence from print-digital comparisons as well as studies of digital reading only found children's emergent literacy skills to be positively related to digital books (Biancarosa & Griffiths, 2012;Bus et al., 2015) but also to print books (e.g., Miller & Warschauer, 2014). Given that the human information processing system has a limited capacity (Mayer, 2009), distributing cognitive resources across the story narrative, handling the device, and children's expectations concerning an electronic device may be the reason for the reported negative effects. When children have to use a mouse or finger to activate hotspots and turn pages, they have to allocate some of their limited cognitive resources to point, click, and swipe while still following the narrative, which may negatively affect meaning-making (Lauricella et al., 2014). Furthermore, many children may constantly search for possibilities to interact with electronic devices, as they are accustomed to game-like activities. On the other hand, many studies suggest that children are more attentive to and engaged in reading digital books than paper books. For instance, in a study by Richter and Courage (2017), both 3-and 4-year-olds were more inattentive during the paperbased reading, and they also spent more time looking off-task than they did in the digital book reading session.
In light of this evidence, our first goal in this meta-analysis was to establish whether the medium per se (paper vs. screen) affects children's story comprehension and learning of new vocabulary.

Digital Design Effects
Traditionally built books and educational television programs are rapidly giving way to digital content on electronic devices, shifting the early learning environment at home and in school for very young children. Considering the low costs and accessibility, digital books supplement or fully provide the reading materials in low-income families (Picton, 2014) or developing countries (Jere-Folotiya et al., 2014). Reading on-screen has significant practical advantages for adults and children who cannot read together in person (e.g., in the case of pandemic lockdowns, displacement of families for work, war conflict, or health reasons). Thus, the question is not whether digital books are better than paper books but rather whether digital books open up new opportunities for book reading and how digital books could be optimized to increase children's learning. In other words, the focus on digital design effects is not on the overall difference a reading medium makes but on what is contributed by the specific enhancements in digital books.

Types of Enhancements
In this meta-analysis, we were less interested in enhancements that serve aesthetic purposes (even though these may increase children's interest and enjoyment of reading) but more in enhancements that target cognitive skills to facilitate children's story comprehension. Especially promising are the so-called digital storytelling enhancements that focus children's attention on the storyline, ranging from synchronizing visualizations with the narration that facilitate the integration of visual and verbal information to using techniques that encourage children's curiosity about new story events, which facilitate the processing of narration (Eng et al., 2019;Sarı et al., 2019;Verhallen et al., 2006).
Most digital books include voice narration that "reads" the book to the child, making adults' direct reading of the text unnecessary. However, there is likely to be a difference in relation to the type of voiceover provided by automated recordings and the dramatization provided by real adults. These audio enhancements afford a new kind of book reading experience to children that in various ways may qualitatively differ from a reading experience of sharing paper books with an adult. For instance, in a recent study, parents of 2-and 3-year-old children were provided with access to a reading platform that included film-like digital books with a voiceover (Bus & Anstadt, 2020). The analytics that registered which days children were logged in and which books were read each session showed that children read more books in one session and repeated the same books more often than reported for regular book reading sessions.
When individual features are aggregated under the broader category of multimedia, there is evidence of learning benefits of digital stories equipped with animated pictures, music, and sound effects . However, this evidence comes from a meta-analysis that took a broad definition of digital texts and included film and television shows presented on television sets, computers, and other electronic devices, as well as stories that did not include a narrative. For studies that directly compare digital and paper-based reading of narrative texts, it is not clear whether children's learning outcomes can be explained through the extent to which the digital books include multimedia affordances or not. This clarification was the second goal of our meta-analysis.

Dictionary Effects
In light of the central importance of vocabulary learning for children's language development (Clark, 2009), researchers and designers have been interested in enhancing children's reading experience with an online dictionary. Dynamic and static dictionaries were among the first enhancements embedded in digital texts, with visual and audio explanations of words, teaching young children either explicitly or implicitly through picture correspondence the meaning, pronunciation, and orthography of story-related vocabulary. Reading print books with dialogic support, which includes an explanation of new words, pointing to them in the text, and contextualizing them to the child's extant knowledge, significantly boosts children's book-related vocabulary as well as vocabulary on nontargeted measures of expressive vocabulary (Hargrave & Sénéchal, 2000). In contrast, parents' natural reading behaviors that do not draw children's attention to the words in the story collide with children's processing of the story. Observation studies show that parents rarely pay explicit attention to difficult words in narratives, possibly because parents intuitively sense that word explanations would disrupt the flow of the reading session (Evans et al., 2011). In addition to studies with parents, several studies examined the effect of a dictionary in digital books, with or without adult support. For example, Korat et al. (2013) compared 4-to 6-yearolds' learning in relation to different types of vocabulary support available during digital book reading and found a clear hierarchy in the extent to which the support was beneficial for children's vocabulary acquisition: adult support was most effective, followed by a dynamic dictionary, static dictionary, and no support. In another study, children's receptive and expressive word learning of target words was studied in relation to a digital book with a dictionary read with adult (mothers') support or independently by the child (Korat & Shneor, 2019). The presence of the dictionary was beneficial for the mothers' mediation of difficult words and children's word learning from digital books. In both Korat et al. (2013) and Korat and Shneor (2019), vocabulary acquisition was the only outcome measure; it was not tested whether explaining words interferes with story processing. Our third goal in this meta-analysis was to establish whether a dictionary affects the differences in learning outcomes when comparing digital and print books and, if it does, in relation to which learning outcomes.

Adult Support Effects
The advent of digital enhancements that provide targeted learning prompts and reading scaffolds for children's language learning is open to the possibility of replacing adult support with technology. This possibility can be framed in technophobic ideologies with technology replacing humans but also in terms of technology supplementing absent or unskilled parents and teachers. From a sociocultural perspective of learning, some form of scaffolding during children's reading is indispensable for them to acquire not only reading skills but also the important life skills that co-occur with reading stories, which include the emotional experiences of fictional story heroes as well as real-life examples (Gee, 1991;Rueda et al., 2001). Scaffolding can take the form of verbal support by the reading partner (e.g., a parent asking a child a question about the main story character), or it can be embedded in the digital book as a prompt (e.g., a written and audio-recorded question is activated when the child taps on a hotspot in the digital book). From a sociocultural perspective, the benefits of scaffolds in digital books depend on their relevance for the child's story understanding and the possibility of combining the built-in digital scaffolds with adult guidance. While sharing a digital book, a verbal and digital prompt can mutually reinforce or interfere with each other.
Studies that compared digital and print reading with and without the adult presence have found various effects. For 5-to 7-year-olds at risk for learning disabilities, children's independent reading of a digital book was more beneficial for their vocabulary than a print book read by an adult (Shamir et al., 2012). For special needs children, automated reading of a computer was as effective as an adult reading of a print book for their vocabulary learning (Segers et al., 2006). A metaanalysis (Takacs et al., 2014) that investigated children's comprehension and word learning from print books as compared to digital picture books, film, and television shows found that for children's story comprehension, reading multimedia-enhanced digital stories was more beneficial than reading print books without adult guidance. There was no difference in children's comprehension when stories enhanced with multimedia were compared to reading print books with adult guidance, which suggests that adults' scaffolding might be of similar effect to welldesigned multimedia books read by children on their own.
A main assumption in the book reading paradigm is that adults who guide children to story-relevant details and who include comments on the story plot and language-stimulating features are more likely to increase children's learning than adults not engaging in such dialogic reading behavior.  attempted to promote adult scaffolding by including questions in digital books, but these were not beneficial for story comprehension outcomes. Other studies suggest that enhancements in digital books are difficult to combine with parent guidance: an enhanced digital book prompted more non-content-related interactions (e.g., device-focused talk, pushing hands away) from children and parents than paper books or digital books without enhancements (Chiong et al., 2012). Our metaanalysis' fourth goal was to establish whether and how adults' support during the book reading session influences children's learning with digital and print books.

Outcome Measures
Stories open a window on other people's emotions and behaviors, thus providing relevant knowledge for functioning in society (Wilson, 2014). This makes book reading from an early age vital for children's social and cognitive development (Dickinson & Morse, 2019). Given the important role of stories, a main outcome measure of book reading is children's narrative comprehension that encompasses not only understanding actions but also people's emotional and behavioral reactions. Assuming that storybook apps are more than a passing fad, the question of how the transition from print to digital books affects the act of meaning-making becomes more pressing. Children's vocabulary learning has attracted considerable research interest due to the well-documented evidence that books provide a unique context for learning new words rare in daily conversations (e.g., Hindman et al., 2012). There is no doubt that young children expand their vocabulary when exposed to books (Bus et al., 1995), but it would be erroneous to assume that word learning is the main aim of book reading. Children may learn individual word meanings from reading texts. However, the promotion of word learning should not be at the expense of book reading's key reasons-children's narrative comprehension, which is basic to enjoyment (e.g., Kelley & Kinney, 2017). To enable a critical test of the hypothesis that isolating words and discussing their meaning outside the narrative context may hinder story comprehension, we included both outcomes (vocabulary and story comprehension) in this meta-analysis. We compare books with and without a dictionary expecting that dictionaries positively affect word learning, but they may interfere with meaning-making.
The existing empirical literature (e.g., Evans & Saint-Aubin, 2005;Justice et al., 2005) does not corroborate the hypothesis that book reading is a source for connecting word pronunciation to its orthography. Therefore, we excluded basic reading outcomes, such as phonemic awareness, letter knowledge, or print knowledge, as outcome measures in our meta-analysis. Nor did we examine the effects of book reading on the adult-child relationship (Dickinson & Morse, 2019). Emerging evidence suggests that responsive and language-rich exchanges between adults and children are often missing in digital book reading, likely because many digital books are not designed to combine built-in affordances with parent-child book sharing. Such digital books may even complicate shared reading because the e-books' interactive features hinder adult-child interaction (Richter & Courage, 2017).

Research Questions
Overall, in light of the extant literature, this meta-analysis was guided by four research questions: 1. Do digital books have the same effect as paper books on children's story comprehension and vocabulary if the only difference is the reading medium (paper vs. digital)? If not, which medium is more beneficial? 2. Can the design of digital books explain the beneficial effects of digital books, especially if the enhancements support children's understanding of the storyline? 3. How does the presence of a dictionary interact with other enhancements and affect the outcomes? 4. How does the support provided by adults during the book reading session influence the findings?

Method
Initially, we identified 33 potentially relevant articles with the "snowball" method (using reference lists from key papers in the field) and the "invisible college" approach (using key figures in the field to collect recent and unpublished materials), out of which 21 met all the inclusion criteria (Bus et al., 2021). We then performed a systematic literature search in bibliographic databases (Science Direct, Web of Science, PsychINFO, Education Resources Information Center, Academic Search Complete, PubMed) using various combinations of the following search terms: ebook* OR e-book* OR "electronic book*" OR "story app*" OR "picturebook app*" OR "digital book*" OR "digital stories*" OR "digital reading*" OR "e-reading*" OR "multimedia stories*" OR "interactive stories*" OR "CD-ROM stories*" OR "DVD stories*" AND "pre-school*" OR preschool* OR kindergarten* OR "early child*." We created new combinations of terms using the three main Boolean operators (AND, OR, and NOT) until all relevant papers from the initial set of 21 articles had recurred. This way, nine additional articles/reports were identified. As the PRISMA (Preferred Reporting Items for Systematic Reviews and Meta-Analyses) flow diagram reports, after the initial screening based on title and abstract, 194 records were excluded, and 46 references were retrieved for full-text screening (see Figure  1). In this set, 30 publications met the inclusion criteria and were included in the meta-analysis. All searches and screenings were performed independently by the first and third authors.
Our inclusion criteria were the following: (1) the study needed to follow an experimental or quasi-experimental design with a contrast between reading a narrative in a digital and print format; (2) the study needed to include children aged between 1 and 8 years (inclusive); (3) the narrative text in digital books could be available both ways, in print and/or orally; (4) the digital reading format can be accessed on any digital device; (5) the study included as outcome measures children's story comprehension and/or vocabulary; (6) articles needed to be written in English, Dutch, German, or Norwegian, but the study could have been conducted in any country; (7) studies needed to provide effect sizes or sufficient information (means, standard deviations, and sample sizes or frequency distributions for treatment and control groups at posttest) to enable calculation of effect sizes.
The focus of the meta-analysis was on studies that reported quantitative comparisons of reading a paper and digital version of the same story. We therefore excluded (1) studies without a control group (e.g., Klop et al., 2018;Messier & Wood, 2015) and case studies (e.g., Boyle et al., 2017, making use of a single-case multiple baseline across participants design); (2) studies that included comparisons between paper on the one hand and video, an audio story, or a film on the other hand were excluded (e.g., Meringoff, 1980); (3) apart from studies just focusing on behavior during book reading (e.g., Moody et al., 2010;Rees et al., 2017) we did not include studies targeting basic reading skills such as letter knowledge or phonemic awareness (e.g., Evans et al., 2017;Segal-Drori et al., 2010;Willoughby et al., 2015) or recognition of kana characters (e.g., Masataka, 2014); (4) we also excluded studies targeting participants older than 8 years (e.g., Connor et al., 2019) and participants with Cochlear implants (e.g., Messier & Wood, 2015) or diagnosed with autism (e.g., Wainwright et al., 2020).

Data Coding and Reliability
The authors' descriptions of digital books were by and large minimal in all the meta-analyzed studies. If available online, we looked up the title to understand the digital enhancements in the book. Each study was coded by digital book features, namely, whether the book was commercially produced or researcher-developed; whether or not a voiceover was available; the print was visible; highlighting print while the text was read aloud; and interactive enhancements to support story comprehension or word learning. The most common genre was fiction. Books were coded as nonfiction where facts and a narrative storyline coexisted, for instance, to introduce the life of a polar bear and her two cubs living in the Arctic (e.g., Zhou & Yadav, 2017). Enhancements were coded as story-related enhancements when the book enabled interaction to support story comprehension; for example, in Elmo Goes to the Doctor, the user can click each character in the waiting room and see why each one is at the doctor's office (Lauricella et al., 2014); in the Tractor in the SandBox, hotspots elicit comments from characters that expand on the text (Korat & Shamir, 2007); in A Frog Thing, tapping hotspots reactivates story-congruent multimedia features such as a creaking door (Richter & Courage, 2017); in Confused Yuval, the computer prompts a question answered by clicking on a location on the screen (Shamir et al., 2012). We ignored automatic dynamic visuals, which might guide children's visual attention to story elements and/or prompt a surprise . In the books that we could access, dynamic visuals mostly missed these purposes and seemed to be added mainly for aesthetic reasons. We did code enhancements meant to teach story-related vocabulary. For instance, in The Polar Bear Horizon story, children see and hear word labels as soon as they touch on the corresponding illustrations on the screen (Zhou & Yadav, 2017); in Tacky the Penguin, the reader can tap on individual pictures to make the name of the object or action pop up and hear the word spoken aloud (Zipke, 2016).
We also coded adult mediation: whether children received adult guidance while reading the paper book and/or the digital book. There were studies in which an adult guided the reading of both books; an adult guided the paper book reading but not the digital book; or children read both books by themselves. For the latter, as an example, O'Toole and Kannass (2018) studied how children listened in both conditions to a recording of the experimenter's voice, and although the experimenter was present during the session, she kept interactions to a minimum and did not provide commentary or answer children's questions about the story.
Furthermore, we coded sample characteristics (country, language(s) used in the intervention, socioeconomic status [SES; overall low SES, middle to high SES or a mix, as defined and categorized by the authors of the individual studies], and children's age range in years), publication year and status (published in a journal or unpublished dissertation or report), and indicators of design quality, particularly number of participants (overall and in individual conditions), type of design (quasi-experiment, cluster randomized control trial, randomized control trial, within-subject design), attrition rate, and blinding. Based on this information, we coded five domain-based risk of bias criteria using three categories (low risk, some concern, high risk): (1) bias arising from randomization; (2) bias due to deviations from the intended interventions (more risk of bias with group-wise interventions carried out by other persons than the researcher, as compared to individual interventions carried out by the researcher); (3) bias due to missing outcome data (attrition rate); (4) the bias in measurements (examiners were not blinded for the experimental condition to which participants were assigned); (5) selection bias due to paper and digital book group differences in gender or SES or other relevant variables at pretest.
We coded postintervention outcome measures, including indicators of story comprehension and vocabulary (mean and standard deviation, t test, F test, r, p value, frequency distributions, and sample size per test). Indicators of story comprehension were the number of correctly answered questions about the story content, the quality of a retelling (e.g., Neuman et al., 2017, gave each story element-setting, characters, events, plot or theme, resolution-one point), or a sequencing task (e.g., Neuman et al., 2017, selected five pictures with event scenes per story and based on the number of events in the correct order the child received a total score ranging from 0 to 5). Indicators of children's vocabulary were the receptive knowledge task assessing children's ability to identify a word from an array of three or four color pictures, balanced with two or three foils (i.e., similar appearance, similar function, and similar category); the expressive naming task examining the children's ability to provide a label when given a picture (i.e., "What is this?") or completing a sentence at a picture; the definition task prompted the child for a definition (e.g., Leacox & Jackson, 2014) used as a first prompt, "What does __ mean?"; and a second prompt queried further description, for example, "What else do you know about ___?" The first and last authors coded all studies. Cohen's kappa was computed for 19 variables, which yielded coefficients between .63 and 1.00, which is considered substantial to perfect agreement. Disagreements were resolved by consulting the original reports and discussing the issues until consensus was reached.

Meta-Analytic Procedures
The Comprehensive Meta-Analysis computer software (Version 3.3; Borenstein et al., 2009) was used for analyzing the data. The standardized differences between the mean of a digital book and the mean of a paper book at posttests were computed to quantify a potential additional value of digital books compared to a paper book version of the same story. Hedges' g was calculated using posttest scores (means and standard deviations) of the digital and paper book condition or by transforming reported test statistics (e.g., t, F, r) into Hedges' g. A positive effect size indicates a favorable outcome for the digital book, while a negative effect implies a better result for print books. Given that studies with an increased sample size provide more reliable estimates of the population mean due to a smaller standard error, effect sizes were determined by weighting each outcome by the inverse of its variance (Cooper & Hedges, 1994;Lipsey & Wilson, 2001).
Effect sizes were aggregated within the two domains-story comprehension and vocabulary-before being averaged across studies. Assuming a distribution of true effect sizes in our sample, a random-effects model was preferred to a fixed model when we pooled effect sizes (see Borenstein et al., 2010). A study was defined as an outlier if the individual study's confidence interval did not overlap with the confidence interval of the pooled effect (Harrer et al., 2019;Viechtbauer & Cheung, 2010). Funnel plot analysis was used to examine publication bias due to the reduced likelihood of publication of studies with nonsignificant findings. To detect bias due to the underrepresentation of studies with small sample sizes that are less likely to be published, the effect sizes of each study's outcome measures were plotted against the inversed standard error. The "trim and fill" method was used to calculate the effect of potential publication bias (Duval & Tweedie, 2000a, 2000b. We also computed the fail-safe number: the number of studies with null results that would have to exist to overturn the effect of book format on story comprehension and vocabulary to a level of no significance (Lipsey & Wilson, 2001).
We report three types of heterogeneity measures. Significant Qs indicate that the separate effect sizes are heterogeneous; they do not estimate the same population's mean effect size (Lipsey & Wilson, 2001). I 2 tells what proportion of the variance is due to variation in real effects rather than sampling error (Borenstein et al., 2017). Prediction intervals estimated with tau-squared (τ 2 ) help give us a range for which we can expect the effects of future studies to fall based on our present evidence in the meta-analysis. If the prediction interval lies completely on the positive side favoring the digital book, we can conclude that despite varying effects, the digital format might be at least in some way beneficial in all contexts that are studied in the future. If the confidence interval includes zero, we can be less sure about this, although it should be noted that broad prediction intervals are quite common. To compute prediction intervals for the mean effect sizes in this meta-analysis, we used a spreadsheet prepared by Michael Borenstein (updated August 10, 2019).
To explain heterogeneity, we tested several moderator variables: digital book features (story-related enhancements, dictionary, voiceover, highlighted print); whether or not children received the same adult guidance during both paper and digital reading or only during paper book reading; sample characteristics (SES, the age range of participants, number of subjects); research quality (random assignment); and publication characteristics (publication outlet, publication year). Moderator analysis was carried out by applying a meta-regression model or by contrasting subsamples. To avoid a lack of power in the search for differences between subgroups, we only contrasted subsamples when subgroups contained a minimum of four studies.

Characteristics of Studies
We found 39 studies in 30 articles/reports in which learning from a paper book is compared with learning from a digital book. These articles/reports included 1,812 children. They are marked with an asterisk in the References list and are included in Table 1 with descriptive information. The bulk of studies (n = 25) was carried out between 2010 and 2019, and for the greater part in the last 4 years (n = 16); only four studies appeared between 2002 and 2010. Most articles/reports (n = 21) originated in the united States followed by Canada (n = 4), Israel (n = 4), and the Netherlands (n = 4). Most studies (n = 23) were concerned with 4-to 5-year-old children, a smaller number also included 6-year-olds (n = 14), and very few focused on 1-to 3-year-olds (n = 2). Although some studies (n = 9) included children mainly from low SES families, most studies focused on children from middle or high SES families (n = 13) or families from low as well as middle/high SES (n = 14).
The internal validity of the studies was generally satisfying. Only four articles/ reports (seven studies) were estimated to have a high overall risk of bias. Figure 2 presents a summary of the authors' judgments broken down for each risk of bias criterion across all included studies (not weighted for sample size). The risks of deviation from intended interventions and bias in measurement were quite high. In more than half of the studies, the intervention was riskier for optimal implementation because the intervention took place in the classroom or family (e.g., Broemmel et al., 2015;Ihmeideh, 2014). In many studies, the bias in the measurement of the outcomes was judged as rather high because assessors were not blinded for the condition; the person who carried out the assessment was aware of the condition to which participants were assigned (e.g., Chiong et al., 2012;De Jong & Bus, 2002;Robb, 2010). On the other hand, the attrition rate was predominantly low, thus keeping the risk of bias due to missing data low. Likewise, most studies scored low on bias due to randomization. All studies were experimental, mostly using random assignment to the paper and digital condition at the level of the individual (n = 21). Fourteen studies used a within-subject design counterbalancing the paper and digital book (e.g., Lauricella et al., 2014). Last, there was not much evidence of selection bias due to paper and digital book group differences in gender or SES or other relevant variables at pretest.
More studies (n = 28) involved books from the commercial market than researcher-developed books (n = 11). In very few studies (n = 5), digital books only differed from paper books on account of the presence of the screen (e.g., Krcmar & Cingel, 2014;Strouse & Ganea, 2017). Most digital books included additional features. A minimal addition was a voiceover, audio-recording of the text in the book (n = 34). In nine studies, the voiceover was the only additional feature in the digital book. In most studies (n = 29), the print was visible in the digital book but not always highlighted while the text was read aloud (n = 16). Thirteen digital books were enhanced with a dictionary and no other enhancements. Eighteen books included story-related enhancements (e.g., expanding on the text or illustrations to support meaning-making), in addition to or instead of a dictionary. In many studies (n = 18), the book was presented on a touchscreen device. Particularly the books in older studies needed mediation with a computer mouse. The adult role varied across the studies: In a small number of studies (n = 6), adults were present but did not provide any support during the reading session. In 13 studies, the adult provided support in the print condition with, for example, answering questions or discussing the images, but not in the digital book   condition. In 20 studies, an adult was present in both the paper and digital conditions and provided support when necessary. There were no significant effects of sample size or design (RCT vs. withinsubject) on the two outcome measures. To test the effect of publication status on comprehension, we contrasted four unpublished studies with the rest (n = 22), but the difference was not significant (p = .221). Publication status could not be tested for vocabulary because all studies were published in peer-reviewed journals. For vocabulary, we found a positive effect of year of publication showing higher effects favoring the digital books as the study was more recent (z = 2.84, p = .005), which means that over the years, the digital books had a stronger effect on vocabulary. This may indicate that the quality of the digital book has been improving. As we did not find a similar effect for studies targeting comprehension, the improvement may be confined to enhancements for vocabulary.

Medium Effects
Twenty-nine studies involving 1,192 children (n digital = 797, n print = 760) reported effects on story comprehension. The studies by Altun (2018), one of the two studies by Chiong et al. (2012), andBus (2004) were considered outliers following Harrer and colleagues' criteria (Harrer et al., 2019) and excluded from further analyses. The remaining 26 studies show varying effects, sometimes favoring digital books and sometimes print books (see Figure 3). The average difference approached zero (g = −0.07), and the confidence interval (CI) included zero (95% CI: [−0.17, 0.04]), indicating that studies favoring paper were in balance with studies favoring digital. Given the broad prediction interval ranging from −0.47 to 0.34, we can be overly confident that results will also be heterogeneous in future scenarios. The funnel plot showed asymmetry around the point estimate. After imputing two studies with small sample sizes, the effect size slightly increased in favor of paper books from −0.07 to −0.10, 95% CI [−0.20, 0.02]. Zero was still included in the confidence interval, indicating that the difference did not favor paper nor digital books. The interaction between medium and genre was not significant. In studies that were carried out in a school setting paper FIGuRE 2. Risk of bias summary: authors' judgments broken down for each risk of bias criterion across all included studies (not weighted for sample size), created in RStudio (Harrer et al., 2019). (Harrer et al., 2019). books outperformed digital books (k = 9, g = −0.28, 95% CI [−0.42, −0.15]), while studies at home (k = 5, g = 0.08, 95% CI [−0.17, 0.32]) or lab studies (k = 12, g = 0.06, 95% CI [−0.06, 0.18]) did not show this preference for paper books, Q(2) = 15.56, p < .001. This may indicate that digital books were the least useful in the mostly group-based reading sessions in schools. As the prediction interval (−0.49, −0.07) did not include zero, we can be confident that the print books' effect is robust in future studies in schools. From the characteristics of the participants, only socioeconomic background interacted with the medium. In studies that included low SES families, paper outperformed digital, while in samples that mainly included middle or high SES families, digital and paper had the same effect, Q(2) = 6.71, p = .010. The prediction interval indicates that not all future studies will favor print books when low SES families are involved.

FIGuRE 3. Forest plot for 26 studies contrasting digital and print books on story comprehension; positive scores indicate that digital books outperform print books; the diamond represents the overall effect and its confidence interval; created in RStudio
Meta-regressing comprehension on the median of the age range in individual studies did not reveal a significant effect (see Table 2).

Digital Design Effects
In some studies (k = 10), the paper and digital book formats were almost the same except for minimal additions to the digital book: a voiceover and/or (sometimes) highlighted print. In those cases, paper outperformed digital (k = 10, g = −0.22, 95% CI [−0.36, −0.08]). Future studies are also expected to favor paper books when the enhancements in digital books are minimal; the prediction interval ranged from −0.38 to −0.06. Most studies, however, focused on enhanced digital books (16 out of 26 studies), enabling children to interact with the book, thereby receiving additional information about story content or the meaning of infrequent words. When digital books included enhancements, paper no longer outperformed digital (k = 16, g = −0.03 [95% CI: −0.18, 0.11]) nor did digital outperform paper. As the prediction interval ranged from −0.43 to 0.37, it might be very well possible that future studies show similar variation, probably due to the type of enhancements. The shift away from favoring paper only approached significance, Q(1) = 3.48, p = .062.

Dictionary Effects
Only five studies focused on digital books that included a dictionary and no other enhancements. This small subset of studies did not show a preference for digital or paper, k = 5, g = −0.05, 95% CI [0.32, 0.23], and given a broad prediction interval ranging from −0.86 to 0.76: future studies may likely show the same variation. The other 11 studies targeted enhancements that focused on the story, sometimes combined with a dictionary (n = 3). The preference for digital to paper increased if the enhancements did not include a dictionary. If the digital book included both enhancements, digital books tended to do worse than print books (k = 3, g = −0.20, 95% CI [−0.40, 0.01]). In contrast, digital books were more effective than paper books if the enhancements in the digital book concerned only story content and not word meanings (k = 8, g = 0.17, 95% CI [0.01, 0.32]). Given a prediction interval ranging from −0.03 to 0.36, we may expect that some future studies focusing on books with content-related enhancements might include outcomes that favor paper above digital, probably due to how well enhancements tie in with the storyline. Enhancements targeting the story content supported story comprehension, but the enhancements interfered with comprehension of the digital book if they were combined with a dictionary. Considering the low numbers of studies in the subgroup that included both enhancements (k = 3), we cannot rely on the Q-statistic revealing a significant effect across these additions (Q[2] = 7.79, p = .020).

Adult Support Effects
Studies differed in adult support, which may have interacted with the medium (paper vs. digital) and enhancements in digital books. In follow-up analyses, we, therefore, tested the effects of enhancements controlling for adult support. In seven out of 26 studies, children received adult support in the paper book condition but not in the digital condition. In this small subset, the paper condition outperformed the digital condition, k = 7, g = −0.22, 95% CI [−0.38, −0.06], meaning that the enhancements available in this set of digital books did not outweigh the adult support while sharing a paper book. Given a prediction interval ranging from −0.43 to -0.01, we might expect similar outcomes in future studies. However, it should be noted that the digital books tested in the seven studies were far from optimal: only three books' enhancements targeted the story content while the other four books included a dictionary, alone or in addition to content-related enhancements. Next, we tested whether enhancements in digital books affected comprehension when adult support was the same in the paper and digital conditions. In most studies, both conditions involved adult support (n = 14), while in a few studies, neither the paper condition nor the digital condition involved adult support (n = 5). In this set of 19 studies, the paper book outperformed the digital book when the digital book was not enhanced (k = 10, g = −0.22, 95% CI [−0.36, −0.08]. Six out of 10 effect sizes were negative. The prediction interval ranged from −0.38 to −0.06, indicating a high probability that studies show negative effects in future studies. When, however, the digital book was enhanced with a content-related enhancements the difference favored the digital book (k = 5, g = 0.20, 95% CI [0.03, 0.36]). Four out of five effect sizes were positive. According to the prediction interval (−0.06, 0.46), we cannot exclude that future studies show negative effects. If enhanced with a dictionary scores ranged around zero (k = 4, g = 0.04, 95% CI [−0.21, 0.28]). In other words, if the amount of adult guidance was the same in the digital and paper book conditions, then reading digital books without enhancements was less effective than reading paper books, but the balance shifted in favor of digital books, particularly when enhancements were content-related; Q(2) = 14.68, p < .001.

Medium Effects
Twenty studies reported effects on vocabulary. After excluding four outliers (Harrer et al., 2019), these studies included 881 children (n digital = 557, n print = 488). Focusing on vocabulary tests, the random effect on language was positive and significantly different from zero (k = 18, g = 0.20, 95% CI [0.08, 0.32]), which implies that digital books were more effective for vocabulary development than paper books (see Figure 4). Given the broad prediction interval ranging from −0.09 to 0.49, which stretches well below zero, we cannot be overly confident that the positive effect we found for digital books is robust in every context. It might be very well possible that digital books do not yield positive effects in some future scenarios. The fail-safe number representing the number of studies required to refute the significant meta-analytic mean equaled 55. To solve asymmetry around the point estimate, six studies with small sample sizes were imputed. As a result, the effect size dropped from 0.20 to 0.09, which was no longer significantly different from zero (95% CI = [−0.04, 0.23]). The interaction between medium and genre was significant. When nonfiction books were included, the differences were more in favor of digital than with fiction, Q(1) = 7.87, p = .005. The 12 studies with only fiction did not reveal differences between paper and digital (k = 12, g = 0.09, 95% CI [−0.03, 0.22]) and according to the prediction interval (−0.09, 0.27), we might expect similar results in future studies. The six studies that also included nonfiction showed a higher score in favor of digital (k = 6, g = 0.42, 95% CI [0.23, 0.61]). The prediction interval predicts a similar result for future studies (0.15, 0.69). Apparently, digital books were more suitable than paper books to highlight complex concepts in nonfiction books. Location or child characteristics (SES, age) were not related to the outcomes.

Digital Design Effects
When the print was visible in digital books enabling children to see the word's orthography, digital outperformed paper (k = 13, g = 0.25, 95% CI [0.13, 0.37]), and future studies might show the same result (the prediction interval was positive ranging from 0.11 to 0.38). At the same time, the outcomes were highly variable when print was not visible in digital books (k = 5, g = 0.12, 95% CI [−0.15, 0.39]). The visible print was in 10 out of 13 cases highlighted while the text was read aloud. When digital books included story-related enhancements, attracting children's attention to the story, word learning from digital no longer outperformed paper (k = 7, g = 0.10, 95% CI [−0.09, 0.28]. The prediction interval ranging from −0.36 to 0.56 indicates that we might expect the same variation in positive scores indicate that digital books outperform print books; the diamond represents the overall effect and its confidence interval; created in RStudio (Harrer et al., 2019). future studies. Digital books without story-related enhancements, by contrast, were more facilitative of word learning (k = 11, g = 0.29, 95% CI [0.14, 0.45]), and it is very well possible that we find the same outcomes in future studies given a prediction interval between 0.11 and 0.47. However, the interaction between the medium and story-related enhancements in digital books was not significant, Q(1) = 2.53, p = .112.

Dictionary Effects
Ten of 18 studies included a dictionary (k = 10, g = 0.20, 95% CI [0.03, 0.38]), suggesting a positive effect of such digital books on word learning. Given the broad prediction interval (−0.29, 0.69), which stretches well below zero, we cannot be overly confident that the positive effect we found for a dictionary is robust in every context. It might be very well possible that such digital books do not outperform print books in some future scenarios. Further exploration indicates that digital outperformed paper when the digital book had a dictionary and no other enhancements (k = 4, g = 0.49, 95% CI [0.23, 0.74]), even though the prediction interval does not completely exclude that paper outperforms digital in future studies (prediction interval: −0.06, 1.04). The contrast between digital and paper was no longer significant when the book included other enhancements in addition to the dictionary (k = 6, g = 0.09, 95% CI [−0.11, 0.28]). Given the broad prediction interval (−0.45, 0.63), we may expect the same in future studies. With a dictionary alone, digital was more beneficial for word learning than paper but this effect disappeared when the digital book also included story-related enhancements, Q(1) = 5.83, p = .016. To explain these findings, we may assume that a dictionary was no longer beneficial when words were only occasionally explained, as may be expected when enhancements also targeted the story content (see Table 3 for details).

Adult Effects
Controlling for adult guidance was not possible due to the small number of studies.

Discussion
This meta-analysis aimed to establish whether the medium (paper vs. digital), in and of itself, has a substantial effect on children's story comprehension and vocabulary learning and, if it does, whether it is moderated by the enhancements available in digital books, including story-related enhancements and/or a dictionary, and/or adults' presence during the reading sessions. We found that when the paper and digital versions of the story are practically the same and only differ by the voiceover or highlighted print as additional features in the digital book, then paper outperforms digital. Previous meta-analyses targeting more advanced readers (Clinton, 2019;Delgado et al., 2018) found an interaction between the screen inferiority effect and the reading genre in that screen inferiority was evident for expository texts but not for fiction. Our meta-analysis does not reveal a similar effect for story comprehension, probably because nonfiction at this early age often provides information that is mostly embedded in a narrative storyline which makes fiction not very different from nonfiction. However, we found an interaction for vocabulary learning in a different direction than that reported in Delgado et al.'s (2018) meta-analysis, which showed screen inferiority for nonfiction, but not for fiction. In the six studies that included nonfiction books, core words to convey information such as "privacy" or "cub" were targeted by the digital enhancements, which may explain that children show a higher score on vocabulary in favor of digital. Our findings are in line with studies comparing the interactions between adults and children while sharing a digital or print book. Several studies showed that conversations during digital book reading were dominated by talk about the device or the child's behavior rather than the story content as is common with print books (e.g., Chiong et al., 2012;Parish-Morris et al., 2013;Richter & Courage, 2017). Low SES children have more difficulties comprehending digital books than print books, possibly because children from low SES families are more used to game-like activities when interacting with digital devices (Bus & Neuman, 2009). As a result, they may target the interactive features in digital books and pay less attention to the story content. The screen-inferiority effect is strongest in a school context probably because in school, sessions are mostly group-based, and group-based sessions are difficult to reconcile with the digital book format that often includes interactive enhancements (Hoel & Tønnessen, 2019).
The screen inferiority effects can be moderated or overcome by the design of the digital books and/or the adults' mediation. When enhancements target the story content, for instance, by prompting children's background knowledge and/ or providing additional explanations of events, these books not only outweigh the negative effects of the digital device on story comprehension, but they even outperform print books. On the other hand, a dictionary has no or negative effect on children's story comprehension, indicating that focusing attention on word meanings distracts children's attention from the story content. When adult mediation is the same in print and digital, digital books with content-related enhancements stimulate children's story comprehension more than paper books. In line with a previous meta-analysis (Takacs et al., 2014), we expected that the adults' mediation during the reading of paper books would be as effective as the enhancements in digital books read by children independently. However, the current findings suggest that adult guidance outperforms the effects of enhanced digital books. This result may reflect the rather low quality of enhancements in the digital books in the small subsample enabling a comparison of adult guidance with enhanced digital books read independently. Only in a few studies enhancements were optimal (i.e., just targeting the storyline and not the vocabulary).
For enhancing children's vocabulary, digital books are more effective than paper books. This is especially the case for digital books that include a dictionary that defines infrequent words and expressions. The dictionary is most effective when used on its own and not in combination with other content-related enhancements, which corroborates the theory that it is hard to combine both activities: engaging in a story and concentrating on word meanings. Nonfiction books seem to be particularly helpful for learning new words, most probably because the enhancements in these books aim at teaching new concepts such as "privacy" (e.g., in Zhang-Kennedy et al., 2017, the story is about Cyber heroes maintaining their secret identities on the Internet), thus making word learning a natural component in the digital books. Interestingly, children's learning of new vocabulary is further promoted when the words' orthography is visible like in the paper book and simultaneously highlighted, as was the case in 10 out of 13 studies. This finding aligns with Rosenthal and Ehri's (2008) conclusion that the word's orthography supports word learning.

Theoretical Implications
The finding that the device may negatively interfere with meaning-making is in line with the cognitive load theory (Kahneman, 1973) and Mayer's (2009) model of multimedia learning, positing that the human information processing system has a limited capacity. According to this theory, each channel of information processing (audio or visual) has a limited capacity, and an overload of information interferes with learning. In the studies examined in this meta-analysis, the device seems to attract young children's attention at the expense of attention paid to the storyline, even when the content of the paper and digital books was the same. The parsimonious resources available for processing the main information in picture books-the central narrative-may have been misallocated to the means of achieving it (e.g., point, click, and swipe), thus hampering meaningmaking. Another source of interference may be that children expect interactivity because they are accustomed to game-like activities and are actively searching for such possibilities, which distracts their attention from the story. Thus, the increased demand placed on cognitive resources when children read digital books might be a performance disadvantage (Fisch, 2000).
The same theory can explain why the design of some digital books can interfere with story comprehension more than others. When the enhancements are aligned with the story content, they contribute to meaning-making, but enhancements that do not support the storyline distract children and diminish their meaning-making. Albeit novel and exciting, additional information such as word definitions may quickly overload the human capacity to learn. Processing word definitions means that cognitive resources are no longer available for processing the storyline. Consequently, a dictionary may limit meaning-making. In the same vein, the capacity model, first introduced by Fisch (2000) in relation to TV-based narratives and the influence of their design on children's learning, explains why some enhancements negatively affect meaning-making. Fisch's (2000) model introduced the notion of distance that he defines as the "degree to which the educational content is integral or tangential to the narrative" (p. 64). If we apply this model to our findings, we can conclude that if the distance between the narrative and the enhancements in digital books is small, both can complement one another rather than compete for cognitive resources, and thereby increase children's meaning-making of the narrative. Following this line of reasoning, we could expect that enhancements that are close to the main narrative can promote children's engagement and support greater story comprehension.
Originally proposed by Vygotsky (1978), sociocultural theory views learning and development of higher mental functioning, such as the cognitive and language skills necessary for reading, as a collaborative product of social interaction between children, adults, and sociocultural tools, such as written systems in the forms of books. In line with this theory, we hypothesized that enhancements in digital books, often modeled after adult guidance, may either outweigh or further support adult mediation. While the prompts embedded in a digital book might scaffold a child's understanding and cognitive learning outcomes such as vocabulary learning, the target books in the current set of studies appear insufficient for the complexity of a sociocultural learning process. The current findings show that children's meaning-making benefits more from adult mediated print book sessions than from enhanced digital books that children read on their own, thus suggesting that adults are better able to attune their support to children's needs. However, interpreting this finding, we also need to consider that the digital books in the current set of studies were not optimally enhanced, thus thwarting a fair comparison of digital enhancements with adult mediation. Most of the books were not equipped with an enhancement that supports children in comprehending the storyline. Given the digital reading format's flexibility for redesign and the current technology advancements (e.g., machine learning and artificial intelligence may enable new forms of interactivity), more research that compares adult mediation with more optimally designed digital books is needed.

Study Limitations
It matters which enhancements are embedded in digital books when comparing reading on paper versus on-screen. We could make a rather rough distinction between content-related enhancements and a dictionary and show different effects but failed to make more fine-grained distinctions. For instance, content-related enhancements may differ in how coherent they are with story content. This might be a core element for meaning-making (see Christ, Wang, Chiu, & Cho, 2019;Kucirkova, 2019). Due to the small numbers of studies, we were unable to test the effect of such a distinction. Second, due to the limited set of enhancements in the meta-analyzed studies, the potential of digital books may be undervalued by the current findings. For instance, embodied actions may raise the empathy-building potential of narratives and have a powerful effect on meaning-making (Kucirkova, 2019). Children may remember a story better if they can physically manipulate objects referred to in the story, thus stimulating the user's empathy with the character's doubts and feelings, potentially deepening story comprehension . Third, the enhancements in the current set of studies limited the comparisons between children's reading with adult support and children reading digital books independently. Although in the current set of studies enhancements did not outweigh adult guidance, future studies should further explore whether a sensible system of digital storytelling techniques may be powerful enhancements of picture book apps that outweigh or complement adult guidance. Fourth, in the studies that we meta-analyzed, only a few authors included reading motivation as an outcome measure, even though we may expect that interest in reading is one of the main outcomes of book reading (e.g., De Bondt et al., 2020). Last, most studies involved children in the age range of 4 to 5 years, making it hard to generalize the current findings to infants or the older children who are at the transition from emergent to conventional readers.

Practical Implications
Reading stories on a digital device means managing the device, which may negatively affect processing the story content (cf. Lauricella et al., 2014). Nevertheless, digital books for young children can outperform paper books when they meet a minimum quality standard. The books include enhancements that increase children's meaning-making of the narrative, for instance, by prompting children's background knowledge or providing additional explanations of story events. Based on inspiring prototypes grounded in multimedia learning theory, the research suggests that digital techniques can create new possibilities. However, designers need to be selective with the type of enhancements they include in the books. With a few exceptions, the commercially published books in the current set of studies did not involve digital storytelling techniques that are similar to what adults do when they share a story with a child, such as attracting children's attention to the main story elements and thus focusing their attention on the chain of story events (e.g., Bus et al., 2015;Eng et al., 2019).
Our meta-analysis showed that some enhancements help children's word acquisition but not meaning-making. Notably, a dictionary in digital books that defines words and expressions rare and unknown to a young reader promotes vocabulary learning but harms meaning-making. The finding that the presence of rather common enhancements such as a dictionary may interfere with processing the storyline points to the need for a careful balance between book reading's main aims. Despite the importance of vocabulary development, it seems preferable to include enhancements that align the main aim of book reading-meaning-making and elaborating on the story content. In other words, digital book designers need to be careful with popular additions that may be helpful for isolated outcomes such as vocabulary learning, but not for the reading session overall. The practical recommendation that we can infer from the findings for teachers and parents/ caregivers is to pay close attention to the design of books they use with young children and select books with content-related enhancements. Seemingly small and attractive design differences, such as the presence of a dictionary, may hinder children's meaning-making. Our findings could extend existing guidelines for designers and policymakers (e.g., The International Collective of Research and Design in Children's Digital Books) and inform new policy documents for educators and educational professionals.

Future Directions
Future research needs to explore books that include more sophisticated digital storytelling enhancements. The efficacy of digital books strongly depends on the quality of digital storytelling enhancements. In new experiments, the selection of enhancements should be carefully considered in light of the cognitive load theory and the body of evidence concerning the distracting and enabling effects of digital books' enhancements. The creation of picture book apps equipped with new digital storytelling techniques requires, in addition to an author and illustrator, the involvement of competent designers creative in finding ways to support story comprehension and arouse the readers' curiosity about what will happen while children are reading the story.
Most studies in the digital reading domain focus on effects concerning children's learning rather than reading routines even though we may expect that, due to a transition from print to digital, not just the reading materials will change but also the established home and school routines of book reading sessions. Digital books may significantly change established reading sessions, given that they include a voiceover and other digital features that make the story content accessible without adult mediation. It follows that digital books may elicit forms of book reading that are uncommon with print books, such as, for example, increased repetition of favorite texts, leading to children's greater exposure and discovery of more layers in stories. This is an exciting prospect for both research and practice of children's digital books.

Note
This study was supported by Grant No. 275576 from the Research Council of Norway.