Distant reading 940,000 online circulations of 26 iconic photographs

How do digital media impact the meaning of iconic photographs? Recent studies have suggested that online circulation, especially in a memeified form, might lead to the erosion, fracturing, or collapsing of the original contextual meaning of iconic pictures. Introducing a distant reading methodology to the study of iconic photographs, we apply the Google Cloud Vision Application Programming Interface (GCV API) to retrieve 940,000 online circulations of 26 iconic images between 1995 and 2020. We use document embeddings, a Natural Language Processing technique, to map in what contexts iconic photographs are circulated online. The article demonstrates that constantly changing configurations of contextual imagetexts, self-referential image-texts, and non-referential image/texts shape the online live of iconic photographs: ebbs and flows of slowly disappearing, suddenly resurfacing, and newly found meanings. While iconic photographs might not need captions to speak, this article argues that a large-scale analysis of texts can help us better grasp what they say.


Introduction
A naked girl running down a road, screaming in pain after a napalm attack; a Zeppelin going up in flames; a Cuban revolutionary looking into the future; an astronaut taking man's first steps on the moon; a protester standing defiantly in front of a squadron of tanks; a toddler lying face down on a beach.For many people, these phrases will immediately conjure up the iconic photographs which they describe.Images which, in the often-quoted definition by Hariman and Lucaites (2007: 27) are "widely recognised and remembered, are understood to be representations of historically significant events, activate strong emotional identification or response, and are reproduced across a range of media, genres, or topics."These pictures, according to Dahmen et al. (2018: 265) have come to the "forefront of our collective, visual public consciousness to become the defining, enduring image of an event."In addition, some researchers have ascribed enormous political power to iconic photographs: they are presented as having stopped the Vietnam war and caused a change in European and Canadian immigration policy (Chong, 1999;Durham, 2018).
Although it was fully developed around 2000 (Brink, 2000;Hariman and Lucaites, 2001;Perlmutter, 1998), when the advent of digital media was already clear, the concept of the iconic photograph has often been connected to the dominance of top-down mass media, such as the newspaper, the illustrated magazine, and television, in the three decades after the Second World War (Boudana et al., 2017;Hariman and Lucaites, 2018b).More recently, scholars have begun to debate the effects of digital media on the creation, selection, distribution, reception, and meaning of iconic photographs (Boudana et al., 2017;Dahmen et al., 2018;Durham, 2018;Hariman and Lucaites, 2018b;Ibrahim, 2016;Merrill, 2020;Mielczarek, 2020b;Mortensen, 2017;Olesen, 2018).Calling into question the "pre-digital typology of iconic images" (Mielczarek, 2020b), those scholars investigate what happens to older iconic photographs when they are circulated online and how digital media impact the formation and dissemination of new iconic images.While answers to these questions vary, scholars agree that the digital circulation of digitized and born-digital iconic pictures often leads to the "trivializing" (Boudana et al., 2017), "decontextualizing" (Mortensen, 2017), "eroding" (Dahmen et al., 2018), "fracturing" (Mielczarek, 2020a), or "collapsing" (Merrill, 2020) of the original contextual meaning of iconic images.
These discussions hark back to earlier debates about how iconic photographs get their meaning and if, and how, this meaning can change.Sontag (1977: 84) influentially argued that "words speak louder than pictures".Because every photograph carries a "plurality of meanings," captions, or maybe more generally, textual narratives are needed to explain them.Texts thus tend to "override the evidence of our eyes."In contrast, arguing against the widespread notion that photographs depend on captions for their meaning, Hariman and Lucaites (2007: 1, 19) famously noted that iconic photographs "bear witness to something that exceeds words".Studying the visual rhetoric of this class of pictures, they described them as visual "mediation[s] of an important question for public life" and emphasized their "visual eloquence": the fact that they do not need captions to speak.
The widespread digital circulation of iconic photographs offers the opportunity to study the relation between image and text at scale.It has been widely recognized that the iconic status of a picture, both offline and online, depends on its sustained and frequent circulation (Dahmen et al., 2018;Hariman and Lucaites, 2007).Most studies of iconic photographs, however, are based on a (close) reading or analysis of the iconic picture itself or of a limited sample-in number, period, and geographic distribution-of circulations, remediations, or appropriations (see Van der Hoeven, 2019 for an overview).Introducing a "distant reading" (Moretti, 2000(Moretti, , 2005) ) method to the study of iconic images, this article applies the Google Cloud Vision Application Programming Interface (GCV API) to retrieve 940,000 online circulations of 26 iconic photographs in the period 1995 and 2020 (Table 1).We use document embeddings, a recently developed Natural Language Processing (NLP) technique, to analyze the relationship between the iconic photographs and the text that surrounds them on the webpage.
After sections on iconic photographs in the digital age, data, and methods, we build on the work of Mitchell (1994Mitchell ( , 2018) ) to show that the 26 iconic images circulate online in changing configurations of contextual imagetexts, self-referential imagetexts, and non-referential image/texts.The article demonstrates that the meaning of iconic photographs in an online environment is determined by the "interaction" (Mitchell, 1994) between text and image.It also shows that, for the 26 iconic photographs studied in this article, digital circulation did not cause fundamental shifts in meaning.Contrary to the view that the digital circulation of iconic photographs is progressively dominated by memes and other decontextualized appropriations, our research reveals ebbs and flows of slowly disappearing, suddenly resurfacing, and sometimes newly found meanings.These ebbs and flows show that processes of online circulation, memeification, and commodification mostly do not threaten the original meaning of the iconic photographs but rather reproduce the dominant message(s) that made them iconic in the first place.

Iconic photographs in the digital age
Iconic photographs do not emerge in a "cultural and cognitive vacuum" (Dahmen et al., 2018: 273).Starting their life as images of newsworthy events, they have to move to another field-that of the iconic picture-to keep being circulated.In this process, they acquire a wider meaning than they had in the original context of being a visual representation of the news.Analyzing the iconic photograph of Neda Agha-Soltan, who was shot during the 2009 Iranian election protests, (Assmann and Assmann, 2010) note that all iconic images start their life as being of something but end up as being images for something.Dahmen et al. (2018: 271) describe the same process as a "symbolic collapse of ideas, historical events, and sentiments to an exemplary form."An iconic photograph becomes a "metonym that stands in for larger, more complex phenomena." The process can be illustrated by looking at one of the most frequently discussed iconic images: Nick Ut's 1972 accidental napalm or "napalm girl" photograph.Sturken (1997) argues that this highly disturbing image of this innocent victim of war quickly acquired a symbolic meaning as a serious indictment of the United States methods of conducting war.Some iconic photographs gain an even more generalized or even universal meaning as symbolic visual representations of (specific) parts of the human condition.Harris (2019) argues that Ut's picture became emblematic for the effect of war on the innocent.Likewise, Lee-Koo (2018) asserts that the picture has come to speak for the human costs of war in general.
Most studies on iconic photographs describe the process of iconization as a one-way, progressive move from a short-lived, specific, and contextual meaning in the realm of news to a generalized, symbolic, or even universal meaning in the realm of iconic pictures.According to Sturken (2018), this sequence makes iconic pictures "paradoxical cultural and social objects . . . the more iconic they become, the less specific they remain" (p.314).Boudana et al. (2017) similarly describe the process of iconization as a paradox: "The more an iconic photo is circulated, the more it is recognized as iconic, yet the more it may become devoid of the significance that made it iconic in the first place" (p.1227).Sustained and widespread circulation is presented as both the cause and the result of the collapse of context and generalization of meaning.
To some researchers, the new digitalized media ecology, where diffusion takes place on a decentralized level and appropriation is made easy by photo-editing software, allows iconic photographs to take on more diverse political meanings (Olesen, 2018).However, most others present the affordances of digital media as threatening to overburden the iconization process.Digital circulation can overload the trade-off between widespread circulation and collapse of meaning to a dangerous extent.Memeification (Jensen et al., 2020) of photographic icons is seen as the most extreme manifestation of this process.According to recent scholarly work, memes can collapse the "original historical and biographical contexts" of an iconic photograph (Merrill, 2020: 115); they threaten to destroy its original meaning and, as a result, its "political and ethical significance" (Boudana et al., 2017(Boudana et al., : 1228)); they decontextualize the iconic image and "divorce [it] from the political" (Ibrahim, 2016: 585), "poach[ing]" the original meaning of the icon and "supplement[ing] it with new interpretations that typically deviate from the main narrative behind the famous image" (Mielczarek, 2020b).
This collapse of meaning and context is often linked to the diminished capacity of mass media gatekeepers, such as photojournalists, news agencies, photo-editors, book publishers, and award committees, to shape the creation, selection, distribution, and meaning of icons.Online audiences are no longer passive recipients, but rather actively interact with iconic photographs (Perlmutter, 2003).Using digital media to create and distribute new photographic pictures, as well as recreate, photoshop, and re-work existing ones (Olesen, 2018), digital audiences can "circumvent" (Dahmen et al., 2018) the traditional mass media and "infiltrate" the iconization process (Mortensen, 2017).Finally, digital media are also presented as diminishing the capacity of photographs to stay iconic.While the affordances of "connective" digital media cause images to become iconic more quickly, these "instant news icons" (Mortensen, 2016), "hyper icons" (Perlmutter, 2003), or "speeded-up icons" (Dahmen et al., 2018) are believed to be quickly consumed and forgotten on the Internet.

Data
Researchers studying iconic photographs have made a distinction between national and global icons, while also noting the possible overlap between these two sets of images.For example, Paul (2009) makes a distinction between German and global photographic icons and Cohen et al. ( 2018) study which national and international photographs are remembered by a Jewish-Israeli audience.Van der Hoeven (2019) set out to discover if some iconic photographs are part of global visual memory: "a limited set of images that people all over the world have seen and remember" (pp.107, 51).Based on a review of "works on historical photographs by historians, media scholars, and other academics," he selected 22 photographs that are often described as iconic.In addition, he included four photographs that figured less prominently in academic debates but are described as iconic in certain non-Western countries: Gandhi and the spinning wheel, Founding of the PRC, Assassination of Inejiro Asanuma, and Allende's last stand (Table 1).
Similar to Cohen et al. (2018), Van der Hoeven used an online survey, distributed to a controlled group of 3000 respondents around the world, to study which photographs were globally recognized.Three photographs-a man on the moon, Guerrillero heroic, and Hijacked airplane-were recognized by more than two-thirds of the respondents.We decided to use Van der Hoeven's (2019) list of 26 photographs as our starting corpus because it is based on academic discussions of iconic photographs (Table 1).It is certainly not exhaustive, in the sense that it contains all, the most recognized, or the most widely distributed iconic photographs.Furthermore, as Van der Hoeven follows the definition of Hariman and Lucaites (2007), the list only contains photographs of important political events and not of celebrities, such as Marilyn Monroe, which are sometimes described as iconic.However, the list serves the purpose of this research: to study patterns of meaning in the online life of photographs that are often described as iconic.
We retrieved the 940,000 online circulations of the 26 photographs in a two-part process.First, an image is uploaded to the GCV API, which enables users to apply computer vision techniques in the cloud.We rely on the basic functionality of the API to find full and partial circulations of the uploaded image on the web.The API returns a list of web addresses (URLs) that contain the iconic photograph.Because the version of the picture on these URLs is often slightly different, for example, in the dimensions, than the version we uploaded, it can be used as input for a second iteration.Using this iterative process, we find not only more circulations but also less recent ones (Smits and Ros, 2020b).The second part of the pipeline includes several methods to collect information from the URLs returned in the first part, such as HTML time-tags, the language of the webpage, as well as the text surrounding the iconic photograph.
Although our pipeline is able to find almost a million online circulations, the data aggregation and curation is limited by several factors.Most prominently, the GCV API is proprietary, which prevents us from properly evaluating the methods it uses to retrieve circulations of images.However, because Google's algorithms seek to optimize user experience, we are confident that our method extracts a significant number of relevant circulations of the iconic photographs in our corpus.This does mean that our method will be biased toward recent websites, as these will be more relevant to Google's users.This explains the relatively high number of URLs in our dataset in 2019 and 2020 (Figure 1).As a result, the dataset becomes relatively less representative the further back in time we go, meaning that it gives a reliable snapshot of the online circulation of pictures in our corpus for the last few years, but provides a less complete representation for the late-2000s.
Next to the year of publication, we can use other metadata elements to describe where on the Internet the iconic photographs can be found.Analyzing the top-level domains (Figure 2(a)), it becomes clear that social image-sharing network Pinterest, the social media platforms Facebook and Twitter, and the blog platforms Blogspot and WordPress are the most important.This supports Perlmutter's (2003) notion that Internet users play an important role in the online circulation of iconic photographs.However, an analysis of top-level domains only provides one measure of importance.The role of top-down mass media outlets, such as influential newspapers, is limited if we only look at the total number of URLs.At the same time, a single URL could attract a lot of traffic.As we did not have information on the number of times the URLs in our dataset were visited, the role of traditional top-down media in the online reception, as opposed to the circulation, of iconic pictures was impossible to assess.In addition, we can use a language identifier to map the most-used languages.Next to English (38.9%), we can find Spanish (10.4%),Chinese (6.6%), Russian (4.7%), and French (3.8%) in the top-five.
Concerning the textual data, the parsing of the texts on the retrieved URLs introduces some "noise" in our dataset.Because the retrieved websites are highly heterogeneous in form, we could not rely on algorithms that are able to extract relevant content from more homogeneous sets of web pages.Therefore, we had to extract text from HTML elements that commonly contain text, such as <p>, which denotes a text paragraph, or <h1> which is often used for titles and headers.This somewhat blunt method catches all relevant textual data, but also introduces noise, like the website form and menus and possibly textual advertisements.By releasing our dataset, we hope to enable sound (historical) comparisons in the future (Smits and Ros, 2020a).

Methodology
Our dataset provides a unique opportunity to study the relation between iconic photographs and texts because the same 26 pictures interact over and over again with different words.We use the work of visual culture studies scholar WJT Mitchell to describe how the meaning of iconic photographs is constructed.Mitchell (1994) argues that the meaning of a picture can be understood by analyzing how it interacts with surrounding (con) text.He identified three main types of the intertwined "dialectical constellations" between The plot shows that the Google Cloud Vision API is biased toward more recent results.Dates are parsed using the htmldate Python module that reports an accuracy of 0.893 in identifying dates of publication (Barbaresi, 2020).

Figure 2.
Top 20 top-level domains and their relative prominence in the dataset for all the images, except Guerrillero heroico (2a) and for Guerrillero heroico (2b).
images and texts: "'imagetext' (if word and image are seamlessly united), image-text (if they are distinct but connected), and image/text (if they are in conflict or tension)" (Mitchell, 2018: 231).Using Mitchell's concepts, we see iconic imagetexts as constellations where the text refers to the original historical event.For example, the accidental napalm photograph is circulated to say something about the Vietnam War.Iconic image-texts are constellations that refer to concepts that fall outside what is shown on the image but are still connected to it.These combinations are often self-referential, referring to iconicity itself, image manipulation, or the power of photography.Third, iconic image/texts display a tension between the image and the text.Memeified versions fall within this category.
While there are thousands of beautiful, shocking, and powerful images being made each day, iconic photographs stand out because they are also widely disseminated, or, as Dahmen et al. (2018: 265) describe it, they are "prominent quantitatively (in a number of occurrences across many sites over time)".Van der Hoeven (2019) notes that this central aspect of iconicity could, previously, only be studied on a small scale as there were no feasible techniques to retrieve and analyze a sufficiently large enough set of circulations of iconic photographs.For example, Hariman and Lucaites (2003) analyzed the original accidental napalm photograph and several appropriations.Boudana et al. (2017) studied 34 different memes of the same image.Mielczarek (2020b) looked at "dozens" of memes of the situation room photograph.Olesen (2018) looked at a sample of 20 different versions of the Alan Kurdi photograph.Basing her research on the largest sample, Mortensen (2017) studied 656 circulations of 147 different appropriations of the same image.
First coined by Moretti (2000) in literary history, "distant reading" refers to a set of methodologies in the (digital) humanities that apply computational methods, often derived from NLP, to identify and analyze patterns in a large collection of digital or digitized texts (Underwood, 2017).We argue that we can use distant reading methods to analyze the interaction between iconic photographs and text at scale and identify in what contexts iconic photographs are circulated online.Several NLP techniques, such as Latent Dirichlet Allocation (LDA), also known as topic modeling, could have been used to gain insight into this semantic context (Blei et al., 2003;Tolonen et al., 2019).After some exploratory experiments, we decided to focus our research on the English-language websites in our corpus (38.8%) and use the recently proposed top2vec technique instead of LDA (Angelov, 2020;Smits and Ros, 2020b).Top2vec, which uses joint document and word semantic embedding and Hierarchical Density-Based Spatial Clustering (HDBSCAN), is not only more versatile than LDA, but also allows for a more datadriven selection of the optimal number of textual topics.However, the use of HDBSCAN in the top2vec method results in so-called hard clustering, which means that every document has to be assigned to one cluster.Because of the heterogeneous nature of our corpus, we decided to use a Gaussian Mixture Model (GMM) instead.GMM clustering superpositions clusters as Gaussian distributions: for every document, the probability of the document belonging to cluster K is calculated, resulting in a probability distribution.Exploratory experiments showed that the hard clustering of HDBSCAN obscures the self-referential language of iconicity (see next section).While GMM groups words such as "photograph, iconic, and famous" together, HDBSCAN disperses these words over other clusters, obscuring their overall significance (published paper of the authors, known to the editors).We have released the results of both clustering methods with our dataset (Smits and Ros, 2020a).
An added benefit of GMM's soft clustering lies in the fact that it provides a probability distribution of a document belonging to any of the K clusters.As result, our dataset contains a list of probabilities for every K for every URL.We can use the sum of these probabilities to calculate the prominence of a cluster in the entire dataset, which, by the nature of the dataset, also allows the mapping of prominence over time.By aggregating all probabilities for all documents in a year and normalizing the probabilities by the sum of all probabilities in that year, we can map the relative change in cluster prominence over time (Zosa et al., 2020).We show that we can use the document prominence to chart changes, which we describe as ebbs and flows, in the different ways in which iconic photographs circulate online.

Imagetexts: contextual digital circulation of iconic photographs
By comparing the clusters of all the 26 iconic photographs, we identify several largescale patterns that play an important role in the online circulation of an iconic picture.First of all, almost all the photographs have a contextual imagetext cluster that refers to the original historical event (Table 2).The prominent words in these clusters tell us what is on the photograph.The 10th cluster of Malcom Browne's 1963 burning monk photograph starts with the words Quang, Thich, immolation, Buddhists, Duc.It contains all the necessary information to reconstruct the original context of the photograph.The second cluster concerns the wider historical context of the photograph: the Vietnam War.This combination between a specific and a more general contextual cluster is also observable for other iconic photographs.Cluster zero of John Filo's 1970 Kent State photograph starts with "Kent, State, university, shootings, nine" and cluster two with "Vietnam, presentations, war, vietnamization, involvement." There are slight differences in the contextual clusters for digitized and born-digital icons.Because our dataset includes the journalistic use of born-digital icons, the associated contextual clusters display a wider range of contextual meanings.Three clusters of Nilüfer Demir's 2016 photograph of the drowned Syrian toddler Alan Kurdi describe different aspects of the original context: cluster seven starts with the words "Syria" and "refugees," while cluster three prominently contains "migrants."Instead of focusing on the wider context that led to his death, cluster eleven tells a more personal story about Kurdi, noting that his uncle and aunt had attempted to bring him to Canada (Table 2).
Contextual clusters not only describe (the wider context of) what is on the image but can also refer to the story of the photograph (Table 2).Cluster fifteen of Dorothea Lange's 1936 migrant mother is related to her work for the Farm Security Administration and her commission to document the plight of poor farmers.Cluster eight of Robert Capa's 1936 falling soldier points to debates about the authenticity of the image, which flared up after hundredths of original negatives turned up in the so-called Mexican suitcase in 2007.Cluster eleven of Kevin Carter's 1994 the vulture and the little girl pertains to the photographer's unofficial membership of the Bang-Bang Club, a group of Table 2. Contextual clusters for the iconic photographs in our corpus selected by the authors.The 26 iconic pictures have different numbers of clusters.The number of a cluster is no indication of its importance.The sequence of the words in the cluster denotes the similarity of the word to the cluster vector.We display the top ten most similar words per cluster, as is common practice.Our dataset contains a full list of clusters for each of the 26 iconic photographs and the top fifty words per cluster (Smits and Ros, 2020a).

Known as
First photojournalists that documented South Africa's transition from Apartheid to democracy (1990)(1991)(1992)(1993)(1994).Carter died by suicide in July 1994, shortly after he had won the Pulitzer Prize for the vulture and the little girl, and Ken Oostenbroek, one of the other members of the club, was killed while working in the Thokoza township.Cluster three of Jeff Widener's 1989 tank man tells the story of the photographers that captured the scene from the same hotel balcony.Some clusters are only indirectly related to the original context of the photographs (Table 2).Because they are widely recognized, iconic pictures are frequently used as illustrations of other processes or events.Cluster eleven of the Hindenburg disaster pertains to circulations where the photograph is used to explain thermodynamic processes.Cluster thirteen of the Abu Graib image is related to Philip Zimbardo's well-known 1971 Stanford Prison Experiment.Another form of this kind of circulation are the clusters that connect the iconic pictures to famous albums and pop songs.Cluster five of Kent State refers to the 1970 protest song Ohio by Crosby, Stills, Nash, and Young.Cluster eight of the Hindenburg disaster is related to its use on the cover of Led Zeppelin's 1969 debut album.Cluster six of the burning monk photographs refers to the cover of the 1993 eponymous Rage Against the Machine album.

Image-texts: self-referential circulation
Next to the contextual clusters, iconic photographs also circulate on the Internet as selfreferential image-texts.The textual clusters in this category are not related to what the photograph shows, in the broadest sense of the word, but rather interpret it in relation to general notions about the power of images, photography, and iconicity.As Table 3 shows, many self-referential clusters contain (combinations) of the words iconic, famous, prize (and Pulitzer), image, photo, picture, photographer, camera, capture, taken, and shot.
The self-referential digital circulation of iconic photographs is also observable in the frequent co-occurrence of multiple iconic photographs on the same URL; 13% of all the URLs in our dataset contain at least two of the photographs in our corpus.Based on a matrix of co-occurrence on the same URL of the images in our corpus, Table 4 shows that this form of self-referential circulation is important for the digital life of some iconic images but almost negligible for others.For example, 55% of all the online circulation Eddie Adams' 1968 vietcong execution, 52% of Yasushi Nagao's 1960 Assassination of Inejiro Asanuma, 45% of the falling man, and 41% of the burning monk are published together with at least one other photograph in our corpus on the same URL.For some images, the connection to a single other iconic image is especially pronounced.Roughly, 13% of all the online circulations of vietcong execution appear together with accidental napalm (6.8% the other way around).Similarly, 5% of all circulations of the vulture and the little girl can be found on the same URL as accidental napalm (4% the other way around).At the same time, co-occurrence has a much smaller effect on the online circulation of other images in our corpus.Only 2.4% of the 18,343 circulations of Lee Miller's 1945 Holocaust survivors, 2.4% of the 186,921 circulations of Neil Armstrong's 1969 a man on the moon, and 4.4% of the 108,288 circulations of Alberto Korda's 1960 guerrillero heroico appear together on the same URL with 1 of the other 26 photographs.
Considering the frequent co-occurrence of iconic photographs on the same URL, it comes as no surprise that clusters of one iconic photograph sometimes refer to other iconic pictures (Table 3).Cluster fourteen of migrant mother contains references to the Hindenburg disaster, V-J Day in Times Square, raising the flag on Iwo Jima, guerrillero heroico, and the burning monk.Cluster sixteen of the burning monk photograph refers to migrant mother, the Hindenburg disaster, V-J Day in Times Square, a man on the moon, Afghan girl, tank man, and the vulture and the little girl.Some clusters reference a specific iconic photograph.Cluster five of the accidental napalm photograph starts with "Kurdi, Aylan, refugee, Syrian, boy," and cluster five of the vulture and the little girl with "Phuc, napalm, Ut, Kim, Vietnamese." Theories of iconic photographs note that they are widely circulated because they acquire a broad meaning.As visual metonyms, they come to represent a historical event or an even wider part of the human condition, such as the horrors of war.Based on this, we expected two, or more, pictures to appear on the same URL if they could be said to be visual metonyms of the same, or similar, event(s) or phenomena(s).This explains the cooccurrence of burning monk and accidental napalm (Vietnam War) or accidental napalm, vulture and the little girl, and Alan Kurdi (suffering children, see below).However, the  importance of co-occurrence and self-referential textual clusters for some pictures-55% of all circulations of burning monk for example-also suggest the existence of an observer effect in the iconization process: the fact that the act of observation (theories of iconic photographs) causes a disturbance in the observed system (circulation of iconic photographs).Some iconic photographs do not stay iconic as a result of the continued importance of what they show but simply because they keep on being described as iconic.
Before turning to the non-referential image/text clusters, a group of clusters should be discussed that points to the circulation of iconic photographs as products (Table 5).Cluster five of raising the flag on Iwo Jima, six of raising the flag over the Reichstag, eight of a man on the moon, and seven of the Afghan girl all contain the words "wallpaper, background and desktop."These clusters pertain to sites that offer the iconic photographs as downloads for people to use as backgrounds on their computer, tablet, or phone.Next to this kind of digital product, other clusters in this category concern websites where some iconic photographs are offered as prints.This circulation of iconic photographs in a commodified form is also represented in Figure 2(a), which prominently figures platforms such as Amazon, eBay, and Redbubble.Clusters two, five, and twelve of the man on the moon, taken by Neil Armstrong in 1969, all refer to different possibilities of buying a print of the photograph.
Commodification plays a particularly substantial role in the online circulation of Alberto Korda's 1960 guerrillero heroico portrait of Cuban revolutionary Che Guevara.The discrepancy between the political conviction and motivation of the photographer (and his subject) and the widespread commodification of the image has been widely noted (Casey, 2009).Clusters six, nine, fifteen, and sixteen of the photograph point to some sort of commercial distribution of the image (Table 5).Uniquely in our dataset, cluster sixteen refers to the printing of the image on t-shirts.Again, we see the importance of the commodified circulation of guerrillero heroico reflected in Figure 2(b), which shows that Amazon is more important for its circulation than Twitter.However, the online circulation of the image is not determined by its commodified form.The contextual cluster seven, which starts with the words "Batista, dictatorship, overthrow, dictator, he," refers to the life of Che Guevara, while the self-referential image-text cluster ten refers to its iconicity (Tables 2 and 3).The co-existence of these two sets of clusters suggests, as Cambre (2012: 64) argued, that the "processes of commodification and radicalization of the image of Che Guevara can coexist."

Image/texts: non-referential circulation
A group of non-referential or decontextualized image/text clusters can be connected to the memetic circulation of iconic photographs (Table 6).Because our method is only able to identify full and partial circulations of the iconic image, all the memes in our dataset fall within the category of the "reaction photoshop" (Shifman, 2014b): (part of) a photograph which has been inserted into a new context, or a photograph to which visual and/or textual elements have been added.Non-referential clusters that point to this kind of circulation often contain words like "meme(s), fun(ny), lol [laughing out loud], shit" The clusters also contain words, such as "mod[erator]" and "comment," that point to the specific part of the Internet where memes frequently circulate: social media platforms, forums, and message-and imageboards.The occurrence of "nbsp" similarly points to the connection between memeified versions of iconic photographs and message boards.Non-breaking space (nbsp) is an HTML command frequently used on message boards and forums to insert a blank line, for example, to separate text from an image.The command is not visible for other users but is recognized as a textual "token" by our method.The appearance of HTML code in the non-contextual clusters points to the earlierdiscussed problem of textual noise in our dataset.Looking at Figures 3 to 8, which will be discussed below, it becomes clear that, besides the contextual, self-referential, and non-contextual clusters, the parsed text surrounding iconic photographs also includes words that are grouped into clusters that seem to make little sense.Considering that the applied algorithms group words that appear together relatively often, these cluster are mostly the result of the clustering of textual noise.The prominence scores are more reliable the more URLs our pipeline was able to retrieve.As a result, the clusters tend to converge toward 2019.While our dataset contains URLs that were first published as early as 1995, the modest number of URLs in these years would distort the figures.Note that a (collection of) documents (URLs in our case) can be related to multiple topics.For example, a website can describe the original context of vietcong execution but also underline its iconicity.However, not all clusters are likely to co-occur with others to the same extent.Memetic cluster are not likely to appear in the same document as contextual clusters.Therefore, these clusters are more pronounced in the figures as they are not distributed over documents.
Some scholars argue that memes can become parasitic upon the iconic photograph, using their fame to spread and distort, or even destroy, its original meaning in the process (Boudana et al., 2017;Durham, 2018;Ibrahim, 2016).However, several clusters show that memes often rely on knowledge of the original context.Cluster one of the Hindenburg disaster refers to a meme where the Hindenburg is replaced by a giant manatee, often accompanied by the phrase "oh, the huge manatee," which refers to a famous radio report by Hebert Morison, who shouted "oh, the humanity," as he witnessed the crash of the airship.The interaction between textual and visual elements of the original iconic photograph and its memeified version-the visual resemblance between a Zeppelin and a manatee, and the similarly sounding sentences-can only be grasped if viewers are aware of the original context.As a result, the circulation of the Hindenburg manatee meme reproduces the contextual meaning of the original iconic image in a "playful" way (Shifman, 2014a).
Cluster eighteen of the burning monk photograph starts with the words "meme, spray, pepper, UC, pike."On 18 November 2011, University of California, Davis police officer John Pike pepper-sprayed a group of peaceful Occupy protesters.The photograph of the incident quickly morphed into the "Casually Pepper Spray Everything Cop Meme," where the officer was photoshopped casually pepper-spraying characters in famous paintings and famous photographs.Scholars have described the meme as a new digital form of protest (Bayerl and Stoynov, 2016;Huntington, 2016;Milner, 2013;Peck, 2014), a way to make (the repression of) protest visible, or as a form of online retribution or punishment (Mielczarek, 2018).Concerning the association between the "Casually Pepper Spray Everything Cop" and the burning monk photograph, Peck (2014) noted that the association between the two images went too far for some message board users, who argued that it "trivialized" the protest of Thích Quảng Đức.Similar to the Hindenburg manatee meme, the "Casually Pepper Spray Everything Cop" meme can only be understood, condoned, or condemned with knowledge of the original context.While viewers may find the meme distasteful, it still reproduces the iconic status of burning monk.
Does our research reveal cases where the memeification of iconic photographs erases their original meaning?Boudana et al. (2017) discuss 34 different memeified versions of accidental napalm (pp. 1225-1226).They argue that some memes produce a discrepancy between the "compassion and pain that we are requested to feel when viewing the original photo and the laughter invited by the additional memetic figure."Clusters eight of accidental napalm reveal a connection between the iconic photograph and the "Swiggity Swooty (I'm Coming for That Booty)" meme.In most versions, Kim Phuc, the girl in the picture, is replaced by an older male celebrity (Jimmy Savile, Prince Charles, and others), creating the impression that the crying boy, originally on the left side of the image, is running away from him.The meme also often contains the words "Swiggity Swooty (I'm Coming for That Booty)."As it produces a strong sense of "intertextual incongruity," as Boudana et al. (2017) describe it, most viewers will find the meme distasteful or even offensive.However, even in this extreme case, the viewer's sense of unease, disgust, or transgressive excitement is largely predicated on knowledge of the original context.Without this knowledge, and the sense of taboo it creates, the meme would lose its edge.Following Hariman andLucaites (2018a, 2018b), our research supports the observation that widespread "misuse" or "misreading" of an iconic photograph, circulations that are considered to be incongruent with the original meaning of the picture, should be taken as proof of, rather than a danger to, its iconic status.

Ebbs and flows of meaning
How widespread are these misuses and misreadings, and do they overtake contextual readings of iconic photographs?A comparison of the prominence scores of different clusters over time demonstrates that the digital circulation of the iconic pictures in our corpus is never dominated by one specific kind of cluster.For example, Figure 3, which displays the normalized prominence scores per year of vietcong execution, shows that in 2018 and 2019, contextual (three and four), self-referential (five and seven), and nonreferential (memetic) clusters (one) are all partly responsible for the online circulation of this iconic photograph.Similarly, Figure 4 shows that several clusters, such as eight on the Swiggity Swooty meme, the contextual clusters ten, eleven and fourteen and the selfreferential cluster six, are the main building blocks of the online circulation of accidental napalm between 2010 and 2019.
These patterns of circulation can be inter-or disrupted by regularly occurring or new events.In 2012, 40 years after accidental napalm was taken, the contextual cluster fourteen became prominent, demonstrating that the photograph was circulated in relation to this anniversary.The peak of cluster seven in 2016 captures the controversy surrounding the removal and censoring of the photograph by Facebook and the ensuing viral commentary of the editor-in-chief of the Norwegian newspaper Aftenposten.In some cases, the increasing prominence of a certain cluster suggests that an iconic image was being re-contextualized and re-politicized.Figure 5 shows that the contextual cluster six of the tank man rapidly became more prominent during the pro-democracy Umbrella Revolution of 2014 and the 2019-2020 Hong Kong protests.At the same time, the prominence of both the self-referential cluster thirteen and the non-referential "meme" cluster four dips in 2014 only to stabilize in period 2015-2019: a good example of the ebbs and flows of contextual, self-referential, and non-referential circulation.
The changing prominence scores of some clusters reveal that iconic images are also circulated in relation to each other.Newly found meaning in one of the two connected photographs or the appearance of a new iconic picture which is connected to an older one can cause shifts in the patterns of online circulation.The peak of cluster five of accidental napalm in 2015, which is related to Alan Kurdi, shows that from 2015 onwards, it was frequently circulated in connection with this picture (Figure 4).Similarly, Figure 6, which displays the prominence scores of the vulture and the little girl, shows a peak in 2015 of cluster five, which is related to accidental napalm.The widespread dissemination of the Alan Kurdi photograph caused a shift not only in the circulation of accidental napalm but also in that of the image to which it was connected: the vulture and the little girl.The prominence scores of the born-digital iconic Alan Kurdi photograph show a slightly different pattern (Figure 7).The clusters in the years 2015 and 2016 fall under the journalistic coverage of Kurdi's death on 2 September 2015.While some of them, such as cluster nine on his uncle and aunt, directly decline, others, such as cluster three on the European refugee crisis, continue to increase in 2016 but then decline.In the same period, the self-referential cluster eight becomes more prominent.The peak of this cluster in 2019 can be explained by the connection between the iconic photograph of Kurdi and the graphic image of Óscar Ramírez and his daughter Valeria, who drowned in the Rio Grande while trying to reach the United States.Just as the publication of the Alan Kurdi image led to a shift in the circulation of accidental napalm and the vulture and the little girl, the image of Ramírez and his daughter changed the circulation of the Alan Kurdi image.
While the peaks in the prominence scores of accidental napalm and Alan Kurdi emphasize how new events suddenly become important for the digital circulation of an iconic image, Figure 8, which display the scores for man on the moon, shows that the digital life of other pictures has taken on a more stable nature.For the years 2017-2019, a steady pattern of several textual clusters with prominence scores around 0.1 emerges.With several contextual image clusters, the online circulation of a man on the moon can be explained by the continued interest in the event itself and, more broadly, the exploration of space.While the image is often described as iconic, there is no clear self-referential cluster on iconicity that underpins its online circulation.This lack of references to the iconic status can be connected to the fact that the image only sparsely (2.4% of all circulations) appears on the same URLs as one of the other 26 images in our corpus.

Conclusion
Instead of theorizing, and often speculating, about the online lives of iconic photographs, this article has shown how computational methods can be used to chart the different ways in which a wide array of configurations of image and text sustain their online circulation.Relying on limited samples, researchers previously presented the online life of iconic images in paradoxical terms: widespread circulation was seen as both the cause and the result of the collapse of context and meaning of the iconic photograph.Our large-scale analysis of contextual, self-referential, and non-referential image/-text clusters has shown that digital circulation does not necessarily result in iconic photographs following this one-way progressive sequence.Rather, their online circulation is marked by ebbs and flows of disappearing, resurfacing, and newly emerging meanings and contextualizations.We have given special attention to the role of decontextualized appropriations, such as memes.While this role has often been described as parasitic, this article demonstrates that memeified versions do not overtake or overshadow contextual circulation: successful parasites will live with their hosts, rather than kill them.Rather, our large-scale analysis has shown that online processes of circulation, memeification, and commodification reproduce rather than threaten the dominant message(s) which made the 26 photographs iconic.
To some readers, the textual focus of this article and the lack of analysis of the iconic images themselves might be disappointing or even problematic.However, we argue that our distant reading can reveal a lot about individual images.For example, it proves that commodification plays an important role in the online circulation of guerrillero heroico but is less important for that of accidental napalm or the vulture and the little girl.At the same, while the horrific nature of these latter images makes them unsuitable for commodification, it plays a decisive role in their memefication.Memes like Swiggity Swooty are effective through the juxtaposition of the serious message of the host and the transgressive or banal message of the meme.
The article has revealed that the circulation of some iconic photographs strongly depends on their connection to other pictures.Some iconic photographs, such accidental napalm, the vulture and the little girl, Alan Kurdi, and Óscar Ramírez and his daughter Valeria, are chained together because they operate as visual stand-ins for the same phenomenon or process: the suffering of innocent children in this case.This suggests that a possible connection to a chain of iconic pictures causes new photographs to become iconic more quickly, while, at the same time, maintaining the iconic status of older links in the chain.

Figure 1 .
Figure 1.The share of URLs published per year.

Figure 3 .
Figure 3. Normalized prominence scores per year of vietcong execution for the years 2010-2019.The prominence scores are more reliable the more URLs our pipeline was able to retrieve.As a result, the clusters tend to converge toward 2019.While our dataset contains URLs that were first published as early as 1995, the modest number of URLs in these years would distort the figures.Note that a (collection of) documents (URLs in our case) can be related to multiple topics.For example, a website can describe the original context of vietcong execution but also underline its iconicity.However, not all clusters are likely to co-occur with others to the same extent.Memetic cluster are not likely to appear in the same document as contextual clusters.Therefore, these clusters are more pronounced in the figures as they are not distributed over documents.

Figure 4 .
Figure 4. Normalized prominence scores per year of accidental napalm for the years 2010-2019.

Figure 5 .
Figure 5. Normalized prominence scores per year of tank man for the years 2010-2019.

Figure 6 .
Figure 6.Normalized prominence scores per year of the vulture and the little girl for the years 2010-2019.

Figure 7 .
Figure 7. Normalized prominence scores per year of Alan Kurdi for the years 2015-2019.

Figure 8 .
Figure 8. Normalized prominence scores per year of a man on the moon for the years 2010-2019.

Table 1 .
The 26 iconic images in our corpus.

Table 3 .
Self-referential clusters for the iconic photographs in our corpus.

Table 4 .
Co-occurrence of the iconic photographs in our corpus on the same URL in our dataset.

Table 5 .
Clusters that refer to the circulation of iconic photographs as products.

Table 6 .
Non-referential clusters for the iconic photographs in our corpus.