Show Summary Details

Page of

Printed from Oxford Research Encyclopedias, Literature. Under the terms of the licence agreement, an individual user may print out a single article for personal use (for details see Privacy Policy and Legal Notice).

date: 25 May 2024

Digital Humanitiesfree

Digital Humanitiesfree

  • Simon BurrowsSimon BurrowsSchool of Humanities and Communication Arts, Western Sydney University
  •  and Michael FalkMichael FalkSchool of English, University of Kent


The article offers a definition, overview, and assessment of the current state of digital humanities, particularly with regard to its actual and potential contribution to literary studies. It outlines the history of humanities computing and digital humanities, its evolution as a discipline, including its institutional development and outstanding challenges it faces. It also considers some of the most cogent critiques digital humanities has faced, particularly from North American-based literary scholars, some of whom have suggested it represents a threat to centuries-old traditions of humanistic inquiry and particularly to literary scholarship based on the tradition of close reading. The article shows instead that digital humanities approaches gainfully employed offer powerful new means of illuminating both context and content of texts, to assist with both close and distant readings, offering a supplement rather than a replacement for traditional means of literary inquiry. The digital techniques it discusses include stylometry, topic modeling, literary mapping, historical bibliometrics, corpus linguistic techniques, and sequence alignment, as well as some of the contributions that they have made. Further, the article explains how many key aspirations of digital humanities scholarship, including interoperability and linked open data, have yet to be realized, and it considers some of the projects that are currently making this possible and the challenges that they face. The article concludes on a slightly cautionary note: What are the implications of the digital humanities for literary study? It is too early to tell.


  • Enlightenment and Early Modern (1600-1800)
  • Literary Theory
  • Western European Literatures
  • Print Culture and Digital Humanities

Defining Digital Humanities: Who’s in the Tent?

Defined succinctly, “digital humanities” involves the application of computational techniques to traditional humanities problems, both as a scholarly practice and as the study thereof. “Digital humanities” thus describes both a technology-empowered methodological approach (or approaches) and a self-reflective critical component. Nevertheless, the precise meaning of the term has proven unstable and subject to debate since it supplanted the earlier label, “humanities computing,” shortly after the millennium.1 This instability is partly the result of technological shifts as visualization, immersive virtual reality, social media, 3-D modeling, online gaming, and machine learning have opened new vistas for researching, exploring, curating, presenting, and understanding objects, the archive, and the human condition. However, much debate and redefinition has occurred as a conscious academic strategy, as digital humanists have sought to emphasize that digital scholarship in the humanities now involves far more than opening up archives and texts to computational analysis through digitization.2

It remains an open question whether digital humanities should be considered as a discipline, a research field, or a movement. Some advocate a capacious, “Big Tent” definition of digital humanities, incorporating a “shared core of like-minded scholars who explore digital frontiers to undertake work in the humanities.”3 This is a widely shared ideal, but it faces two key problems. First, nearly all humanities scholars today rely on digital technology, but “if everyone is a Digital Humanist, then no one is really a Digital Humanist.”4 This has led some to predict that the term “digital humanities” will eventually “wither away” like Marx’s socialist state.5 The second problem is that the Big Tent might not be as big as it seems. Scholars of literature and linguistics—with valuable contributions from cognate areas such as biblical studies, classics, and medieval studies—have dominated digital humanities since its inception, probably because computer analysis of text has historically been simpler than analysis of sound or image.6 This dominance across much of the field’s history has been confirmed by recent statistical analysis, which shows that between 1966 and 2004, 64 percent of the articles in the field’s two most influential journals were devoted to the study of text.7 However, in subsequent years the expansion of the field has embraced—or some might prefer to say brought it into conversation with—disciplines as diverse as art history, design, and architecture through to social media analysis, social robotics, and many areas traditionally associated with the social sciences. Whether digital humanities is a Big Tent or a narrow trench, however, there is no doubt that it has radical implications for the scale and conduct of much humanities research, for modes of inquiry and analysis, and for the types and sophistication of questions that scholars can meaningfully ask of traditional sources. This article addresses how several core questions of literary theory—meaning, interpretation, textuality—have been affected by digital humanities.

The Distant Reading Debate: Can Computers Read?

In the first two decades of the millennium, the role and significance of digital humanities has become one of the most hotly contested and controversial areas of humanities study. In literary studies, the controversy began with a single sentence. In the year 2000, Franco Moretti observed that literary scholars only ever consider a tiny fraction of the world’s books. If they were ever to transcend the limits of individual scholarship, he suggested, then they would need “a little pact with the devil: we know how to read texts, now let’s learn how not to read them.”8 Moretti promoted a new form of “distant reading,” in which scholars would use digital methods to analyze “units much smaller or much larger than the text: devices, themes, tropes—or genres and systems.”9 Though Moretti later recanted his pact with the devil—“it was meant as a joke”—it has remained a totemic statement of a new concept of textual interpretation.10

Since the early 2010s, a series of scholars have begun to answer Moretti’s call. Jodie Archer, Matthew Jockers, Ted Underwood, and Andrew Piper have all published scholarly monographs exploring trends that cut across the boundaries of text, genre, and period, while Moretti and his colleagues at the Stanford Literary Lab have published a series of often field-defining pamphlets.11 These scholars not only claim to have made novel empirical discoveries; they claim to reformulate fundamental concepts of literary theory. Underwood, for instance, claims that digital methods fatally compromise the concept of period, arguing that the “largest patterns organizing literary history” do not fall neatly into the boxes scholars draw around particular times and places.12 Two decades after Moretti’s pronouncement, distant reading has arrived in literary studies.

Moretti’s vision of distant reading has inspired three kinds of critique. First, there are those who see it as an assault on a liberal tradition of humanistic inquiry. Digital humanities is a Trojan horse concealing a neoliberal agenda: it hoovers up research funding, devalues the free intellect of the scholar, and promotes the idea that literary studies should be applied and factual rather than critical and interpretative.13 David Golumbia suggests that digital humanities could bring about the “Death of a Discipline,” as “professionals who are not humanists” (e.g., librarians and computer scientists) become “engaged in setting standards for professional humanists.”14 Though arguments like these have failed to deter distant readers, they have some relevance to the institutionalization of digital humanities (see the section entitled “The Digital Divide”).

The second critique is of more recent origin. In 2019, Nan Da published “A Computational Case against Computational Literary Studies,” a thrilling essay that turns the techniques of distant reading against it.15 She replicates a number of digital studies, showing that minor adjustments to certain parameters can completely change the results. The fundamental problem, she argues, is that computational techniques are necessarily reductive, producing simplified models of text based on the arrangement of words. While such reduction may be useful in contexts like legal document discovery, “there is no rationale for such reductionism” in literature—“in fact, the discipline is about reducing reductionism.”16 Her critics point to her narrow conception of “Computational Literary Studies,” her selective citation of the literature, and her insistence that statistics are only useful if they provide clear causal explanations.17 Whether Da is right or wrong, it is undeniable that she has almost singlehandedly put the debate on a new technical footing.

The third critique is the oldest and most interesting from a theoretical perspective, because it raises fundamental questions of meaning and interpretation. In everyday language, people say that a computer “reads” a file or a disk, but as Johanna Drucker observes, “computers do not interpret; they simply find patterns.”18 Computers have no conception of external reality, and therefore cannot comprehend how language is used by humans in concrete situations. Some argue that close reading is therefore unthreatened by distant reading.19 Others argue that digital methods “deform” or “transform” texts, enabling humans to discover or create new meanings that were not there before.20 Still others argue that information theory provides a basis for linking patterns and meaning.21 Moretti and his school ground their approach in Russian formalism and the philological tradition of Leo Spitzer and Erich Auerbach. Drucker herself argues that “modeling” can bridge the gap between the computer’s pattern and human meaning. Needless to say, the question of pattern and meaning is one of the most fruitful areas of theoretical inquiry in digital humanities today.

While the distant reading debate has brought digital humanities to the heart of literary theory, it has tended to obscure an older and possibly richer tradition of digital text analysis. At the heart of Moretti’s vision is a desire to see past individual texts and authors to reveal the “champs littéraire” [literary field], the field of cultural, social, and economic power that shapes literary production.22 Both proponents and opponents of distant reading alike tend to argue that digital methods are necessarily “crude” or “brute,” and therefore inapt for more fine-grained analysis.23 An earlier tradition argued precisely the opposite: digital analysis can be extremely subtle, and is especially apt for studying the particularity of texts and the individuality of authors. In the 1970s, Robert Cluett rigorously studied the prose of authors such as Philip Sidney and Ernest Hemmingway, seeking to identify the precise linguistic features that distinguished their characteristic styles.24 The following decade, John Burrows carefully sifted the language of Jane Austen’s novels, showing how the individuality of her characters could be seen in the patterns of high-frequency words like “of” and “the.”25 Burrows strongly defended the concept of “idiolect”: since the computer could so easily distinguish the different languages of individual characters and authors, he said, the anti-individualist philosophy of language espoused by Roland Barthes, Jacques Derrida, and Michel Foucault was empirically baseless.26 Perhaps this unorthodox theory contributed to the relative obscurity of Cluett and Burrows’s approach in the new millennium, though their work is fundamental to modern stylometry, and there seems to have been a revival of small-scale digital reading today.27

Modeling Text: Broader Currents

A second major strand of digital humanities has reshaped fundamental aspects of literary scholarship without rousing the controversy of distant reading. Distant reading is underpinned by the idea of computers as information processors, which read in text and reveal the underlying patterns. This idea of computers is false, argues Willard McCarty, one of the most prominent theorists in this second strand. Computers are in fact “modelling machines, not knowledge jukeboxes.”28 Whenever a computer is used to study a text or other artifact, the text or artifact must be reduced to “computational form,” but since no computer model can capture all the ripples and complexities of reality, there is always some residual complexity that the model cannot explain.29 This residual complexity or gap drives a creative process of interpretation. On the one hand, modeling knowledge forces scholars to make their concepts and assumptions explicit. On the other hand, modeling the text compels them to confront its difficult, “computationally unknown” aspects.30 As they become more aware of the gap between their concepts and reality, they are driven to improve their models and begin a playful process of testing and manipulating ideas in digital space.31

This vision of interactive modeling has transformed scholarly editing and book history, and has accordingly transformed how nearly all literary scholars encounter literary objects.32 Scholarly editors have always been aware of the instability of literary texts, as different versions of a text proliferate, and editors combine texts into holistic oeuvres representing the writer’s total vision. Digital editions allow editors to model this instability more thoroughly than ever before. As Jerome J. McGann explains, it is not simply that digital editions can “store vastly greater quantities of documentary materials,” and can “organize, access, and analyze” them more quickly than paper-based editions; hyperlinked digital editions also lead to a new kind of “decentered” textuality, in which no part or version of a text is prioritized over another.33 Likewise, stylometry has transformed how scholars understand the problem of authorship attribution. Under the influence of John Burrows, Patrick Juola, and the Polish School of Stylometry, scholars have learned to build statistical models of individual style, and distinguish authorship as a signal in the noisy flow of text. In book history and periodical studies, databases have destabilized the very concept of the “book,” “article,” or “issue,” and scholars have had to develop new ways to model the production and consumption of text. What unites all these enterprises is the effort to create adequate computer models of texts, authors, or books, and the resultant need to redefine the very concepts scholars were trying to model in the first place.

Surveying this situation, Katherine Bode argues that the transformation of textual scholarship renders the entire distant reading debate otiose. Moretti’s vision of “not reading” texts is an empty dream, because all texts have already been interpreted according to whatever model was used to digitize or edit them in the first place; the dream of Moretti’s critics, that the close reader can exercise subjective readerly freedom, is empty for the same reason.34 It must be said that more sophisticated distant readers grasp Bode and McCarty’s point, and have recast their approach as a kind of modeling.35 Meanwhile there has been an explosion of interest in new modeling techniques, including network analysis, literary mapping, and exciting experiments in gaming, virtual reality, and augmented reality.36 In the Global South, pioneering scholars are pursuing various kinds of decolonial “world making.”37 In these ways, McCarty’s vision of interactive modeling continues to transform the way literature is studied and experienced.

The Emergence and Institutionalization of Digital Humanities

In order to bring about this new world of decentered texts and distant readers, scholars in digital humanities have erected a large scholarly infrastructure. Scholarly journals, academic programs, job vacancies, and research centers have proliferated, particularly since 2010, along with national academic organizations, affiliated to a global body, the Alliance of Digital Humanities Organizations (ADHO), which organizes an annual international conference.38 Annual training camps have also emerged, staffed by academic enthusiasts who donate their time freely, including the Digital Humanities Summer Institute, the Oxford Digital Humanities Summer School, and Digital Humanities Down Under. This open team culture reflects the collaborative and interdisciplinary nature of digital humanities work and the commitment and openness of many practitioners to new modes of scholarly collaboration, public engagement, and open publishing practices.

University, national, and international infrastructures continue to evolve to promote, host, and sustain digital humanities work, and to accommodate academic practices, outputs, collaborations, and careers that defy traditional metrics for academic evaluation, accreditation, publication, and sustainability.39 These efforts have had mixed results, and scholars and administrators continue to disagree on how to establish effective research centers. Labs, social, and creative spaces, software, hardware, and access to supercomputers and digital storage may all play a part. However, many scholars concur with Laurent Dousset, a prominent French scholar who advised the French government that the primary infrastructure lies in people.40 University administrations and funding bodies have been slow to absorb this lesson, which has led to well-known problems: broken and unstable teams; lack of career progression; loss of key institutional or project knowledge; projects delayed or scuttled; and outputs going offline prematurely. By contrast, successful centers such as Stanford University’s Center for Spatial and Textual Analysis (CESTA) or Sheffield’s Humanities Research Institute have invested heavily in people, realizing that the future development of digital humanities depends upon secure jobs and careers for both academic and technical staff.41 Even in these centers, however, the situation can seem precarious.42 Every project—and accompanying years of work—seems to topple perpetually on the verge of a precipice, menaced by the possibility of an undiscovered bug, the departure of a key person with irreplaceable knowledge, or the inability to secure funding to maintain an online resource. Barriers, even in privileged institutions, remain formidable.

Nevertheless, computationally based research in the humanities has a long pedigree: “humanities computing” can trace its origins back to the 1940s and 1950s. Its earliest pioneers include Professor Josephine Miles, who with an all-female team of students and punch-card operators between 1951 and 1956 produced a “Concordance to the Poetical Works of John Dryden,” and the Italian Jesuit priest Roberto Busa, who in 1946 began compiling a concordance of the nine million words in the sprawling work of Saint Thomas Aquinas, eventually with the support of IBM.43 By the mid-1960s there were enough practitioners to support a journal, Humanities and Computing, and the first specialized academic associations were founded in the 1970s.44 Although the early history of humanities computing and digital humanities has often been written from an Anglo-Saxon perspective, significant work was conducted outside Britain and America.45

Much of this early work was revolutionary, even if its impact took decades to register. In 1965–1970, for example, François Furet and his collaborators published Livre et société dans la France du XVIIIe siècle.46 Using computational analysis of French bureaucratic and book trade records, Furet and his team offered foundational insights into reading prior to the 1789 revolution, and this work is only now being surpassed.47 A few years later, another French study pioneered the use of descriptive markers to assess the content of the pre-revolutionary newspaper press.48 The revolutionary research possibilities of large-scale digitization of extensive runs of newspapers using optical character recognition (OCR) and powerful bespoke search and analytic tools were only realized two decades later.49

These developments were accompanied from the mid-1990s by others which further empowered digital humanities research. These included the mass digitization of archives, objects, and printed texts; the advent and uptake of the internet; and the Text Encoding Initiative (TEI), a scholarly initiative which created machine-readable text encoding “guidelines for the creation and management in digital form of every type of data created and used by researchers in the humanities.”50 With the publication of the first TEI guidelines in 1994, the humanities community had for the first time common digital standards and markup to facilitate research, teaching, data curation and preservation, and, in due course, interoperability.51

The Digital Divide: Money, Power, Empire

We have seen already that digital humanities has inspired considerable theoretical and methodological debate. Its institutionalization has raised further criticisms. Critics worry about the creeping encroachments of neoliberalism and neocolonialism on academe, especially given the perceived technical and financial barriers of entry into the field.52 Teams of researchers and technologists can come with eye-watering price tags. The same can be said of indispensable research databases published by Gale-Cengage, ProQuest, and Adam Matthew Digital. For some, these costs represent an unwelcome corporate invasion of the traditional research space, particularly when accompanied by attempts to monetize humanities research.53 As winning funding has become an increasingly important activity for scholars, the expense of digital humanities research has perversely become one of its most prized aspects. Large European Research Council (ERC) grants for projects such as Radboud University’s MEDIATE project can reach €2 million, while projects such as Oxford University’s Cultures of Knowledge, the Old Bailey Online and its successors, or Western Sydney University’s French Book Trade in Enlightenment Europe project have often received repeat grants in six or seven figures. Major players have therefore emerged, surrounded by a cadre of precariously employed early-career scholars who often find themselves moving internationally as digital projects start and finish.54 Yet these projects are collectively dwarfed in scale by the most ambitious project to date, the ERC-backed Time Machine project, which requires funding on a breathtaking scale. It involves, at the time of writing, 645 partner organizations in forty-five countries and has received, €1 million of ERC preliminary funding for just its strategic planning phase.55 For the flagship Venice Time Machine initiative alone, there are plans to digitize, analyze, and make accessible the entire Venetian state archives, which occupy eighty kilometers of archival storage.56

The fear of a growing “digital divide” between wealthier and poorer researchers, institutions, libraries, and countries is thus not without foundation. Nevertheless, scholars in the Global South have found ways to embrace digital humanities, as evidenced by pioneering efforts such as India’s Digital Humanities Alliance for Research and Teaching Innovations (DHARTI), the Network for Digital Humanities in Africa, and the global movement for Indigenous Data Sovereignty. Such scholars have also pioneered approaches to digital scholarship that harness the power of ordinary computers, eschewing centralized computing clusters and subscription databases. The movement is known as “jugaad in India, gambiarra in Brazil, rebusque in Columbia, jua kali in Kenya, and zizhu chuangxin in China,” or as “minimal computing” in the Anglophone world.57 This is a vital and creative movement, though of course jugaad [“making do”] would not be necessary if the digital divide were not a reality.

Given the huge expense and skewed allocation of resources, critics ask: Has digital humanities been worth the expense? As discussed in the section on “The Distant Reading Debate”, some critics argue that digital humanities is an uncritical enterprise, with a covert neoliberal agenda and an overt tendency to prioritize technicalities over critique. Certainly some digital humanities work necessarily engages with technical problems at the expense of humanistic ones, but much digital humanities research remains highly politically engaged. Cases in point include the “Slave Voyages” project’s attempts to capture in database form and visualize three centuries of monstrous transatlantic human trafficking or the “Colonial Frontier Massacres” project’s mapping of the true extent of settler violence against indigenous populations in Central and Eastern Australia between 1788 and 1930. Indeed, such digital projects have a capacity to engage and mobilize much broader publics interested in history, genealogy, demography, or racial politics than much traditional research, particularly when they offer online resources with intuitive interactive tools.58 Nor was the first attempt at a distant reading of 18th-century British erotica any less politically engaged than the scholarship that preceded it. Indeed, its feminist conclusions might appear more authoritative for resting on the comprehensive—if not quite exhaustive—evidential base provided by datamining Gale-Cengage’s magnificent Eighteenth Century Collections Online (ECCO) digital resource.59 Furthermore, digitally empowered techniques such as literary mapping—which uses digital techniques to explore the temporal-spatial dimensions of fictional settings to illuminate where events happened and the geospatial parameters for action within a novel—offer literary scholars (among others) previously unimagined means to develop deeper understandings of setting, plot, action, and even mood and emotional associtions.60 They also offer new means for engaging audiences, particularly once a virtual reality (VR) dimension is added.61

Digital Humanities in Practice: A Case Study

Digital humanities now offers such a dazzling arsenal of tools, techniques, and possibilities to researchers that a comprehensive catalogue or typology is beyond the scope of a single article. This section considers a range of the most prominent techniques in literary studies, and highlights some of the practical issues encountered by scholars who use them. Many of the most innovative projects and digital resources bring together more than one technique or approach. Indeed, the next “big thing” in digital humanities is likely to be the development of tools to exploit multiple datasets using linked open data techniques, though the full promise of the so-called “semantic web” has yet to be realized in practice.62 In the meantime, the main practical effect of digital humanities has been to change the way scholars encounter the archive. At the touch of a button, a student can search an archive as easily as a professor, and could potentially retrieve in seconds material that would not so long ago have taken a lifetime of work to discover.63 Though the promise of such encounters is great, scholarly expertise is still required to analyze, interpret, and explain the significance of the information gathered, and in practice digital scholarship encounters numerous unforeseen barriers. To demonstrate this practical dimension, this section now considers several examples that relate to a particular historical and literary period, the 18th-century European Enlightenment. This is an area of study where digital humanities has had a particularly large impact.64

Much effort and expense in the early stages of the digital humanities revolution was expended on creating new digital editions and large digital text corpora. In both cases, the fundamental problem was the same: digitizing the text could be done relatively simply, but editing and organizing it was labor-intensive, and required the development of complex new models of the “decentered” textuality (see the section entitled “Modeling Text”). This mismatch has plagued many important digitization efforts, including Google Books and ECCO—which used OCR to transcribe thousands of historic books. These made huge amounts of text available, but the transcriptions were poor and in the case of Google Books the bibliographic data was patchy and impeded systematic discoverability. ECCO provides significantly better bibliographic data, relying on library Machine Readable Cataloging (MARC) records to annotate each file. But even this data, gathered and curated by generations of librarians to varying standards, is inadequate for many scholarly purposes. A scholar trying to estimate how many books were published in different towns, for example, would be unable to discover whether a book published in “Richmond” was published in Surrey (UK), Yorkshire (UK), Virginia (US), or Jamaica. Discerning which was intended in each individual case would be a daunting task in a dataset of 220,000 volumes. Similarly, the database records 91,875 distinct publishers, but since every inconsistently placed apostrophe or name variant generates a distinctive “result,” the true number of publishers is far lower. To realize the full potential of the data, such ambiguities need to be fully resolved by painstaking textual scholarship. This daunting data-cleaning task is being undertaken by the Digital Humanities group at the University of Helsinki under the direction of Mikko Tolonen.

Luckily, along with mass-digitization projects like ECCO, there have been a range of projects collating key metadata. The research required to clean and edit ECCO is greatly aided by the Consortium of European Research Libraries’ “CERL Thesaurus” and its feeder databases (e.g., Data@bnf and the British Book Trade Index (BBTI)), though expertise and resources are required to make use of these highly technical resources. As discussed later in this section, when accurate, authoritative metadata is combined with cleaned up text, the possibilities for literary scholarship are immense. And as semantic web technologies become more widespread, it should soon be possible to automate and speed up large parts of the process.

Digital reading and modeling methods play a double role in this process of cleaning and editing. On the one hand, until the data itself is properly cleaned and organized, digital methods are of little use for literary interpretation. On the other hand, certain methods such a stylometry can be of great use during and beyond the cleaning phase. Stylometry has helped to confirm the long-suspected collaboration of Christopher Marlowe in the writing Shakespeare’s plays, for example, and to identify the authors of scandalous political libelle pamphlets attacking Marie-Antoinette in the 1780s.65 In this way, it can help to provide the metadata on which large-scale digital studies rely. As data cleaning proceeds, more reading and modeling techniques become useful. One popular technique, topic modeling, is used to identify clusters of co-occurring words, or “topics,” in order to suggest what a document is about. It can thus be used to identify and interrogate recurring themes within a large set of texts.66 Other popular interpretative techniques include sentiment analysis, which predicts the emotional charge of a text based on its vocabulary and syntax, collocation analysis, which finds pairs, triplets, or larger sets of words that tend to co-occur with one another, and word vectors, where words are represented as points in a high-dimensional space that model their semantic relationships with one another.

While these mathematically sophisticated techniques attract attention, it is often mathematically simpler methods that have proven most useful in practice, even when confronting “big data.” In one study, Clovis Gladstone and Charles Cooney used string-matching or sequence alignment techniques originally developed by computer scientists for applications such as spelling correction to identify literary “commonplaces”—frequently repeated phrases—in the ECCO database. They identified the Bible as the origin of 58.5 percent of all those whose origins could be determined, a surprising finding for a century associated with elite religious skepticism and growing secularization.67 Their results bring traditional theories of secularization into question, hinting at a culture still immersed, to an unsuspected degree, in religious imagery. Thus, string-matching techniques initially developed in bioinformatics, offer support to the hypothesis that the Enlightenment rational critical tradition had largely religious origins, stemming from the habit of finding disputational evidence in scriptural texts.

Techniques like topic modeling and string matching analyze the text. But as scholars like Bode are so right to point out (see the section entitled “Modeling Text”), the metadata painstakingly collected by editors and book historians often harbors equally important insights. In book history, the use of such data is often referred to as bibliometry. Recent industrial-scale bibliometry conducted for the Mapping Print, Charting Enlightenment database project has suggested that religious texts were far more prevalent in pre-revolutionary France than previously thought. Previous scholars relied on sources that underreported such books; only large-scale digital bibliometric analysis of supply-side sources produced by the publishing industry could capture the production of religious writing on a grand scale.68 Likewise, the MEDIATE project’s study of private library catalogues and book ownership is uncovering the kinds of books that “mediated” between religious and secular worldviews, demonstrating the importance of overlooked authors like Madame Leprince de Beaumont and Stéphanie-Félicité de Genlis.69 Thus the textual methods used by Cooney and Gladstone can be combined with bibliometry to provide a multifaceted view of Enlightenment secularization.

Our emphasis on the European Enlightenment is not coincidental. As digital humanities emerged, scholars of 18th-century Britain and France in particular had the good fortune to enjoy a particularly privileged position.70 The materials available to them were both immense and finite, making it possible to build large and comprehensive databases. Further, 18th-century scholars very early benefitted from a number of highly significant digital products. ECCO and the Burney collection of British newspapers gave them digital access to the lion’s share of all 18th-century printed products in English. The Old Bailey Online digitized three centuries of London’s criminal court records, making them publicly accessible online with excellent research and analytic tools.71 Historians of France also had ready access via the Electronic Enlightenment database to a corpus of seventy thousand letters written to and from such luminaries as Voltaire and Rousseau. Its superb metadata has empowered the work of Stanford’s outstanding Mapping the Republic of Letters project and the rich prosopographical insights which have stemmed from it.72 These tools have been supplemented by the riches assembled on the digital library of the Bibliothèque nationale de France, Gallica, a national digitization project surpassing all others, and extensive historical bibliometric initiatives such as Mapping Print, Charting Enlightenment and MEDIATE.73 Thus digital humanists have been able to use the 18th-century as a laboratory for experimentation, and 18th-century scholars have been at the forefront of attempts to realize the potential of linked data. In 2016, an annual international symposium and scholarly network, Digitizing Enlightenment, was established to further these efforts.74

One attempt to realize such ambitions is the Libraries, Reading Communities and Cultural Formation in the Eighteenth-Century Atlantic World, an international project based at Liverpool University (UK).75 It intends to create a database of every extant 18th-century subscription library catalogue in the English-speaking world (covering about eighty institutions) together with borrower records and, primarily through ECCO, the actual texts held. In this way, the project will enable a fundamentally new kind of literary history, in which corpus analysis is weighted according to readership. Rather than treating texts as fixed units published at a particular time, the project will treat them as items that flow and persist across time and space, overcoming Bode’s objections to bibliographically naïve forms of distant reading. This marriage of corpus linguistics, historical bibliometrics, discourse analysis, and big data analytics offers a powerful new means for exploring and conceptualizing the significance of literary texts and nonfiction works and their relationship to processes of social, cultural, and political change.

If the digital revolution has brought new ways of exploring the production, dissemination, and content of texts, it has perhaps even greater potential for recovering and understanding reader response. Reader response, despite its interest to literary and book historians and the best efforts of educationalists and psychologists studying living subjects, remains relatively little understood. In past historical contexts it is particularly difficult to uncover how texts were read, since the act of reading generally leaves little tangible trace. Scholars have generally been forced to rely on printed reviews by highbrow literary reviewers or the equally atypical reading journals or commonplace books of diarists like Samuel Pepys or the Sheffield apprentice Joseph Hunter.76 The pioneering book historian Robert Darnton famously wrote a seminal article on reader responses to Rousseau based on just two readers.77

Innovative attempts to overcome this problem have been made by the Reading Experience Databases (REDs) that exist or are planned for several English-speaking countries, Britain, Canada, Australia, New Zealand, and the Netherlands, and tentatively Finland. These projects, based largely on volunteer crowdsourced labor, gather data on documented reader experience wherever encountered, in some cases (e.g., Britain) over time spans as long as five hundred years, and in others at fixed moments (the New Zealand RED, for example, focuses on reading during World War I).78 This approach is extremely labor-intensive but yields rich information. The online data-entry form gathers data on time of day, location, social context, and postures in which acts of reading occurred, as well as on individual readers and their critical responses to texts. REDs can be hard to maintain, and are often skewed toward the experience of the most voracious readers. Even the largest RED, the British, is limited: thirty thousand entries covering five hundred years of reading equates to an average of just sixty documented acts per year.79 Nonetheless such databases put the study of reading on a fundamentally new footing.

The RED approach to reader experience is not, however, the only possible one. With the proliferation of digitized sources, improved OCR, and improvements in machine learning and sentiment analysis, harvesting dispersed traces of reading is increasingly feasible. The sources for such work include published book reviews; private correspondence; manuscript newsletters; reading journals; mentions of reading in fictional and nonfiction works; footnote citations and commonplaces which cross reference texts; readers’ marginal annotations; school exercise books and university essays; publishers correspondence; and censors’ reports.80 There are, additionally, reports of police spies and agents; court, police, and inquisitorial records; and mass observation diaries. Scholars regularly mine sites such as Goodreads and LibraryThing for data about reading practices today, as well as studying probably the most intense and digital form of reader response: fan fiction.81

The discussions of infrastructure, funding, and ambitious megaprojects in this article should not be taken to imply that the significance, utility, and quality of digital humanities projects is best measured by scale. As the jugaad movement demonstrates, technical barriers to entry are not necessarily very great. Effective and impressive projects have been conducted using technology no more complex than a spreadsheet. Recent work of Cheryl Knott is a case in point: she studied six tiny colonial libraries’ holdings, which pointed to significant differences in reading cultures either side of the Potomac.82 A recent project overseen by Gary Kates was similarly modest in method, though great in aspiration. Kates’s students collated WorldCat data on almost five thousand 18th-century editions of 171 leading titles which historians have associated with the Enlightenment, across all major European languages, to assess which books were most frequently reprinted across the century. The most popular proved to be political works, particularly novels, several of which were reprinted across the entire century. Montesquieu, Rousseau, and Marmontel were unsurprisingly among the most reprinted authors, but the two most reprinted books were Madame de Graffigny’s Lettres peruviennes (282 editions across the century) and Fénelon’s Télémaque, an apology for tempered monarchy, which first appeared in 1699 and ran to an astonishing 445 editions.83 This study, whose primary research tool was the search bar on WorldCat, seems to have turned back the clock on Enlightenment historiography by three generations, suggesting that constitutional monarchy was the period’s most popular political theme.84

While most projects discussed here gathered their own data, existing textual databases and other digital resources offer near infinite research possibilities for digital humanities researchers and students, and many major digitized research collections are now coming with research tools attached, such as Gale-Cengage’s Digital Scholar Lab. Most digital humanists, however, use free software packages such as Voyant Tools, Gephi, and the stylo package of Maciej Eder and Jan Rybicki (see section entitled “Links to Digital Materials”). Increasingly, humanists are also learning to code, particularly in R, Python, and JavaScript. They release their code on Github under permissive licenses, allowing reuse, learning, and scholarly access to the bases of research. Online forums such as StackExchange provide copious free advice to beginning programmers. As digital humanities has matured, practitioners have begun to produce comprehensive textbooks describing basic methods and best practices (see “Further Reading”). Yet it in the digital humanities research space, agreed methodologies and best practice are not enough. As James E. Dobson has argued, computational tools themselves need to be continually critiqued and since the humanities, unlike the sciences, lack a defined knowledge frontier, both methods and findings are subjected to constant re-examination and reinterpretation. Envisaged thus, the digital humanities have the potential to continue in the best traditions of critical humanistic enquiry.85


Nothing in the discussion presented here suggests an intent, desire, or ability for digital humanities to replace traditional literary study. Rather, this article has sought to explore the power of computationally enhanced humanities to accommodate the subjective insights of close reading of texts within wider appreciations of the cultural, social, and literary contexts. This involves applying new methods to interpret, understand, and experience texts, by revealing patterns within, between, or beyond individual texts which are not evident or obvious to unaided human perception. It involves the development of new models to make sense of these patterns, and determine their connection to the world beyond the text. The tools, digital archives, datasets, ontologies, and techniques used to read and model literature are still at a relatively early stage, as are scholars’ abilities to interpret the complex visualizations and numerical outputs needed to understand them. These new technologies have changed how scholars encounter literature and have provoked considerable theoretical debate. Digital humanities remains a contested term describing a field in flux. If asked to assess its impact, it would be wise to concur with Chinese premier Zhou Enlai’s apocryphal response when asked to assess the impact of the yet greater revolution that struck France in 1789: “It is too early to say.”86

Discussion of the Literature

The literature on digital humanities is voluminous. DARIAH-DE, Germany’s peak digital humanities body, maintains a useful biography online. The “Further Reading” section lists a number of useful general texts, case studies, and textbooks. In contrast this section focuses on major works of theory and criticism that have driven debate in digital literary studies.

In the field of computational literary criticism, key early works include Robert Cluett’s Prose Style and Critical Reading and John Burrows’ Computation into Criticism, which were published in 1976 and 1987 respectively.87 Cluett and Burrows harnessed the power of early research computers to study subtle stylistic fluctuations in classic texts, and formulated a strong theory of language as an index of individual personality to justify their methods. Recently several scholars have revived this early vision of precise and subtle computational literary criticism. Stephen Ramsay’s Reading Machines (2011) promotes an anarchic form of digital reading rooted in the philosophy of Paul Feyerabend; Monika Bednarek’s Language and Television Series (2018) combines corpus linguistics with screenwriter interviews to show how race, class, and regional identity are represented in contemporary teleplays; and Martin Paul Eve’s Close Reading with Computers promotes a vision of “narrow deep” reading with computational tools.88

As mentioned in “The Distant Reading Debate” section, in recent decades the dominant force in computational literary criticism has been toward “distant reading,” the statistical analysis of large corpora of texts to analyze long-term trends. This project is most often associated with Franco Moretti, whose early papers on the subject are collected in Distant Reading.89 Recent works in a similar vein include Matthew Jockers’s Macroanalysis and The Bestseller Code, co-written with Jodie Archer; Andrew Piper’s Enumerations; and Ted Underwood’s Distant Horizons.90 All these works contain useful theoretical reflections as well as practical applications of large-scale analysis. No reader of these works can afford to ignore Nan Da’s thrilling critique, “The Computational Case against Computational Literary Studies,” and the numerous responses it has provoked.91

An alternative vision of digital humanities sidelines the question of “reading,” considering instead how computers allow scholars to build models of literary history. In Humanities Computing, Willard McCarty describes a method of “interactive modelling,” in which scholars continually grapple with the computer’s inability to fully capture reality.92 Moretti himself once promoted a similar method, and in works such as Atlas of the European Novel and Graphs, Maps, Trees, he utilized modeling techniques such as digital mapping and network analysis.93 In A World of Fiction, Katherine Bode offers a profound critique of the way scholarly databases model literary history.94 In this vein, Laura Mandell’s Breaking the Book and Alan Liu’s Friending the Past consider how digital media change the way scholars encounter the literary past.95

This change is evident from the interrelated fields of stylometry, book history, and scholarly editing. Stylometry is now a standard tool of authorship attribution for scholarly editors. Readers interested in the underlying theory may begin with articles by John Burrows, Patrick Juola, and Jan Rybicki and Maciej Eder.96 Readers interested in how digital humanities is transforming book history and the concept of the scholarly edition should consult Jerome J. McGann’s Radiant Textuality, Bode’s A World of Fiction and Paul Eggert’s The Work and the Reader in Literary Studies.97 For practical examples of digital book history, or “historical bibliometrics,” readers may consult Simon Burrows’s and Mark Curran’s volumes on the French Book Trade in Enlightenment Europe, and Bode’s Reading by Numbers.98 The fascinating history of digital books lies beyond the scope of this article: interested readers should instead consult the articles on “Reading in the Digital Era,” “E-Text” and “Hypertext Theory” in the Oxford Research Encyclopedia of Literature.99

To conclude this discussion, no survey would be complete without noting Roopika Risam’s New Digital Worlds.100 Risam’s work is not confined to literary studies, but she identifies crucial power imbalances that undercut the ideal of digital humanities, and describes the pioneering interventions of scholars and theorists from the Global South.

Links to Digital Materials

  • 1947 Archive: A pioneering oral history archive, which collects testimonies of Partition from survivors in India, Pakistan, and Bangladesh. An example of digital “world making” in the Global South.
  • ADHO: The Alliance of Digital Humanities Organizations, which runs the yearly Digital Humanities conference and co-ordinates the Digital Scholarship in the Humanities journal.
  • DARIAH-DE bibliography: A reasonably comprehensive bibliography that is constantly updated, and can be downloaded in the convenient form of a Zotero library.
  • Drama Corpora Project: A huge database of German, Russian, Italian, and other plays, an excellent example of network analysis, text analysis, and linked open data.
  • Gephi: The most popular program for network analysis. Free and relatively intuitive, it provides support for a range of visualization and analysis techniques.
  • Humanist discussion group: This is the leading international forum for the discussion of digital humanities, and has been moderated by Willard McCarty since 1987.
  • Mapping Emotions in Victorian London: An interesting application of geospatial analysis and text analysis.
  • Programming Historian: Provides free, peer-reviewed, and well-pitched tutorials in numerous digital technologies and techniques, in English, Spanish, and French.
  • Python: Along with R, probably the most popular programming language in digital humanities. There are numerous online tutorials.
  • QGIS: One of many free geographic information systems used for mapping and geospatial analysis.
  • Rossetti Archive: Jerome J. McGann’s influential digital edition of Dante Gabriel Rossetti, which drove early debates about digital scholarly editing.
  • RStudio: R is one of the most popular programming languages in digital humanities, and RStudio is a free Integrated Programming Environment (IDE) that makes the language easier to use. Many key tools in digital humanities are released as “R packages,” such as Jan Rybicki and Maciej Eder’s stylo packge for authorship analysis, Matthew Jockers’s syuzhet package for sentiment analysis, and David Mimno’s mallet package for topic modeling.
  • Text Encoding Initiative: The TEI website contains detailed information about the TEI specifications, and links to training and resources.
  • Trove: A leading example of a national newspaper and document archive, built on open data and crowdsourcing principles, subsequently emulated around the world.
  • Voyant Tools: A popular suite of free text analysis tools, built by Geoffrey Rockwell and the late Stéfan Sinclair.
  • Zotero: The bibliographic software of choice for many digital humanists. The free version is highly functional, and allows for easy use of the DARIAH-DE bibliography.

Further Reading

I. Digital Humanities: General
  • Bodenhamer, David J., John Corrigan, and Trevor M. Harris, eds. The Spatial Humanities. Bloomington: Indiana University Press, 2010.
  • Gold, Matthew K., and Lauren F. Klein, eds. Debates in the Digital Humanities. Minneapolis: University of Minnesota Press, 2016.
  • McCarty, Willard. Humanities Computing. Basingstoke, UK: Palgrave, 2005.
  • Risam, Roopika. New Digital Worlds: Postcolonial Digital Humanities in Theory, Praxis and Pedagogy. Evanston, IL: Northwestern University Press, 2019.
  • Terras, Melissa, Julianne Nyhan, and Edward Vanhoutte, eds. Defining Digital Humanities: A Reader. New York: Routledge, 2013.
II. Digital Humanities: Literary Studies
  • Armstrong, Nancy, Warren Montag, Alison Booth, Johanna Drucker, Andrew Goldstone, Catherine Nicholson, Andrew Piper, Lisa Marie Rhody, Richard Jean So, Matthew Wickman, Bethany Wiggin and Franco Moretti. “Theories and Methodologies: On Franco Moretti’s Distant Reading.” PMLA 132, no. 3 (May 2017): 613–689.
  • Bode, Katherine. A World of Fiction: Digital Collections and the Future of Literary History. Ann Arbor: University of Michigan Press, 2018.
  • Burrows, John. Computation into Criticism: A Study of Jane Austen’s Novels and an Experiment in Criticism. Oxford: Clarendon Press, 1987.
  • Cluett, Robert. Prose Style and Critical Reading. New York: Teachers College Press, 1976.
  • Da, Nan Z. “The Computational Case against Computational Literary Studies.” Critical Inquiry 45, no. 3 (2019): 601–639.
  • Eve, Martin Paul. Close Reading with Computers: Textual Scholarship, Computational Formalism, and David Mitchell’s “Cloud Atlas.” Open access ebook. Stanford, CA: Stanford University Press, 2019.
  • Liu, Alan. Friending the Past: The Sense of History in the Digital Age. Chicago: Chicago University Press, 2018.
  • Mandell, Laura. Breaking the Book: Print Humanities in the Digital Age. Chichester, UK: Wiley-Blackwell, 2015.
  • McGann, Jerome J. Radiant Textuality: Literature after the World Wide Web. Houndmills, UK: Palgrave, 2001.
  • Moretti, Franco. Distant Reading. London and New York: Verso, 2013.
  • Ramsay, Stephen. Reading Machines: Towards an Algorithmic Criticism. Urbana: University of Illinois Press, 2011.
  • Underwood, Ted. Distant Horizons: Digital Evidence and Literary Change. Chicago: Chicago University Press, 2019.
III. Case Studies: Digital Studies of the European Enlightenment
  • Burrows, Simon, and Glenn Roe, eds. Digitizing Enlightenment: Digital Humanities and the Transformation of Eighteenth-Century Studies. Liverpool, UK: Liverpool University Press, 2020.
  • Edmondson, Chloe, and Dan Edelstein, eds. Networks of Enlightenment: Digital Approaches to the Republic of Letters. Liverpool, UK: Liverpool University Press, 2019.
  • Furet, François, ed. Livre et société dans la France du XVIIIe siècle. 2 vols. Paris: Mouton, 1965–1970.
  • Paule, Jansen, ed. L’Année 1778 à travers la presse traitée par ordinateur. Paris: Presses Universitaires de France, 1982.
IV. Textbooks in Digital Method
  • Arnold, Taylor, and Lauren Tilton. Humanities Data in R: Exploring Networks, Geospatial Data, and Text. Cham, Switzerland: Springer, 2015.
  • Jockers, Matthew. Text Analysis with R for Students of Literature. Cham, Switzerland: Springer, 2014.
  • Juola, Patrick. “Authorship Attribution.” Foundations and Trends in Information Retrieval 1, no. 3 (2006): 233–334.
  • Rockwell, Geoffrey, and Stéfan Sinclair. Hermeneutica: Computer-Assisted Interpretation in the Humanities. Cambridge, MA: MIT Press, 2016.


  • 1. On the discipline’s emergence and self-presentation, see Melissa Terras, Julianne Nyhan, and Edward Vanhoutte, eds., Defining Digital Humanities: A Reader (London: Routledge, 2013); and Matthew K. Gold and Lauren F. Klein, eds., Debates in the Digital Humanities (Minneapolis: University of Minnesota Press, 2016).

  • 2. The rhetorical shift has been attributed to John Unsworth, Susan Schreibmann, and Ray Siemens in A Companion to the Digital Humanities (Malden, MA: Wiley-Blackwell, 2004). The term entered wider circulation around 2010, especially after the New York Times (November 10, 2010) ran a front-page article featuring a prize-winning student-produced visualization from Stanford University’s Mapping the Republic of Letters digital humanities project.

  • 3. Melissa Terras, “Peering Inside the Big Tent,” in Terras, Nyhan, and Vanhoutte, Defining Digital Humanities, 263–270, 267.

  • 4. Terras, “Peering Inside the Big Tent,” 268.

  • 5. A compelling refutation of this position was offered as early as 1998 by Willard McCarty in “What Is Humanities Computing? Toward a Definition of the Field,” Center for Humanities Computing, February 16, 1998.

  • 6. Matthew Kirschenbaum, “What Is Digital Humanities, and What Is It doing in English Departments?” in Terras, Nyhan, and Vanhoutte, Defining Digital Humanities, 201.

  • 7. Chris Alen Sula and Heather V. Hill, “The Early History of Digital Humanities: An Analysis of Computers and the Humanities (1966–2004) and Literary and Linguistic Computing (1986–2004),” Digital Scholarship in the Humanities 34, supplement 1 (2019): I, 190–206.

  • 8. Franco Moretti, “Conjectures on World Literature,” New Left Review 1 (2000): 54–68. The essay is reprinted in Distant Reading by Franco Moretti (London and New York: Verso, 2013), 43–62.

  • 9. Moretti, Distant Reading.

  • 10. Moretti, Distant Reading, 44.

  • 11. Jodie Archer and Matthew Jockers, The Bestseller Code: Anatomy of the Blockbuster Novel (London: Penguin, 2016); Matthew Jockers, Macroanalysis: Digital Methods and Literary History (Urbana: University of Illinois Press, 2013); Andrew Piper, Enumerations: Data and Literary Study (Chicago: Chicago University Press, 2018); Ted Underwood, Distant Horizons: Digital Evidence and Literary Change (Chicago: Chicago University Press, 2019); and Stanford Literary Lab, “Pamphlets,” January 2007–September 2018.

  • 12. Underwood, Distant Horizons, x.

  • 13. For this critique, made generally in the print media, see Daniel Allington, Sarah Brouillete, and David Golumbia, “Neoliberal Tools (and Archives): A Political History of Digital Humanities,” Los Angeles Review of Books, May 1, 2016; Adam Kirsch, “Technology Is Taking over English Departments: The False Promise of Digital Humanities,” The New Republic, May 3, 2014; Carl Straumsheim, “Digital Humanities Bubble,” Inside Higher Education, May 8, 2014; and Timothy Brennan, “The Digital Humanities Bust,” The Chronicle of Higher Education, October 15, 2017.

  • 14. David Golumbia, “Death of a Discipline,” Differences 25, no. 1 (May 2014): 157.

  • 15. Nan Z. Da, “The Computational Case against Computational Literary Studies,” Critical Inquiry 45, no. 3 (2019): 601–639.

  • 16. Da, “The Computational Case,” 638.

  • 17. Katherine Bode, “Computational Literary Studies: Participant Forum Responses, Day 2,” In the Moment (blog), April 2019; Andrew Piper, “Do We Know What We Are Doing?,” Journal of Cultural Analytics, Debates, January 2020; and Ted Underwood, “Critical Response II: The Theoretical Divide Driving Debates about Computation,” Critical Inquiry 46, no. 4 (2020): 900–912.

  • 18. Johanna Drucker, “Why Distant Reading Isn’t,” PMLA 132, no. 3 (2017): 628–635, 629.

  • 19. Barbara Herrnstein Smith, “What Was ‘Close Reading’? A Century of Method in Literary Studies,” Minnesota Review 87, no. 1 (2016): 73.

  • 20. Jerome J. McGann and Lisa Samuels, “Deformance and Interpretation,” in Radiant Textuality: Literature after the World Wide Web, by Jerome J. McGann (Houndmills, UK: Palgrave, 2001), 105–135; and Stephen Ramsay, Reading Machines: Towards an Algorithmic Criticism (Urbana: University of Illinois Press, 2011).

  • 21. See Wendy Wheeler, “Information and Meaning,” Oxford Research Encyclopedia of Literature, July 30, 2020.

  • 22. See Mark Algee-Hewitt, Marissa Gemma, Ryan Heuser, and Hannah Walser, “Canon/Archive: Large-Scale Dynamics in the Literary Field,” Pamphlets of the Stanford Literary Lab 11 (2016), 3–5; and Pierre Bourdieu, “Le champs littéraire,” Actes de la Recherche en Sciences Sociales 89 (1991): 3–46.

  • 23. Underwood, Distant Horizons, xxi; and Brennan, “The Digital Humanities Bust.”

  • 24. Robert Cluett, Prose Style and Critical Reading (New York: Teachers College Press, 1976).

  • 25. John F. Burrows, Computation into Criticism: A Study of Jane Austen’s Novels and an Experiment in Criticism (Oxford: Clarendon Press, 1987).

  • 26. Burrows, Computation into Criticism, 93–95. See also John Burrows, “Rho-Grams and Rho-Sets: Significant Links in the Web of Words,” Digital Scholarship in the Humanities 33, no. 4 (2018): 725.

  • 27. See, for instance, Monika Bednarek, Language and Television Series: A Linguistic Approach to TV Dialogue (Cambridge, UK: Cambridge University Press, 2018); and Martin Paul Eve, Close Reading with Computers: Textual Scholarship, Computational Formalism, and David Mitchell’s “Cloud Atlas,” Open access ebook (Stanford, CA: Stanford University Press, 2019).

  • 28. Willard McCarty, Humanities Computing (Houndmills, UK: Palgrave, 2005), 27.

  • 29. McCarty, Humanities Computing, 25.

  • 30. McCarty, Humanities Computing, 38.

  • 31. On modeling as an exploratory and experimental tool, see Dennis Yi Tenen, “Towards a Computational Archaeology of Fictional Space,” New Literary History 49, no. 1 (2018): 119–147. On the challenges of modeling in the humanities, see Julia Flanders and Fotis Jannidis, “Data Modelling,” in A New Companion to the Digital Humanities, ed. John Unsworth, Susan Schreibmann, and Ray Siemens (Chichester, UK: John Wiley & Sons, 2015), 239–237.

  • 32. See Alan Liu, Local Transcendence: Essays on Postmodern Historicism and the Database (Chicago: Chicago University Press, 2009); Alan Liu, Friending the Past: The Sense of History in the Digital Age (Chicago: Chicago University Press, 2019); and Laura Mandell, Breaking the Book: Print Humanities in the Digital Age (Chichester, UK: Wiley-Blackwell, 2015).

  • 33. McGann, Radiant Textuality, 70; see also Paul Eggert on how digital editing expands the gap between the “archival” and “editorial” impulses: The Work and the Reader in Literary Studies: Scholarly Editing and Book History (Cambridge, UK: Cambridge University Press, 2019), chap. 5.

  • 34. Katherine Bode, A World of Fiction: Digital Collections and the Future of Literary History (Ann Arbor: University of Michigan Press, 2018), chap. 1.

  • 35. See, for instance, Underwood, Distant Horizons, xv–xvii.

  • 36. On networks, see Vincent Labatut and Xavier Bost, “Extraction and Analysis of Fictional Character Networks: A Survey,” ACM Computing Surveys 52, no. 5 (2019): 89:1–89:40. On literary mapping, see Sara Luchetta, “Exploring the Literary Map: An Analytical Review of Online Literary Mapping Projects,” Geography Compass 11, no. 1 (2017). On gaming, virtual reality and augmented reality, see the Hyde project, “which adapts Stevenson’s classic novella, Strange Case of Dr Jekyll and Mr. Hyde, into a pervasive media game driven by players’ bio-data.”

  • 37. Roopika Risam, New Digital Worlds: Postcolonial Digital Humanities in Theory, Praxis and Pedagogy (Evanston, IL: Northwestern University Press, 2019), 32–36.

  • 38. As early as 2008, Diane M. Zorich, A Survey of Digital Humanities Centers in the United States (Washington, DC: Council on Library and Information Resources, November 2008), 48, listed thirty-two surveyed organizations.

  • 39. Large-scale national or transnational infrastructure projects include the ERC’s Europeana and DARIAH initiatives, France’s Gallica, or Australia’s Humanities Networked Infrastructure (Huni). These issues are, for example, raised in the Australian Association for Digital Humanities (AaDH) submission to the Australian Academy for the Humanities “Future Humanities Workforce” consultation, July 28, 2019.

  • 40. Laurent Dousset, “The Politics of Interoperability in France,” unpublished paper to the Digital Humanities Research Group, Western Sydney University, May 3, 2018.

  • 41. See the AaDH submission to the Australian Academy for the Humanities “Future Humanities Workforce” consultation, which calls for “initiatives designed to address concerns around job security, and recognition and career development pathways for the ‘alt-ac’ positions that many ERCs enter (particularly in the digital humanities) as well as pathways for people to flow back and forth between such positions and more traditional academic roles.”

  • 42. Dan Edelstein, “Mapping the Republic of Letters: History of a Digital Humanities Project,” in Digitizing Enlightenment: Digital Humanities and the Transformation of Eighteenth-Century Studies, ed. Simon Burrows and Glenn Roe (Liverpool, UK: Liverpool University Press, 2020), chap. 3.

  • 43. Rachel Sagner Buurma and Laura Heffernan, “Search and Replace: Josephine Miles and the Origins of Distant Reading,” Modernism/Modernity Print+ 3, no. 1, April 11, 2018; and Thomas N. Winter, “Roberto Busa, S. J., and the Invention of the Machine-Generated Concordance,” The Classical Bulletin 75, no. 1 (1999): 3–20.

  • 44. ADHO’s website states that the Association for Literary and Linguistic Computing (now known as the European Association for Digital Humanities) was founded in 1973 and the Association for Computers and the Humanities was founded in 1978 (ADHO, “About,” June 30, 2019).

  • 45. See, for example, Susan Hockey, “The History of Humanities Computing,” in Unsworth, Schreibmann, and Siemens, A Companion to the Digital Humanities, 1–19. In her opening paragraphs, Hockey mentions Busa, the Trésor de la Langue Française (now ARTFL), and the Institute of Dutch Lexicography at Leiden, but thereafter mostly discusses American, Canadian, and British projects, notwithstanding a couple of later examples from Italy and Norway.

  • 46. François Furet, ed. Livre et société dans la France du XVIIIe siècle, 2 vols. (Paris: Mouton, 1965–1970).

  • 47. See the outputs of the French Book Trade in Enlightenment Europe project, notably the later chapters of Simon Burrows, The French Book Trade in Enlightenment Europe II: Enlightenment Best-Sellers (London and New York: Bloomsbury, 2018), which draw on many of the sources used by Furet and his collaborators.

  • 48. Paule Jansen, ed., L’Année 1778 à travers la presse traitée par ordinateur (Paris: Presses Universitaires de France, 1982).

  • 49. The best and most comprehensive of digitized newspaper collections include the Burney collection, digitized by Gale-Cengage, and the Australian newspapers in Trove, whose OCR has been meticulously corrected by crowdsourced volunteers.

  • 50. Lou Burnard, What Is the Text Encoding Initiative? How to Add Intelligent Markup to Digital Resources (Marseille: Open Edition Press, 2014), Introduction [Online edition], June 30, 2019.

  • 51. TEI Consortium, eds., “TEI P5: Guidelines for Electronic Text Encoding and Interchange,” TEI Consortium, version 4.1.0, August 19, 2020.

  • 52. Brennan, “The Digital Humanities Bust,” is particularly vocal on this point.

  • 53. On digital humanities and the neoliberal agenda, see Allington, Brouillete, and Golumbia, “Neoliberal Tools (and Archives).”

  • 54. AaDH submission to the Australian Academy for the Humanities, “Future Humanities Workforce,” July 28, 2019.

  • 55. Technische Universitat Dresden, “Time Machine Heralds New Era,” March 25, 2019; and Time Machine, “Members”.

  • 56. École Polytechnique Fédérale de Lausanne, “Venice Time Machine,” July 28, 2019.

  • 57. Risam, New Digital Worlds, 43.

  • 58. See Slave Voyages, “Transatlantic Slave Trade Database,” July 28, 2019; and University of Newcastle, Australia, “Colonial Frontier Massacres in Australia, 1788–1930,” July 28, 2019.

  • 59. See Jennifer Ann Skipp, “British Eighteenth-Century Erotica: A Reassessment,” PhD diss., University of Leeds, 2007. Skipp’s contentions about the sexual libertine and misogynist social attitudes underpinning this literature are based on a digitally empowered empirical analysis of sexual acts depicted in a corpus of almost 850 erotic texts. No previous scholar had identified more than 350.

  • 60. See, for example, Ryan Heuser, Franco Moretti, and Erik Steiner, “The Emotions of London,” Pamphlets of the Stanford Literary Lab 13, October 2016.

  • 61. For an overview of this sub-field, see Sara Luchetta, “Exploring the Literary Map,” Geography Compass 11, no. 1 (2017). Major literary mapping projects include A Literary Atlas of Europe, at the Institute of Cartography, ETH Zurich, and Mapping the Lakes: A Literary GIS, at the University of Lancaster.

  • 62. For a vision statement for one domain of inquiry—library history—see Simon Burrows, “Locating the Minister’s Looted Books: From Provenance and Library Histories to the Digital Reconstruction of Print Culture,” Library and Information History 31, no. 1 (2015): 1–17.

  • 63. To take but one example from the author’s experience, a student wanting to study the iconography of the French revolution in Britain for her Bachelor of Arts dissertation located three thousand newspaper articles on “trees of liberty” drawn from across the 18th century with a single search query using the Burney collection.

  • 64. See also Burrows and Roe, Digitizing Enlightenment.

  • 65. See the two-part treatment of Robert A. J. Matthews and Thomas V. N. Merriam, “Neural Computation in Stylometry: An Application to the Works of Shakespeare and Fletcher,” Literary and Linguistic Computing 8, no. 4 (1993) and 9, no. 1 (1994); and Simon Burrows, A King’s Ransom: The Life of Charles Théveneau de Morande, Blackmailer, Scandalmonger and Master-Spy (London: Continuum, 2010), 152, 246 n. 124.

  • 66. See, for instance, Matthew L. Jockers and David Mimno, “Significant Themes in 19th-Century Literature,” Poetics 41, no. 6 (2013): 750–769.

  • 67. Clovis Gladstone and Charles Cooney, “Opening New Paths for Scholarship: Algorithms to Track Text Reuse in Eighteenth Century Collections Online (ECCO),” in Burrows and Roe, Digitizing Enlightenment, chap. 14.

  • 68. Simon Burrows, “Forgotten Best-Sellers of Pre-Revolutionary France,” French History and Civilisation: Papers from the George Rudé Seminar 7, 2016 seminar (2017): 51–65.

  • 69. Alicia Montoya, “Shifting Perspectives and Moving Targets: From Conceptual Vistas to Bits of Data in the First Year of the MEDIATE Project,” in Burrows and Roe, Digitizing Enlightenment, 195–218; and Alicia Montoya, “French and English Women Writers in Dutch Library Catalogues, 1700–1800: Some Methodological Considerations and Preliminary Results,” in “I Have Heard about You.” Foreign Women’s Writing Crossing the Dutch Border: From Sappho to Selma Lagerlöf, ed. Suzan van Dijk and Jo Nesbitt (Hilversum, The Netherlands: Verloren, 2004), 182–216.

  • 70. See Paddy Bullard, “Digital Humanities and Electronic Resources in the Long Eighteenth Century,” Literature Compass 10, no. 10 (2013): 748–760.

  • 71. Old Bailey Online, version 7.2, June 30, 2017; see also its successor projects, London Lives, 1690–1800, version 2.0, March 2018; and Digital Panopticon, version 1.2.1, February 2020.

  • 72. See Bodleian Libraries, University of Oxford, Electronic Enlightenment project, 2008–2019; and Humanities and Design at CESTA, Stanford University, Mapping the Republic of Letters, 2013.

  • 73. See Bibliothèque nationale de France, Gallica, July 28, 2019.

  • 74. Meetings of the Digitizing Enlightenment symposium have taken place in Western Sydney University (2016), Nijmegen (2017), Oxford (2018), and Edinburgh (2019). The 2020 symposium is scheduled for Montpellier.

  • 75. This project, funded by the British Arts and Humanities Research Council, is headed by Mark Towsey and has eight investigators and nine “impact partners” based in Britain, Australia, and North America.

  • 76. Stephen M. Colclough, “Procuring Books and Consuming Texts: The Reading Experience of a Sheffield Apprentice, 1798,” Book History 3 (2000): 21–44.

  • 77. Robert Darnton, “Readers Respond to Rousseau: The Fabrication of Romantic Sensibility,” in The Great Cat Massacre and Other Episodes in French Cultural History, by Robert Darnton (New York: Basic Books, 1984), 217–256.

  • 78. NZ-RED, the New Zealand Reading Experience Database (blog), July 28, 2019.

  • 79. The Open University, “About RED,” July 28, 2019.

  • 80. Nicole Moore, ed., Censorship and the Limits of the Literary: A Global View (London: Bloomsbury, 2015); and Robert Darnton, Censors at Work: How States Shaped Literature (New York: Norton, 2015). See also Nicole Moore, “Censorship,” Oxford Research Encyclopedia of Literature, December 22, 2016.

  • 81. For leading scholarship in fan-fiction studies, see Karen Hellekson and Kristina Busse, The Fan Fiction Reader (Iowa City: University of Iowa Press, 2014).

  • 82. See Cheryl Knott, “Uncommon Knowledge: Late Eighteenth-Century American Subscription Library Collections,” in Before the Public Library: Reading, Community, and Identity in the Atlantic World, I650–1850, ed. Mark Towsey and Kyle Roberts (Leiden, The Netherlands: Brill, 2018), 149–173.

  • 83. The authors are grateful to Professor Kates for access to the beta version of his online database and permission to publish preliminary provisional summary statistics first presented at Digitizing Enlightenment 2 in Nijmegen in 2017.

  • 84. This finding supports Annelien de Dijn, who rejects a politically radical Enlightenment, as espoused by three generations of American Enlightenment scholars, notably Peter Gay, Robert Darnton, and Jonathan Israel. See Annelien de Dijn, “The Politics of Enlightenment from Peter Gay to Jonathan Israel,” Historical Journal 55, no. 3 (2012): 785–805.

  • 85. James E. Dobson, Critical Digital Humanities: The Search for a Methodology (Champaign: University of Illinois Press, 2019). For a illustrative example of an attempt to propound a “model of reading literary texts that,” in the words of its authors, “synthesizes familiar humanistic approaches with computational ones,” see Hoyt Long and Richard Jean So, “Literary Pattern Recognition: Modernism between Close Reading and Machine Learning,” Critical Inquiry 42, no. 2 (2016): 235–267.

  • 86. Oxford Essential Quotations, ed., Susan Ratcliffe, 6th ed. [online edition] (2018) citing the Financial Times of June 10, 2011, which reports that both American and Chinese sources confirm that Zhou was in fact referring to the 1968 Paris student rising. Retrieved November 2020.

  • 87. Cluett, Prose Style and Critical Reading; and Burrows, Computation into Criticism.

  • 88. Ramsay, Reading Machines; Bednarek, Language and Television Series; and Eve, Close Reading with Computers.

  • 89. Moretti, Distant Reading.

  • 90. Jockers, Macroanalysis; Archer and Jockers, The Bestseller Code; Piper, Enumerations; and Underwood, Distant Horizons.

  • 91. Da “The Computational Case against Computational Literary Studies.” For responses, see in particular the three “Critical Responses” published by Leif Weatherby, Ted Underwood, and Nan Da in Critical Inquiry 46, no. 4 (2020): 891–924.

  • 92. McCarty, Humanities Computing.

  • 93. Franco Moretti, Atlas of the European Novel: 1800–1900 (London and New York: Verso, 1998); and Franco Moretti, Graphs, Maps, Trees: Abstract Models for a Literary History (London and New York: Verso, 2005).

  • 94. Bode, A World of Fiction.

  • 95. Mandell, Breaking the Book; and Liu, Friending the Past.

  • 96. See, in particular, John Burrows, “‘Delta’: A Measure of Stylistic Difference and a Guide to Likely Authorship,” Literary and Linguistic Computing 17, no. 1 (2002): 267–287; Patrick Juola, “Authorship Attribution,” Foundations and Trends in Information Retrieval 1, no. 3 (2006): 233–334; and Jan Rybicki and Maciej Eder, “Deeper Delta across Genres and Languages: Do We Really Need the Most Frequent Words?,” Literary and Linguistic Computing 26, no. 3 (2011): 315–321.

  • 97. McGann, Radiant Textuality; Bode, A World of Fiction; and Eggert, The Work and the Reader in Literary Studies.

  • 98. Mark Curran, The French Book Trade in Enlightenment Europe I: Selling Enlightenment (London and New York: Bloomsbury, 2018); and Burrows, The French Book Trade in Enlightenment Europe II; and Katherine Bode, Reading by Numbers: Recalibrating the Literary Field (London: Anthem, 2012).

  • 99. Lutz Koepnick, “Reading in the Digital Era,” Oxford Research Encyclopedia of Literature, August 31, 2016; Niels Ole Finnemann, “E-Text,” Oxford Research Encyclopedia of Literature, January 21, 2018; and Astrid Ensslin, “Hypertext Theory” in Oxford Research Encyclopedia of Literature, March 31, 2020.

  • 100. Risam, New Digital Worlds.