Lives and Data
Lives and Data
- Elizabeth RodriguesElizabeth RodriguesBurling Library, Grinnell College
Although frequently associated with the digital era, data is an epistemological concept and representational form that has intersected with the narration of lives for centuries. With the rise of Baconian empiricism, methods of collecting discrete observations became the predominant way of knowing the physical world in Western epistemology. Exhaustive data collection came to be seen as the precursor to ultimate knowledge, theorized to have the potential to reveal predictive patterns without the intervention of human theory. Lives came to be seen as potential data collections, on the individual and the social level. As individuals have come to see value in collecting the data of their own lives, practices of observing and recording the self that characterize spiritual and diaristic practices have been inflected by a secular epistemology of data emphasizing exhaustivity in collection and self-improvement goals aimed at personal wellness and economic productivity. At the social level, collecting data about human lives has become the focus of a range of academic disciplines, governmental structures, and corporate business models.
Nineteenth-century social sciences turned toward data collection as a method of explanation and prediction in earnest, and these methods were especially likely to be focused on the lives of minoritized populations. Theories of racial identity and difference emerging from such studies drew on the rhetoric of data as unbiased to enshrine white supremacist logic and law. The tendency to use data to categorize and thereby direct human lives has continued and manifests in 21st-century practices of algorithmic identification. At both the individual and social scales of collection, though, data holds the formal and epistemological potential to challenge narrative singularity by bringing the internal heterogeneity of any individual life or population into view. Yet it is often used to argue for singular revelation, the assignment of particular narratives to particular lives. Throughout the long history of representing lives as data in Western contexts, life writers have engaged with data conceptually and aesthetically in multiple ways: experimenting with its potential for revelation, critiquing its abstraction and totalization, developing data collection projects that are embodied and situated, using data to develop knowledge in service of oppressed communities, calling attention to data’s economic and political power, and asserting the narrative multiplicity and interpretive agency inherent in the telling of lives.
- Non-Fiction and Life Writing
The word “data” comes from the Latin verb dar, to give. Data’s most literal meaning is “the givens,” the foundational truths from which knowledge can be derived. Prior to the rise of empiricist epistemology, “data” referred to theological precepts from which knowledge was deduced.1 In early 21st-century usage, “data” typically refers to discrete recorded points of observation, collected in accordance with some set of disciplinary standards for method and notation. Data’s shift in reference, from divine truth to preinterpretive fleck of earthly reality, reflects the emergence of empiricist methods of knowledge creation. Francis Bacon’s treatise on empirical methods, Novum Organum (1620), proposes that those who would seek to understand the world should first aim to create “a store and collection of particular facts, capable of informing the mind.”2 Bacon emphasizes that the process of creating this collection must strive toward exhaustivity, the recording of all observations within a given scope of inquiry no matter how insignificant they might seem, and exteriorization, the material inscription of collected observations to create a model of reality that exists outside of the self and does not rely on the capacity of human memory.3 After such a collection has been assembled, Bacon asserts, investigators must also “bind themselves to two things: 1) to lay aside received opinions and notions; 2) to restrain themselves, till the proper season, from generalization.”4 The collection of particulars is thus initially meant to forestall the imposition of preexisting narrative on observed reality. Ultimately, though, data collection is meant to achieve “not uncertainty but fitting certainty,” supposedly revealing a final, true narrative that will transcend the limits of human subjectivity.5 This aim is perhaps most directly expressed by 18th-century astronomer and mathematician Pierre Simon Laplace, who equated exhaustive data with total prediction, hypothesizing, as Wendy Chun has glossed, that with enough data “an all-knowing intelligence can comprehend the future by apprehending the past and present.”6
But Bacon does not use the word “data,” in English or in Latin, to describe the collections of particulars, a reminder that the epistemological privilege accorded to contemporary concepts of data was not immediate or inevitable. It took until the mid-18th century for English usage of data to conflate the theological and empiricist meanings, and as Jonathan Furner notes,7
A major shift in the dominant interpretation of the concept of data began to take place in the second half of the nineteenth century. Accompanying the rapid development of the statistical and social sciences came the proliferation of systematically organized tables of numerical values, recording and reporting the frequencies and quantities resulting from observations and measurements conducted in accordance with the principles and standards of scientific method. The contents of these tables—the “givens” that, once collected and organized, became the raw materials for new, sophisticated forms of quantitative analysis—began to be known as data.8
As Furner’s historicization suggests, the turn toward massive collection of data to represent and understand lives, spurred by the rise of empiricist and quantitative social sciences, runs parallel to a growing tendency to enshrine data’s epistemological superiority. In popular discourse of the early 21st century, data as the collection of particulars has come to have the status of theological precept, the bedrock upon which inarguable truth can be built.9
The belief in data’s power to reveal remains implicit in much contemporary data rhetoric. danah boyd and Kate Crawford have called this “data mythology,” the assumption that “large data sets offer a higher form of intelligence and knowledge that can generate insights that were previously impossible, with the aura of truth, objectivity, and accuracy.”10 While having truly exhaustive data on any phenomenon or human life has always been an ideal of empiricist epistemology rather than a plausible goal, it remains an aspirational horizon in many areas of inquiry. Technological advance has given rise to modes of science that rely on large-scale, long-term data collection rather than experimentation for the generation of hypotheses.11 The 21st-century term “big data” refers to the confluence of widescale digital data collection, long-term data storage, and computational methods of manipulating large amounts of unstructured data. With the rise of ubiquitous networked computing mediating many life activities around the globe, the ideal of exhaustive data collection of human lives seems newly tenable.12 Accompanying these massive stores of personal data are computing methods that claim to reveal its meaning without the intervention of biased human reasoning.
The field of critical data studies examines the limits of data representation and the potentially harmful impacts of cultural assent to the idea that data can transcend bias and reveal atheoretical knowledge. As Craig Dalton, Jim Thatcher, Rob Kitchin, and Tracey P. Lauriault have outlined, critical data studies aims at “situating data regimes in time and space; exposing data as inherently political and identifying whose interests they serve . . . illustrating the ways in which data are never raw; [and] exposing the fallacies that data can speak for themselves.”13 In theory, the collection of particulars, granted the status of data, promises an ultimate revelation of reality, unsullied by human projections of ideology or theory.14 In practice, human choice affects data at every step of its conception, collection, remediation, and analysis. Media theorist Johanna Drucker has argued that the term “data,” with its passive reference to given truths, would more accurately be called “capta,” fragments of reality chosen to represent the whole.15 Popular discourse also tends to elide the materiality of data, but the physical preservation of data, in any form, is a labor- and resource-intensive endeavor.16 From creation to preservation to analysis, data is full of people.
Data, Narrative, and Lives
Formally, data exists simultaneously as point and collection, recorded particularity and aggregation of particularities. This duality is reflected in colloquial usage of the term, which is grammatically plural but frequently used as a collective singular, referring simultaneously to multiple, independent points and to a body of such points treated as a meaning-bearing entity in itself.17 The slippage between the two uses reflects the tension between our desire for data to serve as preinterpretive, objective reality and our desire for it to reveal meaning, to speak for itself. The term “data-driven” usually refers to the application of interpretive tools after data has been collected, but it is conflated with the existence of data itself, as if simply by being collected data becomes information.18 Data’s dual status as point and collection formally encodes an inherent tension, often overlooked in discourse surrounding data collection and analysis, between data as a method of disrupting narrative projection and data as a method of finalizing narrative prediction.
Data is functionally defined by the methods of collection that create it.19 The specific forms collected as data points vary widely. Quantitative data takes the form of numerical counts and mechanical measurements, while qualitative data can be textual and explicitly subjective. Different procedures of observation, conventions of notation, and scopes of collection reflect data’s widely varying contexts of use. Whatever form of data point is collected, data collections also share two other features in common: the aspiration to exhaustivity and some form of exteriorization as a model of reality beyond the self. A population study undertaken by public health researchers may gather hundreds of data points for thousands of lives and store them in a nationally networked database, but an individual undertaking a Quantified Self (QS) project typically only collects data on their own life and stories it locally or in personal cloud data storage accounts. Both of these scopes of collection, though, aim at a kind of exhaustive representation, hoping to get beyond representative cases to an overarching truth that can only be revealed through a commitment to data collection. Both also result in some material form inscribed beyond the self.
Data fundamentally conceptualize lives as collections of discrete points, coequal and coexisting. Every life can be seen as a kind of data collection: a span of moments beginning with birth and ending with death that could, potentially, be captured and represented in some way. As Sidonie Smith and Julia Watson note, life writing is defined by the status of referentiality to an historical life, and therefore by referentiality to such a data collection, actual or hypothetical.20 Formally, data’s collectivity exists in constant tension with narrative’s traditional selectivity. While the data collection insists on the equal potential meaning of each point, narrative demands the selection of points designated as meaningful in the context of a certain beginning, middle, and end—and the exclusion of points that do not contribute to that understanding. In terms of individual life narrative, this tension results in a shift from imagining a life as a selection of meaningful points of experience that cohere in a kind of plot to imagining a life as the collection of all points, experiential or biological, that insist on the multiplicity of potential narrative.21 Imagining life as a data collection can suggest a proliferation of self-narratives rather than singular identities and bring renewed attention to the interpretive acts required to turn data into meaning.22 In terms of narrating group identity, data’s emphasis on collection, at least in theory, demands attention to the internal diversity and coequality of individual lives within any group. As self-tracking technologies model the self as an assemblage of discrete experiences and physical states, the census models a nation as a collection of equally potentially important people, marking an alignment between the form of the data point and the nation-state’s projection of the autonomous individual.23
In the context of data mythology, though, conceiving of lives as data has also functioned to reiterate rigid identity designations and spurred the growth of a global industrial infrastructure of surveillance and prediction. Data’s inherent challenge to narrativity is often overlooked in favor of sweeping designations that claim to identify individuals’ innate, immutable potentials, in effect assigning them a fixed life narrative. Life writing, as a site in which practices of data narration and cognitive relationships to data are foregrounded and explored, is a terrain on which data’s claim to reveal is continually tested and contested. The representation of lives with and in relationship to data collections offers a productive range of sites in which data’s formal potentials and cultural authority can be interrogated and reimagined.
Data Collection as Individual Practice
The inclusion of referential observation in life writing predates the introduction of Baconian empiricism, but the explicit intention to collect data exhaustively for the purposes of retrospective analysis is a definitional feature that marks the difference between traditions of diaristic and spiritual life writing and more data-oriented forms. In classical and medieval Christian life writing, for example, the systematic recording of one’s experiences is typically undertaken to demonstrate one’s alignment with a preexisting narrative template of identity. Augustine collects his experiences in order to narrate his conversion; Margery Kempe records her travels in order to prove herself a candidate for sainthood. Adam Smyth has noted that financial records permeated the self-documentation texts of 17th-century English life writing.24 While these forms could be said to incorporate collected data, they are not primarily defined by a commitment to collect data.
Another important difference between individual life writing practices in general and data-oriented life writing in particular is the narrator’s attitude toward data as a potential bearer of truth. Montaigne’s Essays set out to record the entirety of self, preserving even unflattering observations, yet Montaigne was skeptical of this or any project of data collection revealing ultimate knowledge. As Jacqueline Wernimont suggests, Montaigne’s attitude toward data much more closely aligns with a critical data studies approach than with the rhetoric of data mythology. For Montaigne, Wernimont argues, “the essay was not a transparent record of experience, or an articulation of certain and systematic (or total) knowledge, but instead an active grappling with the idiosyncratic and imprecise nature of knowledge of the world and self.”25 He chastises those who seek to “be beside themselves . . . to escape from their humanity.”26 Data as exteriorized selfhood speciously promises such becoming “beside” one’s self as being able to gain a “true” vantage. But to aspire to know everything, for Montaigne, was to refuse one’s humanity, defined not by knowing everything but by knowing that knowledge was a constant struggle toward imperfect understanding. The idea that collected observations could eventually speak for themselves would have been anathema to such an understanding of human selfhood.
Among forms of life writing, the diary is perhaps the most innately akin to data collection due to its temporal orientation and its foundation in repeated practice. As Phillipe Lejeune has argued, “before becoming a text, the private diary is a practice.”27 Diary entries, like data points, are created in ongoing time without knowledge of the future. A diary cannot have the contours of narrative because its writer cannot know the significance of one day’s events with respect to the next. As well, diaries imply a governing principle of collection. Entries are meant to accrue. Felicity Nussbaum contrasts diary with autobiography by linking diary’s formal emphasis on collection to its narrative incoherence. She argues, if the “most typical [concept of] autobiography is one that presents a coherent core of a self with a beginning, middle, and end, and that embodies a later self that derives from a former self” then “nonfictional serial narrative forms, which allow contradictions to coexist without assimilating the dissonance, do not fit autobiographical conventions.”28 Diaries “describe discrete moments of experience” and thus “contest a coherent and stable self” by insisting, as a data collection also does, that each of these discrete points is equally real and representative of self.29 Yet few diaries are defined by a practice that aspires toward exhaustive collection, and even fewer are actually composed in a manner in keeping with such an aspiration.
Compositional practices aside, diaries have often been repurposed as data by readers and researchers. Diaries were collected during the Mass Observation Project, an interdisciplinary social research group originally based in London from 1937 to 1949, and continuing in various forms until the present. The team collected diary entries, surveys, and interviews from UK citizens with the goal of both understanding social dynamics and intervening in these dynamics by heightening cross-class awareness.30 Diaristic writings that were not explicitly created as data collections have also been reimagined as data by literary scholars and historians. One well-known example is the diary of Martha Ballard, an 18th-century US midwife. Ballard’s 10,000-plus entries have been remediated as data multiple times, including historian Laurel Thatcher Ulrich’s hand tabulation of various activities, digital transcription by Robert R. McCausland and Cynthia MacAlman McCausland, and Cameron Blevins’ assemblage of these digital transcriptions for computational analysis.31
The second part of Benjamin Franklin’s Autobiography marks a confluence of data collection as methodological commitment, spiritual exercise, and capitalist productivity booster that signals how these historically divergent concerns will be blended and blurred as data becomes an accepted, at times predominant, way of knowing the self. Franklin devises a “bold and arduous project of arriving at moral perfection” to be implemented through rigorous data collection on his own habits.32 He envisions this project as a substitute for attending worship services, as he finds listening to an inexpert preacher to be a waste of time that he could instead spend actively striving toward spiritual improvement. He constructs a method of data collection and visualization, which he shares in detail: “I made a little book, in which I allotted a page for each of the virtues. I rul’d each page with red ink so as to have seven columns, one for each day of the week, marking each column with a letter for the day.”33 His autobiography reproduces examples of his daily plan and the tracking scheme for seeing how well he achieves it. After following his “plan for self-examination” he “was supris’d to find myself so much fuller of faults than I had imagined; but I had the satisfaction of seeing them diminish.”34 This section of his narrative illustrates emergent dynamics that will be frequently reprised in the rhetoric of data as a revelatory tool for knowledge: a do-it-yourself ethic that challenges expert pronouncement; an emphasis on personalization that suggests knowledge can and should be customized; a reliance on self-tracking and data visualization to provoke novel insight unattainable through traditional reflective practice; and a desire for self-improvement, especially in the areas of health and work productivity, which are imagined to be roughly synonymous with spiritual virtue.
The Quantified Self
The practitioners of QS, “an international community of users and makers of self-tracking tools who share an interest in ‘self-knowledge through numbers,’” are perhaps the most prominent contemporary heirs to Franklin’s example.35 Practitioners of QS devise “n = 1” studies in order to illuminate their habits, health, happiness, and productivity. Often, these studies involve a biometric tracking device, cloud data storage, and data visualization. In the same way that Franklin imagines that seeing a record of his own actions laid out before him will allow him to intervene in pernicious habits and achieve moral perfection, QS practitioners entertain the prospect that representing themselves through data will allow for insight that would otherwise be inaccessible and will prompt more desirable actions to achieve goals. Encapsulating this seamless equation of knowledge to action, Gary Wolf, cofounder of the website quantifiedself.com, writes in his 2010 New York Times Magazine article “The Data-Driven Life,” “Once you know the facts, you can live by them.”36 Knowing “the facts” is positioned as a prerequisite for “you” to “live,” extending Franklin’s somewhat ironic and self-deprecating project of perfection into a concept of self as machinic process awaiting optimization.
QS has drawn criticism for its reductionist rhetoric of human selfhood and imbrication in larger political economies of surveillance, but closer studies of its practitioners have allowed a more nuanced portrait of how the concept of self as data shapes lived experience.37 Researchers studying the lived experience of QS practitioners also find that that self-tracking practices can raise awareness of data’s representational limits, its entanglement in proprietary platforms and corporate profit motives, and the necessity of interpretive agency in constructing data’s meaning. Dawn Nafus and Jamie Sherman find that QSers are forced into a critical perspective on big data’s claim to represent the totality of self as they work around the limits imposed by tracking devices’ built-in measurement schemas and non-interoperable data formats.38 Natasha Schüll tracks vigorous debate between practitioners on the relationship of data and narrative, whether data complicates or clarifies self-understanding, whether self-tracking is better compared to the partial parameters of sketching a portrait or the exhaustive representation of photography.39
QS practices propose to reveal hyperpersonal realities of self through highly customized collection practices, although to the extent that these practices rely on third-party, monetized applications, they also contribute to widescale and often hidden life data aggregation. The online genealogy site Ancestry.com offers another example of how rhetoric of data as ultimate revelation allows companies to capitalize on individual desires to know oneself as and through data. For a fee, Ancestry.com allows users to share and research genealogical information and incorporate the results of personal genome mapping. The rhetoric around such information is typical of data mythology. As Julie Rak has observed, it is presumed that “genealogy or DNA testing will tell us who we ‘really’ are, as advertisements for Ancestry.com appear to promise.”40 Rak finds the implications of such promises far-reaching and concerning because
companies like Ancestry.com make profits from the promise of the identity reveal, and sometimes use the data that genealogists entrust to them for religious or medical purposes without permission, setting the terms for who gets access to information and how much it will cost.41
Data Visualization as Self-Portrait
Data visualization is a popular practice of data presentation and analysis. Like data collection in general, data visualization is often framed as a tool of revelation, offering a tantalizing prospect of total vision that allows data to reveal its own narrative and meaning. As Minna Ruckenstein describes, “Knowing becomes inseparable from the data visualizations; smartphone applications and other monitoring devices act as mediators and translators that contribute to making human reactions and life visible, identifiable and knowable.”42 Data visualization of the self evidences a range of relationships to data’s claim to revelation.
Nicholas Felton’s Annual Reports, produced from 2005 to 2014, are an example of a more-or-less credulous approach to data collection and visualization as self-revelation. Felton composed his first report to visualize his life in 2005, retrospectively compiling and visualizing a range of activity counts produced by devices and services with which he interacted. Although he thought this would be of interest only to friends and family, burgeoning fascination with data collection augmented by his savvy visual design propelled the report to internet virality. His collection methodologies varied over time, and he eventually adopted more customized and personalized tracking techniques. The year 2008, for example, focused on distances traveled, while the 2010 report was composed of data left behind by his recently deceased father. Felton ultimately produced a decade’s worth of reports, some of which have been exhibited in major contemporary art museums. The reports combine compelling typography and infographic design with the prospect of self-knowledge. Felton’s description of the reports in a 2014 interview directly connects the project of data visualization with the aspiration to reveal an inarguable reality:
At the most philosophical level, I believe that this project is a search for TRUTH [sic]. There is a thought within me that if I can record just the right data and display it in the right way, I may discover something (. . .) some crystalline structure, a higher order than I am able to experience.43
That this lofty goal might be hard to reconcile with the banal reality of, for example, the number of times a popular song has been played on a streaming music service in a given year, serves as a reminder that actual data points are only meaningful when writers and readers invest them with the potential to mean. In 2011, Felton, along with another designer, was contracted by Facebook to redesign individual user profiles into what, in its 2021 iteration, is called the Timeline, translating his own experience with and understanding of the self as data into an interface that shapes millions of other users’ self-perception.
Dear Data offers a more embodied and self-consciously constructed approach to self-data visualization. Each week for a year, data design professionals Giorgia Lupi and Stefanie Posavec collected data about a specific area of their lives: “how often we complained, or the times when we felt envious; when we came into physical contact and with whom; the sounds we heard around us.”44 After the end of each week, they turned these collections into postcard-sized, hand-drawn visualizations that they then mailed to each other. They undertook this project in self-conscious relationship to a broader culture of data mythology, as they note in their introduction:
we are said to be living in the age of “Big Data,” where algorithms and computation are seen as the new keys to universal questions, and where a myriad of applications can detect, aggregate, and visualize our data for us to help us become these efficient super-humans.45
Their gloss of the ethos of big data echoes the Enlightenment dream of total representation as a means of total prediction and therefore total control. By collecting data on idiosyncratic personal experience and hand-drawing its visualization, Lupi and Posavec attempt to intervene in an environment of pervasive but inaccessible data collection, where personal data is passively collected, covertly sold, and revealed only to algorithms. They, instead, seek to “approach data in a slower, more analogue way” and characterize their project as “personal documentary” rather than “quantified self.”46 Lupi and Posavec’s explicitly embodied practice of data collection and visualization complicates the idea of the data point and necessitates a reading practice that demands physical engagement. They aim to collect and interact with self-data differently. “Instead of using data just to become more efficient,” they write, “we argue we can use data to become more humane and to connect with ourselves and others at a deeper level.”47 These are utopian claims, especially when made on behalf of a representational form linked with pervasive surveillance of our bodies and actions. Still, their practice of data collection does represent an alternative practice to popular norms of passive collection, computational visualization, and algorithmic analysis.
Data and Race
All these are individual life data collections used as forms of self-representation that are primarily conceived of and controlled by individuals.48 While these forms of data collection suggest the clearest analogies to autobiography and individual forms of life writing in general, they are not the only spheres in which data intersects with the narration of lives. Data is also, and more commonly, collected on the level of groups and populations. The narratives formed on the basis of these large-scale data collections take the form of theories of identity that project permanent and essential difference between groups, algorithmic identifications that categorize individuals, and in some cases government or corporate policy that allocates certain resources and opportunities to individuals assigned certain group identities, ostensibly on the basis of data.
The history of efforts to narrate the data of racialized lives to argue for the existence of biological racial identity and difference is perhaps the paradigmatic context for understanding how the supposedly self-revealing relationship between data and narrative can be used to justify and perpetuate structural inequality. As Ruha Benjamin notes, the concept of race as an empirical reality of human identity is “itself a kind of technology—one designed to separate, stratify, and sanctify the man forms of injustice experienced by racialized groups, but one that people routinely reimagine and redeploy to their own ends.”49 Data has played a role in building and bolstering this technology, but, as Benjamin notes of race, data has also offered avenues of resistance and repurposing.50
Theories of race rooted in biological reality have frequently leveraged data as an evidentiary and rhetorical tool. This tendency became especially prominent in the course of social science’s turn to data in the second half of the 19th century.51 In the United States, the social sciences’ turn to data also coincided with political struggle over legal chattel slavery and, post-Civil War, over the political and civil rights of Black Americans. Khalil Gibran Muhammed has called this historical moment a “racial data revolution” and identifies the publication of Frederick Hoffman’s Race Traits and Tendencies of the American Negro in 1896 as the beginning of data’s centrality to the theory and discourse of race in the United States.52 Hoffman, a naturalized German immigrant and professional actuary, surveyed crime and mortality statistics to argue for Black racial inferiority. Hoffman’s was not the first study to employ data collection as a method; numerous anthropometric efforts preceded his but been unable to conclusively identify difference between racial groups that exceeded difference within racial groups. Hoffman’s innovation was to focus on life outcomes rather than physical characteristics, which he argued were evidence of inherent traits rather than a social environment shaped by the effects of enslavement and continuing prejudice.
Instead of considering that predominantly white police officers employed by predominantly white city officials might tend to surveil Black neighborhoods more frequently and enforce legal regulations more rigorously for Black citizens, Hoffman allows counts of crimes committed to stand as evidence of Black character rather than white agency. By taking data such as arrest numbers as self-evident, Hoffman erases the human decisions behind the numbers. Because data was (and often is) popularly seen as objective, it provided a seemingly neutral justification for racist assumptions. This seemingly objective condemnation of Black Americans’ life potentials provided a justification for legal discrimination and continued white indifference to systemic inequity and its effect on Black lives. Predictive policing and recidivism risk assessment tools in many ways represent a more technologically advanced version of Hoffman’s uncritical analysis of crime data, resulting in the targeting of Black communities for increased surveillance and Black defendants for harsher sentencing and stricter probation requirements despite the tools’ supposedly race-blind methods.53
W. E. B. Du Bois’ pioneering sociological research methods repurpose data collection to tell very different stories of Black lives.54 Hired by the University of Pennsylvania to conduct a study of its Black residents, Du Bois intuited that “the city of Philadelphia had a theory; and that theory was that this great, rich, and famous municipality was going to the dogs because of the crime and venality of its Negro citizens . . . Philadelphia wanted to prove this by figures.”55 Du Bois employed large-scale data collection in order to counter this expectation, rebuking Hoffman and his cohort’s selective data collection practices and the fixed narratives of Black life that it claimed as objective truth.56 In his own autobiography, he describes how he set about creating an exhaustive data set of Philadelphia’s Seventh Ward:
I sent out no canvassers. I went myself. Personally I visited and talked with 5,000 persons. What I could, I set down in orderly sequence on schedules which I made out and submitted to the University for criticism. Other information I stored in my memory or wrote out as memoranda. I went through the Philadelphia libraries for data, gained access in many instances to private libraries of colored folk and got individual information. I mapped the district, classifying it by conditions; I compiled two centuries of the history of the Negro in Philadelphia and in the Seventh Ward.57
His description underscores the difference of his method and the great personal commitment necessary to see them to fruition as a Black practitioner. Rather than relying on secondhand statistics or sampling methods, he visits nearly every house in the ward. He compiles detailed notes for each household, allowing him to portray the neighborhood’s internal economic diversity. He combines quantitative, qualitative, and historical data to demonstrate the interweaving effects of disenfranchisement, prejudice, and entrenched white supremacy that allow employers to continue paying Black workers less for the same work, creating snowball effects such as increased reliance on short-term credit and deferred or skipped medical care. Finally, his description reminds us that, even as a Harvard-educated researcher, Du Bois also had to negotiate differential working conditions. He is hired as a temporary contractor and does not have a cadre of graduate student research assistants to help him carry out this massive project. There is no illusion of data as disembodied as he physically collects and stores. He chooses to “ignore the pitiful stipend” to make the most of what opportunity he does have, but he has to work harder, longer, and more precariously in order to bring this unique data set of Black life into existence.58
The reality of data collection as racialized and embodied labor is further underscored in the equally pioneering work of journalist and political organizer Ida B. Wells-Barnett. After reporting on the lynchings of Thomas Moss, Calvin McDowell, and Henry Stewart, Black co-owners of a successful grocery store in Memphis, she was driven from her home by death threats. She launched an anti-lynching campaign built on data collected from “news gathered by white correspondents, compiled by white press bureaus and disseminated among white people.”59 Like Du Bois, she sought to collect exhaustively and used the collected data to refute a preexisting white narrative justifying lynching as a regrettable but unpreventable response to sexual violence against white women. Even as reported by the white press, she discovered, sexual violence was charged in less than a third of the cases (and rarely if ever proven, because lynching circumvents any investigation). Further analysis of the stated charges against lynching victims demonstrated that nearly any type of perceived infraction could be used as a justification. The most accurate and comprehensive justification for lynching shown by this data, she contends, is as an exercise of white power designed to disenfranchise, terrify, and impoverish Black citizens. Even with a massive data set behind her argument, Wells-Barnett’s writing alone did little to move northern white action against lynching. She had to physically circulate the data by traveling as a speaker, undertaking multiple tours in the United States and the United Kingdom. The data of lynching only began to come into existence because of her work, at great personal cost and imminent threat of physical peril, and it only circulated as knowledge because of her continued embodied labor.60
Despite a freighted history of data collection in service of racialization, some activists also see potential for data to play a role in combatting the effects of systemic racism and redistributing some forms of epistemic power. In 2018, Yeshimabeit Milner launched Data for Black Lives (D4BL), a research and organizing collective “committed to the mission of using data science to create concrete and measurable change in the lives of Black people.”61 As Milner observes, “In the world we live in, data is destiny. For Black people, who have been disproportionately harmed by data-driven decision-making, this is especially true.”62 D4BL’s campaign to open Facebook data to Black communities points at some of the interventions necessary to make data collection a tool for Black empowerment: access to Facebook’s data in the form of an anonymized public data trust, input in the development of a code of ethics, and the hiring of Black data and research professionals. The Detroit Community Technology Project outlines similar demands for data justice from city government, such as prioritizing new data creation projects based on community interest and creating more opportunities for residents to engage with open city data outside of digital platforms.63
That D4BL was launched after Facebook’s sale of user to data to Cambridge Analytica for the purposes of election influence became public. It provides just one more example that Black data activists are leaders in a broader struggle to understand and repurpose the power of massive data collection toward more equitable and just ends. The changes for which D4BL advocates would not only empower Black communities in relation to their data but also model an alternative to an extractive data economy that turns daily life into proprietary information without oversight or recourse. As Shaka McGlotten has argued, the emerging concepts and practices of Black data constitute a critical response to the claims of big data.64
Surveillance, Capitalism, and Algorithmic Identification
The promise of others’ data has become one of the most valuable commodities in the global economy. Shoshana Zuboff has termed this development “surveillance capitalism,” defined as a “new form of information capitalism aim[ing] to predict and modify human behavior as a means to produce revenue and market control.”65 As technological capability meets economic incentive, the thought experiment of an exhaustive life data collection for every individual is coming closer to reality. As Paul Arthur elaborates,
over decades of engagement with the Internet individual users will each have generated, by default, an extensive and enduring digital life narrative . . . While much of the data will be self-generated, an increasingly large proportion will come from sources that are external, automated, untraceable, and unknown to the subject.66
Digital traces of online activity are complemented by growing stores of data, born-digital and digitized, generated by daily life. As Kevin Haggerty and Eric Richardson have observed, disparate stores of data are likely to be sold, reaggregated, and used in contexts far removed from their original collection, legally and illegally.67 As breakthroughs in medicine enabled by widescale genome sharing appear side by side with sophisticated techniques of election hacking and disinformation campaigns, it is clear that the data of self represents the potential for both knowledge and vulnerability, on the level of the individual as well as on the level of societies.
While the history of self-tracking stretching from Franklin to Felton presents data collection as an agential act of self-representation, data selfhood is increasingly beyond the purview of individual creation or intervention. John Cheney-Lippold has proposed the concept of algorithmic identity to describe how massive data collection has functionally altered concepts of identity and selfhood. Algorithmic identity is
an identity formation that works through mathematical algorithms to infer categories of identity on otherwise anonymous beings. It uses statistical commonality models to determine one’s gender, class, or race in an automatic manner at the same time as it defines the actual meaning of gender, class, or race themselves.68
Instead of being biologically male, female, or transsexual, empirically upper class or working class, socially Black or white, algorithmic identity temporarily assigns individuals to these categories based on how those categories are being defined by the algorithm itself at the time of its deployment. The same set of behaviors could be assigned differently on a different day based on what other individuals also assigned these categories have been doing. Although individuals still experience themselves as having biological and social identities, “who we are in terms of data depends on how our data is spoken for.”69 This speaking is often done not by humans exercising interpretive agency but by categorizing algorithms, deployed in a wide range of commercial and governmental contexts.70 The categorizations are fluid, but the act of categorization is constant. As Joel Haefner has pointed out, in terms of life writing, the centrality of categorizing algorithms, which produce the subject as a series of on-demand, contingent designations, “calls into question the centrality of narration in constructing identity, just as predictive algorithms themselves erode the centrality of human action in mediated life representations.”71
In many contexts, a categorization is in effect a decision about not only who an individual is but who they will be. Haefner observes, “Data mining and aggregation is used to alter users’ interfaces and hence their online behavior, which in turn can alter real-world behavior.”72 Algorithmic categorizations affect us at the level of what advertisements and articles one is shown while browsing the Internet, but also at the level of whether one is offered life insurance, home loans, student loans, subsidized healthcare, public assistance, and job opportunities. Categorization has always been an element of sociality, but algorithmic categorization exploits data as an epistemologically privileged representational form to present findings as true that are, as anyone involved in algorithm design and deployment is aware, probabilistic, often unexplainable, and inevitably biased.73
Autobiographies are beginning to represent and reflect on the cultural and economic presence of data as it shapes lives in the 21st century. Edward Snowden’s Permanent Record portrays the self in the age of mass data collection as a subject at first empowered but then existentially threatened by the emergent potentials of networked data for government surveillance. In 2013, Snowden shared with journalists copied files from his work as a contractor for the US government that confirmed large-scale collection of personal data generated by the digital daily lives of US citizens. Within days of his revelation, the US government promised to prosecute him upon his return to US soil and revoked his passport to limit any further travel. He has since lived in exile in Russia. His autobiography represents the rise of internet infrastructure on a personal and collective level. As an adolescent, he used fledgling hacker skills to gain a sense of power and identity, and after dropping out of college, he was able to parlay these skills into full-time work. Although he began his career as a civil servant, he moved to more lucrative, private-sector positions as a third-party contractor for digital intelligence services. As a contractor, he discovered surveillance tools and programs of data collection far vaster than he had imagined, representing a “historic effort to achieve total access to—and clandestinely take possession of—the records of all digital communications in existence.”74 He realized that when
the ubiquity of collection was combined with the permanency of storage, all any government had to do was select a person or a group to scapegoat and go searching—as I’d gone searching through the agency’s files—for evidence of a suitable crime.75
Even more deeply, though, he feared that knowledge that a complete data record of one’s self imposes “fidelity to memory, identitarian consistency, and so ideological conformity,” epistemologically precluding the individual’s potential to grow and change, and essentially eliminating the conditions of possibility for life narrative as record of development.76
The entanglement of personal data, technologies of daily life, and global economies makes it difficult for individual subjects to meaningfully challenge the widening scope of data surveillance. This is a key dynamic represented by Anna Weiner’s Uncanny Valley, a memoir of her young adulthood as an employee of a Silicon Valley data analytics startup. Even though she initially worked toward a career in print publishing, after a few years of entry-level work she found herself drawn to the creativity, energy, and relatively higher pay of the technology startup world, and she relocated to the San Francisco Bay area for a position at a data analytics startup. Data quickly became an enthralling new way of seeing others’ selves. Wiener recalls, “It did not take long for me to understand the fetish for big data. Data sets were mesmerizing, digital streams of human behavior.”77 The tool tracks user website behaviors and, as a secondary product, generates profiles for individual users of a particular site based on “streams of personalized, searchable activity, as well as any identifying metadata.”78 The economics of data are ubiquitous, shaping the landscape of labor, sociality, aesthetic taste, and vocational aspiration. At the tech startups where she and her peers worked, “The endgame was the same for everyone: Growth at any cost. Scale above all. Disrupt, then dominate,” a business plan driven by the ideal of “A world improved by companies improved by data.”79 Capitalism became the middleman for data’s epistemological potential, which is less to expand knowledge than to create “a world freed of decision-making, the unnecessary friction of human behavior, where everything . . . could be optimized, prioritized, monetized, and controlled.”80 Snowden’s revelations produced barely a tremor in her data-dependent workplace. Wiener and her peers instead clung to the justification that they were “just a neutral platform” that helped “developers make better apps,” never questioning their “role in facilitating and normalizing the creation of unregulated, privately held databases on human behavior.”81 Her life outside of work was shaped by data just as readily: “I read whatever the other nodes in my social networks were reading. I listened to whatever music algorithm told me to . . . The algorithm told me what my aesthetic was: the same as everyone else I knew.”82 The results of the 2016 presidential election seemed, to Wiener, to signal a reckoning for companies made profitable by enabling the collection, analysis, and instrumentalization of user data, but, at least in her view, business as usual prevails.
Toward Data Sovereignty and Data Justice
While Snowden’s and Wiener’s life narratives lay out the limits of individual resistance to the normalization of ubiquitous and unfettered collection of personal data, their reckoning with these limits also models a re-engagement with the definitions and terms of democratic citizenship for a generation that had been assured that technological advance would be a universal panacea. While Snowden did not single-handedly end US data collection surveillance programs, his revelations did help spark collective organization around data rights and government policy discussions. Concepts of data sovereignty and data justice are being formulated and debated in a wide range of settings, and these debates are in many ways being led by minoritized and indigenous communities.83 Legal definitions and frameworks protecting life data have been put into place in some jurisdictions. In April 2016, for example, the European Union passed the General Data Protection Regulation, recognizing the existence of a “data subject” who is
an identifiable natural person . . . who can be identified, directly or indirectly, in particular by reference to an identifier such as a name, an identification number, location data, an online identifier or to one or more factors specific to the physical, physiological, genetic, mental, economic, cultural or social identity of that natural person.84
Theoretical frameworks such as critical data studies, data feminism, and queer data are all characterized by attention to both the intellectual parameters of lives as data and the impacts of a data-centric discourse of knowledge on lived experience, especially in minoritized communities.85 Grassroots organizing around data collection and algorithmic analysis, such as the D4BL movement and the Our Data Bodies collective, is bringing forth new narratives from data and about data.86 Whether these efforts can effectively counter the immense economic and political will behind life data collection and proprietary modes of analysis may, in some part, be determined by how societies grapple with conceiving lives not only as data collections but also as inextricably linked to collective frameworks of meaning and action.
Discussion of the Literature
Critical data studies, focused on the situated, historical, and embodied dimensions of data, builds from a foundational body of work in considering the history of epistemic values attached to objectivity and quantification. Lorraine Daston and Peter Galison’s Objectivity (2007), Mary Poovey’s A History of the Modern Fact: Problems of Knowledge in the Sciences of Wealth and Society (2009), Theodore Porter’s Trust in Numbers: The Pursuit of Objectivity in Science and Public Life (1996) are touchstone works in this area.87 More explicitly data-centered works have followed, including Raw Data is an Oxymoron (2013) edited by Lisa Gitelman, Beautiful Data: A History of Vision and Reason since 1945 by Orit Halpern, and All Data Are Local: Thinking Critically in a Data-Driven Society (2019) by Yanni Loukissas.88
Critical data studies is a highly interdisciplinary and flourishing field. The open access, peer-reviewed journal Big Data & Society, launched in 2014, frequently features life writing-adjacent work from anthropologists, sociologists, and critical theorists of race, gender, and digital selfhood. Key monographs in this area include Thinking Big Data in Geography: New Regimes, New Research (2018), edited by Jim Thatcher, Josef Eckert, and Andrew Shears; The Big Data Agenda: Data Ethics and Critical Data Studies (2018) by Annika Richterich; Data Feminism (2020) by Catherine D’Ignazio and Lauren Klein; and Race after Technology: Abolitionist Tools for the New Jim Code (2019) by Ruha Benjamin.89 John Cheney-Lippold’s We Are Data: Algorithms and the Making of Our Digital Selves (2018) brings a critical data studies approach to the nature of selfhood and identity in the context of algorithmic identification.
The QS community has been a particularly salient site for considering the effects of datafication on concepts of selfhood at an individual and societal level. Gina Neff and Dawn Nafus’s Self-Tracking (2016), Nafus’s edited collection Quantified: Biosensing Technologies in Everyday Life (2016), and Deborah Lupton’s The Quantified Self all examine QS practice and lived experience.90
Considerations of data as a representational form and epistemological concept distinct from, if inextricably related to, the broader landscape of networked digital information have been relatively infrequent in literary studies broadly and life writing studies specifically. Nancy Katherine Hayles’s work on database aesthetics is an early example.91 But as the human concerns surfaced by critical data studies become more apparent, more literary and life writing scholars are beginning to consider data as form and concept. Jacqueline Wernimont considers the formal effects of quantification on the representation of lives in Numbered Lives: Life and Death in Quantum Media (2018).92 The collection Bodies of Information: Intersectional Feminism and the Digital Humanities (2018), edited by Wernimont and Elizabeth Losh, includes several essays that look at data and data visualization as representational forms with specific implications for understandings of race, gender, and professional identity.93 Similarly, Julie Rak and Anna Poletti’s edited collection Identity Technologies: Constructing the Self Online presents many essays that intersect with issues around life data collection in the course of digital self-representation.94 Paul Longley Arthur, Kylie Cardell, Joel Haefner, and Julie Rak have also considered data collection and its implications for selfhood and life narrative.95
- Benjamin, Ruha. Race after Technology: Abolitionist Tools for the New Jim Code. Cambridge, UK: Polity Press, 2019.
- Bigo, Didier, Engin Isin, and Evelyn Ruppert. Data Politics. Milton Park, UK: Taylor & Francis, 2019.
- boyd, danah, and Kate Crawford. “Critical Questions for Big Data: Provocations for a Cultural, Technological, and Scholarly Phenomenon.” Information, Communication & Society 15, no. 5 (2012): 662–679.
- Cheney-Lippold, John. We Are Data : Algorithms and the Making of Our Digital Selves. New York: New York University Press, 2018.
- D’Ignazio, Catherine, and Lauren Klein. Data Feminism. Cambridge: MIT Press, 2020.
- Eubanks, Virginia. Automating Inequality: How High-Tech Tools Profile, Police, and Punish the Poor. New York: St. Martin’s Press, 2019.
- Furner, Jonathan. “‘Data’: The Data.” In Information Cultures in the Digital Age. Edited by Matthew Kelly and Jared Bielby, 287–304. Wiesbaden, Germany: Springer, 2016.
- Gitelman, Lisa, ed. Raw Data is an Oxymoron. Cambridge: MIT Press, 2013.
- Halpern, Orit. Beautiful Data: A History of Vision and Reason since 1945. Durham, NC: Duke University Press, 2014.
- Loukissas, Yanni. All Data Are Local: Thinking Critically in a Data-Driven Society. Cambridge: MIT Press, 2019.
- Noble, Safiya Umoja. Algorithms of Oppression: How Search Engines Reinforce Racism. New York: New York University Press, 2018.
- Poletti, Anna, and Julie Rak. Identity Technologies: Constructing the Self Online. Madison: University of Wisconsin Press, 2014.
- Richterich, Anna. The Big Data Agenda: Data Ethics and Critical Data Studies. London: University of Westminster Press, 2018.
- Thatcher, Jim, Andrew Shears, and Josef Eckert, eds. Thinking Big Data in Geography: New Regimes, New Research. Lincoln: University of Nebraska Press, 2018.
- Wernimont, Jacqueline. Numbered Lives: Life and Death in Quantum Media. Cambridge: MIT Press, 2018.
1. For more on the history of word usage, see Daniel Rosenberg, “Data before the Fact,” in Raw Data is an Oxymoron, ed. Lisa Gitelman (Cambridge: MIT Press, 2013), 15–40, and Jonathan Furner, “‘Data’: The Data,” in Information Cultures in the Digital Age, ed. Matthew Kelly and Jared Bielby (Wiesbaden, Germany: Springer, 2016), 287–306.
2. Francis Bacon, Novum Organum, trans. Joseph Devey (New York: P. F. Collier, 1902), 78.
3. On aspiring to exhaustivity, see Bacon, Novum Organum, 95. On exteriorization, see 80–81.
4. Bacon, Novum Organum, 106–107.
5. Bacon, Novum Organum, 101.
6. Wendy Hui Kyong Chun, Programmed Visions: Software and Memory (Cambridge: MIT Press, 2011), 9.
7. Rosenberg, “Data before the Fact,” 15.
8. Furner, “‘Data,’” 295.
9. Lisa Gitelman and Virginia Jackson, “Introduction,” in Raw Data, ed. Gitelman, 1.
10. danah boyd and Kate Crawford, “Critical Questions for Big Data: Provocations for a Cultural, Technological, and Scholarly Phenomenon,” Information, Communication & Society 15, no. 5 (2012): 662–679, at 663.
11. See Gordon Bell, “Foreword,” in The Fourth Paradigm: Data-Intensive Scientific Discovery, ed. Tony Hey, Stewart Tansley, and Kristin Tolle (Redmond, WA: Microsoft Research, 2009), xi–xv.
12. See boyd and Crawford, “Critical Questions for Big Data.”
13. Rob Kitchin and Tracey P. Lauriault, “Toward Critical Data Studies: Charting and Unpacking Data Assemblages and Their Work,” in Thinking Big Data in Geography: New Regimes, New Research, ed. Jim Thatcher, Andrew Shears, and Josef Eckert (Lincoln: University of Nebraska Press, 2018), 7.
14. For one widely cited version of this claim, see Chris Anderson, “The End of Theory,” Wired 16, no. 7 (2008): 108.
15. See Johanna Drucker, “Graphical Approaches to the Digital Humanities,” in A New Companion to Digital Humanities, ed. John Unsworth, Ray Siemens, and Susan Schreibman (Chichester, UK: John Wiley & Sons, 2015), 238–250.
16. See Jean-Christophe Plantin, “Data Cleaners for Pristine Datasets: Visibility and Invisibility of Data Processors in Social Science,” Science, Technology, & Human Values 44, no. 1 (2018): 52–73; and Luke Munn, “Injecting Failure: Data Center Infrastructures and the Imaginaries of Resilience,” The Information Society 36, no. 3 (2020): 167–176.
17. Gitelman and Jackson, “Introduction,” 8.
18. Defined by philosopher of information Luciano Floridi as “data plus meaning.” Floridi, Information: A Very Short Introduction (Oxford: Oxford University Press, 2010), 21.
19. Roberto Franzosi, From Words to Numbers: Narrative, Data, and Social Science (Cambridge, UK: Cambridge University Press, 2004), 186.
20. Sidonie Smith and Julia Watson, Reading Autobiography: A Guide for Interpreting Life Narratives, 2nd ed. (Minneapolis: University of Minnesota Press, 2010), 1–3, 10.
21. For further consideration of the formal implications on narrative of conceptualizing life as data collection, see Elizabeth Rodrigues, “‘Contiguous but Widely Separate’ Selves: Im/Migrant Life Narrative as Data-Driven Form,” Biography 38, no. 1 (2015): 56–71.
22. Even for life narrators who reject the multiplicity of self implied by the concept of self as data collection, encountering their own data prompts a reckoning with the selective acts that form narrative, evidenced by the emergence of algorithms and commercial services that attempt to automate the selection of meaningful moments from amassed personal data. For a deeper look at companies proposing to turn life data into narrative for users, see Kylie Cardell, “Modern Memory-Making: Marie Kondo, Online Journaling, and the Excavation, Curation, and Control of Personal Digital Data,” a/b Autobiography Studies 32, no. 3 (2017): 499–517.
23. As Theodore Porter notes, “Implicitly, at least, statistics tended to equalize subjects. It makes no sense to count people if their common personhood is not seen as somehow more significant than their differences.” Theodore Porter, The Rise of Statistical Thinking, 1820–1900, new ed. (Princeton, NJ: Princeton University Press, 2020), 25.
24. See Adam Smyth, “Money, Accounting, and Life-Writing, 1600–1700: Balancing a Life,” in A History of English Autobiography, ed. Adam Smyth (Cambridge, UK: Cambridge University Press, 2016), 86–99.
26. Wernimont, Numbered Lives, 104.
27. Philippe Lejeune, On Diary, ed. Jeremy D. Popkin and Julie Rak, trans. Kathy Durnin (Honolulu: University of Hawaii Press, 2009), 31.
28. Felicity Nussbaum, “Toward Conceptualizing Diary,” Studies in Autobiography, ed. James Olney (Oxford: Oxford University Press, 1988), 130.
29. Nussbaum, “Toward Conceptualizing Diary,” 132.
32. Benjamin Franklin, The Autobiography of Benjamin Franklin, ed. Peter J. Conn (Philadelphia: University of Pennsylvania Press, 2005), 65.
33. Franklin, Autobiography, 68.
34. Franklin, Autobiography, 70.
37. See Jose van Dijck, “Datafication, Dataism and Dataveillance: Big Data between Scientific Paradigm and Ideology,” Surveillance & Society 12, no. 2 (2014): 197–208; Gavin J. D. Smith and Ben Vonthethoff, “Health by Numbers? Exploring the Practice and Experience of Datafied Health,” Health Sociology Review 26, no. 1 (2014): 6–21; Joseph E. Davis and Paul Scherz, “Persons without Qualities: Algorithms, AI, and the Reshaping of Ourselves,” Social Research 86, no. 4 (2019): xxxiii–xxxix.
38. See Dawn Nafus and Jamie Sherman, “This One Does Not Go Up to 11: The Quantified Self Movement as an Alternative Big Data Practice,” International Journal of Communication 8 (2014): 1784–1994.
39. See Natasha D. Schüll, “The Data-Based Self: Self-Quantification and the Data-Driven (Good) Life,” Social Research 86, no. 4 (2019): 909–930.
40. Julie Rak, “Radical Connections: Genealogy, Small Lives, Big Data,” a/b Auto/Biography Studies 32, no. 3 (2017): 483.
41. Rak, “Radical Connections,” 482.
42. Minna Ruckenstein, “Visualized and Interacted Life: Personal Analytics and Engagements with Data Doubles,” Societies 4, no. 1 (2014): 80.
43. Nicholas Felton, “Tracing My Life,” in New Challenges for Data Design, ed. David Bihanic (London: Springer, 2015), 342.
44. Giorgia Lupi and Stefanie Posavec, Dear Data: A Friendship in 52 Weeks of Postcards (New York: Princeton Architectural Press, 2016), ix.
45. Lupi and Posavec, Dear Data, xi.
46. Lupi and Posavec, Dear Data, xi.
47. Lupi and Posavec, Dear Data, xi.
48. Although not exclusively, as the dependence of some QS practices on third-party technology companies illustrates.
50. For more historical examples of repurposing the tools and rhetoric of data to combat racialization and racism, see Britt Rusert, Fugitive Science: Empiricism and Freedom in Early African American Culture (New York: New York University Press, 2017).
51. See Nancy Leys Stepan and Sander L. Gilman, “Appropriating the Idioms of Science: The Rejection of Scientific Racism,” in The Bounds of Race: Perspectives on Hegemony and Resistance, ed. Dominick LaCapra (Ithaca, NY: Cornell University Press, 1991), 72–103.
52. Khalil Gibran Muhammad, The Condemnation of Blackness: Race, Crime, and the Making of Modern Urban America (Cambridge, MA: Harvard University Press, 2010), 33.
53. See Sarah Brayne, Alex Rosenblat, and danah boyd, Predictive Policing, Data & Civil Rights Conference, Data & Society, October 10, 2015. See also Julia Angwin et al., “Machine Bias,” Propublica, May 23, 2016.
54. For a landmark study on Du Bois’s innovations and their impact on sociology as a discipline, see Aldon Morris, The Scholar Denied: W. E. B. Du Bois and the Birth of Modern Sociology (Berkeley: University of California Press, 2015).
55. W. E. B. Du Bois, Dusk of Dawn: An Essay Toward an Autobiography of a Race Concept (New York: Harcourt, Brace, 1940), 58.
56. See Mia Bay, “The World Was Thinking Wrong About Race: The Philadelphia Negro and Nineteenth-Century Science,” in W. E. B. Du Bois, Race, and the City: The Philadelphia Negro and Its Legacy, ed. Michael B. Katz and Thomas Sugrue (Philadelphia: University of Pennsylvania Press, 1998), 40–59.
57. W. E. B. Du Bois, The Autobiography of W. E. B. Du Bois: A Soliloquy on Viewing My Life from the Last Decade of Its First Century (New York: International Publishers, 1968), 198.
58. Du Bois, Dusk of Dawn, 51.
59. Ida B. Wells-Barnett, On Lynchings: Southern Horrors, a Red Record, Mob Rule in New Orleans (Salem, NH: Ayer, 1991), 71.
60. See Ida B. Wells-Barnett, Crusade for Justice: The Autobiography of Ida B. Wells (Chicago: University of Chicago Press, 1970). For additional analysis of the intersection between the work of data collection and the autobiographies of Du Bois and Wells-Barnett, see Elizabeth Rodrigues, Collecting Lives: Critical Data Narrative as Modernist Aesthetic in Early Twentieth-Century US Literatures (Ann Arbor: University of Michigan Press, forthcoming spring 2022).
62. Yeshimabeit Milner, “An Open Letter to Facebook from the Data for Black Lives Movement,” April 4, 2018.
64. Shaka McGlotten, “Black Data,” in No Tea, No Shade, ed. E. Patrick Johnson (Durham, NC: Duke University Press, 2020), 262–286.
65. Shoshana Zuboff, “Big Other: Surveillance Capitalism and the Prospects of an Information Civilization,” Journal of Information Technology 30, no. 1 (2015): 75.
66. Paul Longley Arthur, “Things Fall Apart: Identity in the Digital World,” Life Writing 14, no. 4 (2017): 542.
67. Kevin D. Haggerty and Richard V. Ericson, “The Surveillant Assemblage,” British Journal of Sociology 51, no. 4 (2000): 613.
68. John Cheney-Lippold, “A New Algorithmic Identity: Soft Biopolitics and the Modulation of Control,” Theory, Culture & Society 28, no. 6 (2011): 165.
71. Joel Haefner, “Modest_Witness in the Wire: Haraway, Predictive Algorithms, and Online Profiling,” a/b Auto/Biography Studies 34, no. 3 (2019): 414.
72. Haefner, “Modest_Witness,” 405–406.
73. See Jenna Burrell, “How the Machine ‘Thinks’: Understanding Opacity in Machine Learning Algorithms,” Big Data & Society 3, no. 1 (2016): 1–12.
74. Edward J. Snowden, Permanent Record (New York: Metropolitan Books, 2019), 178.
75. Snowden, Permanent Record, 185.
76. Snowden, Permanent Record, 47.
77. Anna Wiener, Uncanny Valley: A Memoir (New York: MCD, Farrar, Straus & Giroux, 2020), 42.
78. Wiener, Uncanny Valley, 44.
79. Wiener, Uncanny Valley, 136.
80. Wiener, Uncanny Valley, 136.
81. Wiener, Uncanny Valley, 83–84.
82. Wiener, Uncanny Valley, 187.
83. For an overview of data sovereignty definitions and contexts, see Patrik Hummel et al., “Data Sovereignty: A Review,” Big Data & Society 8, no. 1 (2021): 1–17. For an overview of data justice definitions and contexts, see Linnet Taylor, “What is Data Justice? The Case for Connecting Digital Rights and Freedoms Globally,” Big Data & Society 4, no. 2 (2017): 1–14. Hummel et al., “Data Sovereignty,” for example, find in their review of the literature that Indigenous theorists of data sovereignty offer the most robust and avowedly emancipatory frameworks for the concept.
85. See Andrew Iliadis and Federica Russo, “Critical Data Studies: An Introduction,” Big Data & Society 3, no. 2 (2016): 1–7. See also Amelia Abreu, “Quantify Everything: A Dream of a Feminist Data Future,” Model View Culture, February 24, 2014; Catherine D’Ignazio and Lauren Klein, Data Feminism (Cambridge: MIT Press, 2020); and Bonnie Ruberg and Spencer Ruelos, “Data for Queer Lives: How LGBTQ Gender and Sexuality Identities Challenge Norms of Demographics,” Big Data & Society 7, no. 1 (2020): 1–12.
87. See Lorraine Daston and Peter Galison, Objectivity (New York: Zone Books, 2007); Mary Poovey, A History of the Modern Fact: Problems of Knowledge in the Sciences of Wealth and Society (Chicago: University of Chicago Press, 2009); Theodore Porter, Trust in Numbers: The Pursuit of Objectivity in Science and Public Life (Princeton, NJ: Princeton University Press, 1996).
88. Gitelman, ed., Raw Data; Orit Halpern, Beautiful Data: A History of Vision and Reason since 1945 (Durham, NC: Duke University Press, 2014); Yanni Loukissas, All Data Are Local: Thinking Critically in a Data-Driven Society (Cambridge: MIT Press, 2019).
89. Thatcher, Shears, and Eckert, eds., Thinking Big Data in Geography; Annika Richterich, The Big Data Agenda: Data Ethics and Critical Data Studies (London: University of Westminster Press, 2018); D’Ignazio and Klein, Data Feminism, and Benjamin, Race after Technology.
90. Gina Neff and Dawn Nafus, Self-Tracking (Cambridge: MIT Press, 2016); Dawn Nafus, ed., Quantified: Biosensing Technologies in Everyday Life (Cambridge: MIT Press, 2016); Deborah Lupton, The Quantified Self (Oxford: Polity Press, 2016).
91. Nancy Katherine Hayles, “Narrative and Database: Natural Symbionts,” PMLA: Publications of the Modern Language Association of America 122, no. 5 (2007): 1603–1608.
92. Wernimont, Numbered Lives.
93. Elizabth Losh and Jacqueline Wernimont, eds., Bodies of Information: Intersectional Feminism and the Digital Humanities (Minneapolis: University of Minnesota Press, 2019).
95. See Paul Longley Arthur, “Data Portraits: Identity, Privacy, and Surveillance,” a/b Auto/Biography Studies 32, no. 2 (2017): 371–373; Paul Longley Arthur, “Coda: Data Generation,” Biography 38, no. 2 (2015): 312–320; Cardell, “Modern Memory-Making”; Haefner, “Modest_Witness”; Rak, “Radical Connections.”