Information Today, Inc. Corporate Site KMWorld CRM Media Streaming Media Faulkner Speech Technology DBTA/Unisphere
PRIVACY/COOKIES POLICY
Other ITI Websites
American Library Directory Boardwalk Empire Database Trends and Applications DestinationCRM Faulkner Information Services Fulltext Sources Online InfoToday Europe KMWorld Literary Market Place Plexus Publishing Smart Customer Service Speech Technology Streaming Media Streaming Media Europe Streaming Media Producer Unisphere Research



Vendors: For commercial reprints in print or digital form, contact LaShawn Fugate (lashawn@infotoday.com)

Magazines > Computers in Libraries > December 2022

Back Index Forward

SUBSCRIBE NOW!
Vol. 42 No. 10 — December 2022
THE DIGITAL ARCHIVIST

The Environmental Impact of Digital Preservation—Can Digital Ever Go Green?
by Jan Zastrow


I intended to write this article back in 2020, as a follow-on to my December 2019 column, “Environmental Sustainability and Climate Action in Libraries and Archives” (infotoday.com/cilmag/dec19/Zastrow--Environmental-Sustainability-and-Climate-Action-in-Libraries-and-Archives.shtml). In the subsequent months, the COVID-19 pandemic and its ongoing effects overshadowed concerns about our carbon footprint, but now it’s time to remember and to focus. World leaders are calling for transformational technologies that can help reduce CO2 greenhouse gas (GHG) emissions and slow down or reverse the effects of radical climate change. Climate action is no longer optional; it is a moral and economic imperative. Addressing environmental sustainability in libraries and archives is doing our part as global citizens and as conscientious information professionals.

Why Us?

What does this have to do with our collections and our institutions? Since the dawn of the digital age, knowledge and cultural heritage organizations (CHOs) have been intrinsically linked with information and communications technologies (ICTs). This includes the internet, wireless networks, cellphones, computers, software, videoconferencing, social networking, and other applications and services that enable users to access, retrieve, store, transmit, and manipulate information in digital form. It is the interconnected web of communication technologies that makes up our current digital environment.

Our profession intersects with ICT in how we create, collect, manage, and preserve digital collections. Simply put, energy consumption and hardware are required to keep our digital files alive; using digital technologies implicates libraries and archives in contributing to climate change. Painful as it is to acknowledge, we must countenance the culpability of our organizations—memory and knowledge institutions—as part of a system that has perpetuated values of capitalism and globalization and engendered the rapid onset of environmental degradation and climate catastrophe. Granted, what galleries, libraries, archives, and museums (GLAM) contribute to the global carbon footprint is minuscule compared with international corporations, the military-industrial complex, cryptocurrency mining, and the like. But that we have been documenting the decline and done little about it only reinforces that uncomfortable awareness.

We Didn’t See This Coming

The irony is that digitization was supposed to make us greener—more sustainable from the perspectives of infrastructure, budget, and the environment. But our current digital preservation strategies rely heavily on server farms and cloud storage. Given the extraordinary amount of electricity required to operate data centers—the physical infrastructure, electricity, cooling, and networking facilities that house server farms—can we continue to believe this is environmentally responsible? Can we ignore the down-stream implications of our own well-intended policies, workflows, and preservation protocols? Approaches such as LOCKSS (lots of copies keep stuff safe) worked fine in the dawning age of the internet, but 30 years on, we are drowning in data and the resulting GHG emissions resulting from its preservation. How can we scale back, respecting the environment from a resource perspective, so that finite reserves can be used more wisely?

Linda Tadic, CEO of Digital Bedrock, has been paying attention to these concerns for a long time. In a recent ALA webinar, she asserts, “Every action by individuals, organizations, corporations, governments, and cultural heritage organizations impacts the environment.” She quotes 2019 estimates that global fossil carbon dioxide emissions—a whopping two-thirds—were due to the combined use of electricity for on-premises and cloud storage (44.3%) and industry aka manufacturing hardware such as servers and computers (22.4%). By 2030, it is anticipated that the emissions of data centers will increase despite improvements in efficiency and cooling (Tadic 2022).

Happily, there are constant advances in new technology, and data centers are becoming more environmentally aware all the time. In his December 2019 column, “Working Toward More Sustainable Technology,” Marshall Breeding outlined that. I checked back with him for an update. He said, “I certainly agree that libraries must be concerned with using greener technologies. Much of what I said in that 2019 column is still true today. Cloud-based systems tend to rely on hosting facilities that are much more energy efficient than locally hosted systems.” (By the way, that whole December 2019 issue was devoted to environmental sustainability and is available in full text at infotoday.com/cilmag/dec19/index.shtml.)

Metadata Mushroom

It isn’t just the snowball of data per se that we’re striving to preserve as we create born-digital content and engage in mass digitization projects. The accompanying bundle of metadata surrounding each record increases the volume logarithmically. Advancements that allow for higher-quality digitization, particularly with audiovisual content, result in huge files that bloat our servers unnecessarily. A single minute of digitized film can result in a 100GB file, which is an exponential increase in digital storage needs. “Cultural heritage professionals should critically examine standards and practices for file formats of born-digital and digitized content at their organizations. Should every item be migrated or digitized to the highest quality possible? Are there true preservation benefits to high-quality digital surrogates?” (Pendergrass et al. 2019). We should limit use of “highest possible” resolution for all but permanent preservation masters to keep file sizes manageable.

Ethical Dilemmas, Human Costs

In addition to technical and environmental matters, going digital is not without ethical quandaries. Server farms and data centers are typically located in areas in which land is cheap, so poor/rural/Indigenous communities bear the burden of that unsightly blight and noise pollution. Tadic points out another wrinkle: “The global droughts experienced in the last two years have impacted communities that live in regions where there are many large data centers. Data centers require vast amounts of water for chilling, so water could be diverted from humans to keep those data centers cool. It’s not just electricity and power use we need to be mindful of.”

Librarian/archivist Stacie Williams goes a step further: “Our obsession with collecting and the ‘throw it in the cloud/storage is cheap’ mentality is in danger of destroying our built environment and our communities.” She frames the issues through a filter of care ethics, defined as a “moral and political philosophy that not only engages our community through basic acts of caring, but also provides justice to communities through that care.” Williams ponders how we can express care ethics in a digital library setting in a way that is sustainable and how we can express the act of preservation in a way that cares for people (2017).

Also concerning are issues of sweatshop labor, natural resource exploitation, and the pollution and illness that results from the manufacture and disposal of electronics. Librarian Jennifer Poggiali proposes practicing ethical consumption when purchasing electronic devices and that information professionals should consider curtailing the purchase of new electronics—computers, e-readers, tablets, cellphones, and more—based on environmental and social justice issues. Poggiali concludes that it may not be possible to reconcile ongoing growth with the demands of sustainability: “Since there is no perfectly ‘green’ electronic device, you will likely always have to accept that some harm to the environment is caused by your electronics purchase. Proper recycling can mitigate some of that harm” (2016).

There’s employee fallout too. The recently released “A*CENSUS II All Archivists” report, conducted by the Society of American Archivists (SAA) and Ithaka S+R, found that 20% in the profession are considering leaving; another 1 in 4 aren’t sure if they’ll stay or leave. Burnout was one of the primary reasons listed.

Similarly, a recurring theme at the 2022 SAA annual meeting was the unsustainability of the increasing staff workload and the physical and psychological stress of processing the crushing backlog. Particularly hard-hit are those (often junior) staffers in digital preservation, who manage the burgeoning mass of digital records in addition to appraisal, processing, reference, outreach, and myriad other duties expected of an archivist/librarian.

Preservation Storage Solutions

To go back to our original question, can digital be environmentally sustainable? The short answer is no. What started in the ’90s with the promise of digitization and born-digital collections has sprouted into a multi-headed hydra out of our control, bibliographic or otherwise. As a colleague quipped, “We’re already reeling from the false promise of a paperless world; now we can reel from the false promise of cheap and harmless digitizing!” Given this new reality, how can we mitigate the environmental impact of our digital collections? In a nutshell, the answer is to use less energy related to storage and preservation.

Tadic suggests implementing hierarchical storage management (HSM) policies for alternative data storage. HSM tiers are offline, nearline, and online. Large and infrequently accessed files can be stored offline on data tape (spinning disk storage takes 26 times more energy than storing and infrequently accessing data tapes). Permanent preservation masters can also be stored offline. “Nearline” means using a storage area network (SAN) that can be accessed by multiple servers or computers to provide a shared pool of storage space. Only the most frequently accessed files should be stored online (Tadic 2022). From a workflow perspective, less frequent migration and fixity checks could be scheduled. Fixity checking is “the practice of algorithmically reviewing digital content to ensure that it has not changed over time. This process results in checksums, also known as cryptographic hashes, which can be compared over time to determine if a file has been altered” (Barsness et al. 2018).

Taking a records management approach to digital preservation, such that collections are preserved only as long as reasonable based on tiers of importance in a formal records schedule, might make sense. Whether that is defined as frequency of use or uniqueness in the world is the stuff of faculty committees and boards of directors.

Reappraising Appraisal and Retention

More thorny questions are those of appraisal and retention. We probably agree on recycling old hard drives and media, but does all digital content acquired or created require permanent retention? A phased approach to longevity of collections could mean migrating forward only select series of a digital collection rather than its entirety. If retained for a specific period (not permanent), can we move files between storage tiers and ultimately delete them?

Similarly, when deciding on digitizing priorities, only specific series might be digitized to begin with. This may cause consternation on the part of researchers and make it harder to verify authority and authenticity of records. It would surely require a more flexible approach to how we view research and produce a new model for scholarly citations that is very different from the long-standing tradition of rock-solid forever references. Would “Save Less, Forget More” become our new slogan?

Basing the continued preservation of a collection on its usage may be another way to slow the escalating need for more storage. While weeding is a useful collection development tool in libraries, it is seldom performed on unique primary sources. Reappraisal is more common but time-consuming and—another catch—requires more energy to conduct. “Reappraisal itself generally requires accessing the data and, therefore, results in its own environmental impacts. However, these likely do not outweigh the costs of preserving a larger amount of data indefinitely” (Pendergrass 2019).

Would it be heresy to deaccession (in digital terms, delete) all of the files, backups, metadata, and drive images if the content hasn’t been accessed in X number of years? Is the social contract of CHOs keeping everything forever—“The Ten Thousand Year Collection”—still appropriate or even desirable (Merritt 2015)? I posit that sustainability in the Anthropocene will require new models of resource management such as regular reappraisal and deaccessioning, acknowledging that archiving in perpetuity is no longer viable and allowing patrons to actively use materials rather than prioritizing preservation over access, as archivists have always done.

Back to the Future

Do we need to reconsider employing digital as a preservation method at all? As early as 20 years ago, Thomas Hecker predicted the end of digital due to resource constraints and that analog was the future for archives. “Contemporary wisdom holds that the scholarly community is in transition from a paper-based knowledge system to an electronically based system,” but Hecker argued this transition is not sustainable and that constraints on energy and other necessary resources would arrest digitization in the not-distant future. “Archives in physical formats, not digitized archives, are essential to preserve the scholarly record” (Hecker 2003).

Going back to the recent past, when we could toggle between hard copy for preservation and digital for access and searchability, seems like a pipe dream now. And even if we could, that wouldn’t resolve the preservation issues for research datasets, video games, computer art, AI, immersive media, geospatial data, and other content that only exists within the digital realm. Can we stuff the genie back in the bottle? For all of its faults and not-so-green thumb, a hybrid digital-analog world is likely here to stay. Let’s figure out ways to generate electricity—wind, wave, solar—that are renewable and clean, as well as places to locate data centers—the cloud—that are safe, non-polluting, and sustainable. Wearing my futurist hat, I envision oceanic floating data islands, which can use solar power for energy and seawater for cooling.

And finally, what can we learn from Indigenous and non-Western perspectives when it comes to alternative ways of keeping archives, eschewing overreliance on technology, and taking better care of Mother Earth? That’s for another column … stay tuned!

Resources

“A*CENSUS II All Archivists,” Society of American Archivists and Ithaka S+R, 2022. doi.org/10 .18665/sr.317224.

Abbey, Heidi N., “The Green Archivist: A Primer for Adopting Affordable, Environmentally Sustainable, and Socially Responsible Archival Management Practices,” Archival Issues 34, no. 2, pp. 91–115, 2012. jstor.org/stable/41756175.

Barsness, Sarah, et al., “2017 Fixity Survey Report,” NDSA, 2018. ndsa.org/documents/Report_2017NDSAFixitySurvery.pdf.

Goldman, Benjamin, “It’s Not Easy Being Green(e): Digital Preservation in the Age of Climate Change,” Archival Values: Essays in Honor of Mark Greene, ALA, 2019. scholarsphere.psu.edu/resources/381e68bf-c199-4786-ae61-671aede4e041

Hecker, Thomas, “The Twilight of Digitization Is Now,” Journal of Scholarly Publishing 35, no. 1, pp. 52–62, October 2003. doi.org/10 .3138/jsp.35.1.52.

Merritt, Elizabeth, Smith College “Futurisms” colloquium (Part I), keynote address Dec. 10, 2015. smith.edu/news/elizabeth-merritt-speaks-museums.

Pendergrass, Keith L., Sampson, Walker, Walsh, Tim, and Alagna, Laura, “Toward Environmentally Sustainable Digital Preservation,” The American Archivist 82, no. 1, pp. 165–206, 2019. doi.org/10.17723 /0360-9081-82.1.165.

Poggiali, Jennifer, “Incorporating Ethical Consumption Into Electronic Device Acquisition: A Proposal,” Portal: Libraries and the Academy 16, no. 3, pp. 581–597, 2016. academicworks.cuny.edu/le_pubs/258.

Tadic, Linda, “Digital Preservation’s Impact on the Environment,” ALA Environmentally Sustainable Preservation, April 28, 2022. dropbox.com/s/csdc0ije7rru2j6/ALA_EnvironmentallySustainablePreservation_Tadic_20220428.pptx?dl=0.

Williams, S., “Sustainable Digital Scholarship: Shrinking Our Footprint, Broadening Our Impact,” Digital Frontiers keynote speech, Sept. 21, 2017. medium.com/on-archivy/sustainable-digital-scholarship-the-limitations-of-space-662627e19e37.


Jan Zastrow is a certified archivist, librarian, and information professional based in Washington, D.C. Contact her at zastrow@hawaii.edu.