Preserving born-digital collections
Once digital collections have been reformatted or transferred, the Born Digital Preservation Lab is also responsible for ensuring that the collections are preserved over the long term. The BDPL preserves a wide range of digital content. In all cases, our goal is bit-level preservation, or verifying that the content we preserve remains unchanged over time.
Our goal is to accession all our digital content into the Stanford Digital Repository. This strategy requires that materials be cleared for PII, as we need to ensure that any high-risk data is confined to a secure server. Accessioning born-digital content into the SDR also requires distinguishing between and generating original and access copies, recording any changes to filenames, packaging hierarchical content to preserve hierarchies, and setting access rights at the file level.
Preservation also hinges on the ability to reformat the original computer media. Collections with historical hardware present reformatting challenges due to the wide range of original media, including computer cassettes, floppy disks, video game cartridges, 8-track tape, optical discs, and video game cards. Access to the content is also dependent on emulating the original media, or finding legacy hardware that still works.
With all these steps, even deceptively simple files require extensive preservation work. For example, the below collection of essays was delivered as a PDF on a thumb drive. The BDPL needed to ensure the original PDF remained unchanged and transfer it off the thumb drive; scan it for high-risk data; create metadata that clarifies the file's relationship to the original media; set access rights; create an access derivative; and accession the original and access copies into the SDR. The work pays off when it's in the SDR and able to be served to researchers!
"As I see it" : a collection of essays, by Elbert Howard. Department of Special Collections and University Archives, Stanford University Libraries, Stanford, California.
Case study: Preserving a software collection
Stanford Libraries collaborated with the National Institute of Standards and Technology (NIST) to migrate data from 13,000 legacy software media in the Stephen M. Cabrinety Collection in the History of Microcomputing, ca. 1975–1995. During this project, Stanford cataloged and shipped the software to NIST, who forensically imaged and photographed the media. Copies of the software were sent to both Stanford and the National Software Reference Library for long-term preservation storage.
This collection presented interesting challenges and opportunities at every turn, from confronting how to reformat a large and diverse amount of computer media in the first place, to the ongoing project of obtaining permission to distribute the content from an almost-equally-diverse number of software companies, many of whom are now out of business. Though the reformatting and metadata work is complete, many other aspects of the Cabrinety project are still ongoing.