DigitalGovernment.org - Home of the Nat'l. Science Foundation Digital Government Research Program
menu 1
menu 2
menu 3
menu 4
   

dg.o Web

State Governments Grapple with Digital Archiving
U.S. Library of Congress Launches an Ambitious Project to Revolutionize the Way States Save Documents & Data
By Mack Reed
DGRC Communications Manager

LOC Digital Archiving
 


Lead Agency:
Library of Congress NDIIPP

Project abstract




They stand like tombstones in every state library - microfilm reels, thick-inked ledgers and volume upon volume of official documents and records. To find anything in these obsolete storage systems, you need to plow through a dead-tree catalogue or, if you're lucky, a microfiche index or online database.

Yet as the ever-steepening tidal curve of information technology bears digital record-keeping into the future and grows too slippery for governments to keep up with, some of those old-tech storage solutions are turning out to be easier to access than supposedly modern archive systems.

With this in mind, the U.S. Library of Congress has launched an ambitious nationwide program to help all U.S. states and territories develop long-term digital archive solutions. The goal is to identify documents, media and data that need saving and to develop programs and tools for saving them and making them accessible to the public well into the future, says William LeFurgy, Digital Initiatives Project Manager of the Library's Office of Strategic Initiatives.

"Technology is moving very fast, there's no question about it, and it's also true that the preservation of the information hasn't kept up," LeFurgy says. "It's a danger that if we don't start moving rapidly that we are going to be falling further and further behind."

The problems are many. Operating systems and hardware platforms grow obsolete, and without the budget for translating their contents to newer formats and systems, data and documents either remain trapped in dead tech or vanish altogether, LeFurgy says.

Some problems are more insidious, such as the deprecation of document layout: As operating systems and document applications evolve, newer formatting standards render documents differently than their authors intended. Archivists also face the slow de-authentication of signed documents: inked signatures and even some digital signatures don't always survive reincarnation to a new format intact. If not, are they still valid?

The nose-to-grindstone work of the Library of Congress Collaboration for Preservation of State Government Digital Information. begins in April and May, when state archivists and librarians will huddle in workshops with Library of Congress project leaders Washington, D.C..

There, the Library's National Digital Information Infrastructure and Preservation Program (NDIIPP) will introduce a toolkit to librarians and archivists from U.S. states and territories (59 have been invited and a dozen have signed up so far, LeFurgy says).

The first version of the toolkit is a series of questionnaires meant to help state librarians and archivists quantify what they must store, how they're storing it now and - ultimately - to work toward short-term goals for better storage and access solutions for that material.

The LOC will work with individual teams to consider inter- or intra-state partnerships and to identify specific types of digital content upon which each state's team wants to focus its R&D efforts, such as electronic records and/or publications, datasets, digitized materials, audio/video and web resources.

Then, in June and December, a second round of workshops will convene at the Center for Technology in Government (CTG). The state teams will work with the Library and CTG researchers to use the toolkit to develop near-term plans and - in some of the more advanced cases - methods and agendas for digital preservation activities.

Some states such as New Mexico, Connecticut and Arizona are already collaborating with the Online Computer Center Library (OCLC to develop long-term solutions, and are eager to discuss their work and their challenges with other participants in the Library of Congress project.

In addition to the broad challenges of building information systems that can survive hardware and software iterations yet to come, the state of New Mexico is wrestling with the challenges of specific document formats - particularly hyperlinked documents: "How deep do you go into a document that has a lot of links? says Richard Akeroyd, New Mexico State Librarian. "You click on one link, and it takes you to another set of ducuments." The Library of Congress' project will help the state make decisions such as this, while helping to identify ways of storing and releasing state documents more consistently and efficiently, he says.

For instance: the U.S. Forest Service issues updates digitally on forest fire conditions before, during and after events - but the documents are not being stored at a national level for later use, so they disappear, he says. "I think w're going to be seeing more and more of those kinds of born-digital documents and resources that - if somebody doesn't capture them on a daily basis - they're going to go away forever," says Akeroyd.

While New Mexico is one of the more advanced states in terms of digital preservation, the vast majority of U.S. states and territories are not, says LeFurgy: Hampered by old technology, budget shortfalls or lack of initiative at the state level, some librarians have done very little to address their obsolescing archives.

In Colorado, for example, the state of digital archiving is "dismal," says Colorado State Librarian Nancy Bolt.

The state has just allocated $178,000 to set up a dedicated server for archiving and putting online born-digital documents, for upgrading their system that catalogues existing digital and hard-copy documents and, ultimately, for digitizing hard-copy documents, Bolt says.

"Information's being lost on a daily basis - electronic information - and we have no program to digitize past documents that we think should be available in the long term," she says. "Other than searching our database, we have no way to search text information other than searching our web sites."

Bolt's colleague, Debbie MacLeod, director of the State Publications Library, puts it a bit more bluntly: "At present, we have not been able to store any born-digital documents, we have only been able to store the link, and that's not sufficient to being able to provide public access," she says. "As you and I know, the links break, and they disappear."

Latest DG News


dg.o 2006 Convenes May 21-24, 2006  
dg.o 2006 Early Registration Ends April 10th!
dg.o 2006 Issues CFP - Tutorials
dg.o 2006 Issues CFP - Workshops
• dg.o 2006 features Workshops on:
   eRulemaking
   GeoInformatics
• dg.o 2006 features Tutorial on:
   •Social Network Analysis
New DG Team Pursues eRulemaking
IEEE ISI2006 Convenes May 22-24, 2006
eChallenges e-2006 Issues CFP
DG Research Helps Predict Urban Growth
Swapping Secrets of the Double Helix
UK and DO-Wire Launch e-Gov Best Practices wiki
DG Team Develops "Virtual Agora" for e-Gov
Mapping for Times of Crisis
Exploring Detection of Crisis Hotspots
Report: Mass eMail Campaigns Harmful
Scenario-Based Designs for Stat Studies
US, EU Explore Info Integration
DG Team Develops Digital Interpreter
DG Study Gives Teeth to FBI
Research Smooths Road for Small Businesses
DG Researchers Parsing in Tongues
e-Gov Journal Issus Call for Articles

See all news stories

Contribute to dgOnline

Both librarians say they greatly look forward to collaborating with other states in the Library of Congress project and learning from their experiences.

Somewhat closer to the other end of the archiving spectrum is the state of Minnesota.

Minnesota has been working to digitize 19th-century timber-survey maps in the form of geospatial-information datasets, says Bob Horton, the state's archivist and head of the Collections Department for the Minnesota Historical Society.

The state has digitized and databased land-survey records including 3,600 huge color TIFFs - some 1.2 terabytes of data - of maps that were made in the late 19th century just before the logging industry began a massive campaign of timber-cutting that essentially deforested the northern half of the state, Horton says.

"It's a wonderful amount of information about pre-settlement Minnesota, information aobut wetlands, forests and prairies, all of which have basically disappeared," he says.

The Minnesota team is working with the San Diego Supercomputer Center using an application called Storage Resource Broker and a grid server to test different data standards and schemes and work towards the ideal of a federated, distributed architecture for archiving, Horton says.

"Wer'e particularly interested in the development of a collaborative network for digital preservation," he says. "We really want to share tools and best practices. I think it's a great idea."

LeFurgy says the project also suits the Library's overall goal of growing its presearvation network, "which is not necessarily a formal arrangement, but just (one) where there's a group of like-minded institutions throughout the country that talk to each other and exchange information." LeFurgy explains, "Eventually, (we) may be able to operate on a similar technical infrastructure so that we're able to share content, back up storage for each other, and combine search -- but that's more down the road."

After the second round of workshops, CTG may develop a more advanced toolkit that can be extended to federal agencies, other state and regional agencies and perhaps even private industry to help them meet the challenge of staying ahead of digital information's decay. The Library may also work in conjunction with the Institute of Museum and Library Services to extend grants to states or state entities that need support for their digital preservation efforts, LeFurgy says.

The danger in not acting now is that the relentless forward surge of technology could sweep past some states, essentially erasing valuable documents, records, data and media forever: "That's why Congress came up with the idea of NDIIPP and asked the Library to do somethign about it," LeFurgy says. "We're hurrying on as fast as our legs can carry us."