Library of Congress

Digital Preservation

The Library of Congress > Digital Preservation > News Archive > Library’s Approach to Open Source Attracts Crowd at GW Presentation

August 9, 2010 -- The Library of Congress is actively exploring development of open-source software for supporting digital stewardship activities.  Leslie Johnston, manager of Technical Architecture Initiatives for the National Digital Information Infrastructure and Preservation Program, spoke on this subject at George Washington University on July 15, 2010. 

Leslie Johnston describing BagIt at George Washington University on July 15, 2010. Credit: Sally Whiting

Leslie Johnston describing BagIt at George Washington University on July 15, 2010. Credit: Candice LaPlante

Johnston opened with an overview of NDIIPP, explaining how the program works with partners around the country on a collaborative approach to digital preservation.  Because of the distributed nature of this work, Johnston explained, there is a need for simple, reliable tools for large and complex file transfers. This led to the development of BagIt, a packaging specification for file transfers that uses checksums and content lists to help users on both ends of a file delivery verify that the correct files were successfully transferred.

Tools for the use of BagIt became the first Library-developed software programs to become available as open source and posted on SourceForge (external link), where they have been downloaded over 2,000 times since the initial release in December of 2008. Johnston explained how it took time and patience to get approval for the release. "We had to start with something that every level of Library administration would feel comfortable with," she said. "This is an entirely new area for the Library, and it took some explaining to make the case that releasing open source made sense for us and the communities we work with."

Subsequent releases related to the BagIt program, Johnston noted, were much easier to process because of the precedent that the first release had set. "There was a lot of education that had to happen in the organization about roles," she said, describing the initially shaky process of sending the code around the Library for approval. "We were able to set a precedent with that project that is now a policy."

Johnston went on to describe the growing open source software community that the BagIt project tapped into. "There’s a community now at the Library around open source," she said, and added that the team had set up a Digital Curation Google group (external link) to encourage participation in the ongoing discussion.

Johnston noted that other open source projects were underway at the Library, including BWF MetaEdit (external link), a software package recently released on SourceForge that enables embedding and enabling metadata in Broadcast WAVE Format files, and Bagger, a desktop application currently under development that provides a graphical interface for consolidating digital content and transferring it through the BagIt system. These and future open source software releases will be announced on when they become available.