Library of Congress

Digital Preservation

The Library of Congress > Digital Preservation > Feature Series > Meeting the Challenge > Access Through Metadata: Library Group Tackles the Challenge

Back to Meeting the Challenge

Perfect metadata is NOT required - good metadata IS useful.

So say the resident metadata experts at the Library of Congress, who made this point very clearly at the recent National Digital Information Infrastructure and Preservation Program partners meeting. Consistent and rich metadata are needed in order to improve search of the Library’s collections and provide web services that users have come to expect.

Metadata for Digital Content group

Photo Credit: Barry Wheeler

To address the challenges in this area, the "Metadata for Digital Content" group was formed at the Library in March 2009. This internal, cross-Library group is working towards new solutions, aligning with a goal in the Library’s overall strategic plan to provide better access to digital materials. The group is co-chaired by Rebecca Guenther and Ann Della Porta from the Technology Policy Directorate of the Library.

The MDC group members include catalogers, programmers and digital project managers, and represent different service units of the Library concerned with digital content. All are united by the common need for more effective descriptive metadata, which is of increasing importance for the burgeoning amounts of new digital material added to the Library’s website every day. In studying the question of "what are users looking for, and can they find it?," the group determined that the overall quality of the online bibliographic records plays a big part in success or failure. So, how can the records be structured to help users discover relevant resources when they search?

Jane Mandelbaum, manager in the Library’s Information Technology directorate and a founder of the group, said the group is focusing on "how we build standardized metadata that works across the spectrum of digital objects."

"This is an internal collaboration, with staff members representing multiple areas of the Library," she noted. "The hope is that this combined effort will result in a common set of guidelines that can be used Library-wide for a variety of digital library projects." The intent is to develop an approach that avoids a continuing need to "re-invent the wheel" for each separate digital collection.

In support of the Library’s goal for increased access, the group hopes to accomplish three objectives. First, recommend a common set of metadata elements for current and future uses. Second, provide more consistent metadata for access and use of the digital objects and recommend how it should be managed. And third, develop recommendations for providing metadata for digital objects that currently have none or little metadata.

Use Cases

The group began their effort by studying existing project models at the Library, including general standards development and the search and display mechanism for the World Digital Library collection (external link). They then came up with a set of use cases to illustrate potential uses for better metadata. For example, what are some of the navigational options users might want for a given collection – that is, to search by place, date, or topic?

Mandelbaum provided a specific example of a use case: the need for consistent search by state. "We have some records that have the city but not the state, and we have other records with state or city name that are in the 'title' or 'place of publication/creation' field but not in the 'state' field," she said "So, our goal there is to populate the 'state' field by using other information in the record."

Profiles

The group has made considerable progress through the creation of a master list of standardized metadata elements used to map existing digital collection records to a single XML metadata scheme. The XML metadata uses the Metadata Object Description Schema.

Profiles are being established for specific digital collections, which generally consist of subsets of the master list, sometimes with local elements. Each element used is mapped to the common MODS schema. To date, the group has established at least 10 such profiles in draft form, for records in diverse Library collections such as American Memory, Web Archiving, World Digital Library (external link), Performing Arts Encyclopedia and Library of Congress Experience. More will be added later as needed, to provide mapping for all digital records.

The MDC group will also develop additional metadata where there currently is little or none, for items such as webcasts and YouTube videos. According to Mandelbaum, "these changes can help provide new ways to expose our content to external search engines like Google. Also, if someone finds a Library video on YouTube, we want them to easily find it on our web site, as well as be able to find other similar videos in our collection."

Metadata Remediation

The master data element list will help with new collections going forward, and enable the metadata to be standardized from the start. In addition, the group is undertaking a parallel effort for "metadata remediation" based on best practices noted in the master data element list.

The remediation will focus on improvement or revision of metadata already included in existing digital collections. This work consists of mapping the existing metadata elements to standard elements (currently, for about 16 collections) to make these older collections consistent with updated metadata practices. Remediation also involves improving the data values in the elements in terms of consistency and accuracy.

In the end, all of the metadata changes, re-mappings and standardizing should enable better search results both across the institution and from external searches. The MDC group has posted results of much of this work, such as profiles and element lists, on the Metadata for Digital Content: Developing institution-wide policies and standards at the Library of Congress website.

In the future, the group will continue to work on refining the master data element set, detailing best practices as needed, developing profiles for existing and new collections and continuing to remediate existing metadata to provide better access to the wealth of the Library’s digital content. To this end, the group also expects to reuse tools that are developed for the existing metadata remediation effort.

For a definition and other resources concerning metadata, see: http://www.digitizationguidelines.gov/term.php?term=metadata.