Library of Congress

Digital Preservation

The Library of Congress > Digital Preservation > News Archive > CRL Report Describes Digital Newspaper Production

CRL logoMay 5, 2011 -- Preserving News in the Digital Environment: Mapping the Newspaper Industry in Transition (PDF, 1.69MB) was produced for the National Digital Information Infrastructure and Preservation Program by a team from the Center for Research Libraries (external link)

This report provides a vivid glimpse inside the workplaces that produce what – not long ago – we would have called newspapers.  As digital news-gathering and production methods proliferate, and as digital avenues for distribution emerge, these workplaces are being transformed in profound ways, with electronic facsimiles and websites (and probably more) overtaking the paper format. 

The report is an outgrowth of the Preserving Digital News meeting held at the Library in September 2009, and it features illustrative examples from four American newspapers: The Arizona Republic, Seattle Post-Intelligencer (since 2008,, Wisconsin State Journal, and The Chicago Tribune. There is additional information pertaining to the work of The New York Times, Investor’s Business Daily, and the Associated Press.  Altogether, the report makes it clear that the transition to the digital environment is not a neat, throw-the-switch change.

The CRL team of researchers, writers and illustrators included Jessica Alverson, Kalev Leetaru, Victoria McCargar, Kayla Ondracek, Bernard Reilly, James Simon and Eileen Wagner. Their narrative takes us through three major stages in the newspaper workflow: sourcing (gathering news information), editing and production and distribution.  Each newspaper applies somewhat different practices in each stage, ranging from the formatting of the content, the types of metadata employed, and the methods applied to manage the content in the information technology systems that support the workflow.

Here are a few highlights:

  • Most editorial systems are built around the traditional concept of a news article or story. Following longstanding newsroom practice, most text is maintained in the editorial system as part of a standard unit of content: it is a news item (for wire service reports) or a story, article or feature. (page 25)
  • After the articles and other components of a newspaper print edition are assembled and tagged in the editorial system they are usually exported to a pagination system where the page layouts for the print edition and e-facsimile edition are created. It is in the pagination system that most of the content for these editions of a newspaper is brought together for the first time. (page 28)
  • Once a locally produced news story is retired from the active or current pages of a newspaper’s website, it is often posted as the "archived" version or "version of record" in a separate part of the Web. Some newspapers outsource maintenance of these archived stories and features to archiving services like NewsBank,, and ProQuest. These services add value by formatting and indexing the stories and presenting them in searchable databases, which are normally hosted by the archiving service, but made to appear seamlessly connected to the newspaper site. (page 51)
  • A computer-assisted analysis of the Chicago Tribune Web site yielded a granular picture of the rate or "velocity" of updates on news web sites . . . [examining] the number of page URLs against minutes of persistence for a two-day period.  The analysis showed that in general business, entertainment, and sports news tended to be updated most frequently (sometimes several times within the half hour), while features, opinion, travel, and blog content changed less frequently. Hence the difference between print and electronic versions of newspaper content will vary considerably by type of content. (pages 55-56)
  • The newer model of the news Web, however, is exemplified by, the Hearst Seattle Media’s "flagship site," [which] focuses heavily on information of local interest, such as crime, regional politics and local sports teams.  But is . . . fundamentally different from its now defunct predecessor, the Seattle Post-Intelligencer newspaper.  It features not only original staff reporting and breaking news, but blogs by staff and readers, links to other journalism and news web sites, community databases and photo galleries.  Through partnerships with other Seattle media (i.e., radio and television broadcasters), also has access to video and audio produced by their local staff. (pages 52-53)