Sustainability of Digital Formats
 Planning for Library of Congress Collections

Introduction | Sustainability Factors | Content Categories | Format Descriptions | Contact
Format Description Categories >> Browse Alphabetical List

PDF (Portable Document Format)

>> Back
Table of Contents
Format Description Properties Explanation of format description terms

Identification and description Explanation of format description terms

Full name PDF (Portable Document Format)
Description PDF (Portable Document Format), developed by Adobe Systems Incorporated, is described by Adobe as a general document representation language. PDF represents formatted, page-oriented documents. These documents may be structured or simple. They may contain text, images, graphics, and other multimedia content, such as video and audio. There is support for annotations, metadata, hypertext links, and bookmarks. Later versions provide additional functionalities, for example, to embed geospatial information within documents that represent maps or other geospatial images, such as satellite photographs.
Production phase In general, a final-state format for delivery to end users.
Relationship to other formats
    Has subtype PDF_1_3, PDF Versions 1.0-1.3
    Has subtype PDF_1_4, PDF Version 1.4
    Has subtype PDF_1_5, PDF, Version 1.5
    Has subtype PDF_1_6, PDF, Version 1.6
    Has subtype PDF_1_7, PDF, Version 1.7 (ISO 32000-1:2008)
    Has subtype PDF_1_7_ext03, PDF, Version 1.7, ExtensionLevel 3
    Has subtype PDF_1_7_ext05, PDF, Version 1.7, ExtensionLevel 5
    Has subtype PDF/X, PDF for Prepress Graphics File Exchange
    Has subtype PDF/A-1, PDF for Long-term Preservation, Use of PDF 1.4
    May contain PDF_geospatial, PDF, Geospatial encoding (Adobe). Supported by version 1.7 ExtensionLevel 3.
    May contain GeoPDF_2_2, GeoPDF encoding (TerraGo), version 2.2

Local use Explanation of format description terms

LC experience or existing holdings Used as service format, including for some scanned historical materials, primarily to support convenient downloading and printing. Acceptable format for copyright registration.
LC preference

The Library of Congress expresses preferences for formats for content (primarily in physical form) for its collections through the "Best Edition" specification from the U.S. Copyright Office in Circular 7b. Rev: 08 ⁄ 2010 of Circular 7b lists formats acceptable for mandatory deposit of Electronic Serials available only online, in order of preference. For page-oriented renditions, PDF/A (the PDF/A-1 format or later versions) appears first on the list. Other forms of PDF are acceptable, preferably with searchable text.

In general, PDF is not a preferred format for images. For text, PDF/A is preferred over other PDF variants. PDF/X may be appropriate if used by creator or publisher during production.


Sustainability factors Explanation of format description terms

Disclosure

Fully documented. PDF was developed by Adobe Systems Incorporated, which makes the specification available openly and at no charge. Several subtypes of this proprietary format have been adopted as international standards by ISO. These include PDF/X (ISO 15930), PDF/A (ISO 19005), and PDF version 1.7 (ISO 32000-1:2008).

    Documentation Adobe provides documentation for the current version at http://www.adobe.com/devnet/pdf/pdf_reference.html and an archive of earlier versions at http://www.adobe.com/devnet/pdf/pdf_reference_archive.html.
Adoption Extremely widely adopted as a platform-independent format for disseminating page-oriented documents. Adobe Reader software for viewing PDF files is freely distributed and bundled with most personal computers.
    Licensing and patents

Adobe has a number of patents covering technology that is disclosed in the Portable Document Format (PDF) Specification, version 1.3 and later.

A summary of information on the Adobe Web site in September 2010 (see http://partners.adobe.com/public/developer/support/topic_legal_notices.html) follows.

To promote the use of PDF for information interchange the following patents are licensed by Adobe on a royalty-free, non-exclusive basis for the term of each patent for developing software that produces, consumes, and interprets PDF files : 5,634,064 (filed 1996-08-02, granted 1997-05-27); 5,737,599 (filed 1995-12-07, granted 1998-04-07); 5,781,785 (filed 1995-09-26, granted 1998-07-14); 5,819,301 (filed 1997-09-09, granted 1998-10-06); 6,028,583 (filed 1998-01-16, granted 2002-02-22); 6,289,364 (filed 1997-12-22, granted 2001-09-11); 6,421,460 (filed 1999-05-06, granted 2002-07-16). Patent 5,860,074 (filed 1997-08-14, granted 1999-01-12) is similarly licensed on a royalty-free, non-exclusive basis for its term but only for the purpose of developing software that produces PDF files (thus specifically excluding software that consumes and/or interprets PDF files).

Adobe Reader displays additional patent numbers on launch.

In association with the adoption of PDF, version 1.7 as an ISO standard (ISO 32000-1:2008), Adobe issued a Public Patent License, granting "every individual and organization in the world the royalty-free right, under all Essential Claims that Adobe owns, to make, have made, use, sell, import and distribute Compliant Implementations."

Transparency Depends upon compliant software tools to read. Building tools requires sophistication.
Self-documentation Later versions of PDF can include XMP metadata packages.
External dependencies Faithful rendering requires that fonts be embedded. PDF/A, intended for archival purposes, and PDF/X, for prepess exchange, require that fonts be embedded.
Technical protection considerations The PDF format offers several forms of technical protection, including encryption, that would prevent custodians of digital content ensuring accessibility in future technological environments.

Quality and functionality factors Explanation of format description terms

Still Image
Normal rendering PDF is designed for page-oriented documents. Scaling, zooming, printing are expected functionalities for PDF viewers. The quality of raster images depend on the quality of the embedded image. Note that, in general, PDF is not a preferred archival or master format for images.
Clarity (high image resolution) High-resolution images can be embedded using professional tools. See PDF/X, a standard version of PDF used by the printing industry.
Color maintenance Parameters to support color management, including CIE-based and ICC-based color spaces, can be stored in the file using professional tools. See PDF/X, a standard version of PDF used by the printing industry.
Support for vector graphics, including graphic effects and typography Extensive support for graphic elements. Versions after PDF 1.4 support a transparent imaging model in addition to the opaque model used for earlier versions. Hence images composed of layers can be stored without pre-composing into a single image.
Support for multispectral bands TBD
Functionality beyond normal rendering PDF has extensive support for annotations of several types. PDF, Version 1.7, ExtensionLevel 3 (PDF_1_7_ext03), introduced with Acrobat 9.0, supports capabilities for embedding data in association with points within 3D and geospatial images.
Text
Normal rendering Good support is possible, but not guaranteed. The PDF format allow creators to disallow printing and extraction of text for quotations. PDF can also be used to create documents from scanned page images; such files do not necessarily support indexing of the document text.
Integrity of document structure The logical structure of a document is only represented in a PDF file if the creator or process during creation takes steps to incorporate structural tagging.
Integrity of layout and display PDF is designed to represent the layout of page-oriented documents.
Support for mathematics, formulae, etc. Can be represented by embedded graphics.
Functionality beyond normal rendering Supports embedding of media objects (in binary format) and links to external media objects, such as images, audio, or video.

File type signifiers Explanation of format description terms

Tag Value Note
Filename extension pdf
 
Internet Media Type application/pdf
From LC web server configuration (Apache) of 2004-04-28. Registered with IANA (see Application Media-Types) and described in IETF (Internet Engineering Task Force) RFC 3778. Reported for PDF files by JHOVE PDF-hul module for file identification.
Internet Media Type application/x-pdf
application/acrobat
application/vnd.pdf
text/pdf
text/x-pdf
Selected media types listed at The File Extension Source.
Magic numbers Hex: 25 50 44 46
ASCII: %PDF
From Gary Kessler's File Signatures Table.

Notes Explanation of format description terms

General In October 2009, ISO authorized a new project to develop the PDF 2.0 standard.
History Adapted from PDF Reference, Third Edition: The origins of PDF and the Adobe Acrobat product family date to early 1990. At that time, the PostScript page description language was rapidly becoming the worldwide standard for the production of the printed page. PDF builds on the PostScript page description language by layering a document structure and interactive navigation features on PostScript's underlying imaging model, providing a convenient, efficient mechanism enabling documents to be reliably viewed and printed anywhere.

Format specifications Explanation of format description terms


Useful references

URLs


Last Updated: 09/29/2010