Sustainability of Digital Formats: Planning for Library of Congress Collections

Introduction | Sustainability Factors | Content Categories | Format Descriptions | Contact
Format Description Categories >> Browse Alphabetical List

Document Container File: Core (based on ZIP 6.3.3)

>> Back
Table of Contents
Format Description Properties Explanation of format description terms

Identification and description Explanation of format description terms

Full name ISO/IEC 21320-1 Information technology -- Document Container File -- Part 1: Core (formal name). Profile of ZIP File Format, Version 6.3.3 from PKWARE.
Description

The development of ISO/IEC 21320-1:2015 - Information technology -- Document Container File -- Part 1: Core was an activity under ISO/IEC JTC 1/SC 34/WG1 to develop a refinement of the widely used ZIP format from PKWARE that would use a subset of ZIP features to support royalty-free use as a container for documents. See ZIP_PK for more on the ZIP File Format in general.

The evolving ZIP format, as defined in a sequence of specifications referred to as the "Application Note" had been in wide use in the computing industry for over twenty years, and the specification had been freely available for much of that time. Initially, only the latest version of the Application Note was available, but since 2010, PKWARE has made available Application Note Archives. Although the format had not been formally standardized, several standards and widely used software applications had incorporated subsets of the specification based on a particular edition of APPNOTE.TXT. The objective of the standardization activity begun in 2011, as reflected in the approved ISO/IEC 21320-1 standard, was to address related challenges faced by standards developers, including those in ISO/IEC committees:

  • stability of reference: what is the correct reference to give for the Zip Application Note and how can it be ensured that this reference remains available?
  • intellectual property rights: what, if any, patents are necessary to implement this technology, and is there a subset that may be freely implemented?
  • cultural and linguistic adaptability: is the Zip Application Note sufficient by itself, or is additional expository material needed to define best practices for global use, e.g. the use of IRIs for file names?
  • interoperability within domain: is there a technology subset that will provide greater interoperability within the domain of Document Container Files than permitting all features of the Zip Application Note?

The new ISO "work item" was approved in August 2011. In November 2012, a working draft of ISO/IEC 21320-1 of the proposed standard was made available for discussion.

ZIP_21320_1 describes itself as a compatible profile of ZIP_6_3_3. The specification consists of restrictions in relation to the full ZIP specification, referred to by specific paragraph numbers in the PKWARE Version 6.3.3 of APPNOTE.TXT. The restrictions include:

  • Files stored in document container files may only be stored uncompressed or using the "deflate" mechanism as defined in RFC 1951
  • The encryption features defined in APPNOTE.TXT are prohibited
  • The digital signature features defined in APPNOTE.TXT are prohibited
  • The "patch data" features defined in APPNOTE.TXT are prohibited
  • Document container files should not be segmented or span multiple volumes
  • Filenames should be encoded in UTF-8 (which allows for ASCII filenames), but this is not mandated unless any byte has a value greater than 0x7F.

    The specification includes an annex on filenames and interoperability (Annex B), which discusses normalization of Unicode characters and prohibited characters in various container formats based on the ZIP format, including OPC/OOXML and the EPUB Open Container Format (OCF) 3.0 (see EPUB_3_0).

Production phase May be used at any lifecycle phase for bundling/packaging files together for exchange, storage, or distribution.
Relationship to other formats
    Subtype of ZIP_6_3_3, ZIP File Format, Version 6.3.3 (PKWARE)
    Subtype of ZIP_PK, ZIP File Format (PKWARE)

Local use Explanation of format description terms

LC experience or existing holdings See ZIP_PK.
LC preference See ZIP_PK.

Sustainability factors Explanation of format description terms

Disclosure ISO/IEC 21320-1 was developed as an international standard under ISO/IEC JTC 1/SC 34/WG1 [Markup Languages]. The ZIP format of which it is a profile was developed and has been published freely online by PKWARE.
    Documentation

The standard is available from ISO as ISO/IEC 21320-1 Information technology -- Document Container File -- Part 1: Core via the catalog record at https://www.iso.org/standard/60101.html. The specification can be downloaded from https://standards.iso.org/ittf/PubliclyAvailableStandards/c060101_ISO_IEC_21320-1_2015.zip

Version 6.3.3 of ZIP, on which draft ISO/IEC 21320-1 is based, is documented in APPNOTE.TXT, Version 6.3.3 (September 2012).

Adoption

The objective of the effort was to define a profile of ZIP that is compatible with the largest number of existing applications and hence provide the greatest level of interoperability. As of May 2020, existing standards have tended to retain the versions of APPNOTE.TXT used when originally developed as normative references. As of May 2020, the compilers of this resource have not identified a specification that uses ISO/IEC 21320-1 as its basis. Comments welcome.

See ZIP_PK for discussion of adoption of ZIP in general.

    Licensing and patents The features in this profile of ZIP are chosen to avoid patent and licensing implications. See ZIP_PK for discussion of patent issues for the parent ZIP format.
Transparency Encryption of individual files and of the central directory is prohibited. Hence this profile of ZIP_PK is more transparent than its parent format.
Self-documentation The ZIP format per se and this profile in particular provide no metadata support beyond what is needed to support unpacking the ZIP archive and extracting the component files. The document format specifications that build on restricted subsets of the ZIP format and might be expected to incorporate this profile in future versions are likely to mandate or facilitate some level of descriptive and structural metadata. For example, OOXML's OCF and EPUB both incorporate files that provide metadata for the document as a whole. Relationships between component files are also likely to be explicit in such document formats, either through a generic relationship representation or use of prescribed application-specific naming conventions.
External dependencies See ZIP_PK.
Technical protection considerations Encryption as supported within the ZIP specification is prohibited in this profile of the ZIP file format. However, it is possible for applications to apply encryption or DRM to the file as a whole or implement application-specific technical protection. Examples of the latter include SCORM and EPUB. See ZIP_PK.

Quality and functionality factors Explanation of format description terms

Other
Bundling/compression Separate functionality factors for comparing formats that are used to bundle and or compress files have not been developed. From the perspective of digital preservation, consideration of the sustainability factors above is more important than the degree of compression.

File type signifiers and format identifiers Explanation of format description terms

Tag Value Note
Filename extension zip
ZIP
Other extensions are used for particular applications that use the ZIP format as a container.
Internet Media Type application/zip
Other Internet Media Types are used for particular applications that use the ZIP format as a container.
File signature See related format.  See ZIP_PK.
Wikidata Title ID Q26211840
See https://www.wikidata.org/wiki/Q26211840.

Notes Explanation of format description terms

General

The ZIP format is designed for cross-platform data exchange and efficient data storage for a set of related files. ZIP_PK is a de facto industry standard, developed, maintained, and openly documented by PKWARE.

See also ZIP_PK.

History

The original version of the format was developed by Phil Katz (hence the "PK" in PKWARE). Since the first specification was published in 1990, PKWARE has updated the format as supported in its products and issued new versions of the specification in a document called APPNOTE.TXT. See ZIP_PK for a more detailed history. The formats defined by versions 6.3.2 (September 2007) and 6.3.3 (September 2012) of APPNOTE.TXT are technically identical. Version 6.3.3 of the APPNOTE.TXT states that the changes from version 6.3.2 are "formatting changes to support easier referencing of this APPNOTE from other documents and standards."

As described in https://en.wikipedia.org/wiki/ZIP_(file_format), a proposed project to create an ISO/IEC international standard for a format compatible with ZIP failed to pass a 2010 ballot of national standards bodies. Instead, a study period was initiated, resulting in recommendations documented in ISO/IEC JTC 1/SC 34 N 1621. The recommendations were (a) to have PKWARE continue its maintenance of the ZIP Application Note, (b) to plan for a new multi-part ISO standard to build on top of the ZIP Application Note, and (c) to propose a new work item for Part 1 of the new standard for a Document Container File. The new work item was approved in August 2011 and ISO/IEC 21320-1 - Information technology -- Document Container File -- Part 1: Core was published in 2015.


Format specifications Explanation of format description terms


Useful references

URLs


Last Updated: 05/20/2020