Sustainability of Digital Formats: Planning for Library of Congress Collections

Introduction | Sustainability Factors | Content Categories | Format Descriptions | Contact
Format Description Categories >> Browse Alphabetical List

OpenDocument Text Document Format (ODT), Version 1.2, ISO 26300-1:2015

>> Back
Table of Contents
Format Description Properties Explanation of format description terms

Identification and description Explanation of format description terms

Full name OpenDocument Text Document Format (ODT), Version 1.2. Part of OASIS Open Document Format for Office Applications, Version 1.2 and the equivalent ISO 26300-1:2015.
Description

The OpenDocument Text Document Format (ODT), Version 1.2 (given the short name ODF_text_1_2 here) is a format for editable textual documents. It is one of several subtypes in the ODF family for particular content categories. Designed to be a native format for word-processing applications, the format is sometimes called ODT after its usual file extension. The term ODT will be used here to refer to ODF_text_1_2 and other chronological versions of the OpenDocument Text format. This description relates primarily to part 1 of the ODF 1.2 specification as published by OASIS and the equivalent ISO/IEC 26300-1:2015 specification. The specification covers two physical forms for ODF text documents, a flat form as a single XML file and a package form based on the ZIP_6_2_0 format. This description focuses on the more commonly used ZIP-based package format, given the .odt file extension. Files using the same markup specification and package but with an extension of .ott are for use as document templates. Similarly, files using the same markup specification and package but with an extension of .odm are for use as a "master" text file that primarily comprises subdocuments, e.g., for a book, with chapters as subdocuments.

An ODF package can be recognized as a textual document in several ways. Externally, there are file extensions for three ways in which ODF text documents may be used in word-processing applications, as noted above. The primary internal indication is that the mandatory file named mimetype will contain one of the corresponding strings listed as File signifiers below. An additional way to recognize a textual ODF document is that the <office:body> element, a child of the root <office:document-content> element in content.xml has the child element <office:text>. See chapter 3 of ODF 1.2 Part 1 specification for details.

The ZIP-based package for any ODF file contains, at a minimum, five files: a one-line mimetype file containing a single text string; content.xml; styles.xml; meta.xml; and settings.xml. The typical content.xml file for a minimal text file has the basic form:

  • <office:document-content>
  • <office:automatic-styles>
  • ... styles created automatically by implementations when a user chooses format features, such as a font, directly. See style P1 used below. May include references to fuller style specifications in styles.xml ...
  • </office:automatic-styles>
  • <office:body>
  • <office:text >
  • <text:p text:style-name="P1">Hello World</text:p>
  • </office:text>
  • </office:body>
  • </office:document-content>
  • Note that several of these elements will usually have attributes omitted here. Additional elements in the office: namespace may be included within the office:text element to reflect application-specific defaults.

The corresponding styles.xml file will contain much fuller style specifications needed for even the most simple document, including choices or defaults for page layout, hyphenation, spacing, writing orientation, etc.

See Notes in ODF Family for more information about the flat XML-only variant of ODF files. For a flat textual ODF file, the root <office:document> element has an office:mimetype attribute with one of the three values listed below as File signifiers.

For details of the ZIP-based package for ODF_text_1_2, see ODF_package_1_2. The package specification defines the form for a package manifest, and options for digital signatures, encryption, etc.

Apart from changes to the underlying package format, changes made to the markup for textual ODF documents between ODF versions 1.1 and 1.2 are few. They relate mainly to added formatting options for lists, tables, and references. See Appendix G of the ODF 1.2 Part 1 specification for details.

Production phase Can be used in any production phase. Particularly used for creating documents (initial state) and for editing and review (middle-state). Documents that are formally published are often converted to a format that is designed for final publication and not for convenient editing.
Relationship to other formats
    Subtype of ODF_Family, OpenDocument Format (ODF) Family, OASIS and ISO/IEC 26300
    Subtype of ODF_package_1_2, OpenDocument Package Format, ODF 1.2, ISO 26300-3:2015
    Subtype of ZIP_6_2_0, ZIP File Format, Version 6.2.0 (PKWARE). Various features of the ZIP File Format are not permitted in ODF.
    Contains META-INF/manifest.xml file. This manifest file is mandatory in all ODF packages.
    Has earlier version ODF_text_1_1, OpenDocument Text Document Format (ODT), Version 1.1, ISO 26300-1:2006
    Defined via XML_1_0, XML (Extensible Markup Language) 1.0. A normative RELAX NG schema is part of the specification for ODF 1.2, which includes the specification for text documents.

Local use Explanation of format description terms

LC experience or existing holdings See ODF_family.
LC preference The Library of Congress Recommended Format Statement (RFS) lists ODF as an acceptable format for textual works in digital form and electronic serials. The RFS list does not distinguish between versions of ODF. In general, the Library of Congress prefers formats intended for final publication of textual works, rather than editable formats. Editable word-processing formats will be found in collections of papers of organizations and individuals.

Sustainability factors Explanation of format description terms

Disclosure International open standard. Developed and maintained by OASIS Open Document Format for Office Applications (OpenDocument) TC as part of the OpenDocument Format (ODF) 1.2 specification published by OASIS in 2011. Also approved as part of the equivalent ISO/IEC 26300-1:2015 by ISO/IEC JTC1/SC34.
    Documentation

Specifications from OASIS: Open Document Format for Office Applications (OpenDocument) Version 1.2. Specification for ODF 1.2 text documents are found primarily in chapters 5-8 of Part 1 of the specification. The technical specification is in a normative RNG schema for primary component files for ODF documents..

The identical specification is published as ISO/IEC 26300-1:2015, Information technology -- Open Document Format for Office Applications (OpenDocument) v1.2 Part 1: OpenDocument Schema.

Although there are few changes from ODF_text_1_1 to the schema that defines markup for textual documents, there is a completely new organization and style for the newer specification document. The ODF 1.2 specification is divided into three parts, with the bulk of the markup specification in Part 1: OpenDocument Schema. The package specification is in Part 3: Packages. The new formula specification for spreadsheets is in its own Part 2: Recalculated Formula (OpenFormula) Format. Part 1, which is the part defining ODF_text_1_2, focuses on the schema, with fewer illustrative examples than its predecessor.

See ODF-Family for a listing of namespaces that can be incorporated into any ODF 1.1 or ODF 1.2 document and links to associated specifications.

Adoption

The major applications supporting ODF can read and write text documents as defined in ODF 1.2:

See ODF-Family for more detail on adoption of ODF in general, and particularly for mandates or recommendations for ODF when exchanging editable documents among government agencies and the individiduals or organizations they serve.

    Licensing and patents No concerns. See ODF-Family.
Transparency

The structure and text of an ODT file are all represented in XML and hence viewable without special tools, although XML-aware tools that can show the element hierarchy make viewing and interpretation more convenient. The most commonly used parts, elements, and attributes have recognizable names. Simple documents can be interpreted with very basic tools. However, interpreting the semantics of some elements and the correspondence of some elements and attributes to word-processing functionality will require not only understanding of the schema and the specification text, but familiarity with the associated functionality.

Self-documentation

As for other members of the ODF 1.2 family, ODF_text_1_2 added support for metadata based on RDF (W3C's Resource Description Framework). As well as using RDF for metadata for the document package as a whole, RDF can be attached to elements within the document's content. The use of "custom" metadata as specified in ODF 1.1 is deprecated in ODF 1.2.

Pre-defined metadata elements for the document as a whole, stored in an office:meta element include:

  • From the Dublin core namespace, using the dc: prefix: Title, Creator (of most recent modification), Description, Subject, Date (last modified), Language
  • From the ODF specification, using the meta: prefix: Generator (creating software application), Keywords, Initial Creator, Creation Date and Time, Modification Date and Time, Print Date and Time, Document Template, Document Statistics (word count, page count, etc.),

The pre-defined elements are all optional and repeatable. However, applications are not required to update multiple occurrences in a specific way to reflect modifications to a document.

Also supported in both ODF 1.1 and ODF 1.2 is an XML structure for user-defined metadata, based on triplets of name, data type, and value.

External dependencies Depends on features used. Textual documents in ODF_text_1_2 format may include sections that import text from an external document or data source; see clause 5.4 in Part 1 of the specification. They may include links to external databases; see clause 7.6. They can also import scripts from external sources; see subclause 7.7.9.
Technical protection considerations Encryption is supported for files within an ODF 1.1 or ODF 1.2 package. In addition, an ODF package file may be encrypted during interchange or as part of DRM controlling distribution.

Quality and functionality factors Explanation of format description terms

Text
Normal rendering Editable document, with embedded support for powerful word-processing functionality. Textual content is conveniently extractable for quotation and for indexing. Full support for Unicode character set.
Integrity of document structure Paragraphs and sections are easily recognized, as are headers and footers. Excellent support is available for higher-level constructs through the consistent use of styles (e.g., for headings), automatically generated tables of contents and indexes, and structured templates. All formatting is represented in the stored ODT document by style structures in order to facilitate separation of the text and semantic structure from layout characteristics. Style structures for a document are stored in one file, styles.xml. Textual content, including the semantic structure of chapters, paragraphs etc., are stored in content.xml.
Integrity of layout and display Excellent support for layout choices. Represents entire layout and formatting as intended by an author who used a word-processor for which ODT is a native format. Bi-directional and vertical display of text can be specified. Differences in detail can occur on display if the original fonts used are not available in the system used for viewing or due to conversion from another word-processing format with different markup semantics.
Support for mathematics, formulae, etc. Mathematical equations can be included in documents by use of MathML either as independent files that can be embedded in a document or as drawing objects.
Functionality beyond normal rendering

In contrast to formats designed for documents as publications, word-processing formats such as ODT typically store much information associated with the process of creating and reviewing documents, including tracked changes and other annotations.

ODT files may include markup to support building an index or bibliography from references entered in the text. ODT documents may include tables of contents generated automatically from section headings; such files will include elements and attributes to support regeneration of the table of contents using the author's choices of levels to include, of layout style, and of paragraphs to attach to headings.

ODT files may include forms designed to be filled in by a reader. The ODF 1.1 and 1.2 specifications include elements and attributes in the forms namespace to define presentation and controls for interactive forms. It also allows use of the W3C XForms namespace for defining models and controls for forms.


File type signifiers and format identifiers Explanation of format description terms

Tag Value Note
Filename extension odt
.odt is the extension used for a regular word-processing document.
Internet Media Type application/vnd.oasis.opendocument.text
The MIME types for ODF_text_1_2 are the same as for ODF_text_1_1.
Magic numbers See note.  Magic numbers that apply to ODF document category subtypes incorporate the magic number for ZIP_PK, the string mimetype at position 30, and the MIME subtype string value at position 38.
Indicator for profile, level, version, etc. ASCII: office:version="1.2"
The four root elements used in the primary files in an ODF package all require an attribute that records the ODF version. This is the signifier that distinguishes ODF 1.2 packages from earlier versions. Documents without this attribute are assumed to be from version 1.1 or earlier.
Pronom PUID fmt/291
See https://www.nationalarchives.gov.uk/PRONOM/fmt/291.
Wikidata Title ID Q27203601
See https://www.wikidata.org/wiki/Q27203601.
Tag Value Note
Filename extension ott
The extension .ott is used for a text document used as a template.
Internet Media Type application/vnd.oasis.opendocument.text-template
 
Tag Value Note
Filename extension odm
The .odm extension is intended for an ODF text file that primarily comprises subdocuments, e.g., for a book, with chapters as subdocuments.
Internet Media Type application/vnd.oasis.opendocument.text-master
The text-master MIME type is intended for an ODF text file that primarily comprises subdocuments, e.g., for a book, with chapters as subdocuments.

Notes Explanation of format description terms

General

ODF_text_1_2 introduces the concept of an ODF Extended document, and has a related clause 3.17 on Foreign Elements and Attributes. The inclusion of support for this possibility was controversial, judging from a blog post by Dennis Hamilton and subsequent comments, but some mechanism of this sort is necessary to support the introduction and testing of enhancements to the format to fulfill customer demand for new features in applications. ODF_text_1_1 also permits the use of elements beyond those covered by the specification. Section 1.5 of the ODF 1.1 specification begins, "Documents that conform to the OpenDocument specification may contain elements and attributes not specified within the OpenDocument schema. Such elements and attributes must not be part of a namespace that is defined within this specification and are called foreign elements and attributes." OASIS produced a so-called "strict" schema, that could be used to permit only elements defined in the specification. The specification indicates how applications should treat foreign elements. The compilers of this resource have not determined whether there is a substantive difference between ODF 1.1 and ODF 1.2 in relation to the inclusion and recommended treatment of foreign elements and attributes. Comments welcome.

See Notes for DOCX/OOXML_2012 for notes on challenges for conversion between ODT and DOCX formats.

History

See ODF_package_1_2 for discussion of changes to ODF in general between versions 1.1 and 1.2. Changes to the specification for text documents between versions 1.1 and 1.2 were limited to corrections and small modifications requested by implementers.

ODF 1.3 was approved as an OASIS Committee Specification in December 2020, according to a December 4, 2020 announcement. This followed several periods of public review in 2019 and 2020. The next stage in the multi-step OASIS process is to gather three "statements of use", written statements that a party has successfully used or implemented the specification. See Approval of an OASIS Standard.

The specification for ODF 1.3 has been re-organized into four Parts. Part 1 is a brief introduction; Part 2 is the Packages specification; Part 3 defines the OpenDocument Schema, which includes specifications for the ODF content subtypes, including charts; and Part 4 defines the Recalculated Formula (OpenFormula) Format. The main specification for textual content is in subclause 3.4 and clauses 5, 6, and 7 of ODF 1.3, Part 3. Since text documents can include indexes, tables, illustrations, charts, database reports and forms, clauses 8, 9, 10, 11, 12 and 13 are also relevant. Clause 16 covers styles. Judging from the change log in Appendix G of ODF 1.3, Part 3, most of the changes are corrections and clarifications aimed at improving interoperability across implementations. Enhancements or changes to the markup of text documents include the following:

  • New elements have been added to allow specification of special header and footer formats on the first page of a document.
  • ODF 1.2 has a text:text-input element that permits freeform text to be entered by a user. ODF 1.3 adds a mechanism for supplying a drop-down list of choices for a user to pick from.
  • For changes to display or control of charts, see History notes in ODF_chart_1_2.
  • For updated support for digital signatures and encryption in the ODF package format, see History notes in ODF_package_1_2.

See ODF_family for more on the history ODF in general.


Format specifications Explanation of format description terms


Useful references

URLs


Last Updated: 12/22/2021