|Introduction | Sustainability Factors | Content Categories | Format Descriptions | Contact|
|Full name||ISO/IEC 14496-14:2003. Information technology -- Coding of audio-visual objects -- Part 14: MP4 File Format (formal name); MPEG-4 file format, version 2 (common name)|
|Description||The second MPEG-4 file format developed by the Motion Picture Experts Group (MPEG). The format's object-based design defines a set of tools that present binary coded representation of individual audiovisual objects, text, graphics, and synthetic objects. (See Notes below.) This format is intended to serve web and other online applications; mobile devices, i.e., cell phones and PDAs; and broadcasting and other professional applications. See also Notes below.|
|Production phase||Generally a final-state (end-user delivery) format, may also serve as middle-state format.|
|Relationship to other formats|
|Subtype of||ISO_BMFF, ISO Base Media File Format|
|Has subtype||MP4_FF_2_V, MPEG-4 File Format, V.2, with Visual Encoding (All Profiles)|
|Has subtype||MP4_FF_2_AVC, MPEG-4 File Format, V.2, with AVC, No Profile Indicated|
|Has subtype||MP4_FF_2_AVC_BP, MPEG-4 File Format, V.2, with AVC, Baseline Profile|
|Has subtype||MP4_FF_2_AVC_MP, MPEG-4 File Format, V.2, with AVC, Main Profile|
|Has subtype||MP4_FF_2_AVC_EP, MPEG-4 File Format, V.2, with AVC, Extended Profile|
|Has subtype||MP4_FF_2_AVC_HP, MPEG-4 File Format, V.2, with AVC, High Profile|
|Has subtype||MP4_FF_2_AVC_H10P, MPEG-4 File Format, V.2, with AVC, High 10 Profile|
|Has subtype||MP4_FF_2_AVC_H422P, MPEG-4 File Format, V.2, with AVC, High 4:2:2 Profile|
|Has subtype||MP4_FF_2_AVC_H444P, MPEG-4 File Format, V.2, with AVC, High 4:4:4 Profile|
|Has subtype||MP4_FF_2_AAC, MPEG-4 File Format, V.2, with Advanced Audio Coding|
|Has subtype||For other object types, not described at this time|
|Has earlier version||MP4_FF_1, MPEG-4 File Format, Version 1|
|LC experience or existing holdings||The content produced by the NDIIPP partnership project with SCOLA consists of foreign television news broadcasts in MP4_FF_2_V, MPEG-4 File Format, V.2, with Visual Encoding.|
|Disclosure||Open standard. Developed by ISO technical program JTC 1/SC 29 (WG11), aka the Motion Picture Experts Group (MPEG).|
|Documentation||ISO/IEC 14496-14:2003. Information technology -- Coding of audio-visual objects -- Part 14: MP4 File Format.
The total documentation package for ISO/IEC 14496 is extensive; 17 parts have been published from 1998 to 2004, with more to come. See complete list of documents in Format specifications below.
|Adoption||Appears to be more widely adopted than MP4_FF_1. Overall, the adoption of MPEG-4 has been slowed by licensing terms that require some content disseminators to pay fees according to the number of endusers or the extent of content delivered. As adoption advances, it may not extend to all profiles, levels, or parts of the standard.|
|Licensing and patents||MPEG-4 Visual, Systems, and Advanced Video Coding licensing is managed by MPEG LA LLC (http://www.mpegla.com/). These licenses cover the manufacture and sale of devices or software and, for some content disseminators, levy fees according to number of endusers or the extent of content delivered. The arrangements are updated periodically; for example, in January 2005, MPEG LA announced that the patent portfolio had been expanded to cover the FRExt (Fidelity Range Extensions) associated with MPEG-4_AVC and ITU H.264
MPEG-4 Audio licensing is managed by Via Licensing Corporation (http://www.vialicensing.com/), an independent subsidiary of Dolby Laboratories. MPEG-4 Audio licensing appears to be limited to the manufacture of devices or software.
|Transparency||Depends upon included encodings, but all MPEG-4 encodings depend upon algorithms and tools to read and require sophistication to build tools.|
|Self-documentation||The inclusion of metadata of various types is a key element in MPEG-4. As indicated in the notes below, object and scene descriptions are required in order for MPEG-4 content to be presented.
Semantic description is carried by Object Content Information (OCI) descriptors and streams; the standard also permits the inclusion of MPEG-7 data, a separately standardized structure for metadata to support discovery and other purposes.
|External dependencies||Playback of surround sound requires multiple loudspeakers.|
|Technical protection considerations||MPEG-4 offers a standardized Intellectual Property Management and Protection (IPMP) interface consisting of IPMP-Descriptors (IPMP-Ds) and IPMP-Elementary Streams (IPMP-ES) that allow the design and use of domain-specific IPMP systems.|
|Normal rendering||Good support. The format supports timescales that manage the playout of time-based media streams and hint tracks employed in streaming applications.|
|Clarity (high image resolution)||Depends upon encoding; see MPEG-4_V and MPEG-4_AVC.|
|Functionality beyond normal rendering||MPEG-4 program streams may be multiplexed in MPEG-2 transport streams. Random access and other features are discussed in the specification.|
|Fidelity (high audio resolution)||Depends upon encoding; the encodings used are generally lossy and provide moderate to very good fidelity. See, for example, AAC_MP4, considered to be superior to MP3 (MPEG-2 layer 3 audio) at a given bit rate.
The MPEG-4 standard also provides support for other "natural" sound encodings, e.g., parametric coding (HILN or Harmonic and Individual Lines plus Noise) and CELP (Code Excited Linear Prediction) and other encodings for speech. The standard also supports the synthesis of audio, and for what is called Synthetic-Natural Hybrid Coding (SNHC). The presentation of these elements depends upon the use of AudioBIFs (Audio BInary Format for Scenes). In 2005, the MPEG committee announced two additional audio capabilities: Audio Lossless coding (ALS; lossless compression of multi-channel sound using time-domain prediction and entrogy coding) and Scalable to Lossless coding (SLS; a scalable enhancement layer is added to a lossy bitstream that extends the representation to lossless but which can be truncated at delivery time). The compilers of this document do not know the degree to which any of these various elements may be implemented in practice.
|Multiple channels||The AAC_MP4 audio structure provides a capability of up to 48 main audio channels, 16 LFE (Low Frequency Encoding or Effects) channels, 16 overdub/multilingual channels, and 16 data streams.
SNHC [and other note-based or synthetic?] sound can be spatially presented using extensions of the concepts initially implemented in Virtual Reality Modeling Language (VRML).
|Support for user-defined sounds, samples, and patches||Not applicable.|
|Functionality beyond normal rendering||Not fully investigated at this time. Recent published or announced additions to the standard include Part 16, the Animation Extension Framework; Part 17 for "timed text," e.g., subtitles or karaoke; Part 18 for font compression and streaming; and Part 22 for Open Fonts based on the OpenType specification.|
Paraphrased from www.m4a.com: MP4 can be used for MPEG 4 video files, combined video and audio files, or just plain MPEG 4 audio. M4A files contain only MPEG 4 Audio. Apple started using M4A to identify files unprotected by digital rights management; note that protected QTA_AAC files carry the M4P and and M4B (for bookmarkable files) extensions. Apple felt that MP4 was too general (video, video/audio, or audio) and might confuse some media players. Until recently, encoder and player software like Nero and Compaact used .mp4 for audio files while WinAmp 5.02, Apple iTunes, and others used .m4a. Today, most audio software developers allow you to choose the file extension you prefer.
The Wikipedia article Apple Lossless (consulted November 2, 2012) reports that the m4a extension is used for files containing either AAC_MP4 or the Apple Lossless encoding, wrapped in the MPEG4_FF_2 (MPEG-4, version 2) file format.
|Internet Media Type||video/mp4
||According to IETF RFC 4337 (March 2006), for files with video and audio streams (including MPEG-J1).|
|Internet Media Type||audio/mp4
||According to IETF RFC 4337 (March 2006), for files with audio but no visual aspect (including MPEG-J1).|
|Internet Media Type||application/mp4
||According to IETF RFC 4337 (March 2006), for files with neither visual nor audio presentations but only MPEG-J1 or MPEG-7 metadata.|
|Internet Media Type||application/mpeg4-iod
|IOD (Initial Object Descriptor) in binary format and (with appended xmt) in textual format, from IETF RFC 4337 (March 2006).|
|Internet Media Type||video/mp4v-es
|Additional MIME types referred to in various documents. IETF RFC 3016 reports that MIME types may have indicators for data rate or profile-level appended to them.|
|Magic numbers||See note.||None|
|Uniform Type Identifier (Mac OS)||mpg4
||Similar in function to a filename; the mpg4 type code is documented in IETF RFC 4337 (March 2006).|
|File type brand (ISO Base Media File Format)||mp42
||ISO_BMFF includes a file type box that contains major and minor brands (identifiers); this brand is specified in Part 14, Section 4 (ISO/IEC 14496-14:2003. Information technology -- Coding of audio-visual objects -- Part 14: MP4 File Format, p. 6).|
|General||The four file formats associated with the ISO/IEC 14496 family of specifications are:
• MP4_FF_1, "version 1" from Part 1 (2001)
• MP4_FF_2, "version 2," this document, from Part 14
• MP4_FF_AVCE, for Advanced Video Coding extensions, from Part 15
• MP4_XMT, "textual format" from Part 11
Version 2 is very similar to its predecessor MP4_FF_1 as both owe a debt to the QuickTime file format that preceded them. This lineage is shared with the supertype for MP4_FF_2, ISO_BMFF, defined in Parts 12 of both the MPEG-4 and JPEG 2000 standards.
Note that "object-oriented building blocks" are called boxes in this file format and its parent, ISO_BMFF; in contrast, they are called atoms in the predecessor MP4_FF_1 and QuickTime.
The object-based design of MPEG-4 is characterized as follows in Fernando Pereira and Touradj Ebrahimi's The MPEG-4 Book (Upper Saddle River, NJ: IMSC Press, 2002): "MPEG-4 is an ISO/IES standard developed by MPEG for communicating interactive audiovisual scenes. The standard defines a set of tools that provide binary coded representation of individual audiovisual objects, text, graphics, and synthetic objects. The interactive behaviors of these objects and the way they are composed in space and time to form an MPEG-4 scene are dependent on the scene description, which is coded in a binary format known as binary format for scenes (BIFS) . . . . The audiovisual streams are defined as elementary streams (ESs) and managed according to the object descriptor (OD) framework . . . . In addition, the OD framework defines additional streams for object content information (OCI), MPEG-J [Java APIs], and intellectual property management and protection (IPMP)." (p. 188)
BIFS owes a debt to the Virtual Reality Modeling Language (VRML), even as it extends VRML's capabilities and employs binary encoding. Timing of elements in MPEG-4 is managed by a Synchronization Layer (SL). The delivery of MPEG-4 content is supported by the Delivery Multimedia Framework or DMIF and its application interface.
MPEG-J is described in Part 1 of the standard (ISO/IEC 14496-1:2004). This API for the interoperation of MPEG-4 media players with Java code is contrasted with a conventional parametric system. "By combining MPEG-4 media and safe executable code, content creators may embed complex control and data processing mechanisms with their media data to intelligently manage the operation of the audio-visual session. The parametric MPEG-4 System forms the Presentation Engine while the MPEG-J subsystem controlling the Presentation Engine forms the Application Engine. The Java application is delivered as a separate elementary stream to the MPEG-4 terminal. There it will be directed to the MPEG-J run time environment, from where the MPEG-J program will have access to the various components and required data of the MPEG-4 player to control it." (p. xii)
1 From http://mpeg.chiariglione.org/faq/mp4-sys/sys-faq-mpegj.htm: MPEG-J is an extension of MPEG-4 that allows the use of Java classes within MPEG-4 content. It allows an audio-visual session to be adapted to the operating characteristics of the terminal. Important characteristics are, for example, the capability to allow graceful degradation under limited or time varying resources and the ability to respond to user interaction to allow programmatic control of the terminal to facilitate the integration of features for applications such as set top box, interactive games and mobile AV terminals in MPEG-4.