Sustainability of Digital Formats: Planning for Library of Congress Collections

Introduction | Sustainability Factors | Content Categories | Format Descriptions | Contact
Format Description Categories >> Browse Alphabetical List

ESRI Arc Geodatabase Format Family

>> Back
Table of Contents
Format Description Properties Explanation of format description terms

Identification and description Explanation of format description terms

Full name ESRI Arc Geodatabase Format family
Description

The formats in the Esri Arc Geodatabase (GeoDB) format family are proprietary implementations of a data model used in the Esri ArcGIS product line, which includes ArcMap, ArcCatalog, ArcGlobe, and ArcScene for desktop computers, as well as ArcGIS software for enterprises and mobile devices. The data model extends the georelational data model that was the basis for the Esri ArcInfo_Coverage data format. For example, based on technological developments not available when the Esri coverage format was developed, the Esri geodatabase adds support for object-oriented functionality and takes advantage of the capabilities of off-the-shelf relational data base management systems. The geodatabase data model serves as the common data storage and management framework for all ArcGIS software starting with v8.0, released in late 1999. The data model supports, as standard, a rich collection of objects (rows in a database table) and features (objects with geometry). It also supports advanced feature types such as geometric and logical networks, true curves, complex polylines, and user-defined features. Vector features can have two, three, or four dimensions (x, y, z, and m). Users can define topological and association relationships and rules that define how feature classses interact.

Included in the Esri geodatabase model is a storage mechanism for spatial and attribute data that contains specific storage structures for features, collections of features, attributes, relationships between attributes, and relationships between features. An Esri geodatabase has two major elements: first, a physical store of geographic information inside a relational database management system (DBMS); secondly, a data model that supports objects with attributes and behavior, and transactional views of the database including versioning. Behavior describes how an object or feature can be edited and displayed. Behavior includes, but is not limited to relationships, validation rules, subtypes, and default values. With associated behaviors, data entry is regulated more efficiently, and data contamination issues can be avoided. See Notes below for a high-level explanation from Esri of what the term "geodatabase" means in ArcGIS.

The DBMS for an Esri geodatabase uses tables and other structures found in commercial off-the-shelf database management systems to store spatial data (vector, raster, address, measures, CAD, etc.) including:

  • Simple features such as shapefiles
  • Custom features with business logic and editing rules
  • Attribute data
  • Metadata
  • Images
  • Raster/Grid data
  • CAD data

The geodatabase schema includes the definitions, integrity rules, and behavior for these and for extended capabilities. These include properties for coordinate systems, coordinate resolution, feature classes, topologies, networks, raster catalogs, relationships, domains, and so forth. This schema information is persisted in a collection of geodatabase meta tables in the DBMS. These tables define the integrity and behavior of the geographic information.

The geodatabase data model has been implemented by Esri in three different Types of geodatabases: enterprise databases, also known as spatial database engine (SDE) databases because they use Esri's ArcSDE technology; file geodatabases; and personal geodatabases. These three storage options offer different capabilities and are suitable in different contexts:

  • Enterprise geodatabases are stored in a relational database using Oracle, Microsoft SQL Server, IBM DB2, IBM Informix, or PostgreSQL. These multiuser geodatabases require the use of ArcSDE software and can be unlimited in size and numbers of users. ArcSDE is the recommended native data format for ArcGIS stored and managed in a relational database. See GeoDB_SDE.
  • Personal geodatabases are editable by a single user at a time, with all datasets stored within a single Microsoft Access database file with the .mdb file extension. Its size is limited to 2 gigabytes and it is tied to the Windows operating system and the JET database engine. See MDB_family for information about the underlying format. This storage option was introduced with ArcGIS 8.0 in late 1999. Since the introduction in ArcGIS version 9.2 in 2006 of File geodatabases, Esri has encouraged users to migrate away from use of Personal geodatabases Personal geodatabases are not supported in the ArcGIS Pro product, which is a 64-bit desktop application released in 2015.
  • File geodatabases are stored as many folders in a file system; a folder representing a geodatabase has a file for each dataset as well as supporting files. Users can edit separate datasets simultaneously. Each dataset file can be up to 1 terabyte in size and the File geodatabase is not limited to the Windows operating system. Esri recommends the File geodatabase format over Personal geodatabases. See GeoDB_file for a more detailed description of this storage option.

Geodatabases can be exported from ArcGIS as GeoDB_XML workspaces.

The primary mechanism used in a geodatabase to organize and use geographic information in ArcGIS is the dataset. Three primary dataset types are used: feature classes; raster datasets; and attribute tables. Creating a collection of these dataset types is the first step in designing and building a geodatabase. Users typically start by building a number of these fundamental dataset types. They then add to or extend their geodatabase with more advanced capabilities (such as by adding topologies, networks, or domain-specific schemas) to model GIS behavior, maintain data integrity and work with a set of spatial relationships. See An overview of geodatabase design.

Production phase The Esri geodatabase model and its several storage options can be used to support any active stage of the lifecycle (creation and editing, data sharing and transfer, and distribution to end users) of a collection of closely related datasets.
Relationship to other formats
    Has subtype GeoDB_File, GeoDB, ESRI Geodatabase (File-based). The file-based geodatabase is one option for data storage for a single-user Esri Geodatabase. It is implemented as a collection of binary files in a file system.
    Has subtype GeoDB_SDE, GeoDB, ESRI Geodatabase ArcSDE. The spatial database engine is the multi-user and/or enterprise option for data storage for an Esri Geodatabase.
    Has subtype Esri Personal geodatabase. An option for data storage for a single-user Esri geodatabase that is implemented as a single Microsoft Access file. Esri recommends file geodatabases over Microsoft Access Personal geodatabases, because they offer more functionality and better performance. The Personal geodatabase format is not described separately on this website.
    Affinity to GeoDB_XML, ESRI Geodatabase (XML). Exchange format used by ArcGIS to import and export all items and data in a geodatabase such as domains, rules, feature datasets, and topologies.
    Affinity to ArcInfo_Coverage, ESRI ArcInfo Coverage. GeoDB replaced ArcInfo_Coverage for coverage data. In ArcGIS software releases subsequent to 8.3, ArcInfo_Coverage datasets were no longer editable. Coverage instances must be imported and stored in an Esri geodatabase to be editable.

Local use Explanation of format description terms

LC experience or existing holdings  
LC preference  

Sustainability factors Explanation of format description terms

Disclosure A proprietary data framework used for Esri GIS software applications. Partial documentation is available.
    Documentation

Partial documentation is available in Esri ArcGIS Help. Listed immediately below are versions from ArcGIS 9.2 (2004), ArcGIS 9.3 (2008), ArcGIS 10.3 (2014), and ArcGIS 10.7 (2019). See Format Specifications below for the latest versions of these pages.

  • An overview of the geodatabase: 9.2, 9.3, 10.3, 10.7
  • The architecture of a geodatabase: 9.2, 9.3, 10.3, 10.7
  • Types of geodatabases (comparing the different storage options for Esri geodatabases): 9.2, 9.3, 10.3, 10.7
  • An overview of geodatabase system tables: 10.3, 10.7

The compilers of this resource have noticed differences in the versions of these documents, but being at a high level, these do not appear to relate to underlying technical differences in the format structures. The Simplified Geodatabase Schema in ArcGIS 10 makes it clear that substantial differences in the underlying table structure occurred between ArcGIS versions 9.x and 10.x. Client and geodatabase compatibility suggests that the details of the format structures have been modified or extended over time to accommodate new ArcGIS functionality. Comments welcome.

A general description from 2008 for the XML Schema for the Geodatabase is available.

Adoption

The geodatabase data model was introduced by Esri in the late 1990s with the release of version ArcGIS 8.0. The release of the ArcGIS suite constituted a major change in Esri's software offerings, aligning all their client and server products under one software architecture known as ArcGIS, developed using Microsoft Windows COM standards. While the Esri shapefile (see ESRI_shape) is still used widely in the industry, at least for sharing and transferring datasets among different systems, the Esri geodatabase has become a mechanism of choice for data sharing and data interoperability among organizations, and departments within a single organization. Most of the GIS software market share that Esri holds (approximately 36 percent worldwide as of 2002 and over 45 percent as of January 2019) is held by ArcGIS products that use and support the Esri geodatabase formats. See Useful References below for resources indicating market share at various dates.

Esri's Industries menu page shows the range of industries in which Esri ArcGIS products are deployed.

See Wikipedia article on ArcGIS and COTS GIS: The Value of a Commercial Geographic Information System for more information on adoption.

See also the geodatabase subtypes described in GeoDB_file, GeoDB_SDE, and GeoDB_XML for more on adoption of particular Esri geodatabase storage options. Note that the Personal geodatabase storage option, which uses a single file in the Microsoft Access .mdb format (superseded in 2007 as the default format for Microsoft Access), is not supported in the ArcGIS Pro product, which is a 64-bit desktop application released in 2015. See ArcGIS Pro: Types of geodatabases.

    Licensing and patents Resources available at Esri | Master Agreements; Products and Services Terms of Use detail the terms of use for Esri GIS software.
Transparency Transparency depends on the storage option used.
Self-documentation The Esri geodatabase format supports the application of metadata and requires specifications of data types for attribute data. Semantic descriptions of a dataset and its attributes(variables) are optional.
External dependencies Full use of Esri geodatabases requires use of Esri software products. Enterprise (SDE) geodatabases require use of the Esri ArcSDE spatial database engine with one of a variety of DBMS implementations, including IBM DB2, IBM Informix, Oracle, PostgreSQL, and Microsoft SQL Server. See GeoDB_SDE. Personal geodatabases based on the Microsoft Access .mdb file format were the first form of geodatabase, introduced in ArcGIS 8.0; use requires Esri software and Microsoft Access. Esri advises against editing personal geodatabases directly in Access.

Esri provides products for developers to develop custom extensions or stand-alone applications using the ArcGIS framework. See ArcGIS Engine Developer Kit. Starting in June 2011, Esri provided an API that supports limited exploration, manipulation, and extraction of data in Esri file geodatabases. See GeoDB_File.

Technical protection considerations Whether technological protection can be applied will depend on the storage option used.

Quality and functionality factors Explanation of format description terms

GIS images and datasets
Normal functionality

The Esri geodatabase data model allows users to take advantage of both basic and advanced spatial analysis when GIS data is stored within the geodatabase. Complex business logic can be applied to GIS data to create more detailed and accurate spatial data models that represent real-world GIS application workflows. Examples include land parcel management; natural resources management; river and stream system modeling; utility network system modeling, such as gas, water, and sewage pipelines; and three-dimensional surface modeling of the landscape.

By storing feature classes within a feature dataset, geospatial relationships can be modeled between the feature classes, enabling more advanced GIS analysis. The more common types of geospatial relationship data structures in the geodatabase are:

  • Topology -- Defines and enforces data integrity rules for features. For example, there should be no gaps between polygons. It supports topological relationship queries and navigation, such as feature adjacency or connectivity and sophisticated feature editing tools, and allows feature construction from unstructured geometry (for example, constructing polygon features from line features).
  • Geometric Networks -- Consists of a set of connected edges and junctions (line and point features) that, along with connectivity rules, are used to represent and model the behavior of a common network infrastructure in the real world. Water distribution, electrical lines, gas pipelines, telephone services, and water flow in a stream are all examples of resource flows that can be modeled and analyzed using a geometric network.
  • Network Dataset -- Consists of a set of connected edges and junctions, as well as turn features, along with connectivity rules, that represent and model the behavior of transportation network systems. Highways, roads, and streets in a city; rail lines; and bus routes are examples of undirected network flows that can be modeled with a network dataset.
  • Terrain -- A data structure that is generated from a mass collection of elevation measurement points, typically from remote-sensing data sources. It is a triangulated irregular network (TIN)-based data structure with multiple levels of resolution and is used to represent surface morphology. A terrain is used for 3D surface modeling applications.
  • Cadastral Fabric -- A continuous surface of connected parcel features that represents the record of survey for an area of land. This data structure enables GIS data to be integrated with survey data to maintain a consistent and accurate survey record..

Additional business logic in the geodatabase, in the form of subtypes and attribute domains, can also be applied to GIS data. Subtypes enable categorization of data in a table or feature class. For example, the streets in a streets feature class could be categorized into three subtypes: local streets, collector streets, and arterial streets. Such business logic in the geodatabase helps streamline data entry and ensure the integrity of a user's GIS data. The Esri geodatabase data model is designed to enable users to leverage and optimize their GIS data to its full potential and maintain a consistent, accurate repository. See The Geodatabase: Modeling and Managing Spatial Data for more information.


File type signifiers and format identifiers Explanation of format description terms

Tag Value Note
Filename extension See note.  File signifiers depend on geodatabase storage option.

Notes Explanation of format description terms

General

The answer Esri provided for What is a geodatabase? in association with ArcGIS Pro, released in 2015, demonstrates the inclusive and complex nature of the model and its implementation in Esri products:

  • The geodatabase is the native data structure for ArcGIS and is the primary data format used for editing and data management. While ArcGIS works with geographic information in numerous geographic information system (GIS) file formats, it is designed to work with and leverage the capabilities of the geodatabase.
  • It is the physical store of geographic information, primarily using a DBMS or file system. You can access and work with this physical instance of your collection of datasets either through ArcGIS or through a database management system using SQL.
  • Geodatabases have a comprehensive information model for representing and managing geographic information. This information model is implemented as a series of tables holding feature classes and attributes. In addition, advanced GIS data objects add real world behavior; rules for managing spatial integrity; and tools for working with spatial relationships of the core features and attributes.
  • Geodatabase software logic provides the common application logic used throughout ArcGIS for accessing and working with all geographic data in a variety of files and formats. This supports working with the geodatabase, and it includes working with shapefiles, computer-aided drafting (CAD) files, triangulated irregular networks (TINs), grids, imagery, Geography Markup Language (GML) files, and numerous other GIS data sources.
  • Geodatabases have a transaction model for managing GIS data workflows.

See also The geodatabase is object-relational for an explanation from 2008 of the object-relational model behind GeoDB.

Migration tools have been included in the ArcGIS software suite since the introduction of the geodatabase formats, to import data from other Esri formats. See Migrating Coverages to Geodatabases (2001) and Migrating your existing data into the Geodatabase (2008).

A variant format implemented for the Esri geodatabase model is employed under the covers of the Collector for ArcGIS app used for data collection in the field. As explained in How To: Access offline edits from Collector for ArcGIS directly from an Android or iOS device, Collector for ArcGIS stores offline replicas in a SQLite database or runtime geodatabase (as files with the .geodatabase extension) before they are synchronized to an online geodatabase via a feature service. In the event the offline edits cannot be synchronized, the locally stored edits can be extracted from the mobile device and converted to a File geodatabase on a local personal computer using tools provided by Esri. This geodatabase variant is not considered a storage option. See Useful References below for resources related to this variant.

History

Prior to the development of the ArcGIS data model and software suite, Esri developed the Arc/INFO (now usually written as ArcInfo) workstation and various GUI based products for a suite known as ArcView GIS. In 1999, Esri released ArcGIS 8.0 to provide a single integrated software architecture that included the geodatabase, an object-relational model. All subsequent ArcGIS products to date have used that model. As of June 2020, the most recent generation of ArcGIS application software is the multi-threaded 64-bit ArcGIS Pro, released in January 2015 in conjunction with version 10.3 of existing products, such as ArcMap. See the Esri announcement at ArcGIS 10.3 and ArcGIS Pro Modernize GIS for Organizations and Enterprises.

More information about the history of ArcGIS products can be found in the Wikipedia entry for ArcGIS.


Format specifications Explanation of format description terms


Useful references

URLs


Last Updated: 07/02/2020