Sustainability of Digital Formats: Planning for Library of Congress Collections

Introduction | Sustainability Factors | Content Categories | Format Descriptions | Contact
Content Categories >> Still Image | Sound | Textual | Moving Image | Web Archive | Datasets | Email and PIM | Design and 3D | Geospatial | Aggregate | Generic

Geospatial Content >> Introduction to Geospatial Resources and Formats


Scope
The digital formats described within the geospatial category of this web site can be used by geographic information systems (GIS) or other software applications to access, visualize, manipulate, and analyze geospatial data. Resources in these formats are primarily geospatial, i.e., they focus upon conveying information about the Earth, the location of specific features, and attributes and properties of those geo-located features. As such, the formats included in this geospatial content category will often comprise information in the following three forms: (i) raster or bit-mapped images; (ii) vector images consisting of points, lines, and polygons; and (iii) data values that express attributes associated with geographic locations or features. A defining characteristic for geospatial resources is that they are intended to be used by computer systems that enable spatial analysis.

The intended audience for this web site is the librarian, archivist, and/or data manager responsible for preserving digital resources. This essay is an introduction to geospatial formats and GIS functionality for the generalist or specialist geospatial data manager rather than for a GIS domain expert actively using the geospatial data. See also the summary overview Geospatial Content: Quality and Functionality Factors.

The descriptions of geospatial formats on this web site are intended to support the preservation of the data and its documentation and metadata, as received by a digital archive. The preservation goal is to facilitate future viewing or rendering of the data, and, to the extent known and agreed upon by the geospatial community, enable the re-use, re-analysis, and/or re-compilation of the data in the future.

Characteristics of Geospatial Formats
Geospatial resources are composed of data about geo-located features represented primarily by images (raster or vector) and tables or grids of observed or calculated attributes. (For a useful overview, see GIS Data Types: Vector vs. Raster [1].) Increasingly, geospatial formats include geospatially focused datasets or databases that contain primary information about a geographic location. In addition, ancillary and supplemental data that either are included or can be derived using spatial analysis are considered necessary for the full and effective functioning, interpretation and re-use of the data.

Geospatial formats have been and are continuously being specified and adopted by governmental organizations, software vendors, and standards-making bodies. These formats are often based on specifications or standards that are more general, e.g., for still images (raster and vector) and for datasets. For information about such formats and associated factors for assessing quality and functionality, see Still Images: Quality and Functionality Factors and Datasets: Quality and Functionality Factors. Common to geospatial formats are the capabilities for accurate representation of the described resource’s location on the Earth using basic and inherent conceptual mechanisms such as georeferencing, scale, precision and accuracy.

Georeferencing. Georeferencing has been defined as the establishment of a relationship between information (e.g., documents, datasets, maps, images, biographical information) and geographic locations through mechanisms such as the addition of place labels (e.g., place codes or toponyms) or the assignment of geographic coordinates.*  (See the glossary in Linda Hill's Georeferencing: The Geographic Associations of Information [2], p. 228.)

Georeferencing must be understood as a multi-part process involving the concepts of geographic coordinates and two or three dimensional map projections. All of the terms in the following list bear on georeferencing.

  • Coordinates. Geographic coordinates locate points in space. Two of the most commonly used geographic coordinate systems used are the latitude/longitude systems, and Universal Transverse Mercator (UTM) systems which place points on grids that divide sections of the Earth by various means.
  • Projection. Map projection is necessary to correctly map points on Earth; the term refers to the process of mapping a three dimensional Earth onto a two dimensional planar surface such as a paper map or digital GIS. The Earth’s surface is not only curved but is curved irregularly, thus making projection necessary. A declaration of the projection system that was used at time of creation is important although conversion to another projection that will be used in an existing projection framework is often done regardless of the original projection. For a useful overview, see Projections: What You Need To Know for GIS [3].
  • Datum. Different approaches are used to represent the Earth’s surface in a regular way that allows for mathematical calculation. The set of parameters and control points that are used to accurately pinpoint the three dimensional shape of the Earth is called a datum. Each datum is based on a particular view of the earth’s shape as a spheroid or ellipsoid (the science of measuring the shape of the Earth is called geodesy). Along with its unit of measurement (e.g., feet or meters), a datum will vary depending upon whether it looks at the Earth from a primarily horizontal point of view (for a horizontal datum) or a vertical point of view (for a vertical datum), and where it begins the grid, i.e., its prime meridian. For a useful overview, see Datum: What You Need To Know for GIS [4].
          The fact that a particular datum was used is an essential characteristic of a geospatial resource. A declaration of which datum was used in data creation / collection is GIS metadata that should be represented in the data structure of a geospatial format. Users of a dataset may wish to transform the data from one datum to another for purposes of harmonization to a common view (shared by projects or sites, for instance). This is done using datum transformation methods often facilitated by a GIS or performed by external systems and hopefully documented within the GIS metadata that exists for a given dataset.
  • Map Scale. The concept of map scale is also important to be able to judge how to view the data within a geospatial format. Defined as the ratio between linear distance on a resource and the corresponding distance on the surface being mapped, it can involve both grain (the size of a pixel and the smallest resolvable unit) and extent (the size of the study area and the largest resolvable unit). (See the glossary in Linda Hill's Georeferencing: The Geographic Associations of Information [2], p. 231, and Scale in GIS: What You Need To Know for GIS [5].) Information about scale where present and about the readability by humans or machines should be recorded and preserved to facilitate normal functionality.
          Map scale is also related to the important concepts of accuracy and precision as they are reflected in the data, and built into the data structures of a given geospatial format. Each of these factors is important to understand about geospatial data in order to judge its appropriateness for a planned use.
  • Accuracy. Accuracy in the geospatial context generally measures how close an observed and recorded value is to the true value. Spatial accuracy is measured in four primary ways: positional accuracy, attribute accuracy, logical consistency, and completeness. It is by positional accuracy that one knows how close the recorded location is to the real location. Attribute accuracy allows one to measure (by means of error and percentage calculations, usually) how close to reality are the attributes recorded for a described location. Logical consistency reflects the presence, absence or frequency of inconsistent data, usually determined by comparison of themes or juxtaposition of facts. Finally, completeness describes how fully the data describes the location and features about the location that it is intending to represent. (See Paul Bolstad's GIS Fundamentals: A First Text on Geographic Information Systems [6], p. 516)
  • Precision. Precision in this context refers to the consistency of a measurement method, measured by how often the same results are achieved when measuring. It is usually defined in terms of how dispersed a set of repeat measurements are from the average measurement. (See Paul Bolstad's GIS Fundamentals: A First Text on Geographic Information Systems [6], p. 517).

In analysing geospatial formats, it is important to understand how or whether a given format provides the means for documenting and calculating the accuracy and precision of measurement, description, and placement of the feature. Particularly when data are to be re-used, re-computed, or appended in time series or by similar or different instrumentation, it is of primary importance that such measures of the quality of the data be documented. Data quality is important to assess from the perspective of a preservation goal of being able to reproduce or replicate the data for purposes of re-use, and to further scientific experimentation. Contributing to an assessment of data quality are the concepts of provenance and lineage for data. The term provenance concerns the factual establishment of data authorship and is important in assessing the authenticity and accuracy of data. The term lineage goes beyond provenance to include discussion of the source, methods, and timing of the data. All three characteristics of geospatial resources--discussed in more detail below--should be recorded as metadata to enable future users to assess the fitness of a geospatial resource for a particular purpose.

GIS Metadata and Data Documentation
What makes geospatial formats different from other formats is the fundamental capability of placing information in relation to the surface of the Earth either both horizontally and vertically, or horizontally with height or depth implicit. Important characteristics to be recorded in metadata include geographic coordinates, projection, scale, and datum since the digital resources are not comprehensible or usable with normal functionality without it. Even a simple raster image that is a product of a dataset or other format, such as a simple PDF or bit-mapped image of a map should also include information on the map that would allow the viewer to determine how accurately the map corresponded to its geographic "reality," or determine its accurate location from the data behind the map.

Content Standards. Community based content standards exist for geospatial data such as the U.S. Federal Geospatial Data Committee’s Content Standard for Digital Geospatial Metadata (FGDC) [7] and the broader ISO standard for geographic information, ISO 19115:2003 [8]. In the U.S., FGDC’s content standard elements have has been adopted more widely as a result of being incorporated into commonly used software products by such domain giants as ESRI and GeoMedia. (ESRI uses a "profile" of the FGDC standard rather than a native FGDC XML schema.)  ISO 19115:2003 is slowly being adopted by more U.S. federal agencies and international agencies and is beginning to be incorporated into common software packages.

Both the FGDC and the ISO:19115 content standards describe a number of characteristics of data including descriptive, technical, source, and preservation metadata. Descriptive metadata is necessary for identification, citation, and currency assessment of the data. Technical metadata describes the spatial references (projection, datum, and geographic coordinates). Preservation metadata includes the environment characteristics associated with the creation of the data, processing history, and provenance / lineage of the data sources and final output. Within the geospatial community, there is increased use of community derived ontologies for controlled values of semantic concepts such as provenance and data quality, instrument descriptions, algorithm expressions, and workflow processes.

Content-specific metadata and data documentation can be expressed or noted within a given data format in terms of community based content standards (such as ISO 19115, FGDC, SensorML [9], and UncertML [10]) and community-built ontologies. Such information is useful not only for sharing and comparing by subsequent data users, but also for purposes of replicative compilation and/or computation to prove or extend scientific research, and for data extension in the case of open or serial or time series additions to a data set. In addition, it’s useful to know the form of expression for the metadata and documentation, i.e., whether it is expressed in a well-known XML schema or RDF ontology, as CSV spreadsheets, relational database tables and attributes and included with the data, or as links to external reports, ontologies or web-based services. Noting whether standards based metadata can be included or referred to within a format is useful for sharing and comparing between subsequent data users. To ascertain which software products are compliant with OGC standards see the OGC Product Registry [11].

Quality Factors. For some geospatial resources such as satellite data, it is critical to proper understanding and use of the data to include information about the quality of the data as well as its provenance and lineage. For example, for satellite images the percentage of cloud cover is a significant quality characteristic. Such information is critical not only to understanding what the data says, but also to understand appropriate and inappropriate uses of the data.

Levels of Data Quality. There can be many levels at which data quality is and should be documented. For example at the product level, it is key to know how closely the data represents the actual geophysical state given the output from different instruments. Another quality level would be at the pixel level where the algorithms used to create the data points are noted as well as an assessment of the usability of those data points. At the granule-level, statistical roll-up of pixel-level data is compiled. This kind of computation could be important to validate the model used. For example, climate change data models can have grids of contiguous data tagged with uncertainty statistics for each grid cell, thus providing the means to assign quantitative risk factors or uncertainty levels to different mitigation scenarios. Examples of data quality reports for a data set can be found at NASA's NASA Surface Meteorology and Solar Energy: Accuracy [12]. The assessment of bias is a key data quality factor, i.e., bias that is generated from the instruments used (instrumental bias), or the type of sampling or observations made that provide the view of the data produced. In addition, an assessment of appropriate and/or inappropriate use is often considered to be an important data quality consideration.

Provenance and Lineage. Documentation about the provenance of data in terms of factual establishment of its authorship is usually considered to be quite important in order to determine the authenticity of the data, and to some extent its accuracy. For example, knowing the name of the organization and/or person(s) responsible for the creation and/or collection of data may help ascertain whether or why certain features are or are not present, such as roads or buildings on a map of a city. A data consumer would have more confidence in the accuracy of such a map if it had been created by the city’s data center rather than by a student at a local college or university. Another example of describing the provenance of data is the tracking of what instrumentation was used to generate or record the data and the algorithms used to calculate the data output.

The term data lineage is often considered to encompass provenance in the sense of authorship, but can include discussion of source, methods, and timing of the data as it has been created, derived and/or subset over time and into different products. For example, MODIS data is characterized as being generated at various levels, starting with Level 1 which is closest to the raw data output by satellites and other remote sensing instruments, and is rarely used by itself. Level 2 data is derived from Level 1, and may involve a subset of data from certain instruments, or for certain time periods or locations. The crunching or compilation of Level 2 data often results in more specific products that can be used by themselves for various purposes (educational, policymaking, etc.), and are considered Level 3. Ideally, a Level 3 product would include documentation about its lineage going back to the Level 1 data.

GIS Functionality: Introduction
In addition to support for locating a digital resource in relation to the earth’s surface, geospatial formats facilitate the methods used in geospatial information systems to view, access, manipulate, analyze and query the geospatial data represented. Whether focused upon an entire resource, e.g., a geo-referenced image, or attribute values within a resource, there are certain inherent activities that a geospatial format normally should allow a data consumer to perform. Some aspects of spatial analysis are basic, regardless of specific format (i.e., for vector, raster or attribute data). Some analysis techniques are particularly appropriate for raster formats, such as terrain analysis using raster digital elevation models (DEMs). Other techniques are more appropriate for vector formats or attribute data. To determine functionality for geospatial resources, one can ask what does a geospatial data consumer normally want to "do" with the data either immediately, or in the future?

One important feature of GIS systems is the display or printing of maps. Vector data such as geographic coordinates, points, lines, and polygons that describe areas are stored in mathematical form. This data can be used by GIS systems, or by vector graphics software, to print or to display on screen using scalable shapes, labels, and legends. When a user wishes to transform vector data into a raster format, the data structure of the original vector format should facilitate that basic need.

GIS Functionality: Types of Analysis and Uses for GIS Resources
The following sections describe the most important kinds of activities undertaken by users of geospatial resources. Various geospatial formats will have varying levels of fitness for each of these activities. In our descriptions of specific formats, we highlight the ways in which a given format will function in terms of these activities. The description that follows begins with basic spatial analysis and moves on to describe more specialized types of analysis: spatial interpolation and estimation, grid-based analysis, and the role of geospatial datasets. We also note the overlap with the functionality required of two non-GIS formats: still images (especially multispectral images) and datasets (containing data that is not specifically geospatial).

Basic Spatial Analysis. A fundamental activity associated with normal functionality for a geospatial resource involves the reconciling of multiple data sources to the same or compatible geo-referenced locations as represented by the spatial data (coordinate information that describes the resource’s geography), and the attribute data (the non-spatial characteristics describing the resource), and documented by the GIS metadata. In determining the fitness of a format for basic spatial analysis, we must bear in mind that some of the basic analysis techniques described below are only appropriate for vector or attribute data.

Typically, the reconciliation will take the form of performing operations on both the spatial data and the attribute data, if necessary, that allow the resource to become associated or converted to a (different) datum, map projection, and measurement units. Once any necessary reconciliation is done, the data are ready for further spatial analysis including sorting/selection, classification and other operations. Brief descriptions of some types of spatial analysis considered to be part of normal functionality follow.

  • Selection. Operations that enable the identification of one or more conditions or criteria, and the subsequent return of features or sets of features meeting the criteria. Implicit in these operations are the visualization of the spatial data, viewing their metadata, querying their attributes, and the capability of creating one or more layers as output. Querying may be done either by on-screen viewing capabilities within the GIS software, or by set or Boolean algebra mechanisms (using Standard Query Language or SQL, for example). Other mechanisms enabling selection include the establishment of adjacency or containment conditions for either the spatial data or the attributes.
  • Classification Operations that enable features or sets of features to be categorized based on a set of conditions. Classification operations may be done to help group the data for better visualization in the form of a map or other display format, or as the basis for more complex calculations or analysis. Typically, the grouping of the data is done based on the attributes describing the data.
  • Dissolves. Operations that combine similar features in a layer rather than differentiate them, often based on classifications that have already been done on the data or their attributes. Dissolve functions are usually applied to data in order to remove information that is unnecessary to understand, simplify, or process the data.
  • Proximity/Buffering. Operations that help measure the distance between features of interest. Proximity functions may cause existing attribute information to be modified, or to be added. One very common method of establishing proximity among features is by buffering, i.e., establishing an area that is less than or equal to a designated distance from one or more features. Buffers may be specified for point, line or area features, and for either raster or vector data. (See Paul Bolstad's GIS Fundamentals: A First Text on Geographic Information Systems [6], p. 342-3). Specific mechanisms for establishing buffers will differ depending upon the format of the base data.
  • Overlays/Layering. Operations that combine spatial and attribute data from different data sources in a vertical merge of a single layer that enable more complex querying about a given problem. Particularly with overlays, it is critical to ensure that the georeferencing information among the data sources is compatible so that features being merged align correctly.
  • Network Analysis. Operations that define a set of connected features by means of one or more network links that provide the paths between features. The paths can be measured and analyzed for various purposes including cost and time to traverse, and other factors related to resource allocation, for example.

Spatial Interpolation and Estimation. Some geospatial formats support more specialized spatial analyses that use statistical techniques to provide additional data points, especially when the complete extent of data points within it are unknown due to sparse, lost, or unobserved data points. These techniques are also used when changing the size of a grid, especially to a smaller cell size. The analysis can provide an estimation of a more full extent of data points within a given sample. Some formats support the generation and writing of derived data from these statistical techniques back into a resource, usually with a calculated error or accuracy rate included. Some of the statistical methods include the following:

  • Spatial Interpolation. Allows the prediction of variable values at unknown locations based on sampling of the same kind of variables at known locations. This method is often used to estimate air and water temperatures, soil moisture, elevation, population density, etc.
  • Spatial Prediction. Uses spatial interpolation but often includes other variables as well as a total set of other measurements. For instance, a map of elevation values may be combined with a set of measured temperatures to estimate temperatures at elevated locations the temperature values of which are unknown. Typically, this technique is used to generate more data points, lines or areas, i.e., to fill in information when samples do not cover the range of information desired.
  • Establishment of Core Areas. Uses a set of known samples to predict the frequency or likelihood of the occurrence of a feature (such as an event or object). The method is used to identify high use, density or intensity of features as well as the probability of occurrence for a variable or event.

Grid-based Analysis. Grid-based analysis is a GIS functionality that begins by identifying an area of interest and dividing it into rectangular cells based on geo-location (using a known datum and projection). The cells contain data values from a variety of sources and are stored in a format designed to hold gridded data. The values are then available for various forms of spatial and statistical analysis. Grids contain information that can range from geographic coordinates to reflectance values from solar radiation hitting surface features. Since grid capability enhances the utility of geospatial data, format descriptions document when and how a format supports the use of gridded data.

Increasingly, geospatial data users transform their data to formats with more capability for grid-based analysis. For instance, a vector format containing point data may be transformed into a grid-capable format in order to perform area analysis over a broader geographic spectrum. To fully understand the implications of these kinds of data transformations, it is critical that the GIS metadata for the output of the grid-based analysis describe the factors that might impact the potential accuracy (error rate) of the data output such as the cell size and unit of measurement.

  • Use of Vector Grids. Vector data is usually accompanied by attribute tables that can be the object of grid analysis based on simple mathematical functions such as summing, averaging, and identifying mean and medians for values in a row. Some of the same functions described in the Basic Spatial Analysis section work for both vector and raster data such as the use of map algebra where values of cells are combined by addition, subtraction or multiplication, or the computation of statistics (such as majority, mean, and maximum). These kinds of operations can be done at a number of different levels including local functions (using data from one cell only to produce the output for another single cell), regional (data from a set of cells to produce the output for a single cell, i.e., within a neighborhood, aka "focal," or within a zone, aka "zonal"), or global (all data from a raster is operated upon as output for a single cell).
  • Use of Raster Grids. Powerful functions can be found in formats that deal with raster data. Terrain analysis is a very common use of the grids found in raster data sets. For example, a comparison of cell values using spatial functions would allow the calculation of a number of useful variables for a location including height (elevation above a base), slope (the rise of the land relative to a horizontal distance), aspect (the downhill direction of the steepest slope), profile curvature (the curvature of the land parallel to the slope direction), plan curvature (the curvature of the land perpendicular to the slope direction), and visibility (site obstruction from given viewpoints). This kind of information is useful to determine or to predict broader characteristics of a location on the Earth such as temperature, vegetation, erosion, water flow acceleration and convergence, and viewshed to cite a few examples. Digital elevation models (DEMs), typically raster data sets, can be used to create contour lines (connected lines of uniform elevation that run at right angles to the slope) on a surface. Most people are familiar with topographic maps like those from the United States Geographical Survey (USGS) upon which contour lines are displayed.

Geospatial Datasets. For formats that support geospatial datasets, the means used to establish and maintain the relationships among the constructs of a dataset are important. These relationships are useful to know for the full extent of the data within the dataset (as well as selected subsets of the data) such as the attribute tables and features they describe, and the location information that places them on the Earth. In addition, very complex geospatial datasets may be open, and thus a capability may be needed to append data on a rolling or ongoing basis, or to set up relationships among data series based on time or instrumentation sources. It is important to document the extent and mechanism a format uses to maintain relationships among the parts of and the output from a geospatial dataset. In addition, because the primary and secondary or related parts of a geospatial data resource may not necessarily be connected by a given format, our format description documents will note when a community-based aggregation format is warranted to keep the dataset together. Examples of community-based aggregation formats include SDTS (Spatial Data Transfer Standard [13]), SAFE (Standard Archive Format for Europe [14]), XFDU (XML Formatted Data Unit [15]), DDI (Data Documentation Initiative [16]), METS (Metadata Encoding and Transmission Standard [17]) or one of the various flavors of ZIP [18].

  • Support for Specialized Software Interfaces. One of the most important sustainability factors for a format is Adoption. A key sign of adoption is support by a variety of software applications or the existence of a well-supported software toolkit or software library, especially those non-proprietary. For some geospatial datasets it may be advantageous to use a format that supports a community-specific API (application programming interface). Such software interfaces are particularly important for extracting data subsets, data manipulation and transformation, or domain-specific functionality. Software interfaces that support open standards and compatibility with mechanisms for accomplishing common functions promulgated by the OGC (Open Geospatial Consortium [19]) are of particular value. For example OGC has specified a number of web services, such as the OpenGIS Web Map Service, and the OpenGIS Web Coverage Service Interface Standard.

Functionality Shared with Other Categories. The functionality associated with different geospatial formats will vary depending upon whether the format is used for raster data, vector data, or attribute data. For example, using a raster still image, one could identify general locations within an image, but precise location is facilitated by using Boolean logic to query attribute data associated with the raster image. The following sections discuss some of the functionality shared with other formats:

  • Multispectral Images. This factor is shared with still images. Multispectral images such as those generated by satellite and aerial photography are used more and more frequently for geospatial functions. Using techniques associated with grid-based analysis (see further discussion below), relatively wide and uniform data is made available in the form of grids. The grids often take advantage of the inherent capability of photos and scanners to detect electromagnetic energy in ranges from below, through and past human eyesight including infrared and thermal (heat). This imagery collects the values associated with electromagnetic energy generated from solar radiation that bounces back or reflects from objects and features on the Earth. Measured in wavelengths (the distance between peaks in the electromagnetic stream), different objects and features will have different reflectance values, some seen as light visible to the human eye and some not. The known human visible range has been further divided into different color bands based on how humans perceive them, i.e., red, green, and blue. By isolating and combining the values from different color bands as well as the non-visible bands, it is possible to identify objects and features within the image at a certain point in time, both visually and by calculation.
          Following identification, objects and features can then be grouped using classification techniques, indices and models built, thus creating land cover and land use data layers that can be used for various purposes such as vegetation indexes and ocean depth. The still image quality and functionality factor Support for multispectral bands is particularly important for the image formats used for many geospatial raster images.
          Other operations associated with the use of multispectral images include image restoration and rectification which can correct degraded image data, remove systematic distortions in images, and change the image’s projection system. Images may also be enhanced to improve the display, and subsequent capability for better visual interpretation and identification of objects and features.
  • Datasets. Dataset formats intended for use for geospatial datasets should satisfy the requirements for GIS functionality, as described above, but also the normal functionality for datasets in general. As stated in the general discussion titled Datasets: Quality and Functionality Factors, the basic functionality for a dataset format is to provide a structured representation of numeric or character-based values that permits automated manipulation of the structure and the representation. By storing the structured representation together with other information, a dataset format should facilitate the performance of calculations and transformations based on the values.

Summary.
Geospatial formats range from fairly simple raster image formats to quite complex, database formats designed specifically to describe, and often visualize features placed in specific locations on the Earth. The functionality supported by the various geospatial formats represents the degree to which geographic information systems and other computer systems can create, use and modify the geospatial resources, and will be described within the geospatial format descriptions.

References

1. Geospatial Innovation Facility, University of California, Berkeley. GIS Data Types: Vector vs. Raster, accessed April 8, 2011. http://gif.berkeley.edu/documents/GIS_Data_Formats.pdf.

2. Hill, Linda L. Georeferencing: The Geographic Associations of Information. MIT Press: Cambridge, Massachusetts, 2006.

3. Geospatial Innovation Facility, University of California, Berkeley. Projections: What You Need to Know for GIS. Page 1 of document; accessed April 8, 2011. http://gif.berkeley.edu/documents/Projections_Datums.pdf.

4. Geospatial Innovation Facility, University of California, Berkeley. Datum: What You Need to Know for GIS. Page 2 of document; accessed April 8, 2011. http://gif.berkeley.edu/documents/Projections_Datums.pdf.

5. Geospatial Innovation Facility, University of California, Berkeley. Scale in GIS: What You Need to Know for GIS, accessed April 8, 2011. http://gif.berkeley.edu/documents/Scale_in_GIS.pdf.

6. Bolstad, Paul. GIS Fundamentals: a First Text on Geographic Information Systems. White Bear Lake, Minn: Eider Press, 2008.

7. Federal Geographic Data Committee. Content Standard for Digital Geospatial Metadata. FGDC-STD-001-1998, accessed April 8, 2011. http://www.fgdc.gov/standards/projects/FGDC-standards-projects/metadata/base-metadata/index_html.

8. ISO (International Organization for Standardization). Geographic information –- Metadata. ISO 19115:2003, accessed April 11, 2011. http://www.iso.org/iso/catalogue_detail.htm?csnumber=26020.

9. Open Geospatial Consortium Inc., Mike Botts, editor. OpenGIS Sensor Model Language (SensorML) Implementation Specification. OGC 07-000, July 17, 2007; accessed April 11, 2011. http://www.opengeospatial.org/standards/sensorml.

10. Williams, Matthew, Dan Cornford, Lucy Bastin, and Edzer Pebesma. Uncertainty Markup Language (UncertML): OpenGIS Discussion Paper. 08-122r2, accessed January 16, 2012. http://portal.opengeospatial.org/files/?artifact_id=33234.

11. Open Geospatial Consortium Inc. All Registered Products. OGC Product Registry, accessed April 8, 2011. http://www.opengeospatial.org/resource/products.

12. NASA Langley Research Center. NASA Surface Meteorology and Solar Energy: Accuracy. Page accessed April 8, 2011. http://power.larc.nasa.gov/cgi-bin/cgiwrap/solar/print.cgi?accuracy.txt.

13. American National Standards Institute (ANSI). Spatial Data Transfer Standard (SDTS). ANSI NCITS 320-1998, June 9, 1998, accessed April 11, 2011. http://mcmcweb.er.usgs.gov/sdts/standard.html.

14. European Space Agency. Standard Archive Format for Europe (SAFE). Page accessed April 8, 2011. http://earth.esa.int/SAFE/index.html.

15. ISO (International Organization for Standardization). Space Data And Information Transfer Systems -- XML Formatted Data Unit (XFDU) Structure And Construction Rules. ISO 13527:2010, 2010, accessed April 11, 2011. http://www.iso.org/iso/iso_catalogue/catalogue_tc/catalogue_detail.htm?csnumber=53985.

16. DDI Alliance. Data Documentation Initiative (DDI) Technical Specification. Version 3.1, 2009, accessed April 8, 2011. http://www.ddialliance.org/Specification/.

17. Digital Library Federation. METS Metadata Encoding & Transmission Standard. Page accessed April 8, 2011. //www.loc.gov/standards/mets/.

18. Wikipedia. ZIP (file format), Accessed April 8, 2011. http://en.wikipedia.org/wiki/ZIP_%28file_format%29.

19. Open Geospatial Consortium, Inc. Welcome to the OGC Website. Page accessed April 8, 2011. http://www.opengeospatial.org/.

* Georeferencing ought not be confused with georegistration and geocoding. Georegistration is the process of adjusting one drawing or image (the "target component") so that its features match the geographic locations of the same features on a "reference component," i.e., a drawing, image, surface, or map that is known to be correct. Geocoding is the process of determining geographic coordinates from other data, such as street addresses or place names.


Back to top

Last Updated: 01/21/2022