Skip to content

stac-extensions/table

 
 

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

20 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Table Extension Specification

This document explains the table Extension to the SpatioTemporal Asset Catalog (STAC) specification. It can be used with the projection extension to describe geospatial tabular data.

An Collection or Item can describe a tabular data assets, using a list of Column objects. Additionally, Collections can describe many tabular datasets using Table objects.

Fields

The fields in the table below can be used in these parts of STAC documents:

  • Collections
  • Item Properties (incl. Summaries in Collections)
  • Assets (for both Collections and Items, incl. Item Asset Definitions in Collections)
Field Name Type Description
table:columns [Column Object] A list of (#column objects) describing each column.
table:primary_geometry string The primary geometry column name.
table:primary_datetime string The primary date/time column name.
table:row_count number The number of rows in the dataset.

table:primary_geometry

This is the column name of the "primary" or "active" geometry. This is used by libraries like geopandas and sf to control which geometry column is used. When a STAC item uses both the projection and table extensions, it's understood that the values in proj:espg, proj:bbox, etc. that (implicitly) apply to the asset refer to the primary_geometry column.


The fields in the table below can be used in these parts of STAC documents:

  • Assets (for both Collections and Items, incl. Item Asset Definitions in Collections)
Field Name Type Description
table:storage_options Map<string, any> DEPRECATED Additional keywords for opening the dataset.

table:storage_options

This can be used with fsspec to specify additional keywords necessary to open the data. For example, an asset might use {"account_name": "ai4edataeuwest"} to indicate that the asset is in the ai4edataeuwest storage account. Libraries like adlfs use this information to open the dataset.

A potential alternative for storage options could be the Storage Extension.


The fields in the table below can be used in these parts of STAC documents:

  • Collections
  • Item Properties (incl. Summaries in Collections)
  • Assets (for both Collections and Items, incl. Item Asset Definitions in Collections)

They can be used to catalog a collection of tables, where each table is stored as an Item, without having to include column-level metadata from each table on the Collection.

Field Name Type Description
table:tables Map<string, Table Object> DEPRECATED A mapping of table names to Table Objects (see below).

Column Object

Column objects contain information about each colum in the table.

Field Name Type Description
name string REQUIRED. The column name.
description string Detailed multi-line description to explain the dimension. CommonMark 0.29 syntax MAY be used for rich text representation.
type string Native data type of the column. If using a file format with a type system (like Parquet), we recommend you use those types.

Other properties such as description, license, unit, data_type and statistics from STAC common metadata can be used in the Column Object.

It is also recommended to add vector:geometry_types from the Vector Extension to the column that describe geometry data, e.g. the column identified by the table:primary_geometry field.

type and data_type describe the same information, but type should use the native name in the given file format and data_type describes the standardized data type name according to the STAC specification.

Columns can also include additional information from other extensions that are not otherwise covered on the asset-level and are column specific, e.g. projection extension information for additional geometry columns.

Table Object

DEPRECATED: Table objects contain high-level summaries about a table.

Field Name Type Description
name string REQUIRED. The table name
description string Detailed multi-line description to explain the dimension. CommonMark 0.29 syntax MAY be used for rich text representation.

Best Practices

STAC allows for some flexibility in how to catalog assets. In general, we recommend using the following hierarchy:

  • Use Collection objects to catalog a dataset consisting of one or more tables.
  • Use Item objects to catalog an individual table.

For a dataset consisting of a single table or many tables with the same schema (for example gbif, which provides snapshots of the same database at different points in time), you might include table:columns on the Collection itself, or both the Collection and items.

For datasets with many tables (for example, USF Forest Inventory and Analysis), we recommend cataloging the the columns at just the Item level in table:columns on each Item.

Contributing

All contributions are subject to the STAC Specification Code of Conduct. For contributions, please follow the STAC specification contributing guide Instructions for running tests are copied here for convenience.

Running tests

The same checks that run as checks on PR's are part of the repository and can be run locally to verify that changes are valid. To run tests locally, you'll need npm, which is a standard part of any node.js installation.

First you'll need to install everything with npm once. Just navigate to the root of this repository and on your command line run:

npm install

Then to check markdown formatting and test the examples against the JSON schema, you can run:

npm test

This will spit out the same texts that you see online, and you can then go and fix your markdown or examples.

If the tests reveal formatting problems with the examples, you can fix them with:

npm run format-examples

About

Describes tabular data assets using a list of Column objects. Tables can be specified in Collections.

Resources

License

Stars

Watchers

Forks

Packages

 
 
 

Contributors

Generated from stac-extensions/template