- Title: Table
- Identifier: https://stac-extensions.github.io/table/v1.2.0/schema.json
- Field Name Prefix: table
- Scope: Item, Collection
- Extension Maturity Classification: Pilot
- Owner: @TomAugspurger
This document explains the table Extension to the SpatioTemporal Asset Catalog (STAC) specification. It can be used with the projection extension to describe geospatial tabular data.
An Collection or Item can describe a tabular data assets, using a list of Column objects. Additionally, Collections can describe many tabular datasets using Table objects.
- Examples:
- Item example: Shows the basic usage of the extension in a STAC Item
- Collection example: Shows the basic usage of the extension in a STAC Collection
- JSON Schema
- Changelog
The fields in the table below can be used in these parts of STAC documents:
- Collections
- Item Properties (incl. Summaries in Collections)
- Assets (for both Collections and Items, incl. Item Asset Definitions in Collections)
| Field Name | Type | Description |
|---|---|---|
| table:columns | [Column Object] | A list of (#column objects) describing each column. |
| table:primary_geometry | string | The primary geometry column name. |
| table:primary_datetime | string | The primary date/time column name. |
| table:row_count | number | The number of rows in the dataset. |
This is the column name of the "primary" or "active" geometry. This is used by libraries like geopandas and sf
to control which geometry column is used. When a STAC item uses both the projection and table extensions, it's understood that the
values in proj:espg, proj:bbox, etc. that (implicitly) apply to the asset refer to the primary_geometry column.
The fields in the table below can be used in these parts of STAC documents:
- Assets (for both Collections and Items, incl. Item Asset Definitions in Collections)
| Field Name | Type | Description |
|---|---|---|
| table:storage_options | Map<string, any> | DEPRECATED Additional keywords for opening the dataset. |
This can be used with fsspec to specify additional keywords
necessary to open the data. For example, an asset might use {"account_name": "ai4edataeuwest"} to indicate that the asset is
in the ai4edataeuwest storage account. Libraries like adlfs use this information to open the dataset.
A potential alternative for storage options could be the Storage Extension.
The fields in the table below can be used in these parts of STAC documents:
- Collections
- Item Properties (incl. Summaries in Collections)
- Assets (for both Collections and Items, incl. Item Asset Definitions in Collections)
They can be used to catalog a collection of tables, where each table is stored as an Item, without
having to include column-level metadata from each table on the Collection.
| Field Name | Type | Description |
|---|---|---|
| table:tables | Map<string, Table Object> | DEPRECATED A mapping of table names to Table Objects (see below). |
Column objects contain information about each colum in the table.
| Field Name | Type | Description |
|---|---|---|
| name | string | REQUIRED. The column name. |
| description | string | Detailed multi-line description to explain the dimension. CommonMark 0.29 syntax MAY be used for rich text representation. |
| type | string | Native data type of the column. If using a file format with a type system (like Parquet), we recommend you use those types. |
Other properties such as description, license, unit, data_type and statistics from
STAC common metadata
can be used in the Column Object.
It is also recommended to add vector:geometry_types from the Vector Extension
to the column that describe geometry data, e.g. the column identified by the table:primary_geometry field.
type and data_type describe the same information, but type should use the native name in the given file format and data_type describes the standardized data type name according to the STAC specification.
Columns can also include additional information from other extensions that are not otherwise covered on the asset-level and are column specific, e.g. projection extension information for additional geometry columns.
DEPRECATED: Table objects contain high-level summaries about a table.
| Field Name | Type | Description |
|---|---|---|
| name | string | REQUIRED. The table name |
| description | string | Detailed multi-line description to explain the dimension. CommonMark 0.29 syntax MAY be used for rich text representation. |
STAC allows for some flexibility in how to catalog assets. In general, we recommend using the following hierarchy:
- Use
Collectionobjects to catalog a dataset consisting of one or more tables. - Use
Itemobjects to catalog an individual table.
For a dataset consisting of a single table or many tables with the same schema (for example
gbif, which provides snapshots of the same database at
different points in time), you might include table:columns on the Collection itself, or both the Collection and items.
For datasets with many tables (for example, USF Forest Inventory and Analysis),
we recommend cataloging the the columns at just the Item level in table:columns
on each Item.
All contributions are subject to the STAC Specification Code of Conduct. For contributions, please follow the STAC specification contributing guide Instructions for running tests are copied here for convenience.
The same checks that run as checks on PR's are part of the repository and can be run locally to verify that changes are valid.
To run tests locally, you'll need npm, which is a standard part of any node.js installation.
First you'll need to install everything with npm once. Just navigate to the root of this repository and on your command line run:
npm installThen to check markdown formatting and test the examples against the JSON schema, you can run:
npm testThis will spit out the same texts that you see online, and you can then go and fix your markdown or examples.
If the tests reveal formatting problems with the examples, you can fix them with:
npm run format-examples