Welcome to the HDMF-common Format Specification

Overview of hdmf-common

hdmf-common defines common data structures to be used across applications.

DynamicTable

The DynamicTable type is used to store tabular data. The tables are created in a columnar fashion with each column stored in its own VectorData object. Rows of the table are assigned unique ids with the required id column of type ElementIdentifier. The colnames attribute indicates the order of the columns.

VectorData

VectorData is the datatype used to store a column in a DynamicTable. If unpaired with a VectorIndex object the first dimension is the row dimension, which must be the same across all of the columns in that DynamicTable.

Ragged Arrays

(also known as Jagged Arrays)

Sometimes, you want to have a 2-d array where each row of the array has a different number of elements. For instance, in neuroscience, when storing the action potential times of sorted neurons, you might want to store them as a neuron x times matrix, but the problem is that each neuron will have a different number of spikes, so the second dimension will be inconsistent.

ragged array goal

There are a number of possible solutions to this problem. Some solve it by NaN-padding the array. You might want to store the spike times of each neuron in a separate dataset, but that will not scale well if you have many neurons. In HDMF, you would store this using a pair of objects a VectorData and a VectorIndex object. The VectorData array holds all of the data concatenated as a 1-d array, and it is paired with a link to a VectorIndex object that indexes the data, forming a map between the rows of the ragged array and the indices of VectorData.

ragged arrays in HDMF

These objects are generally stored inside a DynamicTable, and the elements of VectorIndex map onto the rows of the table. The VectorData object may be n-dimensional, but only the first dimension is ragged.

Experimental data structures

The following data structures are currently available under the HDMF-experimental schema. These are subject to change! They are not guaranteed to exist in the future nor maintain backward compatibility.

ExternalResources

The ExternalResources type is used to store references to data stored in external, web-accessible databases. This information is maintained using four row-based tables.

Version v1.8.0 Jan 12, 2024

Format Overview

Namespace – HDMF Common

  • Description: Common data structures provided by HDMF

  • Name: hdmf-common

  • Full Name: HDMF Common

  • Version: 1.8.0

  • Authors:
    • Andrew Tritt

    • Oliver Ruebel

    • Ryan Ly

    • Ben Dichter

  • Contacts:
  • Schema:
    • doc: base data types

    • source: base.yaml

    • title: Base data types

    • doc: data types for a column-based table

    • source: table.yaml

    • title: Table data types

    • doc: data types for different types of sparse matrices

    • source: sparse.yaml

    • title: Sparse data types

Type Hierarchy

Type Specifications

Base data types

base data types

Data

Overview: An abstract data type for a dataset.

Container

Overview: An abstract data type for a group storing collections of data and metadata. Base type for all data and metadata containers.

SimpleMultiContainer

Overview: A simple Container for holding onto multiple containers.

SimpleMultiContainer extends Container and includes all elements of Container with the following additions or changes.

SimpleMultiContainer
Datasets, Links, and Attributes contained in <SimpleMultiContainer>

Id

Type

Description

<SimpleMultiContainer>

Group

Top level Group for <SimpleMultiContainer>

.<Data>

Dataset

Data objects held within this SimpleMultiContainer.

  • Extends: Data

  • Quantity: 0 or more

Groups contained in <SimpleMultiContainer>

Id

Type

Description

<SimpleMultiContainer>

Group

Top level Group for <SimpleMultiContainer>

.<Container>

Group

Container objects held within this SimpleMultiContainer.

Groups: <Container>

Container objects held within this SimpleMultiContainer.

Table data types

data types for a column-based table

VectorData

Overview: An n-dimensional dataset representing a column of a DynamicTable. If used without an accompanying VectorIndex, first dimension is along the rows of the DynamicTable and each step along the first dimension is a cell of the larger table. VectorData can also be used to represent a ragged array if paired with a VectorIndex. This allows for storing arrays of varying length in a single cell of the DynamicTable by indexing into this VectorData. The first vector is at VectorData[0:VectorIndex[0]]. The second vector is at VectorData[VectorIndex[0]:VectorIndex[1]], and so on.

VectorData extends Data and includes all elements of Data with the following additions or changes.

  • Extends: Data

  • Primitive Type: Dataset

  • Dimensions: [[‘dim0’], [‘dim0’, ‘dim1’], [‘dim0’, ‘dim1’, ‘dim2’], [‘dim0’, ‘dim1’, ‘dim2’, ‘dim3’]]

  • Shape: [[None], [None, None], [None, None, None], [None, None, None, None]]

  • Inherits from: Data

  • Subtypes: VectorIndex, DynamicTableRegion

  • Source filename: table.yaml

  • Source Specification: see Section 3.3.1

Datasets, Links, and Attributes contained in <VectorData>

Id

Type

Description

<VectorData>

Dataset

Top level Dataset for <VectorData>

  • Neurodata Type: VectorData

  • Extends: Data

  • Dimensions: [[‘dim0’], [‘dim0’, ‘dim1’], [‘dim0’, ‘dim1’, ‘dim2’], [‘dim0’, ‘dim1’, ‘dim2’, ‘dim3’]]

  • Shape: [[None], [None, None], [None, None, None], [None, None, None, None]]

.description

Attribute

Description of what these vectors represent.

  • Data Type: text

  • Name: description

VectorIndex

Overview: Used with VectorData to encode a ragged array. An array of indices into the first dimension of the target VectorData, and forming a map between the rows of a DynamicTable and the indices of the VectorData. The name of the VectorIndex is expected to be the name of the target VectorData object followed by “_index”.

VectorIndex extends VectorData and includes all elements of VectorData with the following additions or changes.

  • Extends: VectorData

  • Primitive Type: Dataset

  • Data Type: uint8

  • Dimensions: [‘num_rows’]

  • Shape: [None]

  • Inherits from: VectorData, Data

  • Source filename: table.yaml

  • Source Specification: see Section 3.3.2

Datasets, Links, and Attributes contained in <VectorIndex>

Id

Type

Description

<VectorIndex>

Dataset

Top level Dataset for <VectorIndex>

  • Neurodata Type: VectorIndex

  • Extends: VectorData

  • Data Type: uint8

  • Dimensions: [‘num_rows’]

  • Shape: [None]

.target

Attribute

Reference to the target dataset that this index applies to.

  • Data Type: object reference to VectorData

  • Name: target

ElementIdentifiers

Overview: A list of unique identifiers for values within a dataset, e.g. rows of a DynamicTable.

ElementIdentifiers extends Data and includes all elements of Data with the following additions or changes.

  • Extends: Data

  • Primitive Type: Dataset

  • Data Type: int

  • Dimensions: [‘num_elements’]

  • Shape: [None]

  • Default Name: element_id

  • Inherits from: Data

  • Source filename: table.yaml

  • Source Specification: see Section 3.3.3

DynamicTableRegion

Overview: DynamicTableRegion provides a link from one table to an index or region of another. The table attribute is a link to another DynamicTable, indicating which table is referenced, and the data is int(s) indicating the row(s) (0-indexed) of the target array. DynamicTableRegion`s can be used to associate rows with repeated meta-data without data duplication. They can also be used to create hierarchical relationships between multiple `DynamicTable`s. `DynamicTableRegion objects may be paired with a VectorIndex object to create ragged references, so a single cell of a DynamicTable can reference many rows of another DynamicTable.

DynamicTableRegion extends VectorData and includes all elements of VectorData with the following additions or changes.

  • Extends: VectorData

  • Primitive Type: Dataset

  • Data Type: int

  • Dimensions: [‘num_rows’]

  • Shape: [None]

  • Inherits from: VectorData, Data

  • Source filename: table.yaml

  • Source Specification: see Section 3.3.4

DynamicTableRegion
Datasets, Links, and Attributes contained in <DynamicTableRegion>

Id

Type

Description

<DynamicTableRegion>

Dataset

Top level Dataset for <DynamicTableRegion>

  • Neurodata Type: DynamicTableRegion

  • Extends: VectorData

  • Data Type: int

  • Dimensions: [‘num_rows’]

  • Shape: [None]

.table

Attribute

Reference to the DynamicTable object that this region applies to.

.description

Attribute

Description of what this table region points to.

  • Data Type: text

  • Name: description

DynamicTable

Overview: A group containing multiple datasets that are aligned on the first dimension (Currently, this requirement if left up to APIs to check and enforce). These datasets represent different columns in the table. Apart from a column that contains unique identifiers for each row, there are no other required datasets. Users are free to add any number of custom VectorData objects (columns) here. DynamicTable also supports ragged array columns, where each element can be of a different size. To add a ragged array column, use a VectorIndex type to index the corresponding VectorData type. See documentation for VectorData and VectorIndex for more details. Unlike a compound data type, which is analogous to storing an array-of-structs, a DynamicTable can be thought of as a struct-of-arrays. This provides an alternative structure to choose from when optimizing storage for anticipated access patterns. Additionally, this type provides a way of creating a table without having to define a compound type up front. Although this convenience may be attractive, users should think carefully about how data will be accessed. DynamicTable is more appropriate for column-centric access, whereas a dataset with a compound type would be more appropriate for row-centric access. Finally, data size should also be taken into account. For small tables, performance loss may be an acceptable trade-off for the flexibility of a DynamicTable.

DynamicTable extends Container and includes all elements of Container with the following additions or changes.

DynamicTable
Datasets, Links, and Attributes contained in <DynamicTable>

Id

Type

Description

<DynamicTable>

Group

Top level Group for <DynamicTable>

.colnames

Attribute

The names of the columns in this table. This should be used to specify an order to the columns.

  • Data Type: text

  • Dimensions: [‘num_columns’]

  • Shape: [None]

  • Name: colnames

.description

Attribute

Description of what is in this dynamic table.

  • Data Type: text

  • Name: description

.id

Dataset

Array of unique identifiers for the rows of this dynamic table.

  • Extends: ElementIdentifiers

  • Data Type: int

  • Dimensions: [‘num_rows’]

  • Shape: [None]

  • Name: id

.<VectorData>

Dataset

Vector columns, including index columns, of this dynamic table.

AlignedDynamicTable

Overview: DynamicTable container that supports storing a collection of sub-tables. Each sub-table is a DynamicTable itself that is aligned with the main table by row index. I.e., all DynamicTables stored in this group MUST have the same number of rows. This type effectively defines a 2-level table in which the main data is stored in the main table implemented by this type and additional columns of the table are grouped into categories, with each category being represented by a separate DynamicTable stored within the group.

AlignedDynamicTable extends DynamicTable and includes all elements of DynamicTable with the following additions or changes.

AlignedDynamicTable
Datasets, Links, and Attributes contained in <AlignedDynamicTable>

Id

Type

Description

<AlignedDynamicTable>

Group

Top level Group for <AlignedDynamicTable>

.categories

Attribute

The names of the categories in this AlignedDynamicTable. Each category is represented by one DynamicTable stored in the parent group. This attribute should be used to specify an order of categories and the category names must match the names of the corresponding DynamicTable in the group.

  • Data Type: text

  • Dimensions: [‘num_categories’]

  • Shape: [None]

  • Name: categories

Groups contained in <AlignedDynamicTable>

Id

Type

Description

<AlignedDynamicTable>

Group

Top level Group for <AlignedDynamicTable>

.<DynamicTable>

Group

A DynamicTable representing a particular category for columns in the AlignedDynamicTable parent container. The table MUST be aligned with (i.e., have the same number of rows) as all other DynamicTables stored in the AlignedDynamicTable parent container. The name of the category is given by the name of the DynamicTable and its description by the description attribute of the DynamicTable.

Groups: <DynamicTable>

A DynamicTable representing a particular category for columns in the AlignedDynamicTable parent container. The table MUST be aligned with (i.e., have the same number of rows) as all other DynamicTables stored in the AlignedDynamicTable parent container. The name of the category is given by the name of the DynamicTable and its description by the description attribute of the DynamicTable.

Sparse data types

data types for different types of sparse matrices

CSRMatrix

Overview: A compressed sparse row matrix. Data are stored in the standard CSR format, where column indices for row i are stored in indices[indptr[i]:indptr[i+1]] and their corresponding values are stored in data[indptr[i]:indptr[i+1]].

CSRMatrix extends Container and includes all elements of Container with the following additions or changes.

CSRMatrix
Datasets, Links, and Attributes contained in <CSRMatrix>

Id

Type

Description

<CSRMatrix>

Group

Top level Group for <CSRMatrix>

.shape

Attribute

The shape (number of rows, number of columns) of this sparse matrix.

  • Data Type: uint

  • Dimensions: [‘number of rows, number of columns’]

  • Shape: [2]

  • Name: shape

.indices

Dataset

The column indices.

  • Data Type: uint

  • Dimensions: [‘number of non-zero values’]

  • Shape: [None]

  • Name: indices

.indptr

Dataset

The row index pointer.

  • Data Type: uint

  • Dimensions: [‘number of rows in the matrix + 1’]

  • Shape: [None]

  • Name: indptr

.data

Dataset

The non-zero values in the matrix.

  • Dimensions: [‘number of non-zero values’]

  • Shape: [None]

  • Name: data

Schema Sources

Source Specification: see Section 3.1

Namespace – HDMF Common

Description: see Section 1.1

YAML Specification:

 1author:
 2- Andrew Tritt
 3- Oliver Ruebel
 4- Ryan Ly
 5- Ben Dichter
 6contact:
 7- ajtritt@lbl.gov
 8- oruebel@lbl.gov
 9- rly@lbl.gov
10- bdichter@lbl.gov
11doc: Common data structures provided by HDMF
12full_name: HDMF Common
13name: hdmf-common
14schema:
15- doc: base data types
16  source: base.yaml
17  title: Base data types
18- doc: data types for a column-based table
19  source: table.yaml
20  title: Table data types
21- doc: data types for different types of sparse matrices
22  source: sparse.yaml
23  title: Sparse data types
24version: 1.8.0

Base data types

base data types

Data

Description: see Section 2.1.1

YAML Specification:

1data_type_def: Data
2doc: An abstract data type for a dataset.

Container

Description: see Section 2.1.2

YAML Specification:

1data_type_def: Container
2doc: An abstract data type for a group storing collections of data and metadata. Base
3  type for all data and metadata containers.

SimpleMultiContainer

Extends: Container

Description: see Section 2.1.3

YAML Specification:

 1data_type_def: SimpleMultiContainer
 2data_type_inc: Container
 3datasets:
 4- data_type_inc: Data
 5  doc: Data objects held within this SimpleMultiContainer.
 6  quantity: '*'
 7doc: A simple Container for holding onto multiple containers.
 8groups:
 9- data_type_inc: Container
10  doc: Container objects held within this SimpleMultiContainer.
11  quantity: '*'

Table data types

data types for a column-based table

VectorData

Extends: Data

Description: see Section 2.2.1

YAML Specification:

 1attributes:
 2- doc: Description of what these vectors represent.
 3  dtype: text
 4  name: description
 5data_type_def: VectorData
 6data_type_inc: Data
 7dims:
 8- - dim0
 9- - dim0
10  - dim1
11- - dim0
12  - dim1
13  - dim2
14- - dim0
15  - dim1
16  - dim2
17  - dim3
18doc: An n-dimensional dataset representing a column of a DynamicTable. If used without
19  an accompanying VectorIndex, first dimension is along the rows of the DynamicTable
20  and each step along the first dimension is a cell of the larger table. VectorData
21  can also be used to represent a ragged array if paired with a VectorIndex. This
22  allows for storing arrays of varying length in a single cell of the DynamicTable
23  by indexing into this VectorData. The first vector is at VectorData[0:VectorIndex[0]].
24  The second vector is at VectorData[VectorIndex[0]:VectorIndex[1]], and so on.
25shape:
26- -
27- -
28  -
29- -
30  -
31  -
32- -
33  -
34  -
35  -

VectorIndex

Extends: VectorData

Description: see Section 2.2.2

YAML Specification:

 1attributes:
 2- doc: Reference to the target dataset that this index applies to.
 3  dtype:
 4    reftype: object
 5    target_type: VectorData
 6  name: target
 7data_type_def: VectorIndex
 8data_type_inc: VectorData
 9dims:
10- num_rows
11doc: Used with VectorData to encode a ragged array. An array of indices into the first
12  dimension of the target VectorData, and forming a map between the rows of a DynamicTable
13  and the indices of the VectorData. The name of the VectorIndex is expected to be
14  the name of the target VectorData object followed by "_index".
15dtype: uint8
16shape:
17-

ElementIdentifiers

Extends: Data

Description: see Section 2.2.3

YAML Specification:

1data_type_def: ElementIdentifiers
2data_type_inc: Data
3default_name: element_id
4dims:
5- num_elements
6doc: A list of unique identifiers for values within a dataset, e.g. rows of a DynamicTable.
7dtype: int
8shape:
9-

DynamicTableRegion

Extends: VectorData

Description: see Section 2.2.4

YAML Specification:

 1attributes:
 2- doc: Reference to the DynamicTable object that this region applies to.
 3  dtype:
 4    reftype: object
 5    target_type: DynamicTable
 6  name: table
 7- doc: Description of what this table region points to.
 8  dtype: text
 9  name: description
10data_type_def: DynamicTableRegion
11data_type_inc: VectorData
12dims:
13- num_rows
14doc: DynamicTableRegion provides a link from one table to an index or region of another.
15  The `table` attribute is a link to another `DynamicTable`, indicating which table
16  is referenced, and the data is int(s) indicating the row(s) (0-indexed) of the target
17  array. `DynamicTableRegion`s can be used to associate rows with repeated meta-data
18  without data duplication. They can also be used to create hierarchical relationships
19  between multiple `DynamicTable`s. `DynamicTableRegion` objects may be paired with
20  a `VectorIndex` object to create ragged references, so a single cell of a `DynamicTable`
21  can reference many rows of another `DynamicTable`.
22dtype: int
23shape:
24-

DynamicTable

Extends: Container

Description: see Section 2.2.5

YAML Specification:

 1attributes:
 2- dims:
 3  - num_columns
 4  doc: The names of the columns in this table. This should be used to specify an order
 5    to the columns.
 6  dtype: text
 7  name: colnames
 8  shape:
 9  -
10- doc: Description of what is in this dynamic table.
11  dtype: text
12  name: description
13data_type_def: DynamicTable
14data_type_inc: Container
15datasets:
16- data_type_inc: ElementIdentifiers
17  dims:
18  - num_rows
19  doc: Array of unique identifiers for the rows of this dynamic table.
20  dtype: int
21  name: id
22  shape:
23  -
24- data_type_inc: VectorData
25  doc: Vector columns, including index columns, of this dynamic table.
26  quantity: '*'
27doc: A group containing multiple datasets that are aligned on the first dimension
28  (Currently, this requirement if left up to APIs to check and enforce). These datasets
29  represent different columns in the table. Apart from a column that contains unique
30  identifiers for each row, there are no other required datasets. Users are free to
31  add any number of custom VectorData objects (columns) here. DynamicTable also supports
32  ragged array columns, where each element can be of a different size. To add a ragged
33  array column, use a VectorIndex type to index the corresponding VectorData type.
34  See documentation for VectorData and VectorIndex for more details. Unlike a compound
35  data type, which is analogous to storing an array-of-structs, a DynamicTable can
36  be thought of as a struct-of-arrays. This provides an alternative structure to choose
37  from when optimizing storage for anticipated access patterns. Additionally, this
38  type provides a way of creating a table without having to define a compound type
39  up front. Although this convenience may be attractive, users should think carefully
40  about how data will be accessed. DynamicTable is more appropriate for column-centric
41  access, whereas a dataset with a compound type would be more appropriate for row-centric
42  access. Finally, data size should also be taken into account. For small tables,
43  performance loss may be an acceptable trade-off for the flexibility of a DynamicTable.

AlignedDynamicTable

Extends: DynamicTable

Description: see Section 2.2.6

YAML Specification:

 1attributes:
 2- dims:
 3  - num_categories
 4  doc: The names of the categories in this AlignedDynamicTable. Each category is represented
 5    by one DynamicTable stored in the parent group. This attribute should be used
 6    to specify an order of categories and the category names must match the names
 7    of the corresponding DynamicTable in the group.
 8  dtype: text
 9  name: categories
10  shape:
11  -
12data_type_def: AlignedDynamicTable
13data_type_inc: DynamicTable
14doc: DynamicTable container that supports storing a collection of sub-tables. Each
15  sub-table is a DynamicTable itself that is aligned with the main table by row index.
16  I.e., all DynamicTables stored in this group MUST have the same number of rows.
17  This type effectively defines a 2-level table in which the main data is stored in
18  the main table implemented by this type and additional columns of the table are
19  grouped into categories, with each category being represented by a separate DynamicTable
20  stored within the group.
21groups:
22- data_type_inc: DynamicTable
23  doc: A DynamicTable representing a particular category for columns in the AlignedDynamicTable
24    parent container. The table MUST be aligned with (i.e., have the same number of
25    rows) as all other DynamicTables stored in the AlignedDynamicTable parent container.
26    The name of the category is given by the name of the DynamicTable and its description
27    by the description attribute of the DynamicTable.
28  quantity: '*'

Sparse data types

data types for different types of sparse matrices

CSRMatrix

Extends: Container

Description: see Section 2.3.1

YAML Specification:

 1attributes:
 2- dims:
 3  - number of rows, number of columns
 4  doc: The shape (number of rows, number of columns) of this sparse matrix.
 5  dtype: uint
 6  name: shape
 7  shape:
 8  - 2
 9data_type_def: CSRMatrix
10data_type_inc: Container
11datasets:
12- dims:
13  - number of non-zero values
14  doc: The column indices.
15  dtype: uint
16  name: indices
17  shape:
18  -
19- dims:
20  - number of rows in the matrix + 1
21  doc: The row index pointer.
22  dtype: uint
23  name: indptr
24  shape:
25  -
26- dims:
27  - number of non-zero values
28  doc: The non-zero values in the matrix.
29  name: data
30  shape:
31  -
32doc: A compressed sparse row matrix. Data are stored in the standard CSR format, where
33  column indices for row i are stored in indices[indptr[i]:indptr[i+1]] and their
34  corresponding values are stored in data[indptr[i]:indptr[i+1]].

Making a Pull Request

Actions to take on each PR that modifies the schema and does not prepare the schema for a public release (this is also in the GitHub PR template):

If the current schema version on “main” is a public release, then:

  1. Update the version string in docs/source/conf.py and common/namespace.yaml to the next version with the suffix “-alpha”

  2. Add a new section in the release notes for the new version with the date “Upcoming”

Always:

  1. Add release notes for the PR to docs/source/hdmf_common_release_notes.rst and/or docs/source/hdmf_experimental_release_notes.rst

Documentation or internal changes to the repo (i.e., changes that do not affect the schema files) do not need to be accompanied with a version bump or addition to the release notes.

Merging PRs and Making Releases

Public release: a tagged release of the schema. The version string MUST NOT have a suffix indicating a pre-release, such as “-alpha”. The current “dev” branch of HDMF and all HDMF releases MUST point to a public release of hdmf-common-schema. All schema that use hdmf-common-schema as a submodule MUST also point only to public releases.

Internal release: a state of the schema “main” branch where the version string ends with “-alpha”.

The default branch of hdmf-common-schema is “main”. The “main” branch holds the bleeding edge version of the hdmf-common schema specification.

PRs should be made to “main”. Every PR should include an update to the namespace release notes (docs/source/hdmf_common_release_notes.rst and/or docs/source/hdmf_experimental_release_notes.rst). If the current version is a public release, then the PR should also update the version of the schema in two places: docs/source/conf.py and common/namespace.yaml. The new version should be the next bugfix/minor/major version of the schema with the suffix “-alpha”. For example, if the current schema on “main” has version “2.2.0”, then a PR implementing a bug fix should update the schema version from “2.2.0” to “2.2.1-alpha”. Appending the “-alpha” suffix ensures that any person or API accessing the default “main” branch of the repo containing an internal release of the schema receives the schema with a version string that is distinct from public releases of the schema. If the current schema on “main” is already an internal release, then the version string does not need to be updated unless the PR requires an upgrade in the version (e.g., from bugfix to minor).

HDMF should contain a branch and PR that tracks the “main” branch of hdmf-common-schema. Before a public release of hdmf-common-schema is made, this HDMF branch should be checked to ensure that when the new release is made, the branch can be merged without issue.

Immediately prior to making a new public release, the version of the schema should be updated to remove the “-alpha” suffix and the documentation and release notes should be updated as needed (see next section).

The current “dev” branch of HDMF and all HDMF releases MUST always point to a public release of hdmf-common-schema. If a public release contains an internally released version of hdmf-common-schema, e.g., from an untagged commit on the “main” branch, then it will be difficult to find the version (commit) of hdmf-common-schema that was used to create an HDMF file when the schema is not cached.

Making a Release Checklist

Before merging:

  1. Update requirements versions as needed

  2. Update legal file dates and information in Legal.txt, license.txt, README.md, docs/source/conf.py, and any other locations as needed

  3. Update README.md as needed

  4. Update the version string in docs/source/conf.py and common/namespace.yaml (remove “-alpha” suffix)

  5. Update docs/source/conf.py as needed

  6. Update release notes (set release date) in docs/source/hdmf_common_release_notes.rst, docs/source/hdmf_experimental_release_notes.rst, and any other docs as needed

  7. Test docs locally (cd docs; make fulldoc) where the hdmf-common-schema submodule in the local version of HDMF is fully up-to-date with the head of the main branch.

  8. Push changes to a new PR and make sure all PRs to be included in this release have been merged. Add ?template=release.md to the PR URL to auto-populate the PR with this checklist.

  9. Check that the readthedocs build for this PR succeeds (build latest to pull the new branch, then activate and build docs for new branch): https://readthedocs.org/projects/hdmf-common-schema/builds/

After merging:

  1. Create a new git tag. Pull the latest main branch locally, run git tag [version] --sign, copy and paste the release notes into the tag message, and run git push --tags.

  2. On the GitHub tags page, click “…” -> “Create release” for the new tag on the right side of the page. Copy and paste the release notes into the release message, update the formatting if needed (reST to Markdown), and set the title to the version string.

  3. Check that the readthedocs “latest” and “stable” builds run and succeed. Delete the readthedocs build for the merged PR. https://readthedocs.org/projects/hdmf-common-schema/builds/

  4. Update the HDMF submodule in the HDMF branch corresponding to this schema version to point to the tagged commit.

This checklist can also be found in the GitHub release PR template.

The time between merging this PR and creating a new public release should be minimized.

hdmf-common Release Notes

1.8.0 (August 4, 2023)

  • No change in the hdmf-common namespace. See here for changes to the hdmf-experimental namespace.

1.7.0 (June 22, 2023)

  • No change in the hdmf-common namespace. See here for changes to the hdmf-experimental namespace.

1.6.0 (May 3, 2023)

  • No change in the hdmf-common namespace. See here for changes to the hdmf-experimental namespace.

1.5.1 (January 10, 2022)

  • No change in the hdmf-common namespace. See here for changes to the hdmf-experimental namespace.

1.5.0 (April 19, 2021)

  • Added AlignedDynamicTable, which defines a DynamicTable that supports storing a collection of sub-tables. Each sub-table is itself a DynamicTable that is aligned with the main table by row index. Each sub-table defines a sub-category in the main table effectively creating a table with sub-headings to organize columns.

1.4.0 (March 29, 2021)

Summary: In 1.4.0, the HDMF-experimental namespace was added, which includes the ExternalResources and EnumData data types. Schema in the HDMF-experimental namespace are experimental and subject to breaking changes at any time. ExternalResources was changed to support storing both names and URIs for resources. The VocabData data type was replaced by EnumData to provide more flexible support for data from a set of fixed values.

  • Added EnumData for storing data that comes from a set of fixed values. This replaces VocabData which could hold only string values. Also, VocabData could hold only a limited number of elements (~64k) when used with the HDF5 storage backend. EnumData gets around these restrictions by using an untyped dataset (VectorData) instead of a string attribute to hold the enumerated values.

  • Removed VocabData.

  • Renamed the “resources” table in ExternalResources to “entities”.

  • Created a new “resources” table to store the name and URI of the ontology / external resource used by the “entities” table in ExternalResources.

  • Renamed fields in ExternalResources.

  • Added “entities” dataset to ExternalResources. This is a row-based table dataset to replace the functionality of the “resources” dataset in ExternalResources.

  • Changed the “resources” dataset in ExternalResources to store the name and URI of the ontology / external resource used by the “entities” dataset in ExternalResources.

  • Added HDMF-experimental namespace.

  • Moved ExternalResources and EnumData to HDMF-experimental.

1.3.0 (December 2, 2020)

  • Add data type ExternalResources for storing ontology information / external resource references. NOTE: this data type is in beta testing and is subject to change in a later version.

  • Changed dtype for datasets within CSRMatrix from ‘int’ to ‘uint’. Negative values do not make sense for these datasets.

1.2.1 (November 4, 2020)

  • Update software process documentation for maintainers.

  • Fix missing data_type_inc for CSRMatrix. It now has data_type_inc: Container.

  • Add hdmf-schema-language comment at the top of each yaml file.

  • Add SimpleMultiContainer, a Container for storing other Container and Data objects together

1.2.0 (July 10, 2020)

  • Add software process documentation.

  • Fix missing dtype for VectorIndex.

  • Add new VocabData data type.

  • Move Data, Index, and Container to base.yaml. This change does not functionally change the schema.

  • VectorIndex now extends VectorData instead of Index. This change allows VectorIndex to index other VectorIndex types.

  • The Index data type is now unused and has been removed.

  • Fix documentation for ragged arrays.

1.1.3 (January 21, 2020)

  • Fix missing ‘shape’ and ‘dims’ key for types VectorData, VectorIndex, and DynamicTableRegion.

1.1.2 (January 9, 2020)

  • Fix version number in namespace.yaml and docs

1.1.1 (January 9, 2020)

  • Support for ReadTheDocs continuous documentation was added, and legal/license documents were also added. The schema is unchanged.

1.1.0 (January 3, 2020)

  • The ‘colnames’ attribute of DynamicTable changed from data type ‘ascii’ to ‘text’.

  • Improved documentation and type docstrings.

1.0.0 (September 26, 2019)

Initial release.

hdmf-experimental Release Notes

0.5.0 (August 4, 2023)

  • Updates ExternalResources to have a uniform name throughout the codebase and the literature, which is now HERD (HDMF External Resources Data).

  • Fixed schema bug regarding the missing quote.

0.4.0 (June 22, 2023)

  • In the experimental ExternalResources, added a entity_keys table and removed keys_idx from the entities table.

0.3.0 (May 3, 2023)

  • In the experimental ExternalResources, added a files table, removed the resources table, and adjusted existing columns.

0.2.0 (January 10, 2022)

  • In the experimental ExternalResources, added relative_path field to the “objects” table dtype. This is used in place of the previous field field representing the relative path to get to the dataset/attribute from the object. The previous field field will be used to represent a compound type field name if the dataset/attribute is a compound dtype.

  • Updated contributors.

0.1.0 (March 29, 2021)

Credits

Authors

  • Andrew Tritt

  • Oliver Ruebel

  • Ryan Ly

  • Ben Dichter

  • Matthew Avaylon