hdmf.common.table module

Collection of Container classes for interacting with data types related to the storage and use of dynamic data tables as part of the hdmf-common schema

class hdmf.common.table.VectorData(name, description, data=[])

Bases: Data

A n-dimensional dataset representing a column of a DynamicTable. If used without an accompanying VectorIndex, first dimension is along the rows of the DynamicTable and each step along the first dimension is a cell of the larger table. VectorData can also be used to represent a ragged array if paired with a VectorIndex. This allows for storing arrays of varying length in a single cell of the DynamicTable by indexing into this VectorData. The first vector is at VectorData[0:VectorIndex(0)+1]. The second vector is at VectorData[VectorIndex(0)+1:VectorIndex(1)+1], and so on.

Parameters:
property description

a description for this column

add_row(val)

Append a data value to this VectorData column

Parameters:

val (None) – the value to add to this column

get(key, **kwargs)

Retrieve elements from this VectorData

Parameters:
  • key – Selection of the elements

  • kwargs – Ignored

extend(ar, **kwargs)

Add all elements of the iterable arg to the end of this VectorData.

Each subclass of VectorData should have its own extend method to ensure functionality and efficiency.

Parameters:

arg – The iterable to add to the end of this VectorData

data_type = 'VectorData'
namespace = 'hdmf-common'
class hdmf.common.table.VectorIndex(name, data, target)

Bases: VectorData

When paired with a VectorData, this allows for storing arrays of varying length in a single cell of the DynamicTable by indexing into this VectorData. The first vector is at VectorData[0:VectorIndex(0)+1]. The second vector is at VectorData[VectorIndex(0)+1:VectorIndex(1)+1], and so on.

Parameters:
property target

the target dataset that this index applies to

add_vector(arg, **kwargs)

Add the given data value to the target VectorData and append the corresponding index to this VectorIndex :param arg: The data value to be added to self.target

add_row(arg, **kwargs)

Convenience function. Same as add_vector

__getitem__(arg)

Select elements in this VectorIndex and retrieve the corresponding data from the self.target VectorData

Parameters:

arg – slice or integer index indicating the elements we want to select in this VectorIndex

Returns:

Scalar or list of values retrieved

get(arg, **kwargs)

Select elements in this VectorIndex and retrieve the corresponding data from the self.target VectorData

Parameters:
  • arg – slice or integer index indicating the elements we want to select in this VectorIndex

  • kwargs – any additional arguments to get method of the self.target VectorData

Returns:

Scalar or list of values retrieved

data_type = 'VectorIndex'
namespace = 'hdmf-common'
class hdmf.common.table.ElementIdentifiers(name, data=[])

Bases: Data

Data container with a list of unique identifiers for values within a dataset, e.g. rows of a DynamicTable.

Parameters:
data_type = 'ElementIdentifiers'
namespace = 'hdmf-common'
class hdmf.common.table.DynamicTable(name, description, id=None, columns=None, colnames=None, target_tables=None)

Bases: Container

A column-based table. Columns are defined by the argument columns. This argument must be a list/tuple of VectorData and VectorIndex objects or a list/tuple of dicts containing the keys name and description that provide the name and description of each column in the table. Additionally, the keys index, table, enum can be used for specifying additional structure to the table columns. Setting the key index to True can be used to indicate that the VectorData column will store a ragged array (i.e. will be accompanied with a VectorIndex). Setting the key table to True can be used to indicate that the column will store regions to another DynamicTable. Setting the key enum to True can be used to indicate that the column data will come from a fixed set of values.

Columns in DynamicTable subclasses can be statically defined by specifying the class attribute __columns__, rather than specifying them at runtime at the instance level. This is useful for defining a table structure that will get reused. The requirements for __columns__ are the same as the requirements described above for specifying table columns with the columns argument to the DynamicTable constructor.

Parameters:
property description

a description of what is in this table

property id

the identifiers for this table

property colnames

the ordered names of the columns in this table. columns must also be provided.

property columns

the columns in this table

add_row(data=None, id=None, enforce_unique_id=False, check_ragged=True)

Add a row to the table. If id is not provided, it will auto-increment.

Parameters:
  • data (dict) – the data to put in this row

  • id (int) – the ID for the row

  • enforce_unique_id (bool) – enforce that the id in the table must be unique

  • check_ragged (bool) – whether or not to check for ragged arrays when adding data to the table. Set to False to avoid checking every element if performance issues occur.

add_column(name, description, data=[], table=False, index=False, enum=False, col_cls=<class 'hdmf.common.table.VectorData'>, check_ragged=True)

Add a column to this table.

If data is provided, it must contain the same number of rows as the current state of the table.

Extra keyword arguments will be passed to the constructor of the column class (“col_cls”).

raises ValueError:

if the column has already been added to the table

Parameters:
  • name (str) – the name of this VectorData

  • description (str) – a description for this column

  • data (ndarray or list or tuple or Dataset or Array or StrDataset or HDMFDataset or AbstractDataChunkIterator or DataIO) – a dataset where the first dimension is a concatenation of multiple vectors

  • table (bool or DynamicTable) – whether or not this is a table region or the table the region applies to

  • index (bool or VectorIndex or ndarray or list or tuple or Dataset or Array or StrDataset or HDMFDataset or AbstractDataChunkIterator or int) –

    • False (default): do not generate a VectorIndex

    • True: generate one empty VectorIndex

    • VectorIndex: Use the supplied VectorIndex

    • array-like of ints: Create a VectorIndex and use these values as the data

    • int: Recursively create n VectorIndex objects for a multi-ragged array

  • enum (bool or ndarray or list or tuple or Dataset or Array or StrDataset or HDMFDataset or AbstractDataChunkIterator) – whether or not this column contains data from a fixed set of elements

  • col_cls (type) – class to use to represent the column data. If table=True, this field is ignored and a DynamicTableRegion object is used. If enum=True, this field is ignored and a EnumData object is used.

  • check_ragged (bool) – whether or not to check for ragged arrays when adding data to the table. Set to False to avoid checking every element if performance issues occur.

create_region(name, region, description)

Create a DynamicTableRegion selecting a region (i.e., rows) in this DynamicTable.

raises:

IndexError if the provided region contains invalid indices

Parameters:
  • name (str) – the name of the DynamicTableRegion object

  • region (slice or list or tuple) – the indices of the table

  • description (str) – a brief description of what the region is

__getitem__(key)
get(key, default=None, df=True, index=True, **kwargs)

Select a subset from the table.

If the table includes a DynamicTableRegion column, then by default, the index/indices of the DynamicTableRegion will be returned. If df=True and index=False, then the returned pandas DataFrame will contain a nested DataFrame in each row of the DynamicTableRegion column. If df=False and index=True, then a list of lists will be returned where the list containing the DynamicTableRegion column contains the indices of the DynamicTableRegion. Note that in this case, the DynamicTable referenced by the DynamicTableRegion can be accessed through the table attribute of the DynamicTableRegion object. df=False and index=False is not yet supported.

Parameters:

key

Key defining which elements of the table to select. This may be one of the following:

  1. string with the name of the column to select

  2. a tuple consisting of (int, str) where the int selects the row and the string identifies the column to select by name

  3. int, list of ints, array, or slice selecting a set of full rows in the table. If an int is used, then scalars are returned for each column that has a single value. If a list, array, or slice is used and df=False, then lists are returned for each column, even if the list, array, or slice resolves to a single row.

Returns:

  1. If key is a string, then return the VectorData object representing the column with the string name

  2. If key is a tuple of (int, str), then return the scalar value of the selected cell

  3. If key is an int, list, np.ndarray, or slice, then return pandas.DataFrame or lists consisting of one or more rows

Raises:

KeyError

get_foreign_columns()

Determine the names of all columns that link to another DynamicTable, i.e., find all DynamicTableRegion type columns. Similar to a foreign key in a database, a DynamicTableRegion column references elements in another table.

Returns:

List of strings with the column names

has_foreign_columns()

Does the table contain DynamicTableRegion columns

Returns:

True if the table contains a DynamicTableRegion column, else False

get_linked_tables(other_tables=None)
Get a list of the full list of all tables that are being linked to directly or indirectly

from this table via foreign DynamicTableColumns included in this table or in any table that can be reached through DynamicTableRegion columns

Returns: List of NamedTuple objects with:
  • ‘source_table’ : The source table containing the DynamicTableRegion column

  • ‘source_column’ : The relevant DynamicTableRegion column in the ‘source_table’

  • ‘target_table’ : The target DynamicTable; same as source_column.table.

Parameters:

other_tables (list or tuple or set) – List of additional tables to consider in the search. Usually this parameter is used for internal purposes, e.g., when we need to consider AlignedDynamicTable

to_dataframe(exclude=None, index=False)

Produce a pandas DataFrame containing this table’s data.

If this table contains a DynamicTableRegion, by default,

If exclude is None, this is equivalent to table.get(slice(None, None, None), index=False).

Parameters:
  • exclude (set) – Set of column names to exclude from the dataframe

  • index (bool) – Whether to return indices for a DynamicTableRegion column. If False, nested dataframes will be returned.

generate_html_repr(level: int = 0, access_code: str = '', nrows: int = 4)
classmethod from_dataframe(df, name, index_column=None, table_description='', columns=None)

Construct an instance of DynamicTable (or a subclass) from a pandas DataFrame.

The columns of the resulting table are defined by the columns of the dataframe and the index by the dataframe’s index (make sure it has a name!) or by a column whose name is supplied to the index_column parameter. We recommend that you supply columns - a list/tuple of dictionaries containing the name and description of the column- to help others understand the contents of your table. See DynamicTable for more details on columns.

Parameters:
  • df (DataFrame) – source DataFrame

  • name (str) – the name of this table

  • index_column (str) – if provided, this column will become the table’s index

  • table_description (str) – a description of what is in the resulting table

  • columns (list or tuple) – a list/tuple of dictionaries specifying columns in the table

copy()

Return a copy of this DynamicTable. This is useful for linking.

data_type = 'DynamicTable'
namespace = 'hdmf-common'
class hdmf.common.table.DynamicTableRegion(name, data, description, table=None)

Bases: VectorData

DynamicTableRegion provides a link from one table to an index or region of another. The table attribute is another DynamicTable, indicating which table is referenced. The data is int(s) indicating the row(s) (0-indexed) of the target array. DynamicTableRegion`s can be used to associate multiple rows with the same meta-data without data duplication. They can also be used to create hierarchical relationships between multiple `DynamicTable`s. `DynamicTableRegion objects may be paired with a VectorIndex object to create ragged references, so a single cell of a DynamicTable can reference many rows of another DynamicTable.

Parameters:
property table

The DynamicTable this DynamicTableRegion is pointing to

__getitem__(arg)
get(arg, index=False, df=True, **kwargs)

Subset the DynamicTableRegion

Parameters:
  • arg

    Key defining which elements of the table to select. This may be one of the following:

    1. string with the name of the column to select

    2. a tuple consisting of (int, str) where the int selects the row and the string identifies the column to select by name

    3. int, list of ints, array, or slice selecting a set of full rows in the table. If an int is used, then scalars are returned for each column that has a single value. If a list, array, or slice is used and df=False, then lists are returned for each column, even if the list, array, or slice resolves to a single row.

  • index – Boolean indicating whether to return indices of the DTR (default False)

  • df – Boolean indicating whether to return the result as a pandas DataFrame (default True)

Returns:

Result from self.table[…] with the appropriate selection based on the rows selected by this DynamicTableRegion

to_dataframe(**kwargs)

Convert the whole DynamicTableRegion to a pandas dataframe.

Keyword arguments are passed through to the to_dataframe method of DynamicTable that is being referenced (i.e., self.table). This allows specification of the ‘exclude’ parameter and any other parameters of DynamicTable.to_dataframe.

property shape

Define the shape, i.e., (num_rows, num_columns) of the selected table region :return: Shape tuple with two integers indicating the number of rows and number of columns

data_type = 'DynamicTableRegion'
namespace = 'hdmf-common'
class hdmf.common.table.EnumData(name, description, data=[], elements=[])

Bases: VectorData

A n-dimensional dataset that can contain elements from fixed set of elements.

Parameters:
property elements

lookup values for each integer in data

data_type = 'EnumData'
namespace = 'hdmf-experimental'
__getitem__(arg)
get(arg, index=False, join=False, **kwargs)

Return elements elements for the given argument.

Parameters:
  • index (bool) – Return indices, do not return CV elements

  • join (bool) – Concatenate elements together into a single string

Returns:

CV elements if join is False or a concatenation of all selected elements if join is True.

add_row(val, index=False)

Append a data value to this EnumData column

If an element is provided for val (i.e. index is False), the correct index value will be determined. Otherwise, val will be added as provided.

Parameters:
  • val (None) – the value to add to this column

  • index (bool) – whether or not the value being added is an index