hdmf.common.table module
Collection of Container classes for interacting with data types related to the storage and use of dynamic data tables as part of the hdmf-common schema
- class hdmf.common.table.VectorData(name, description, data=[])
Bases:
Data
A n-dimensional dataset representing a column of a DynamicTable. If used without an accompanying VectorIndex, first dimension is along the rows of the DynamicTable and each step along the first dimension is a cell of the larger table. VectorData can also be used to represent a ragged array if paired with a VectorIndex. This allows for storing arrays of varying length in a single cell of the DynamicTable by indexing into this VectorData. The first vector is at VectorData[0:VectorIndex(0)+1]. The second vector is at VectorData[VectorIndex(0)+1:VectorIndex(1)+1], and so on.
- Parameters:
name (
str
) – the name of this VectorDatadescription (
str
) – a description for this columndata (
ndarray
orlist
ortuple
orDataset
orArray
orStrDataset
orHDMFDataset
orAbstractDataChunkIterator
orDataIO
) – a dataset where the first dimension is a concatenation of multiple vectors
- property description
a description for this column
- add_row(val)
Append a data value to this VectorData column
- Parameters:
val (None) – the value to add to this column
- get(key, **kwargs)
Retrieve elements from this VectorData
- Parameters:
key – Selection of the elements
kwargs – Ignored
- extend(ar, **kwargs)
Add all elements of the iterable arg to the end of this VectorData.
Each subclass of VectorData should have its own extend method to ensure functionality and efficiency.
- Parameters:
arg – The iterable to add to the end of this VectorData
- data_type = 'VectorData'
- namespace = 'hdmf-common'
- class hdmf.common.table.VectorIndex(name, data, target)
Bases:
VectorData
When paired with a VectorData, this allows for storing arrays of varying length in a single cell of the DynamicTable by indexing into this VectorData. The first vector is at VectorData[0:VectorIndex(0)+1]. The second vector is at VectorData[VectorIndex(0)+1:VectorIndex(1)+1], and so on.
- Parameters:
name (
str
) – the name of this VectorIndexdata (
ndarray
orlist
ortuple
orDataset
orArray
orStrDataset
orHDMFDataset
orAbstractDataChunkIterator
orDataIO
) – a 1D dataset containing indexes that apply to VectorData objecttarget (
VectorData
) – the target dataset that this index applies to
- property target
the target dataset that this index applies to
- add_vector(arg, **kwargs)
Add the given data value to the target VectorData and append the corresponding index to this VectorIndex :param arg: The data value to be added to self.target
- add_row(arg, **kwargs)
Convenience function. Same as
add_vector
- __getitem__(arg)
Select elements in this VectorIndex and retrieve the corresponding data from the self.target VectorData
- Parameters:
arg – slice or integer index indicating the elements we want to select in this VectorIndex
- Returns:
Scalar or list of values retrieved
- get(arg, **kwargs)
Select elements in this VectorIndex and retrieve the corresponding data from the self.target VectorData
- Parameters:
arg – slice or integer index indicating the elements we want to select in this VectorIndex
kwargs – any additional arguments to get method of the self.target VectorData
- Returns:
Scalar or list of values retrieved
- data_type = 'VectorIndex'
- namespace = 'hdmf-common'
- class hdmf.common.table.ElementIdentifiers(name, data=[])
Bases:
Data
Data container with a list of unique identifiers for values within a dataset, e.g. rows of a DynamicTable.
- Parameters:
name (
str
) – the name of this ElementIdentifiersdata (
ndarray
orlist
ortuple
orDataset
orArray
orStrDataset
orHDMFDataset
orAbstractDataChunkIterator
orDataIO
) – a 1D dataset containing integer identifiers
- data_type = 'ElementIdentifiers'
- namespace = 'hdmf-common'
- class hdmf.common.table.DynamicTable(name, description, id=None, columns=None, colnames=None, target_tables=None)
Bases:
Container
A column-based table. Columns are defined by the argument columns. This argument must be a list/tuple of
VectorData
andVectorIndex
objects or a list/tuple of dicts containing the keysname
anddescription
that provide the name and description of each column in the table. Additionally, the keysindex
,table
,enum
can be used for specifying additional structure to the table columns. Setting the keyindex
toTrue
can be used to indicate that theVectorData
column will store a ragged array (i.e. will be accompanied with aVectorIndex
). Setting the keytable
toTrue
can be used to indicate that the column will store regions to another DynamicTable. Setting the keyenum
toTrue
can be used to indicate that the column data will come from a fixed set of values.Columns in DynamicTable subclasses can be statically defined by specifying the class attribute __columns__, rather than specifying them at runtime at the instance level. This is useful for defining a table structure that will get reused. The requirements for __columns__ are the same as the requirements described above for specifying table columns with the columns argument to the DynamicTable constructor.
- Parameters:
name (
str
) – the name of this tabledescription (
str
) – a description of what is in this tableid (
ndarray
orlist
ortuple
orDataset
orArray
orStrDataset
orHDMFDataset
orAbstractDataChunkIterator
orDataIO
orElementIdentifiers
) – the identifiers for this tablecolnames (
ndarray
orlist
ortuple
orDataset
orArray
orStrDataset
orHDMFDataset
orAbstractDataChunkIterator
) – the ordered names of the columns in this table. columns must also be provided.target_tables (
dict
) – dict mapping DynamicTableRegion column name to the table that the DTR points to. The column is added to the table if it is not already present (i.e., when it is optional).
- property description
a description of what is in this table
- property id
the identifiers for this table
- property colnames
the ordered names of the columns in this table. columns must also be provided.
- property columns
the columns in this table
- add_row(data=None, id=None, enforce_unique_id=False, check_ragged=True)
Add a row to the table. If id is not provided, it will auto-increment.
- Parameters:
data (
dict
) – the data to put in this rowid (
int
) – the ID for the rowenforce_unique_id (
bool
) – enforce that the id in the table must be uniquecheck_ragged (
bool
) – whether or not to check for ragged arrays when adding data to the table. Set to False to avoid checking every element if performance issues occur.
- add_column(name, description, data=[], table=False, index=False, enum=False, col_cls=<class 'hdmf.common.table.VectorData'>, check_ragged=True)
Add a column to this table.
If data is provided, it must contain the same number of rows as the current state of the table.
Extra keyword arguments will be passed to the constructor of the column class (“col_cls”).
- raises ValueError:
if the column has already been added to the table
- Parameters:
name (
str
) – the name of this VectorDatadescription (
str
) – a description for this columndata (
ndarray
orlist
ortuple
orDataset
orArray
orStrDataset
orHDMFDataset
orAbstractDataChunkIterator
orDataIO
) – a dataset where the first dimension is a concatenation of multiple vectorstable (
bool
orDynamicTable
) – whether or not this is a table region or the table the region applies toindex (
bool
orVectorIndex
orndarray
orlist
ortuple
orDataset
orArray
orStrDataset
orHDMFDataset
orAbstractDataChunkIterator
orint
) –False
(default): do not generate a VectorIndexTrue
: generate one empty VectorIndexVectorIndex
: Use the supplied VectorIndexarray-like of ints: Create a VectorIndex and use these values as the data
int
: Recursively create n VectorIndex objects for a multi-ragged array
enum (
bool
orndarray
orlist
ortuple
orDataset
orArray
orStrDataset
orHDMFDataset
orAbstractDataChunkIterator
) – whether or not this column contains data from a fixed set of elementscol_cls (
type
) – class to use to represent the column data. If table=True, this field is ignored and a DynamicTableRegion object is used. If enum=True, this field is ignored and a EnumData object is used.check_ragged (
bool
) – whether or not to check for ragged arrays when adding data to the table. Set to False to avoid checking every element if performance issues occur.
- create_region(name, region, description)
Create a DynamicTableRegion selecting a region (i.e., rows) in this DynamicTable.
- raises:
IndexError if the provided region contains invalid indices
- __getitem__(key)
- get(key, default=None, df=True, index=True, **kwargs)
Select a subset from the table.
If the table includes a DynamicTableRegion column, then by default, the index/indices of the DynamicTableRegion will be returned. If
df=True
andindex=False
, then the returned pandas DataFrame will contain a nested DataFrame in each row of the DynamicTableRegion column. Ifdf=False
andindex=True
, then a list of lists will be returned where the list containing the DynamicTableRegion column contains the indices of the DynamicTableRegion. Note that in this case, the DynamicTable referenced by the DynamicTableRegion can be accessed through thetable
attribute of the DynamicTableRegion object.df=False
andindex=False
is not yet supported.- Parameters:
key –
Key defining which elements of the table to select. This may be one of the following:
string with the name of the column to select
a tuple consisting of (int, str) where the int selects the row and the string identifies the column to select by name
int, list of ints, array, or slice selecting a set of full rows in the table. If an int is used, then scalars are returned for each column that has a single value. If a list, array, or slice is used and df=False, then lists are returned for each column, even if the list, array, or slice resolves to a single row.
- Returns:
If key is a string, then return the VectorData object representing the column with the string name
If key is a tuple of (int, str), then return the scalar value of the selected cell
If key is an int, list, np.ndarray, or slice, then return pandas.DataFrame or lists consisting of one or more rows
- Raises:
KeyError
- get_foreign_columns()
Determine the names of all columns that link to another DynamicTable, i.e., find all DynamicTableRegion type columns. Similar to a foreign key in a database, a DynamicTableRegion column references elements in another table.
- Returns:
List of strings with the column names
- has_foreign_columns()
Does the table contain DynamicTableRegion columns
- Returns:
True if the table contains a DynamicTableRegion column, else False
- get_linked_tables(other_tables=None)
- Get a list of the full list of all tables that are being linked to directly or indirectly
from this table via foreign DynamicTableColumns included in this table or in any table that can be reached through DynamicTableRegion columns
- Returns: List of NamedTuple objects with:
‘source_table’ : The source table containing the DynamicTableRegion column
‘source_column’ : The relevant DynamicTableRegion column in the ‘source_table’
‘target_table’ : The target DynamicTable; same as source_column.table.
- to_dataframe(exclude=None, index=False)
Produce a pandas DataFrame containing this table’s data.
If this table contains a DynamicTableRegion, by default,
If exclude is None, this is equivalent to table.get(slice(None, None, None), index=False).
- classmethod from_dataframe(df, name, index_column=None, table_description='', columns=None)
Construct an instance of DynamicTable (or a subclass) from a pandas DataFrame.
The columns of the resulting table are defined by the columns of the dataframe and the index by the dataframe’s index (make sure it has a name!) or by a column whose name is supplied to the index_column parameter. We recommend that you supply columns - a list/tuple of dictionaries containing the name and description of the column- to help others understand the contents of your table. See
DynamicTable
for more details on columns.- Parameters:
df (
DataFrame
) – source DataFramename (
str
) – the name of this tableindex_column (
str
) – if provided, this column will become the table’s indextable_description (
str
) – a description of what is in the resulting tablecolumns (
list
ortuple
) – a list/tuple of dictionaries specifying columns in the table
- copy()
Return a copy of this DynamicTable. This is useful for linking.
- data_type = 'DynamicTable'
- namespace = 'hdmf-common'
- class hdmf.common.table.DynamicTableRegion(name, data, description, table=None)
Bases:
VectorData
DynamicTableRegion provides a link from one table to an index or region of another. The table attribute is another DynamicTable, indicating which table is referenced. The data is int(s) indicating the row(s) (0-indexed) of the target array. DynamicTableRegion`s can be used to associate multiple rows with the same meta-data without data duplication. They can also be used to create hierarchical relationships between multiple `DynamicTable`s. `DynamicTableRegion objects may be paired with a VectorIndex object to create ragged references, so a single cell of a DynamicTable can reference many rows of another DynamicTable.
- Parameters:
name (
str
) – the name of this VectorDatadata (
ndarray
orlist
ortuple
orDataset
orArray
orStrDataset
orHDMFDataset
orAbstractDataChunkIterator
orDataIO
) – a dataset where the first dimension is a concatenation of multiple vectorsdescription (
str
) – a description of what this region representstable (
DynamicTable
) – the DynamicTable this region applies to
- property table
The DynamicTable this DynamicTableRegion is pointing to
- __getitem__(arg)
- get(arg, index=False, df=True, **kwargs)
Subset the DynamicTableRegion
- Parameters:
arg –
Key defining which elements of the table to select. This may be one of the following:
string with the name of the column to select
a tuple consisting of (int, str) where the int selects the row and the string identifies the column to select by name
int, list of ints, array, or slice selecting a set of full rows in the table. If an int is used, then scalars are returned for each column that has a single value. If a list, array, or slice is used and df=False, then lists are returned for each column, even if the list, array, or slice resolves to a single row.
index – Boolean indicating whether to return indices of the DTR (default False)
df – Boolean indicating whether to return the result as a pandas DataFrame (default True)
- Returns:
Result from self.table[…] with the appropriate selection based on the rows selected by this DynamicTableRegion
- to_dataframe(**kwargs)
Convert the whole DynamicTableRegion to a pandas dataframe.
Keyword arguments are passed through to the to_dataframe method of DynamicTable that is being referenced (i.e., self.table). This allows specification of the ‘exclude’ parameter and any other parameters of DynamicTable.to_dataframe.
- property shape
Define the shape, i.e., (num_rows, num_columns) of the selected table region :return: Shape tuple with two integers indicating the number of rows and number of columns
- data_type = 'DynamicTableRegion'
- namespace = 'hdmf-common'
- class hdmf.common.table.EnumData(name, description, data=[], elements=[])
Bases:
VectorData
A n-dimensional dataset that can contain elements from fixed set of elements.
- Parameters:
name (
str
) – the name of this columndescription (
str
) – a description for this columndata (
ndarray
orlist
ortuple
orDataset
orArray
orStrDataset
orHDMFDataset
orAbstractDataChunkIterator
orDataIO
) – integers that index into elements for the value of each rowelements (
ndarray
orlist
ortuple
orDataset
orArray
orStrDataset
orHDMFDataset
orAbstractDataChunkIterator
orDataIO
orVectorData
) – lookup values for each integer indata
- property elements
lookup values for each integer in
data
- data_type = 'EnumData'
- namespace = 'hdmf-experimental'
- __getitem__(arg)
- get(arg, index=False, join=False, **kwargs)
Return elements elements for the given argument.
- add_row(val, index=False)
Append a data value to this EnumData column
If an element is provided for val (i.e. index is False), the correct index value will be determined. Otherwise, val will be added as provided.
- Parameters:
val (None) – the value to add to this column
index (
bool
) – whether or not the value being added is an index