Note

Go to the end to download the full example code.

AlignedDynamicTable

This is a user guide to interacting with AlignedDynamicTable objects.

Introduction

The class AlignedDynamicTable represents a column-based table with support for grouping columns by category. AlignedDynamicTable inherits from DynamicTable and may contain additional DynamicTable objects, one per sub-category. All tables must align, i.e., they are required to have the same number of rows. Some key features of AlignedDynamicTable are:

support custom categories, each of which is a DynamicTable stored as part of the AlignedDynamicTable,
support interaction with category tables individually as well as treating the AlignedDynamicTable as a single large table, and
because AlignedDynamicTable is itself a DynamicTable users can:
- Use DynamicTableRegion to reference rows in AlignedDynamicTable
- Add custom columns to the AlignedDynamicTable, and
- Interact with AlignedDynamicTable as well as the category (sub-tables) it contains in the same fashion as with DynamicTable

When to use (and not use) AlignedDynamicTable?

AlignedDynamicTable is a useful data structure but it is also fairly complex, consisting of multiple DynamicTable objects, each of which is itself a complex type composed of many datasets and attributes. In general, if a simpler data structure is sufficient, then consider using those instead. For example, consider using instead:

DynamicTable if a regular table is sufficient.
A compound dataset via Table if all columns of a table are fixed and fast, column-based access is not critical but fast row-based access is.
Multiple, separate tables if using AlignedDynamicTable would lead to duplication of data (i.e., de-normalize data), e.g., by having to replicate values across rows of the table.

Use AlignedDynamicTable when:

When you need to group columns in a DynamicTable by category
Need to avoid name collisions between columns in a DynamicTable and creating compound columns is not an option

Constructing a table

To create an AlignedDynamicTable, call the constructor with:

name string with the name of the table, and
description string to describe the table.

from hdmf.common import AlignedDynamicTable

customer_table = AlignedDynamicTable(
    name='customers',
    description='an example aligned table',
)

Initializing columns of the primary table

The basic behavior of adding data and initializing AlignedDynamicTable is the same as in DynamicTable. See the DynamicTable How-To Guide for details. E.g., using the columns and colnames parameters (which are inherited from DynamicTable) we can define the columns of the primary table. All columns must have the same length.

from hdmf.common import VectorData

col1 = VectorData(
    name='firstname',
    description='Customer first name',
    data=['Peter', 'Emma']
)
col2 = VectorData(
    name='lastname',
    description='Customer last name',
    data=['Williams', 'Brown']
)

customer_table = AlignedDynamicTable(
    name='customer',
    description='an example aligned table',
    columns=[col1, col2]
)

Initializing categories

By specifying the category_tables as a list of DynamicTable objects we can then directly specify the sub-category tables. Optionally, we can also set the categories names of the sub-tables as an array of strings to define the ordering of categories.

from hdmf.common import DynamicTable

# create the home_address category table
subcol1 = VectorData(
    name='city',
    description='city',
    data=['Rivercity', 'Mountaincity']
)
subcol2 = VectorData(
    name='street',
    description='street data',
    data=['Amazonstreet', 'Alpinestreet']
)
homeaddress_table = DynamicTable(
    name='home_address',
    description='home address of the customer',
    columns=[subcol1, subcol2]
)

# create the table
customer_table = AlignedDynamicTable(
    name='customer',
    description='an example aligned table',
    columns=[col1, col2],
    category_tables=[homeaddress_table, ]
)

# render the table in the online docs
customer_table.to_dataframe()

	customer		home_address
	firstname	lastname	id	city	street
(customer, id)
0	Peter	Williams	0	Rivercity	Amazonstreet
1	Emma	Brown	1	Mountaincity	Alpinestreet

Adding more data to the table

We can add rows, columns, and new categories to the table.

Adding a row

To add a row via add_row we can either: 1) provide the row data as a single dict to the data parameter or 2) specify a dict for each category and column as keyword arguments. Additional optional arguments include id and enforce_unique_id.

customer_table.add_row(
    firstname='Paul',
    lastname='Smith',
    home_address={'city': 'Bugcity',
                  'street': 'Beestree'}
)

# render the table in the online docs
customer_table.to_dataframe()

	customer		home_address
	firstname	lastname	id	city	street
(customer, id)
0	Peter	Williams	0	Rivercity	Amazonstreet
1	Emma	Brown	1	Mountaincity	Alpinestreet
2	Paul	Smith	2	Bugcity	Beestree

Adding a column

To add a columns we use add_column.

customer_table.add_column(
    name='zipcode',
    description='zip code of the city',
    data=[11111, 22222, 33333],  # specify data for the 3 rows in the table
    category='home_address'  # use None (or omit) to add columns to the primary table
)

# render the table in the online docs
customer_table.to_dataframe()

	customer		home_address
	firstname	lastname	id	city	street	zipcode
(customer, id)
0	Peter	Williams	0	Rivercity	Amazonstreet	11111
1	Emma	Brown	1	Mountaincity	Alpinestreet	22222
2	Paul	Smith	2	Bugcity	Beestree	33333

Adding a category

To add a new DynamicTable as a category, we use add_category.

Note

Only regular DynamicTables are allowed as category tables. Using an AlignedDynamicTable as a category for another AlignedDynamicTable is currently not supported.

# create a new category DynamicTable for the work address
subcol1 = VectorData(
    name='city',
    description='city',
    data=['Busycity', 'Worktown', 'Labortown']
)
subcol2 = VectorData(
    name='street',
    description='street data',
    data=['Cannery Row', 'Woodwork Avenue', 'Steel Street']
)
subcol3 = VectorData(
    name='zipcode',
    description='zip code of the city',
    data=[33333, 44444, 55555])
workaddress_table = DynamicTable(
    name='work_address',
    description='home address of the customer',
    columns=[subcol1, subcol2, subcol3]
)

# add the category to our AlignedDynamicTable
customer_table.add_category(category=workaddress_table)

# render the table in the online docs
customer_table.to_dataframe()

	customer		home_address				work_address
	firstname	lastname	id	city	street	zipcode	id	city	street	zipcode
(customer, id)
0	Peter	Williams	0	Rivercity	Amazonstreet	11111	0	Busycity	Cannery Row	33333
1	Emma	Brown	1	Mountaincity	Alpinestreet	22222	1	Worktown	Woodwork Avenue	44444
2	Paul	Smith	2	Bugcity	Beestree	33333	2	Labortown	Steel Street	55555

Note

Because each category is stored as a separate DynamicTable there are no name collisions between the columns of the home_address and work_address tables, so that both can contain matching city, street, and zipcode columns. However, since a category table is a sub-part of the primary table, categories must not have the same name as other columns or other categories in the primary table.

Accessing categories, columns, rows, and cells

Convert to a pandas DataFrame

If we need to access the whole table for analysis, then converting the table to pandas DataFrame is a convenient option. To ignore the id columns of all category tables we can simply set the ignore_category_ids parameter.

# render the table in the online docs while ignoring the id column of category tables
customer_table.to_dataframe(ignore_category_ids=True)

	customer		home_address			work_address
	firstname	lastname	city	street	zipcode	city	street	zipcode
(customer, id)
0	Peter	Williams	Rivercity	Amazonstreet	11111	Busycity	Cannery Row	33333
1	Emma	Brown	Mountaincity	Alpinestreet	22222	Worktown	Woodwork Avenue	44444
2	Paul	Smith	Bugcity	Beestree	33333	Labortown	Steel Street	55555

Accessing categories

# Get the list of all categories
_ = customer_table.categories

# Get the DynamicTable object of a particular category
_ = customer_table.get_category(name='home_address')

# Alternatively, we can use normal array slicing to get the category as a pandas DataFrame.
# NOTE: In contrast to the previous call, the table is here converted to a DataFrame.
_ = customer_table['home_address']

Accessing columns

We can use the standard Python in operator to check if a column exists

# To check if a column exists in the primary table we only need to specify the column name
# or alternatively specify the category as None
_ = 'firstname' in customer_table
_ = (None, 'firstname') in customer_table
# To check if a column exists in a category table we need to specify the category
# and column name as a tuple
_ = ('home_address', 'zipcode') in customer_table

We can use standard array slicing to get the VectorData object of a column.

# To get a column from the primary table we just provide the name.
_ = customer_table['firstname']
# To get a column from a category table we provide both the category name and column name
_ = customer_table['home_address', 'city']

Accessing rows

Accessing rows works much like in DynamicTable How-To Guide

# Get a single row by index as a DataFrame
customer_table[1]

	customer		home_address				work_address
	firstname	lastname	id	city	street	zipcode	id	city	street	zipcode
(customer, id)
1	Emma	Brown	1	Mountaincity	Alpinestreet	22222	1	Worktown	Woodwork Avenue	44444

# Get a range of rows as a DataFrame
customer_table[0:2]

	customer		home_address				work_address
	firstname	lastname	id	city	street	zipcode	id	city	street	zipcode
(customer, id)
0	Peter	Williams	0	Rivercity	Amazonstreet	11111	0	Busycity	Cannery Row	33333
1	Emma	Brown	1	Mountaincity	Alpinestreet	22222	1	Worktown	Woodwork Avenue	44444

# Get a list of rows as a DataFrame
customer_table[[0, 2]]

	customer		home_address				work_address
	firstname	lastname	id	city	street	zipcode	id	city	street	zipcode
(customer, id)
0	Peter	Williams	0	Rivercity	Amazonstreet	11111	0	Busycity	Cannery Row	33333
2	Paul	Smith	2	Bugcity	Beestree	33333	2	Labortown	Steel Street	55555

Accessing cells

To get a set of cells we need to specify the: 1) category, 2) column, and 3) row index when slicing into the table.

When selecting from the primary table we need to specify None for the category, followed by the column name and the selection.

# Select rows 0:2 from the 'firstname' column in the primary table
customer_table[None, 'firstname', 0:2]

['Peter', 'Emma']

# Select rows 1 from the 'firstname' column in the primary table
customer_table[None, 'firstname', 1]

'Emma'

# Select rows 0 and 2 from the 'firstname' column in the primary table
customer_table[None, 'firstname', [0, 2]]

['Peter', 'Paul']

# Select rows 0:2 from the 'city' column of the 'home_address' category table
customer_table['home_address', 'city', 0:2]

['Rivercity', 'Mountaincity']

Gallery generated by Sphinx-Gallery