Note
Go to the end to download the full example code.
DynamicTable Tutorial
This is a tutorial for interacting with DynamicTable
objects. This tutorial
is written for beginners and does not describe the full capabilities and nuances
of DynamicTable
functionality. Please see the DynamicTable How-To Guide
for more complete documentation. This tutorial is designed to give
you basic familiarity with how DynamicTable
works and help you get started
with creating a DynamicTable
, adding columns and rows to a
DynamicTable
,
and accessing data in a DynamicTable
.
Introduction
The DynamicTable
class represents a column-based table
to which you can add custom columns. It consists of a name, a description, a list of
row IDs, and a list of columns.
Constructing a table
To create a DynamicTable
, call the constructor for
DynamicTable
with a string name
and string
description
.
from hdmf.common import DynamicTable
users_table = DynamicTable(
name='users',
description='a table containing data/metadata about users, one user per row',
)
Adding columns
You can add columns to a DynamicTable
using
DynamicTable.add_column
.
users_table.add_column(
name='first_name',
description='the first name of the user',
)
users_table.add_column(
name='last_name',
description='the last name of the user',
)
Adding ragged array columns
You may want to add columns to your table that have a different number of entries per row.
This is called a “ragged array column”. To do this, pass index=True
to
DynamicTable.add_column
.
users_table.add_column(
name='phone_number',
description='the phone number of the user',
index=True,
)
Adding rows
You can add rows to a DynamicTable
using
DynamicTable.add_row
.
You must pass in a keyword argument for every column in the table.
Ragged array column arguments should be passed in as lists or numpy arrays.
The ID of the row will automatically be set and incremented for every row,
starting at 0.
# id will be set to 0 automatically
users_table.add_row(
first_name='Grace',
last_name='Hopper',
phone_number=['123-456-7890'],
)
# id will be set to 1 automatically
users_table.add_row(
first_name='Alan',
last_name='Turing',
phone_number=['555-666-7777', '888-111-2222'],
)
Displaying the table contents as a pandas DataFrame
pandas is a popular data analysis tool for working with tabular data.
Convert your DynamicTable
to a pandas
DataFrame
using
DynamicTable.to_dataframe
.
Accessing the table as a DataFrame
provides you with powerful
methods for indexing, selecting, and querying tabular data from pandas.
Get the “last_name” column as a pandas Series
:
users_df['last_name']
id
0 Hopper
1 Turing
Name: last_name, dtype: object
The index of the DataFrame
is automatically set to the
table IDs. Get the row with ID = 0 as a pandas Series
:
users_df.loc[0]
first_name Grace
last_name Hopper
phone_number [123-456-7890]
Name: 0, dtype: object
Get single cells of the table by indexing with both ID and column name:
print('My first user:', users_df.loc[0, 'first_name'], users_df.loc[0, 'last_name'])
My first user: Grace Hopper
Adding columns that reference rows of other DynamicTable
objects
You can create a column that references rows of another
DynamicTable
. This is analogous to
a foreign key in a relational database. To do this, use the table
keyword
argument for
DynamicTable.add_column
and set it to the other table.
# create a new table of users
users_table = DynamicTable(
name='users',
description='a table containing data/metadata about users, one user per row',
)
# add simple columns to this table
users_table.add_column(
name='first_name',
description='the first name of the user',
)
users_table.add_column(
name='last_name',
description='the last name of the user',
)
# create a new table of addresses to reference
addresses_table = DynamicTable(
name='addresses',
description='a table containing data/metadata about addresses, one address per row',
)
addresses_table.add_column(
name='street_address',
description='the street number and address',
)
addresses_table.add_column(
name='city',
description='the city of the address',
)
# add rows to the addresses table
addresses_table.add_row(
street_address='123 Main St',
city='Springfield'
)
addresses_table.add_row(
street_address='45 British Way',
city='London'
)
# add a column to the users table that references rows of the addresses table
users_table.add_column(
name='address',
description='the address of the user',
table=addresses_table
)
# add rows to the users table
users_table.add_row(
first_name='Grace',
last_name='Hopper',
address=0 # <-- row index of the address table
)
users_table.add_row(
first_name='Alan',
last_name='Turing',
address=1 # <-- row index of the address table
)
Displaying the contents of a table with references to another table
Earlier, we converted a DynamicTable
to a
DataFrame
using
DynamicTable.to_dataframe
and printed the DataFrame
to see its contents.
This also works when the DynamicTable
contains a column
that references another table. However, the entries for this column for each row
will be printed as a nested DataFrame
. This can be difficult to
read, so to view only the row indices of the referenced table, pass
index=True
to
DynamicTable.to_dataframe
.
You can then access the referenced table using the table
attribute of the
column object. This is useful when reading a table from a file where you may not have
a variable to access the referenced table.
First, use DynamicTable.__getitem__
(square brackets notation) to get the
DynamicTableRegion
object representing the column.
Then access its table
attribute to get the addresses table and convert the table
to a DataFrame
.
address_column = users_table['address']
read_addresses_table = address_column.table
addresses_df = read_addresses_table.to_dataframe()
Get the addresses corresponding to the rows of the users table:
address_indices = users_df['address'] # pandas Series of row indices into the addresses table
addresses_df.iloc[address_indices] # use .iloc because these are row indices not ID values
Note
The indices returned by users_df['address']
are row indices and not
the ID values of the table. However, if you are using default IDs, these
values will be the same.
You now know the basics of creating DynamicTable
objects and reading data from them, including tables that have ragged array columns
and references to other tables. Learn more about working with
DynamicTable
in the DynamicTable How-To Guide,
including:
ragged array columns with references to other tables
nested ragged array columns
columns with multidimensional array data
columns with enumerated (categorical) data
accessing data and properties from the column objects directly
writing and reading tables to a file
writing expandable tables
defining subclasses of
DynamicTable