hdmf.common.hierarchicaltable module

Module providing additional functionality for dealing with hierarchically nested tables, i.e., tables containing DynamicTableRegion references.

hdmf.common.hierarchicaltable.to_hierarchical_dataframe(dynamic_table)

Create a hierarchical pandas.DataFrame that represents all data from a collection of linked DynamicTables.

LIMITATIONS: Currently this function only supports DynamicTables with a single DynamicTableRegion column. If a table has more than one DynamicTableRegion column then the function will expand only the first DynamicTableRegion column found for each table. Any additional DynamicTableRegion columns will remain nested.

NOTE: Some useful functions for further processing of the generated DataFrame include:

  • pandas.DataFrame.reset_index to turn the data from the pandas.MultiIndex into columns

  • drop_id_columns to remove all ‘id’ columns

  • flatten_column_index to flatten the column index

Parameters:

dynamic_table (DynamicTable) – DynamicTable object to be converted to a hierarchical pandas.Dataframe

Returns:

Hierarchical pandas.DataFrame with usually a pandas.MultiIndex on both the index and columns.

Return type:

DataFrame

hdmf.common.hierarchicaltable.drop_id_columns(dataframe, inplace=False)

Drop all columns named ‘id’ from the table.

In case a column name is a tuple the function will drop any column for which the inner-most name is ‘id’. The ‘id’ columns of DynamicTable is in many cases not necessary for analysis or display. This function allow us to easily filter all those columns.

Raises:

TypeError – In case that dataframe parameter is not a pandas.Dataframe.

Parameters:
  • dataframe (DataFrame) – Pandas dataframe to update (usually generated by the to_hierarchical_dataframe function)

  • inplace (bool) – Update the dataframe inplace or return a modified copy

Returns:

pandas.DataFrame with the id columns removed

Return type:

DataFrame

hdmf.common.hierarchicaltable.flatten_column_index(dataframe, max_levels=None, inplace=False)

Flatten the column index of a pandas DataFrame.

The functions changes the dataframe.columns from a pandas.MultiIndex to a normal Index, with each column usually being identified by a tuple of strings. This function is typically used in conjunction with DataFrames generated by to_hierarchical_dataframe

Raises:
  • ValueError – In case the num_levels is not >0

  • TypeError – In case that dataframe parameter is not a pandas.Dataframe.

Parameters:
  • dataframe (DataFrame) – Pandas dataframe to update (usually generated by the to_hierarchical_dataframe function)

  • max_levels (int or integer) – Maximum number of levels to use in the resulting column Index. NOTE: When limiting the number of levels the function simply removes levels from the beginning. As such, removing levels may result in columns with duplicate names.Value must be >0.

  • inplace (bool) – Update the dataframe inplace or return a modified copy

Returns:

pandas.DataFrame with a regular pandas.Index columns rather and a pandas.MultiIndex

Return type:

DataFrame