hdmf.common.hierarchicaltable module

Module providing additional functionality for dealing with hierarchically nested tables, i.e., tables containing DynamicTableRegion references.

hdmf.common.hierarchicaltable.to_hierarchical_dataframe(dynamic_table)

Create a hierarchical pandas.DataFrame that represents all data from a collection of linked DynamicTables.

LIMITATIONS: Currently this function only supports DynamicTables with a single DynamicTableRegion column. If a table has more than one DynamicTableRegion column then the function will expand only the first DynamicTableRegion column found for each table. Any additional DynamicTableRegion columns will remain nested.

NOTE: Some useful functions for further processing of the generated DataFrame include:

  • pandas.DataFrame.reset_index to turn the data from the pandas.MultiIndex into columns

  • drop_id_columns to remove all ‘id’ columns

  • flatten_column_index to flatten the column index

Parameters:

dynamic_table (DynamicTable) – DynamicTable object to be converted to a hierarchical pandas.Dataframe

Returns:

Hierarchical pandas.DataFrame with usually a pandas.MultiIndex on both the index and columns.

Return type:

DataFrame

hdmf.common.hierarchicaltable.drop_id_columns(dataframe, inplace=False)

Drop all columns named ‘id’ from the table.

In case a column name is a tuple the function will drop any column for which the inner-most name is ‘id’. The ‘id’ columns of DynamicTable is in many cases not necessary for analysis or display. This function allow us to easily filter all those columns.

raises TypeError:

In case that dataframe parameter is not a pandas.Dataframe.

Parameters:
  • dataframe (DataFrame) – Pandas dataframe to update (usually generated by the to_hierarchical_dataframe function)

  • inplace (bool) – Update the dataframe inplace or return a modified copy

Returns:

pandas.DataFrame with the id columns removed

Return type:

DataFrame

hdmf.common.hierarchicaltable.flatten_column_index(dataframe, max_levels=None, inplace=False)

Flatten the column index of a pandas DataFrame.

The functions changes the dataframe.columns from a pandas.MultiIndex to a normal Index, with each column usually being identified by a tuple of strings. This function is typically used in conjunction with DataFrames generated by to_hierarchical_dataframe

raises ValueError:

In case the num_levels is not >0

raises TypeError:

In case that dataframe parameter is not a pandas.Dataframe.

Parameters:
  • dataframe (DataFrame) – Pandas dataframe to update (usually generated by the to_hierarchical_dataframe function)

  • max_levels (int or integer) – Maximum number of levels to use in the resulting column Index. NOTE: When limiting the number of levels the function simply removes levels from the beginning. As such, removing levels may result in columns with duplicate names.Value must be >0.

  • inplace (bool) – Update the dataframe inplace or return a modified copy

Returns:

pandas.DataFrame with a regular pandas.Index columns rather and a pandas.MultiIndex

Return type:

DataFrame