hdmf.common.hierarchicaltable module
Module providing additional functionality for dealing with hierarchically nested tables, i.e., tables containing DynamicTableRegion references.
- hdmf.common.hierarchicaltable.to_hierarchical_dataframe(dynamic_table)
Create a hierarchical pandas.DataFrame that represents all data from a collection of linked DynamicTables.
LIMITATIONS: Currently this function only supports DynamicTables with a single DynamicTableRegion column. If a table has more than one DynamicTableRegion column then the function will expand only the first DynamicTableRegion column found for each table. Any additional DynamicTableRegion columns will remain nested.
NOTE: Some useful functions for further processing of the generated DataFrame include:
pandas.DataFrame.reset_index to turn the data from the pandas.MultiIndex into columns
drop_id_columns
to remove all ‘id’ columnsflatten_column_index
to flatten the column index
- Parameters:
dynamic_table (
DynamicTable
) – DynamicTable object to be converted to a hierarchical pandas.Dataframe- Returns:
Hierarchical pandas.DataFrame with usually a pandas.MultiIndex on both the index and columns.
- Return type:
- hdmf.common.hierarchicaltable.drop_id_columns(dataframe, inplace=False)
Drop all columns named ‘id’ from the table.
In case a column name is a tuple the function will drop any column for which the inner-most name is ‘id’. The ‘id’ columns of DynamicTable is in many cases not necessary for analysis or display. This function allow us to easily filter all those columns.
- raises TypeError:
In case that dataframe parameter is not a pandas.Dataframe.
- hdmf.common.hierarchicaltable.flatten_column_index(dataframe, max_levels=None, inplace=False)
Flatten the column index of a pandas DataFrame.
The functions changes the dataframe.columns from a pandas.MultiIndex to a normal Index, with each column usually being identified by a tuple of strings. This function is typically used in conjunction with DataFrames generated by
to_hierarchical_dataframe
- raises ValueError:
In case the num_levels is not >0
- raises TypeError:
In case that dataframe parameter is not a pandas.Dataframe.
- Parameters:
dataframe (
DataFrame
) – Pandas dataframe to update (usually generated by the to_hierarchical_dataframe function)max_levels (
int
orinteger
) – Maximum number of levels to use in the resulting column Index. NOTE: When limiting the number of levels the function simply removes levels from the beginning. As such, removing levels may result in columns with duplicate names.Value must be >0.inplace (
bool
) – Update the dataframe inplace or return a modified copy
- Returns:
pandas.DataFrame with a regular pandas.Index columns rather and a pandas.MultiIndex
- Return type: