.. DO NOT EDIT. .. THIS FILE WAS AUTOMATICALLY GENERATED BY SPHINX-GALLERY. .. TO MAKE CHANGES, EDIT THE SOURCE PYTHON FILE: .. "tutorials/plot_dynamictable_howto.py" .. LINE NUMBERS ARE GIVEN BELOW. .. only:: html .. note:: :class: sphx-glr-download-link-note :ref:`Go to the end ` to download the full example code. .. rst-class:: sphx-glr-example-title .. _sphx_glr_tutorials_plot_dynamictable_howto.py: .. _dynamictable-howtoguide: DynamicTable How-To Guide ========================= This is a user guide to interacting with ``DynamicTable`` objects. .. GENERATED FROM PYTHON SOURCE LINES 13-21 Introduction ------------ The :py:class:`~hdmf.common.table.DynamicTable` class represents a column-based table to which you can add custom columns. It consists of a name, a description, a list of row IDs, and a list of columns. Columns are represented by objects of the class :py:class:`~hdmf.common.table.VectorData`, including subclasses of :py:class:`~hdmf.common.table.VectorData`, such as :py:class:`~hdmf.common.table.VectorIndex`, and :py:class:`~hdmf.common.table.DynamicTableRegion`. .. GENERATED FROM PYTHON SOURCE LINES 23-28 Constructing a table -------------------- To create a :py:class:`~hdmf.common.table.DynamicTable`, call the constructor for :py:class:`~hdmf.common.table.DynamicTable` with a string ``name`` and string ``description``. Specifying the arguments with keywords is recommended. .. GENERATED FROM PYTHON SOURCE LINES 28-36 .. code-block:: Python from hdmf.common import DynamicTable table = DynamicTable( name='my_table', description='an example table', ) .. GENERATED FROM PYTHON SOURCE LINES 38-52 Initializing columns -------------------- You can create a :py:class:`~hdmf.common.table.DynamicTable` with particular columns by passing a list or tuple of :py:class:`~hdmf.common.table.VectorData` objects for the ``columns`` argument in the constructor. If the :py:class:`~hdmf.common.table.VectorData` objects contain data values, then each :py:class:`~hdmf.common.table.VectorData` object must contain the same number of rows as each other. A list of row IDs may be passed into the :py:class:`~hdmf.common.table.DynamicTable` constructor using the ``id`` argument. If IDs are passed in, there should be the same number of rows as the column data. If IDs are not passed in, then the IDs will be set to ``range(len(column_data))`` by default. .. GENERATED FROM PYTHON SOURCE LINES 52-81 .. code-block:: Python from hdmf.common import VectorData, VectorIndex col1 = VectorData( name='col1', description='column #1', data=[1, 2], ) col2 = VectorData( name='col2', description='column #2', data=['a', 'b'], ) # this table will have two rows with ids 0 and 1 table = DynamicTable( name='my table', description='an example table', columns=[col1, col2], ) # this table will have two rows with ids 0 and 1 table_set_ids = DynamicTable( name='my table', description='an example table', columns=[col1, col2], id=[0, 1], ) .. GENERATED FROM PYTHON SOURCE LINES 82-86 If a list of integers in passed to ``id``, :py:class:`~hdmf.common.table.DynamicTable` automatically creates an :py:class:`~hdmf.common.table.ElementIdentifiers` object, which is the data type that stores row IDs. The above command is equivalent to: .. GENERATED FROM PYTHON SOURCE LINES 86-96 .. code-block:: Python from hdmf.common.table import ElementIdentifiers table_set_ids = DynamicTable( name='my table', description='an example table', columns=[col1, col2], id=ElementIdentifiers(name='id', data=[0, 1]), ) .. GENERATED FROM PYTHON SOURCE LINES 97-103 Adding rows ----------- You can also add rows to a :py:class:`~hdmf.common.table.DynamicTable` using :py:meth:`DynamicTable.add_row `. A keyword argument for every column in the table must be supplied. You may also supply an optional row ID. .. GENERATED FROM PYTHON SOURCE LINES 103-110 .. code-block:: Python table.add_row( col1=3, col2='c', id=2, ) .. GENERATED FROM PYTHON SOURCE LINES 111-116 .. note:: If no ID is supplied, the row ID is automatically set to the number of rows of the table prior to adding the new row. This can result in duplicate IDs. In general, IDs should be unique, but this is not enforced by default. Pass `enforce_unique_id=True` to :py:meth:`DynamicTable.add_row ` to raise an error if the ID is set to an existing ID value. .. GENERATED FROM PYTHON SOURCE LINES 116-123 .. code-block:: Python # this row will have ID 3 by default table.add_row( col1=4, col2='d', ) .. GENERATED FROM PYTHON SOURCE LINES 124-130 Adding columns -------------- You can add columns to a :py:class:`~hdmf.common.table.DynamicTable` using :py:meth:`DynamicTable.add_column `. If the table already has rows, then the ``data`` argument must be supplied as a list of values, one for each row already in the table. .. GENERATED FROM PYTHON SOURCE LINES 130-137 .. code-block:: Python table.add_column( name='col3', description='column #3', data=[True, True, False, True], # specify data for the 4 rows in the table ) .. GENERATED FROM PYTHON SOURCE LINES 138-151 Enumerated (categorical) data ----------------------------- :py:class:`~hdmf.common.table.EnumData` is a special type of column for storing an enumerated data type. This way each unique value is stored once, and the data references those values by index. Using this method is more efficient than storing a single value many times, and has the advantage of communicating to downstream tools that the data is categorical in nature. .. warning:: :py:class:`~hdmf.common.table.EnumData` is currently an experimental feature and as such should not be used for production use. .. GENERATED FROM PYTHON SOURCE LINES 151-170 .. code-block:: Python from hdmf.common.table import EnumData import warnings warnings.filterwarnings(action="ignore", message="EnumData is experimental") # this column has a length of 5, not 3. the first row has value "aa" enum_col = EnumData( name='cell_type', description='this column holds categorical variables', data=[0, 1, 2, 1, 0], elements=['aa', 'bb', 'cc'] ) my_table = DynamicTable( name='my_table', description='an example table', columns=[enum_col], ) .. GENERATED FROM PYTHON SOURCE LINES 171-181 Ragged array columns -------------------- A table column with a different number of elements for each row is called a "ragged array column". To initialize a :py:class:`~hdmf.common.table.DynamicTable` with a ragged array column, pass both the :py:class:`~hdmf.common.table.VectorIndex` and its target :py:class:`~hdmf.common.table.VectorData` in for the ``columns`` argument in the constructor. For instance, the following code creates a column called ``col1`` where the first cell is ['1a', '1b', '1c'] and the second cell is ['2a']. .. GENERATED FROM PYTHON SOURCE LINES 181-201 .. code-block:: Python col1 = VectorData( name='col1', description='column #1', data=['1a', '1b', '1c', '2a'], ) # the 3 signifies that elements 0 to 3 (exclusive) of the target column belong to the first row # the 4 signifies that elements 3 to 4 (exclusive) of the target column belong to the second row col1_ind = VectorIndex( name='col1_index', target=col1, data=[3, 4], ) table_ragged_col = DynamicTable( name='my table', description='an example table', columns=[col1, col1_ind], ) .. GENERATED FROM PYTHON SOURCE LINES 202-205 .. note:: By convention, the name of the :py:class:`~hdmf.common.table.VectorIndex` should be the name of the target column with the added suffix "_index". .. GENERATED FROM PYTHON SOURCE LINES 207-213 VectorIndex.data provides the indices for how to break VectorData.data into cells You can add an empty ragged array column to an existing :py:class:`~hdmf.common.table.DynamicTable` by specifying ``index=True`` to :py:meth:`DynamicTable.add_column `. This method only works if run before any rows have been added to the table. .. GENERATED FROM PYTHON SOURCE LINES 213-225 .. code-block:: Python new_table = DynamicTable( name='my_table', description='an example table', ) new_table.add_column( name='col4', description='column #4', index=True, ) .. GENERATED FROM PYTHON SOURCE LINES 226-230 If the table already contains data, you must specify the new column values for the existing rows using the ``data`` argument and you must specify the end indices of the ``data`` argument that correspond to each row as a list/tuple/array of values for the ``index`` argument. .. GENERATED FROM PYTHON SOURCE LINES 230-238 .. code-block:: Python table.add_column( # <-- this table already has 4 rows name='col4', description='column #4', data=[1, 0, -1, 0, -1, 1, 1, -1], index=[3, 4, 6, 8], # specify the end indices (exclusive) of data for each row ) .. GENERATED FROM PYTHON SOURCE LINES 239-243 Alternatively we may also define the ragged array data as a nested list and use the ``index`` argument to indicate the number of levels. In this case, the :py:class:`~hdmf.common.table.DynamicTable.add_column` function will automatically flatten the data array and compute the corresponding index vectors. .. GENERATED FROM PYTHON SOURCE LINES 243-259 .. code-block:: Python table.add_column( # <-- this table already has 4 rows name='col5', description='column #5', data=[[[1, ], [2, 2]], # row 1 [[3, 3], ], # row 2 [[4, ], [5, 5]], # row 3 [[6, 6], [7, 7, 7]]], # row 4 index=2 # number of levels in the ragged array ) # Show that the ragged array was converted to flat VectorData with a double VectorIndex print("Flattened data: %s" % str(table.col5.data)) print("Level 1 index: %s" % str(table.col5_index.data)) print("Level 2 index: %s" % str(table.col5_index_index.data)) .. rst-class:: sphx-glr-script-out .. code-block:: none Flattened data: [1, 2, 2, 3, 3, 4, 5, 5, 6, 6, 7, 7, 7] Level 1 index: [1, 3, 5, 6, 8, 10, 13] Level 2 index: [2, 3, 5, 7] .. GENERATED FROM PYTHON SOURCE LINES 260-266 Referencing rows of other tables -------------------------------- You can create a column that references rows of another table by adding a :py:class:`~hdmf.common.table.DynamicTableRegion` object as a column of your :py:class:`~hdmf.common.table.DynamicTable`. This is analogous to a foreign key in a relational database. .. GENERATED FROM PYTHON SOURCE LINES 266-288 .. code-block:: Python from hdmf.common.table import DynamicTableRegion dtr_col = DynamicTableRegion( name='table1_ref', description='references rows of earlier table', data=[0, 1, 0, 0], # refers to row indices of the 'table' variable table=table ) data_col = VectorData( name='col2', description='column #2', data=['a', 'a', 'a', 'b'], ) table2 = DynamicTable( name='my_table', description='an example table', columns=[dtr_col, data_col], ) .. GENERATED FROM PYTHON SOURCE LINES 289-299 Here, the ``data`` of ``dtr_col`` maps to rows of ``table`` (0-indexed). .. note:: The ``data`` values of :py:class:`~hdmf.common.table.DynamicTableRegion` map to the row index, not the row ID, though if you are using default IDs, these values will be the same. Reference more than one row of another table with a :py:class:`~hdmf.common.table.DynamicTableRegion` indexed by a :py:class:`~hdmf.common.table.VectorIndex`. .. GENERATED FROM PYTHON SOURCE LINES 299-321 .. code-block:: Python indexed_dtr_col = DynamicTableRegion( name='table1_ref2', description='references multiple rows of earlier table', data=[0, 0, 1, 1, 0, 0, 1], table=table ) # row 0 refers to rows [0, 0], row 1 refers to rows [1], row 2 refers to rows [1, 0], row 3 refers to rows [0, 1] of # the "table" variable dtr_idx = VectorIndex( name='table1_ref2_index', target=indexed_dtr_col, data=[2, 3, 5, 7], ) table3 = DynamicTable( name='my_table', description='an example table', columns=[dtr_idx, indexed_dtr_col], ) .. GENERATED FROM PYTHON SOURCE LINES 322-338 Setting the target table of a DynamicTableRegion column of a DynamicTable ------------------------------------------------------------------------- A subclass of DynamicTable might have a pre-defined DynamicTableRegion column. To write this column correctly, the "table" attribute of the column must be set so that users know to what table the row index values reference. Because the target table could be any table, the "table" attribute must be set explicitly. There are three ways to do so. First, you can use the ``target_tables`` argument of the DynamicTable constructor as shown below. This argument is a dictionary mapping the name of the DynamicTableRegion column to the target table. Secondly, the target table can be set after the DynamicTable has been initialized using ``my_table.my_column.table = other_table``. Finally, you can create the DynamicTableRegion column and pass the ``table`` attribute to `DynamicTableRegion.__init__` and then pass the column to `DynamicTable.__init__` using the `columns` argument. However, this approach is not recommended for columns defined in the schema, because it is up to the user to ensure that the column is created in accordance with the schema. .. GENERATED FROM PYTHON SOURCE LINES 338-356 .. code-block:: Python class SubTable(DynamicTable): __columns__ = ( {'name': 'dtr', 'description': 'required region', 'required': True, 'table': True}, ) referenced_table = DynamicTable( name='referenced_table', description='an example table', ) sub_table = SubTable( name='sub_table', description='an example table', target_tables={'dtr': referenced_table}, ) # now the target table of the DynamicTableRegion column 'dtr' is set to `referenced_table` .. GENERATED FROM PYTHON SOURCE LINES 357-365 Creating an expandable table ---------------------------- When using the default HDF5 backend, each column of these tables is an HDF5 Dataset, which by default are set in size. This means that once a file is written, it is not possible to add a new row. If you want to be able to save this file, load it, and add more rows to the table, you will need to set this up when you create the :py:class:`~hdmf.common.table.DynamicTable`. You do this by wrapping the data with :py:class:`~hdmf.backends.hdf5.h5_utils.H5DataIO` and the argument ``maxshape=(None, )``. .. GENERATED FROM PYTHON SOURCE LINES 365-392 .. code-block:: Python from hdmf.backends.hdf5.h5_utils import H5DataIO col1 = VectorData( name='expandable_col1', description='column #1', data=H5DataIO(data=[1, 2], maxshape=(None,)), ) col2 = VectorData( name='expandable_col2', description='column #2', data=H5DataIO(data=['a', 'b'], maxshape=(None,)), ) # don't forget to wrap the row IDs too! ids = ElementIdentifiers( name='id', data=H5DataIO(data=[0, 1], maxshape=(None,)), ) expandable_table = DynamicTable( name='expandable_table', description='an example table that can be expanded after being saved to a file', columns=[col1, col2], id=ids, ) .. GENERATED FROM PYTHON SOURCE LINES 393-426 Now you can write the file, read it back, and run ``expandable_table.add_row()``. In this example, we are setting ``maxshape`` to ``(None,)``, which means this is a 1-dimensional matrix that can expand indefinitely along its single dimension. You could also use an integer in place of ``None``. For instance, ``maxshape=(8,)`` would allow the column to grow up to a length of 8. Whichever ``maxshape`` you choose, it should be the same for all :py:class:`~hdmf.common.table.VectorData` and :py:class:`~hdmf.common.table.ElementIdentifiers` objects in the :py:class:`~hdmf.common.table.DynamicTable`, since they must always be the same length. The default :py:class:`~hdmf.common.table.ElementIdentifiers` automatically generated when you pass a list of integers to the ``id`` argument of the :py:class:`~hdmf.common.table.DynamicTable` constructor is not expandable, so do not forget to create a :py:class:`~hdmf.common.table.ElementIdentifiers` object, and wrap that data as well. If any of the columns are indexed, the ``data`` argument of :py:class:`~hdmf.common.table.VectorIndex` will also need to be wrapped with :py:class:`~hdmf.backends.hdf5.h5_utils.H5DataIO`. Converting the table to a pandas ``DataFrame`` ---------------------------------------------- `pandas`_ is a popular data analysis tool, especially for working with tabular data. You can convert your :py:class:`~hdmf.common.table.DynamicTable` to a :py:class:`~pandas.DataFrame` using :py:meth:`DynamicTable.to_dataframe `. Accessing the table as a :py:class:`~pandas.DataFrame` provides you with powerful, standard methods for indexing, selecting, and querying tabular data from `pandas`_. This is the recommended method of reading data from your table. See also the `pandas indexing documentation`_. Printing a :py:class:`~hdmf.common.table.DynamicTable` as a :py:class:`~pandas.DataFrame` or displaying the :py:class:`~pandas.DataFrame` in Jupyter shows a more intuitive tabular representation of the data than printing the :py:class:`~hdmf.common.table.DynamicTable` object. .. _pandas: https://pandas.pydata.org/ .. _`pandas indexing documentation`: https://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html .. GENERATED FROM PYTHON SOURCE LINES 426-429 .. code-block:: Python df = table.to_dataframe() .. GENERATED FROM PYTHON SOURCE LINES 430-433 .. note:: Changes to the ``DataFrame`` will not be saved in the ``DynamicTable``. .. GENERATED FROM PYTHON SOURCE LINES 435-440 Converting the table from a pandas ``DataFrame`` ------------------------------------------------ If your data is already in a :py:class:`~pandas.DataFrame`, you can convert the ``DataFrame`` to a :py:class:`~hdmf.common.table.DynamicTable` using the class method :py:meth:`DynamicTable.from_dataframe `. .. GENERATED FROM PYTHON SOURCE LINES 440-446 .. code-block:: Python table_from_df = DynamicTable.from_dataframe( name='my_table', df=df, ) .. GENERATED FROM PYTHON SOURCE LINES 447-453 Accessing elements ------------------ To access an element in the i-th row in the column with name "col_name" in a :py:class:`~hdmf.common.table.DynamicTable`, use square brackets notation: ``table[i, col_name]``. You can also use a tuple of row index and column name within the square brackets. .. GENERATED FROM PYTHON SOURCE LINES 453-457 .. code-block:: Python table[0, 'col1'] # returns 1 table[(0, 'col1')] # returns 1 .. rst-class:: sphx-glr-script-out .. code-block:: none 1 .. GENERATED FROM PYTHON SOURCE LINES 458-460 If the column is a ragged array, instead of a single value being returned, a list of values for that element is returned. .. GENERATED FROM PYTHON SOURCE LINES 460-463 .. code-block:: Python table[0, 'col4'] # returns [1, 0, -1] .. rst-class:: sphx-glr-script-out .. code-block:: none [1, 0, -1] .. GENERATED FROM PYTHON SOURCE LINES 464-465 Standard Python and numpy slicing can be used for the row index. .. GENERATED FROM PYTHON SOURCE LINES 465-478 .. code-block:: Python import numpy as np table[:2, 'col1'] # get a list of elements from the first two rows at column 'col1' table[0:3:2, 'col1'] # get a list of elements from rows 0 to 3 (exclusive) in steps of 2 at column 'col1' table[3::-1, 'col1'] # get a list of elements from rows 3 to 0 in reverse order at column 'col1' # the following are equivalent to table[0:3:2, 'col1'] table[slice(0, 3, 2), 'col1'] table[np.s_[0:3:2], 'col1'] table[[0, 2], 'col1'] table[np.array([0, 2]), 'col1'] .. rst-class:: sphx-glr-script-out .. code-block:: none [1, 3] .. GENERATED FROM PYTHON SOURCE LINES 479-481 If the column is a ragged array, instead of a list of row values being returned, a list of list elements for the selected rows is returned. .. GENERATED FROM PYTHON SOURCE LINES 481-484 .. code-block:: Python table[:2, 'col4'] # returns [[1, 0, -1], [0]] .. rst-class:: sphx-glr-script-out .. code-block:: none [[1, 0, -1], [0]] .. GENERATED FROM PYTHON SOURCE LINES 485-490 .. note:: You cannot supply a list/tuple for the column name. For this kind of access, first convert the :py:class:`~hdmf.common.table.DynamicTable` to a :py:class:`~pandas.DataFrame`. .. GENERATED FROM PYTHON SOURCE LINES 492-497 Accessing columns ----------------- To access all the values in a column, use square brackets with a colon for the row index: ``table[:, col_name]``. If the column is a ragged array, a list of list elements is returned. .. GENERATED FROM PYTHON SOURCE LINES 497-501 .. code-block:: Python table[:, 'col1'] # returns [1, 2, 3, 4] table[:, 'col4'] # returns [[1, 0, -1], [0], [-1, 1], [1, -1]] .. rst-class:: sphx-glr-script-out .. code-block:: none [[1, 0, -1], [0], [-1, 1], [1, -1]] .. GENERATED FROM PYTHON SOURCE LINES 502-507 Accessing rows -------------- To access the i-th row in a :py:class:`~hdmf.common.table.DynamicTable`, returned as a :py:class:`~pandas.DataFrame`, use the syntax ``table[i]``. Standard Python and numpy slicing can be used for the row index. .. GENERATED FROM PYTHON SOURCE LINES 507-519 .. code-block:: Python table[0] # get the 0th row of the table as a DataFrame table[:2] # get the first two rows table[0:3:2] # get rows 0 to 3 (exclusive) in steps of 2 table[3::-1] # get rows 3 to 0 in reverse order # the following are equivalent to table[0:3:2] table[slice(0, 3, 2)] table[np.s_[0:3:2]] table[[0, 2]] table[np.array([0, 2])] .. raw:: html
col1 col2 col3 col4 col5
id
0 1 a True [1, 0, -1] [[1], [2, 2]]
2 3 c False [-1, 1] [[4], [5, 5]]


.. GENERATED FROM PYTHON SOURCE LINES 520-523 .. note:: The syntax ``table[i]`` returns the i-th row, NOT the row with ID of `i`. .. GENERATED FROM PYTHON SOURCE LINES 525-533 Iterating over rows -------------------- To iterate over the rows of a :py:class:`~hdmf.common.table.DynamicTable`, first convert the :py:class:`~hdmf.common.table.DynamicTable` to a :py:class:`~pandas.DataFrame` using :py:meth:`DynamicTable.to_dataframe `. For more information on iterating over a :py:class:`~pandas.DataFrame`, see https://pandas.pydata.org/pandas-docs/stable/user_guide/basics.html#iteration .. GENERATED FROM PYTHON SOURCE LINES 533-538 .. code-block:: Python df = table.to_dataframe() for row in df.itertuples(): print(row) .. rst-class:: sphx-glr-script-out .. code-block:: none Pandas(Index=0, col1=1, col2='a', col3=True, col4=[1, 0, -1], col5=[[1], [2, 2]]) Pandas(Index=1, col1=2, col2='b', col3=True, col4=[0], col5=[[3, 3]]) Pandas(Index=2, col1=3, col2='c', col3=False, col4=[-1, 1], col5=[[4], [5, 5]]) Pandas(Index=3, col1=4, col2='d', col3=True, col4=[1, -1], col5=[[6, 6], [7, 7, 7]]) .. GENERATED FROM PYTHON SOURCE LINES 539-547 Accessing the column data types ------------------------------- To access the :py:class:`~hdmf.common.table.VectorData` or :py:class:`~hdmf.common.table.VectorIndex` object representing a column, you can use three different methods. Use the column name in square brackets, e.g., ``table[col_name]``, use the :py:meth:`DynamicTable.get ` method, or use the column name as an attribute, e.g., ``table.col_name``. .. GENERATED FROM PYTHON SOURCE LINES 547-553 .. code-block:: Python table['col1'] table.get('col1') # equivalent to table['col1'] except this returns None if 'col1' is not found table.get('col1', default=0) # you can change the default return value table.col1 .. rst-class:: sphx-glr-script-out .. code-block:: none .. GENERATED FROM PYTHON SOURCE LINES 554-560 .. note:: Using the column name as an attribute does NOT work if the column name is the same as a non-column name attribute or method of the :py:class:`~hdmf.common.table.DynamicTable` class, e.g., ``name``, ``description``, ``object_id``, ``parent``, ``modified``. .. GENERATED FROM PYTHON SOURCE LINES 562-564 If the column is a ragged array, then the methods above will return the :py:class:`~hdmf.common.table.VectorIndex` associated with the ragged array. .. GENERATED FROM PYTHON SOURCE LINES 564-569 .. code-block:: Python table['col4'] table.get('col4') # equivalent to table['col4'] except this returns None if 'col4' is not found table.get('col4', default=0) # you can change the default return value .. rst-class:: sphx-glr-script-out .. code-block:: none .. GENERATED FROM PYTHON SOURCE LINES 570-575 .. note:: The attribute syntax ``table.col_name`` currently returns the ``VectorData`` instead of the ``VectorIndex`` for a ragged array. This is a known issue and will be fixed in a future version of HDMF. .. GENERATED FROM PYTHON SOURCE LINES 577-584 Accessing elements from column data types ----------------------------------------- Standard Python and numpy slicing can be used on the :py:class:`~hdmf.common.table.VectorData` or :py:class:`~hdmf.common.table.VectorIndex` objects to access elements from column data. If the column is a ragged array, then instead of a list of row values being returned, a list of list elements for the selected rows is returned. .. GENERATED FROM PYTHON SOURCE LINES 584-599 .. code-block:: Python table['col1'][0] # get the 0th element from column 'col1' table['col1'][:2] # get a list of the 0th and 1st elements table['col1'][0:3:2] # get a list of the 0th to 3rd (exclusive) elements in steps of 2 table['col1'][3::-1] # get a list of the 3rd to 0th elements in reverse order # the following are equivalent to table['col1'][0:3:2] table['col1'][slice(0, 3, 2)] table['col1'][np.s_[0:3:2]] table['col1'][[0, 2]] table['col1'][np.array([0, 2])] # this slicing and indexing works for ragged array columns as well table['col4'][:2] # get a list of the 0th and 1st list elements .. rst-class:: sphx-glr-script-out .. code-block:: none [[1, 0, -1], [0]] .. GENERATED FROM PYTHON SOURCE LINES 600-603 .. note:: The syntax ``table[col_name][i]`` is equivalent to ``table[i, col_name]``. .. GENERATED FROM PYTHON SOURCE LINES 605-609 Multi-dimensional columns ------------------------- A column can be represented as a multi-dimensional rectangular array or a list of lists, each containing the same number of elements. .. GENERATED FROM PYTHON SOURCE LINES 609-616 .. code-block:: Python col5 = VectorData( name='col5', description='column #5', data=[['a', 'b', 'c'], ['d', 'e', 'f'], ['g', 'h', 'i']], ) .. GENERATED FROM PYTHON SOURCE LINES 617-621 Ragged multi-dimensional columns --------------------------------- Each element within a column can be an n-dimensional array or list or lists. This is true for ragged array columns as well. .. GENERATED FROM PYTHON SOURCE LINES 621-633 .. code-block:: Python col6 = VectorData( name='col6', description='column #6', data=[['a', 'b', 'c'], ['d', 'e', 'f'], ['g', 'h', 'i']], ) col6_ind = VectorIndex( name='col6_index', target=col6, data=[2, 3], ) .. GENERATED FROM PYTHON SOURCE LINES 634-649 Nested ragged array columns --------------------------- In the example above, the ragged array column above has two rows. The first row has two elements, where each element has 3 sub-elements. This can be thought of as a 2x3 array. The second row has one element with 3 sub-elements, or a 1x3 array. This works only if the data for ``col5`` is a rectangular array, that is, each row element contains the same number of sub-elements. If each row element does not contain the same number of sub-elements, then a nested ragged array approach must be used instead. A :py:class:`~hdmf.common.table.VectorIndex` object can index another :py:class:`~hdmf.common.table.VectorIndex` object. For example, the first row of a table might be a 2x3 array, the second row might be a 3x2 array, and the third row might be a 1x1 array. This cannot be represented by a singly indexed column, but can be represented by a nested ragged array column. .. GENERATED FROM PYTHON SOURCE LINES 649-673 .. code-block:: Python col7 = VectorData( name='col7', description='column #6', data=['a', 'b', 'c', 'd', 'e', 'f', 'g', 'h', 'i', 'j', 'k', 'l', 'm'], ) col7_ind = VectorIndex( name='col7_index', target=col7, data=[3, 6, 8, 10, 12, 13], ) col7_ind_ind = VectorIndex( name='col7_index_index', target=col7_ind, data=[2, 5, 6], ) # all indices must be added to the table table_double_ragged_col = DynamicTable( name='my table', description='an example table', columns=[col7, col7_ind, col7_ind_ind], ) .. GENERATED FROM PYTHON SOURCE LINES 674-677 Access the first row using the same syntax as before, except now a list of lists is returned. You can then index the resulting list of lists to access the individual elements. .. GENERATED FROM PYTHON SOURCE LINES 677-681 .. code-block:: Python table_double_ragged_col[0, 'col7'] # returns [['a', 'b', 'c'], ['d', 'e', 'f']] table_double_ragged_col['col7'][0] # same as line above .. rst-class:: sphx-glr-script-out .. code-block:: none [['a', 'b', 'c'], ['d', 'e', 'f']] .. GENERATED FROM PYTHON SOURCE LINES 682-686 Accessing the column named 'col7' using square bracket notation will return the top-level :py:class:`~hdmf.common.table.VectorIndex` for the column. Accessing the column named 'col7' using dot notation will return the :py:class:`~hdmf.common.table.VectorData` object .. GENERATED FROM PYTHON SOURCE LINES 686-690 .. code-block:: Python table_double_ragged_col['col7'] # returns col7_ind_ind table_double_ragged_col.col7 # returns the col7 VectorData object .. rst-class:: sphx-glr-script-out .. code-block:: none .. GENERATED FROM PYTHON SOURCE LINES 691-707 Accessing data from a ``DynamicTable`` that contain references to rows of other ``DynamicTable`` objects -------------------------------------------------------------------------------------------------------- By default, when :py:meth:`DynamicTable.__getitem__ ` and :py:meth:`DynamicTable.get ` are supplied with an int, list of ints, numpy array, or a slice representing rows to return, a pandas :py:class:`~pandas.DataFrame` is returned. If the :py:class:`~hdmf.common.table.DynamicTable` contains a :py:class:`~hdmf.common.table.DynamicTableRegion` column that references rows of other ``DynamicTable`` objects, then by default, the :py:meth:`DynamicTable.__getitem__ ` and :py:meth:`DynamicTable.get ` methods will return row indices of the referenced table, and not the contents of the referenced table. To return the contents of the referenced table as a nested :py:class:`~pandas.DataFrame` containing only the referenced rows, use :py:meth:`DynamicTable.get ` with ``index=False``. .. GENERATED FROM PYTHON SOURCE LINES 707-771 .. code-block:: Python # create a new table of users users_table = DynamicTable( name='users', description='a table containing data/metadata about users, one user per row', ) # add simple columns to this table users_table.add_column( name='first_name', description='the first name of the user', ) users_table.add_column( name='last_name', description='the last name of the user', ) # create a new table of addresses to reference addresses_table = DynamicTable( name='addresses', description='a table containing data/metadata about addresses, one address per row', ) addresses_table.add_column( name='street_address', description='the street number and address', ) addresses_table.add_column( name='city', description='the city of the address', ) # add rows to the addresses table addresses_table.add_row( street_address='123 Main St', city='Springfield' ) addresses_table.add_row( street_address='45 British Way', city='London' ) # add a column to the users table that references rows of the addresses table users_table.add_column( name='address', description='the address of the user', table=addresses_table ) # add rows to the users table users_table.add_row( first_name='Grace', last_name='Hopper', address=0 # <-- row index of the address table ) users_table.add_row( first_name='Alan', last_name='Turing', address=1 # <-- row index of the address table ) # get the first row of the users table users_table.get(0) .. raw:: html
first_name last_name address
id
0 Grace Hopper 0


.. GENERATED FROM PYTHON SOURCE LINES 773-777 .. code-block:: Python # get the first row of the users table with a nested dataframe users_table.get(0, index=False) .. raw:: html
first_name last_name address
id
0 Grace Hopper street_address city id ...


.. GENERATED FROM PYTHON SOURCE LINES 779-783 .. code-block:: Python # get the first two rows of the users table users_table.get([0, 1]) .. raw:: html
first_name last_name address
id
0 Grace Hopper 0
1 Alan Turing 1


.. GENERATED FROM PYTHON SOURCE LINES 785-790 .. code-block:: Python # get the first two rows of the users table with nested dataframes # of the addresses table in the address column users_table.get([0, 1], index=False) .. raw:: html
first_name last_name address
id
0 Grace Hopper street_address city id ...
1 Alan Turing street_address city id ...


.. GENERATED FROM PYTHON SOURCE LINES 791-795 .. note:: You can also get rows from a :py:class:`~hdmf.common.table.DynamicTable` as a list of lists where the i-th nested list contains the values for the i-th row. This method is generally not recommended. .. GENERATED FROM PYTHON SOURCE LINES 797-809 Displaying the contents of a table with references to another table ------------------------------------------------------------------- Earlier, we converted a :py:class:`~hdmf.common.table.DynamicTable` to a :py:class:`~pandas.DataFrame` using :py:meth:`DynamicTable.to_dataframe ` and printed the :py:class:`~pandas.DataFrame` to see its contents. This also works when the :py:class:`~hdmf.common.table.DynamicTable` contains a column that references another table. However, the entries for this column for each row will be printed as a nested :py:class:`~pandas.DataFrame`. This can be difficult to read, so to view only the row indices of the referenced table, pass ``index=True`` to :py:meth:`DynamicTable.to_dataframe `. .. GENERATED FROM PYTHON SOURCE LINES 809-812 .. code-block:: Python users_df = users_table.to_dataframe(index=True) users_df .. raw:: html
first_name last_name address
id
0 Grace Hopper 0
1 Alan Turing 1


.. GENERATED FROM PYTHON SOURCE LINES 813-822 You can then access the referenced table using the ``table`` attribute of the column object. This is useful when reading a table from a file where you may not have a variable to access the referenced table. First, use :py:meth:`DynamicTable.__getitem__ ` (square brackets notation) to get the :py:class:`~hdmf.common.table.DynamicTableRegion` object representing the column. Then access its ``table`` attribute to get the addresses table and convert the table to a :py:class:`~pandas.DataFrame`. .. GENERATED FROM PYTHON SOURCE LINES 822-826 .. code-block:: Python address_column = users_table['address'] read_addresses_table = address_column.table addresses_df = read_addresses_table.to_dataframe() .. GENERATED FROM PYTHON SOURCE LINES 827-828 Get the addresses corresponding to the rows of the users table: .. GENERATED FROM PYTHON SOURCE LINES 828-831 .. code-block:: Python address_indices = users_df['address'] # pandas Series of row indices into the addresses table addresses_df.iloc[address_indices] # use .iloc because these are row indices not ID values .. raw:: html
street_address city
id
0 123 Main St Springfield
1 45 British Way London


.. GENERATED FROM PYTHON SOURCE LINES 832-836 .. note:: The indices returned by ``users_df['address']`` are row indices and not the ID values of the table. However, if you are using default IDs, these values will be the same. .. GENERATED FROM PYTHON SOURCE LINES 838-841 Creating custom DynamicTable subclasses --------------------------------------- TODO .. GENERATED FROM PYTHON SOURCE LINES 843-846 Defining ``__columns__`` ^^^^^^^^^^^^^^^^^^^^^^^^ TODO .. _sphx_glr_download_tutorials_plot_dynamictable_howto.py: .. only:: html .. container:: sphx-glr-footer sphx-glr-footer-example .. container:: sphx-glr-download sphx-glr-download-jupyter :download:`Download Jupyter notebook: plot_dynamictable_howto.ipynb ` .. container:: sphx-glr-download sphx-glr-download-python :download:`Download Python source code: plot_dynamictable_howto.py ` .. container:: sphx-glr-download sphx-glr-download-zip :download:`Download zipped: plot_dynamictable_howto.zip ` .. only:: html .. rst-class:: sphx-glr-signature `Gallery generated by Sphinx-Gallery `_