hdmf.common.resources module

class hdmf.common.resources.KeyTable(name='keys', data=[])

Bases: Table

A table for storing keys used to reference external resources.

Parameters
add_row(key)
Parameters

key (str) – The user key that maps to the resource term / registry symbol.

class hdmf.common.resources.Key(key, table=None, idx=None)

Bases: Row

A Row class for representing rows in the KeyTable.

Parameters
  • key (str) – The user key that maps to the resource term / registry symbol.

  • table (Table) – None

  • idx (int) – None

todict()
class hdmf.common.resources.ResourceTable(name='resources', data=[])

Bases: Table

A table for storing names and URIs of ontology sources.

Parameters
add_row(resource, resource_uri)
Parameters
  • resource (str) – The resource/registry that the term/symbol comes from.

  • resource_uri (str) – The URI for the resource term / registry symbol.

class hdmf.common.resources.Resource(resource, resource_uri, table=None, idx=None)

Bases: Row

A Row class for representing rows in the ResourceTable.

Parameters
  • resource (str) – The resource/registry that the term/symbol comes from.

  • resource_uri (str) – The URI for the resource term / registry symbol.

  • table (Table) – None

  • idx (int) – None

todict()
class hdmf.common.resources.EntityTable(name='entities', data=[])

Bases: Table

A table for storing the external resources a key refers to.

Parameters
add_row(keys_idx, resources_idx, entity_id, entity_uri)
Parameters
  • keys_idx (int or Key) – The index into the keys table for the user key that maps to the resource term / registry symbol.

  • resources_idx (int or Resource) – The index into the ResourceTable.

  • entity_id (str) – The unique ID for the resource term / registry symbol.

  • entity_uri (str) – The URI for the resource term / registry symbol.

class hdmf.common.resources.Entity(keys_idx, resources_idx, entity_id, entity_uri, table=None, idx=None)

Bases: Row

A Row class for representing rows in the EntityTable.

Parameters
  • keys_idx (int or Key) – The index into the keys table for the user key that maps to the resource term / registry symbol.

  • resources_idx (int or Resource) – The index into the ResourceTable.

  • entity_id (str) – The unique ID for the resource term / registry symbol.

  • entity_uri (str) – The URI for the resource term / registry symbol.

  • table (Table) – None

  • idx (int) – None

todict()
class hdmf.common.resources.ObjectTable(name='objects', data=[])

Bases: Table

A table for storing objects (i.e. Containers) that contain keys that refer to external resources.

Parameters
add_row(object_id, relative_path, field)
Parameters
  • object_id (str) – The object ID for the Container/Data.

  • relative_path (str) – (‘The relative_path of the attribute of the object that uses ‘, ‘an external resource reference key. Use an empty string if not applicable.’)

  • field (str) – The field of the compound data type using an external resource. Use an empty string if not applicable.

class hdmf.common.resources.Object(object_id, relative_path, field, table=None, idx=None)

Bases: Row

A Row class for representing rows in the ObjectTable.

Parameters
  • object_id (str) – The object ID for the Container/Data.

  • relative_path (str) – (‘The relative_path of the attribute of the object that uses ‘, ‘an external resource reference key. Use an empty string if not applicable.’)

  • field (str) – The field of the compound data type using an external resource. Use an empty string if not applicable.

  • table (Table) – None

  • idx (int) – None

todict()
class hdmf.common.resources.ObjectKeyTable(name='object_keys', data=[])

Bases: Table

A table for identifying which keys are used by which objects for referring to external resources.

Parameters
add_row(objects_idx, keys_idx)
Parameters
  • objects_idx (int or Object) – The index into the objects table for the Object that uses the Key.

  • keys_idx (int or Key) – The index into the keys table that is used to make an external resource reference.

class hdmf.common.resources.ObjectKey(objects_idx, keys_idx, table=None, idx=None)

Bases: Row

A Row class for representing rows in the ObjectKeyTable.

Parameters
  • objects_idx (int or Object) – The index into the objects table for the Object that uses the Key.

  • keys_idx (int or Key) – The index into the keys table that is used to make an external resource reference.

  • table (Table) – None

  • idx (int) – None

todict()
class hdmf.common.resources.ExternalResources(name, keys=None, resources=None, entities=None, objects=None, object_keys=None, type_map=None)

Bases: Container

A table for mapping user terms (i.e. keys) to resource entities.

Parameters
  • name (str) – The name of this ExternalResources container.

  • keys (KeyTable) – The table storing user keys for referencing resources.

  • resources (ResourceTable) – The table for storing names and URIs of resources.

  • entities (EntityTable) – The table storing entity information.

  • objects (ObjectTable) – The table storing object information.

  • object_keys (ObjectKeyTable) – The table storing object-resource relationships.

  • type_map (TypeMap) – The type map. If None is provided, the HDMF-common type map will be used.

property keys

The table storing user keys for referencing resources.

property resources

The table for storing names and URIs of resources.

property entities

The table storing entity information.

property objects

The table storing object information.

property object_keys

The table storing object-resource relationships.

static assert_external_resources_equal(left, right, check_dtype=True)

Compare that the keys, resources, entities, objects, and object_keys tables match

Parameters
  • left – ExternalResources object to compare with right

  • right – ExternalResources object to compare with left

  • check_dtype – Enforce strict checking of dtypes. Dtypes may be different for example for ids, where depending on how the data was saved ids may change from int64 to int32. (Default: True)

Returns

The function returns True if all values match. If mismatches are found, AssertionError will be raised.

Raises

AssertionError – Raised if any differences are found. The function collects all differences into a single error so that the assertion will indicate all found differences.

get_key(key_name, container=None, relative_path='', field='')

Return a Key or a list of Key objects that correspond to the given key.

If container, relative_path, and field are provided, the Key that corresponds to the given name of the key for the given container, relative_path, and field is returned.

Parameters
  • key_name (str) – The name of the Key to get.

  • container (str or AbstractContainer) – The Container/Data object that uses the key or the object id for the Container/Data object that uses the key.

  • relative_path (str) – (‘The relative_path of the attribute of the object that uses ‘, ‘an external resource reference key. Use an empty string if not applicable.’)

  • field (str) – The field of the compound data type using an external resource.

get_resource(resource_name)

Retrieve resource object with the given resource_name.

Parameters

resource_name (str) – The name of the resource.

add_ref(container=None, attribute=None, field='', key=None, resources_idx=None, resource_name=None, resource_uri=None, entity_id=None, entity_uri=None)

Add information about an external reference used in this file.

It is possible to use the same name of the key to refer to different resources so long as the name of the key is not used within the same object, relative_path, and field combination. This method does not support such functionality by default.

Parameters
  • container (str or AbstractContainer) – The Container/Data object that uses the key or the object_id for the Container/Data object that uses the key.

  • attribute (str) – The attribute of the container for the external reference.

  • field (str) – The field of the compound data type using an external resource.

  • key (str or Key) – The name of the key or the Key object from the KeyTable for the key to add a resource for.

  • resources_idx (Resource) – The Resource from the ResourceTable.

  • resource_name (str) – The name of the resource to be created.

  • resource_uri (str) – The URI of the resource to be created.

  • entity_id (str) – The identifier for the entity at the resource.

  • entity_uri (str) – The URI for the identifier at the resource.

get_object_resources(container, relative_path='', field='')

Get all entities/resources associated with an object.

Parameters
  • container (str or AbstractContainer) – The Container/data object that is linked to resources/entities.

  • relative_path (str) – (‘The relative_path of the attribute of the object that uses ‘, ‘an external resource reference key. Use an empty string if not applicable.’)

  • field (str) – The field of the compound data type using an external resource.

get_keys(keys=None)
Return a DataFrame with information about keys used to make references to external resources.
The DataFrame will contain the following columns:
  • key_name: the key that will be used for referencing an external resource

  • resources_idx: the index for the resourcetable

  • entity_id: the index for the entity at the external resource

  • entity_uri: the URI for the entity at the external resource

It is possible to use the same key_name to refer to different resources so long as the key_name is not used within the same object, relative_path, field. This method doesn’t support such functionality by default. To select specific keys, use the keys argument to pass in the Key object(s) representing the desired keys. Note, if the same key_name is used more than once, multiple calls to this method with different Key objects will be required to keep the different instances separate. If a single call is made, it is left up to the caller to distinguish the different instances.

Parameters

keys (list or Key) – The Key(s) to get external resource data for.

Returns

a DataFrame with keys and external resource data

Return type

DataFrame

to_dataframe(use_categories=False)
Convert the data from the keys, resources, entities, objects, and object_keys tables

to a single joint dataframe. I.e., here data is being denormalized, e.g., keys that are used across multiple enities or objects will duplicated across the corresponding rows.

Returns: DataFrame with all data merged into a single, flat, denormalized table.

Parameters

use_categories (bool) – Use a multi-index on the columns to indicate which category each column belongs to.

Returns

A DataFrame with all data merged into a flat, denormalized table.

Return type

DataFrame

to_sqlite(db_file)

Save the keys, resources, entities, objects, and object_keys tables using sqlite3 to the given db_file.

The function will first create the tables (if they do not already exist) and then add the data from this ExternalResource object to the database. If the database file already exists, then the data will be appended as rows to the existing database tables.

Note, the index values of foreign keys (e.g., keys_idx, objects_idx, resources_idx) in the tables will not match between the ExternalResources here and the exported database, but they are adjusted automatically here, to ensure the foreign keys point to the correct rows in the exported database. This is because: 1) ExternalResources uses 0-based indexing for foreign keys, whereas SQLite uses 1-based indexing and 2) if data is appended to existing tables then a corresponding additional offset must be applied to the relevant foreign keys.

raises

The function will raise errors if connection to the database fails. If the given db_file already exists, then there is also the possibility that certain updates may result in errors if there are collisions between the new and existing data.

Parameters

db_file (str) – Name of the SQLite database file

to_tsv(path)
Write ExternalResources as a single, flat table to TSV

Internally, the function uses pandas.DataFrame.to_csv. Pandas can infer compression based on the filename, i.e., by changing the file extension to ‘.gz’, ‘.bz2’, ‘.zip’, ‘.xz’, or ‘.zst’ we can write compressed files. The TSV is formatted as follows: 1) line one indicates for each column the name of the table the column belongs to, 2) line two is the name of the column within the table, 3) subsequent lines are each a row in the flattened ExternalResources table. The first column is the row id in the flattened table and does not have a label, i.e., the first and second row will start with a tab character, and subseqent rows are numbered sequentially 1,2,3,… . For example:

1    objects objects objects objects keys    keys    resources       resources       resources       entities        entities        entities
2    objects_idx     object_id       relative_path   field   keys_idx        key     resources_idx   resource        resource_uri    entities_idx    entity_id       entity_uri
30   0       1fc87200-e91e-45b3-978c-6d295af144c3            species 0       Mus musculus    0       NCBI_Taxonomy   https://www.ncbi.nlm.nih.gov/taxonomy   0       NCBI:txid10090  https://www.ncbi.nlm.nih.gov/Taxonomy/Browser/wwwtax.cgi?id=10090
41   0       9bf0c58e-09dc-4457-a652-94065b112c41            species 1       Homo sapiens    0       NCBI_Taxonomy   https://www.ncbi.nlm.nih.gov/taxonomy   1       NCBI:txid9606   https://www.ncbi.nlm.nih.gov/Taxonomy/Browser/wwwtax.cgi?id=9606

See also from_tsv

Parameters

path (str) – path of the tsv file to write

classmethod from_tsv(path)
Read ExternalResources from a flat tsv file

Formatting of the TSV file is assumed to be consistent with the format generated by to_tsv. The function attempts to validate that the data in the TSV is consistent and parses the data from the denormalized table in the TSV to the normalized linked table structure used by ExternalResources. Currently the checks focus on ensuring that row id links between tables are valid. Inconsistencies in other (non-index) fields (e.g., when two rows with the same resource_idx have different resource_uri values) are not checked and will be ignored. In this case, the value from the first row that contains the corresponding entry will be kept.

Note

Since TSV files may be edited by hand or other applications, it is possible that data in the TSV may be inconsistent. E.g., object_idx may be missing if rows were removed and ids not updated. Also since the TSV is flattened into a single denormalized table (i.e., data are stored with duplication, rather than normalized across several tables), it is possible that values may be inconsistent if edited outside. E.g., we may have objects with the same index (object_idx) but different object_id, relative_path, or field values. While flat TSVs are sometimes preferred for ease of sharing, editing the TSV without using the ExternalResources class should be done with great care!

Parameters

path (str) – path of the tsv file to read

Returns

ExternalResources loaded from TSV

Return type

ExternalResources

data_type = 'ExternalResources'
namespace = 'hdmf-experimental'