SampleData Quick Reference Sheet

Notation

In the code lines constituting this reference sheet, the following notation conventions are used in the SampleData method arguments:

  • node_name, group_name, data_item_name are the Name, Indexname, Path or Alias of respectively a Data Array, Group, or either of the two previous data items.

  • parent_name is the the Name, Indexname, Path or Alias of a group that is the parent of the

  • data is a SampleData class instance synchronized with the dataset files my_dataset.hdf5 and my_dataset.xdmf.

SampleData Naming system

Interacting with data item in a SampleData dataset require to provide their name to the various class method. 4 possible types of names can be provided for each data item:

  1. the Path of the data item in the HDF5 file.

  2. the Name of the data item.

  3. the Indexname of the data item

  4. the Alias or aliases of the data item

Creating, Opening datasets, and Exploring their content

Dataset creation/opening

Import SampleData class:

[ ]:
from pymicro.core.samples import SampleData as SD

Create/Open a SampleData dataset, and activate verbose mode:

[ ]:
# CREATE dataset: the file `filename` must not exist. Verbose mode OFF
data = SD(filename='my_first_dataset', verbose=False)
# OPEN dataset: the file `filename` must exist. Verbose mode ON
data = SD(filename='my_first_dataset', verbose=True)

Copy dataset and get class instance synchronized with new dataset:

[ ]:
data = SD.copy_sample(src_sample_file='source_dataset', dst_sample_file='destination_dataset', get_object=True)

Create dataset, overwrite existing file, and automatic removal of dataset files at class instance destruction (autodelete option):

[ ]:
# Create new dataset and overwrite already existing dataset files
data = SD(filename='my_first_dataset',  verbose=True, overwrite_hdf5=True)
# Create new dataset with autodelete option ON
data = SD(filename='my_first_dataset',  verbose=True, autodelete=True)
# Set autodelete option on
data.autodelete = True

Getting information on datasets

Print informations on global content of the dataset:

[ ]:
# Print dataset index
data.print_index() # --> no option = local root '/' and max depth 3
data.print_index(max_depth=2, local_root='local_root_name') # --> with specified local root and depth

# Print dataset content (list all groups, nodes with detailed information)
data.print_dataset_content() # detailed output, printed in standard output
data.print_dataset_content(short=True, to_file='dataset_information.txt') # short output, written in text file

# Print both index and dataset content in short version --> class string representation
print(data)

# Print only grid groups informations
data.print_grids_info()

# Print content of XDMF file
data.print_xdmf()

# Get the memory disk size of the HDF5 dataset
size, unit = data.get_file_disk_size(convert=True, print_flag=False)
# value not printed, returned in method output and converted to most readable memory unit

Command line tools to get information on dataset

[ ]:
# recursive (-r) and detailed (-d) output of h5ls --> also print the content of the data arrays
!h5ls -rd ../data/test_sampledata_ref.h5
# h5dump
!ptdump -d ../data/test_sampledata_ref.h5
# detailed (-d) and verbose (-v) output of ptdump
!ptdump -dv ../data/test_sampledata_ref.h5
[ ]:
# Print information on data items content
data.print_node_info(nodename='node_name') # detailed information on a data item (group or array node)
data.print_group_content('group_name') # print information on group childrens
data.print_group_content('group_name', recursive=True)  # print information on group childrens recursively

# get data item disk size
size, unit = data.get_node_disk_size(nodename='node_name', print_flag=False, convert=False)

Dataset interactive visualization

[ ]:
# Visualize dataset organization and content with Vitables
data.pause_for_visualization(Vitables=True, Vitables_path='Path_to_Vitables_executable')
# Visualize spatially organized data with Paraview
data.pause_for_visualization(Paraview=True, Paraview_path='Path_to_Paraview_executable')

Basic data items: creating and getting them

Generic methods to get data items

[ ]:
# Dictionary like access
data['data_item_name']
# Attribute like access
data.data_item_name
# generic getter method
data.get_node('data_item_name') # --> returns a Pytables Group or Node object
data.get_node('data_item_name', as_numpy=True) # --> for array data items, returns a numpy array

Group data items

group_name and group_indexnam are the Name and Indexname of a Group data item. parent_group is the Name, Path, Indexname or Alias of a Group where group_name will be stored.

[ ]:
# Create a group in a dataset with name `group_name`, stored in the group `parent_name`
data.add_group(groupname='group_name', location='parent_name', indexname='group_indexname')
# Create a group and overwrite pre-existing group with the same name + get the created Pytables Group object
group = data.add_group(groupname='group_name', location='parent_name', indexname='group_indexname', replace=True)

Data item attributes (metadata)

[ ]:
# Add attributes from a python dictionary (metadata_dictionary)
data.add_attributes(metadata_dictionary, nodename='node_name')

# get data item attributes (metadata)
data.print_node_attributes(nodename='node_name') # print all attributes of note
attribute_value = data.get_attribute(attrname='attribute_name', nodename='node_name') # get value of one attribute
mesh_attrs = data.get_dic_from_attributes(nodename='node_name') # get all attributes as a dictionary

# set and get specific `description` attribute for node `node_name`
data.set_description(description="Write your description text here.", node='node_name')
data.get_description('node_name')

Data arrays

[ ]:
# add a numpy array `array` in data item `node_name`
data.add_data_array(location='parent_name', name='node_name', indexname='array_indexname', array=array)
 # replace = True allows to overwrite preexisting field with same name
data.add_data_array(location='parent_name', name='node_name', indexname='array_indexname', array=array, replace= True)

# get data array from data item `node_name`
array_node = data.get_node('node_name') # --> returns a Pytables Node object
array = data.get_node('node_name', as_numpy=True) # --> returns a Numpy array
array = data.['node_name'] # --> returns a Numpy array
array = data.node_name # --> returns a Numpy array

String arrays

[ ]:
# Add list of strings `List` as a string array `node_name`
data.add_string_array(name='node_name', location='parent_name', indexname='Sarray_indexname', data=List)

# get and decode binary strings stored in a String array
sarray = data['Sarray_indexname']
for string in sarray:
    print(string.decode('utf-8'), end=' ') # prints each string of the String array

Structured arrays

[ ]:
# Add structured array from Numpy structured array `structured_array` with Numpy.dtype `table_type`
data.add_table(name='node_name', location='parent_name', indexname='table_indexname', description=table_type,
               data=structured_array)
# Add lines to a structured array node from Numpy array `structured_array` (same dtype as the table)
data.append_table(name='table_indexname', data=structured_array)

# Add columns to a structured array node from a Numpy array `structured_array` with Numpy.dtype ``
data.add_tablecols(tablename='table_indexname', description=cols_dtype, data=structured_array)

# Get structured array just like Data arrays

Remove data items

[ ]:
data.remove_node('node_name') # removes a group without childrens or a data array item
data.remove_node('group_name', recursive=True) # remove a Group data item and all its childrens recursively
# remove one or a list of attributes (metadata) from a node
data.remove_attribute(attrname='attribute_name', nodename='node_name')
data.remove_attributes(attr_list=['list','of','attributes','to','remove','from','node'], nodename='node_name')

Image Groups and Image fields: creating and getting them

Creating Image groups from fields

[ ]:
# Create an Image Group from a Numpy array `field_array` interpreted as a pixel/voxel wise constant scalar field
data.add_image_from_field(field_array=field_array, fieldname='node_name', imagename='group_name',
                          indexname='image_indexname', location='parent_name',
                          description="Write image group description here.", origin=[0.,10.], spacing=[2.,2.])

# Create an Image Group from a Numpy array `field_array` interpreted as a node value scalar field
data.add_image_from_field(field_array=field_array, fieldname='node_name', imagename='group_name',
                          indexname='image_indexname', location='parent_name', is_elemField=False,
                          description="Write image group description here.", origin=[0.,10.], spacing=[2.,2.])

# Create an Image Group from a Numpy array `field_array` interpreted as a non scalar field
data.add_image_from_field(field_array=field_array, fieldname='node_name', imagename='group_name',
                          indexname='image_indexname', location='parent_name', is_scalar=False,
                          description="Write image group description here.", origin=[0.,10.], spacing=[2.,2.])

# Set image position and dimensions
data.set_voxel_size(image_group='image_indexname', voxel_size=np.array([4.,4.]))
data.set_origin(image_group='image_indexname', origin=np.array([10.,0.]))

Creating image groups from image objects

[ ]:
# import BasicTools image object `ConstantRectilinearMesh`
from BasicTools.Containers.ConstantRectilinearMesh import ConstantRectilinearMesh
# Initialize image object, image dimension, origin and pixel/voxel size
image_object = ConstantRectilinearMesh(dim=3)
image_object.SetDimensions((50,50,3))
image_object.SetOrigin([0.,0.,0.])
image_object.SetSpacing([1.,1.,1.]) # pixel/voxel size in each dimension
# Create Image Group in dataset
data.add_image(image_object, imagename='group_name', indexname='image_indexname', location='parent_name',
               description="""Write image group description here.""")

Creating empty images

[ ]:
data.add_image(imagename='group_name', indexname='image_indexname', location='parent_name',
               description="""Write image group description here.""")

Get image object from Image Group

[ ]:
# Get BasicTools image object from SampleData image group `group_name` including image group fields data arrays
im_object = data.get_image('group_name', with_fields=True)

Creating and getting image Fields

[ ]:
# Creating a field for image group `group_name` from Numpy array `tensor_field`
data.add_field(gridname='group_name', fieldname='node_name', location='parent_name', indexname='field_indexname',
               array=tensor_field, replace=True) # replace = True allows to overwrite preexisting field with same name

# Getting image fields
# --> field returned as Numpy array
field = data.get_field('node_name')
field = data.get_node('node_name', as_numpy=True)
field = data['node_name']
field = data.node_name
# --> field returned as a Pytables Node object
field = data.get_node('node_name')

Creating a field time serie

[ ]:
instants = [1.,10., 100.]
# Add three temporal values for the field `node_name` in image group `group_name` for 3 different time values given in
# `instants` array. Field values are stored in *Numpy* arrays temporal_field_0, temporal_field_1, temporal_field_2
data.add_field(gridname='group_name', fieldname='node_name', location='parent_name', indexname='Field',
               array=temporal_field_0, time=instants[0])
# instant 1
data.add_field(gridname='group_name', fieldname='node_name', location='parent_name', indexname='Field',
               array=temporal_field_1, time=instants[1])
# instant 2
data.add_field(gridname='group_name', fieldname='node_name', location='parent_name', indexname='Field',
               array=temporal_field_2, time=instants[2])

Mesh Groups and Mesh Fields: creating and getting them

Creating Mesh objects with BasicTools

[ ]:
# Import Basictools mesh creation tools
import BasicTools.Containers.UnstructuredMeshCreationTools as UMCT
# Create a Node and Connectivity (elements) array, then create a mesh:
mesh = UMCT.CreateMeshOfTriangles(mesh_nodes, mesh_elements) # mesh of triangles
mesh = UMCT.CreateMeshOf(mesh_nodes, mesh_elements, elemName='tet4') # mesh of tetrahedra
# Create a mesh of a cube with tetrahedron elements
mesh = UMCT.CreateCube(dimensions=[5,5,5],spacing=[2.,2.,2.],ofTetras=True)

# adding node and element tags to the mesh
mesh.nodesTags.CreateTag('nodetag_name', False).SetIds(nodetag_Id_list) # Node tag
mesh.GetElementsOfType('tri3').GetTag('elemtag_name').SetIds(elemtag_Id_list) # Element tag ( of type `tri3`)

# adding fields
mesh.nodeFields['nodal_fieldname'] = nodal_field_array
mesh.elemFields['element_fieldname'] = elem_field_array

Creating a Mesh Group in a dataset

[ ]:
# Creating Mesh Group from Mesh object
# mesh is a Basictools mesh object. bin_fields_from_sets options allows to load node and element tags in mesh Group
data.add_mesh(mesh_object=mesh, meshname='meshname', indexname='mesh_indexname', location='mesh_parent',
              bin_fields_from_sets=True)

# Creating Mesh group from file
data.add_mesh(file=meshfile_name, meshname='meshname', indexname='mesh_indexname', location='mesh_parent',
              bin_fields_from_sets=True)

Creating and getting Mesh Fields

[ ]:
# creation of the mesh field
data.add_field(gridname='meshname', fieldname='fieldname', array=field_data_array, indexname='field_indexname')
# Creation of a field part of a time serie
data.add_field(gridname='meshname', fieldname='fieldname', array=field_data_array, indexname='field_indexname',
              time)
# Force element field to be defined on boundary elements if the mesh has same number of bulk and boundary elements
data.add_field(gridname='meshname', fieldname='fieldname', array=field_data_array, indexname='field_indexname',
              time)

# getting the inputed array --> no options
field_data_array = data.get_field('fieldname')
# getting the visualization array of an integration point field
field_data_array = data.get_field('fieldname', get_visualisation_field=True)
# getting the unpadded visualization array of an integration point field
field_data_array = data.get_field('fieldname', unpad_field=False, get_visualisation_field=True)

Getting Mesh objects

[ ]:
# Get a Basictools mesh object with all content of Mesh group 'meshname' (fields, tags, nodes, elements)
mesh = data.get_mesh('meshname')
# Get a Basictools mesh object without fields  (tags, nodes, elements) from Mesh group 'meshname'
mesh = data.get_mesh('meshname', with_fields=False)
# Get a Basictools mesh object without fields and tags  (just nodes, elements) from Mesh group 'meshname'
mesh = data.get_mesh('meshname', with_fields=False, with_tags=False)

Data Compression

[ ]:
# Set chunckshape and compression settings for one data item
compression_options = {'complib':'zlib', 'complevel':1, 'shuffle':True}
chunkshape = c_shap # tuple
data.set_chunkshape_and_compression(nodename='nodename_to_compress', compression_options=compression_options,
                                    chunkshape=chunkshape)

# Set chunckshape and compression settings forseveral nodes
compression_options = {'complib':'zlib', 'complevel':1, 'shuffle':True}
data.set_nodes_compression_chunkshape(node_list=['nodename_to_compress1', 'nodename_to_compress2',...],
                                      compression_options=compression_options,
                                      chunkshape=chunkshape)

# Apply lossy compression
compression_options = {'complib':'zlib', 'complevel':1, 'shuffle':True, 'least_significant_digit':2}
data.set_chunkshape_and_compression(nodename='nodename_to_compress', compression_options=compression_options,
                                    chunkshape=chunkshape)

# Apply lossy compression with normalization
compression_options = {'complib':'zlib', 'complevel':1, 'shuffle':True, 'least_significant_digit':2,
                      normalization='standard'}
data.set_chunkshape_and_compression(nodename='nodename_to_compress', compression_options=compression_options,
                                    chunkshape=chunkshape)


# Apply lossy compression with per-component normalization
compression_options = {'complib':'zlib', 'complevel':1, 'shuffle':True, 'least_significant_digit':2,
                      normalization='standard-per-component'}
data.set_chunkshape_and_compression(nodename='nodename_to_compress', compression_options=compression_options,
                                    chunkshape=chunkshape)

# Create an array with predefined chunkshape and compression settings
data.add_data_array(name='arrayname', indexname='array_indexname', location='parent_name', array=array,
                    chunkshape=chunkshape, compression_options=compression_options)