Data structure & access

Data produced among sessions is published into Redis (RAM storage) and in the same time written to disk in a hdf5 file (see data saving).

In Redis, data is stored for a limited period of time (1 day by default) and for a limited amount (1GB by default).

Experiment and data files structure¶

A complete experiment can be seen has a succession of measurements with different samples in different conditions.

In a Bliss session, the experiment can be seen as a tree, where the trunk is the session itself and where each measurement performed for a given sample and experimental conditions is a branch.

As an example, let’s consider:

two samples: ‘sample1‘ and ‘sample2‘
a session named ‘test_session‘
several measurements with the two samples
a measurement that consists in scanning one of the samples by moving a motor ‘roby‘ and measuring the value of a counter ‘diode‘ along the scan.

The sample scan is performed with the following command (for details, see: scan commands):

ascan(roby, 0, 9, 10, 0.1, diode)

To start the session:

bliss -s test_session

Format data saving path for the experiment:

While starting a session for the first time, a directory with the same name as the session is created and the scan data is stored in this directory.

As two different samples will be scanned, one sub-directory per sample will be created. To do that, the SCAN_SAVING object has to be used. The data saving path is customized by adding a new parameter ‘s_name‘ usable in the template of the PATH.

SCAN_SAVING.add('s_name', '')                # add a parameter named "s_name".
SCAN_SAVING.template = '{session}/{s_name}/' # modify data saving path template.
SCAN_SAVING.s_name = 'sample1'               # set value of parameter "s_name".

Perform a first measurement:

ascan(roby, 0, 9, 10, 0.1, diode)

Perform a second measurement:

ascan(roby, 0, 9, 10, 0.1, diode)

Change the data saving path for measurements on sample2:

SCAN_SAVING.s_name = 'sample2'

Perform a measurement:

ascan(roby, 0, 9, 10, 0.1, diode)

For this experiment the files structure inside the session main folder is described by the following tree:

Screenshot

The measurements data can be accessed by reading the content of the hdf5 files (see silx, pymca). One file usually contains multiple scans data.

Screenshot

Experiment and Redis data structure¶

In parallel of the in-file data storage describe above, Redis stores the data in RAM memory as soon as it is produced. Therefore retrieving data from Redis allows a fast and live access to the data.

Sessions accessing

bliss.data.node.sessions_list() gives access to the list of sessions having data published in Redis. It returns a list of DataNodeContainer objects.

DEMO [6]: import bliss
DEMO [7]: bliss.data.node.sessions_list()
 Out [7]: [<bliss.data.node.DataNodeContainer object at 0x7f9e83ebab00>,
           <bliss.data.node.DataNodeContainer object at 0x7f9e83ffb828>]

DEMO [8]: bliss.data.node.sessions_list()[0].name
 Out [8]: 'demo'

Redis stores the data as a flatten list of (key:value) pairs but hopefully the bliss.data.node module provides a simple interface to access the data in a structured way that reflects the session structure.

from bliss.data.node import get_node
n = get_node("test_session")
for node in n.iterator.walk(wait=False):
        print(node.name, node.db_name, node)

test_session    test_session                                                      <bliss.data.node.DataNodeContainer object at 0x7ff7064f36d8>
sample1         test_session:mnt:c:tmp:sample1                                    <bliss.data.node.DataNodeContainer object at 0x7ff7063d3b38>
1_ascan         test_session:mnt:c:tmp:sample1:1_ascan                            <bliss.data.scan.Scan object at 0x7ff7063e7e10>
axis            test_session:mnt:c:tmp:sample1:1_ascan:axis                       <bliss.data.node.DataNodeContainer object at 0x7ff7063e7f60>
roby            test_session:mnt:c:tmp:sample1:1_ascan:axis:roby                  <bliss.data.channel.ChannelDataNode object at 0x7ff7062a4668>
timer           test_session:mnt:c:tmp:sample1:1_ascan:axis:timer                 <bliss.data.node.DataNodeContainer object at 0x7ff7063f5978>
elapsed_time    test_session:mnt:c:tmp:sample1:1_ascan:axis:timer:elapsed_time    <bliss.data.channel.ChannelDataNode object at 0x7ff70633e748>
diode           test_session:mnt:c:tmp:sample1:1_ascan:axis:timer:diode           <bliss.data.node.DataNodeContainer object at 0x7ff70633ef98>
diode           test_session:mnt:c:tmp:sample1:1_ascan:axis:timer:diode:diode     <bliss.data.channel.ChannelDataNode object at 0x7ff7063670f0>
2_ascan         test_session:mnt:c:tmp:sample1:2_ascan                            <bliss.data.scan.Scan object at 0x7ff7063e7438>
axis            test_session:mnt:c:tmp:sample1:2_ascan:axis                       <bliss.data.node.DataNodeContainer object at 0x7ff70633ee10>
roby            test_session:mnt:c:tmp:sample1:2_ascan:axis:roby                  <bliss.data.channel.ChannelDataNode object at 0x7ff7062edd30>
timer           test_session:mnt:c:tmp:sample1:2_ascan:axis:timer                 <bliss.data.node.DataNodeContainer object at 0x7ff7062edd68>
elapsed_time    test_session:mnt:c:tmp:sample1:2_ascan:axis:timer:elapsed_time    <bliss.data.channel.ChannelDataNode object at 0x7ff70633e470>
diode           test_session:mnt:c:tmp:sample1:2_ascan:axis:timer:diode           <bliss.data.node.DataNodeContainer object at 0x7ff70633efd0>
diode           test_session:mnt:c:tmp:sample1:2_ascan:axis:timer:diode:diode     <bliss.data.channel.ChannelDataNode object at 0x7ff7063e7908>
sample2         test_session:mnt:c:tmp:sample2                                    <bliss.data.node.DataNodeContainer object at 0x7ff7063f59b0>
1_ascan         test_session:mnt:c:tmp:sample2:1_ascan                            <bliss.data.scan.Scan object at 0x7ff7063e74e0>
axis            test_session:mnt:c:tmp:sample2:1_ascan:axis                       <bliss.data.node.DataNodeContainer object at 0x7ff7062edb00>
roby            test_session:mnt:c:tmp:sample2:1_ascan:axis:roby                  <bliss.data.channel.ChannelDataNode object at 0x7ff7063f5240>
timer           test_session:mnt:c:tmp:sample2:1_ascan:axis:timer                 <bliss.data.node.DataNodeContainer object at 0x7ff7063f5710>
elapsed_time    test_session:mnt:c:tmp:sample2:1_ascan:axis:timer:elapsed_time    <bliss.data.channel.ChannelDataNode object at 0x7ff70633ea20>
diode           test_session:mnt:c:tmp:sample2:1_ascan:axis:timer:diode           <bliss.data.node.DataNodeContainer object at 0x7ff70633e358>
diode           test_session:mnt:c:tmp:sample2:1_ascan:axis:timer:diode:diode     <bliss.data.channel.ChannelDataNode object at 0x7ff70633ea90>

With the function n = get_node("node_db_name") node_db_name is your entry point and n is the associated DataNodeContainer.

Note

Better use get_session_node() when you know that get_node() will return a session node.

With the function n.iterator.walk(wait=False) you can iterate over all the child nodes of the node n (see DataNodeIterator).

Among the child nodes, two other types of nodes can be ditinguished:

the ChannelDataNode
and the scan.

They both inherit from the DataNodeContainer class.

The experimental measures are associated to the ChannelDataNodes, and the scan object to the Scan nodes.

Screenshot

You can access any node using its full name (db_name):

cdn_roby = get_node("test_session:mnt:c:tmp:sample1:1_ascan:axis:roby")

Online data analysis¶

The classes inheriting from the DataNode class provide the iterator method which returns a DataNodeIterator object.

The DataNodeIterator provides the best methods to monitor the events happening during the experiment and follow the data production.

The method walk(filter=None, wait=True) iterates over existing child nodes that match the filter argument. If wait is True (default), the function blocks until a new node appears. It returns the new node when it appears and then waits again for a new node.

The method walk_events(filter=None) walks through child nodes, just like walk function but waits for node events (like EVENTS.NEW_CHILD or EVENTS.NEW_DATA_IN_CHANNEL). It returns the event type and the node then waits again for next event.

session = get_node("test_session")
def f(filter=None):
    """wait for any new node in the session"""
    for node in session.iterator.walk(filter=filter):
        print(node.name,node.type)

def g(filter='channel'):
    """wait for a new event happening in any node of the
    type 'channel' (ChannelDataNode)
    """
    for event_type, node in session.iterator.walk_events(filter=filter):
        print(event_type, node.name, node.get(-1))

# spawn greenlets to avoid blocking
g1 = gevent.spawn(f)
g2 = gevent.spawn(g)

# start a scan
ascan(roby,0,9,10,0.01,diode)

# the monitoring prints pop out during the scan
# (produced by f and g running in greenlets)
10_ascan scan
axis None
roby channel
event.NEW_CHILD roby None
timer None
elapsed_time channel
event.NEW_CHILD elapsed_time None
diode None
diode channel
event.NEW_CHILD diode None
event.NEW_DATA_IN_CHANNEL elapsed_time 0.0
event.NEW_DATA_IN_CHANNEL roby 0.0
event.NEW_DATA_IN_CHANNEL diode 70.0
event.NEW_DATA_IN_CHANNEL elapsed_time 0.157334566116333
event.NEW_DATA_IN_CHANNEL roby 1.0
event.NEW_DATA_IN_CHANNEL diode -57.0
event.NEW_DATA_IN_CHANNEL elapsed_time 0.33751416206359863
event.NEW_DATA_IN_CHANNEL roby 2.0
event.NEW_DATA_IN_CHANNEL diode -61.0
...

# do not forget to kill the greenlet at the end
g1.kill()
g2.kill()

Note

In the example above node.get(-1) is used to retrieve the last data value produced on this node. For more details see method get(from_index,to_index=None) of the ChannelDataNode.

Lima Data View¶

2D images from Lima servers have a special lima node type, which corresponds to a LimaImageChannelDataNode class. When executing .get() on those nodes, the returned value is not directly the raw images, since it can be very voluminous, but a “view” on the data. It represents references to data already produced by the image channel at the moment the view object is instantiated.

g3 = gevent.spawn(g, "lima")

lima_simulator=config.get("lima_simulator")
ct(0.1, lima_simulator)
event.NEW_NODE image <bliss.data.lima.LimaImageChannelDataNode.LimaDataView object at 0x7f4d28931be0>
event.NEW_DATA_IN_CHANNEL image <bliss.data.lima.LimaImageChannelDataNode.LimaDataView object at 0x7f4d28931c18>

Function g() calls .get(-1) on the node object, thus returning the last image view. This works the same as with other nodes. It is perfectly valid to only get a view with partial data, for example: .get(0, 10) will return a view object for images from 0 to 10 only.

Getting raw image data¶

The .get_image() method can be called on LimaDataView objects to retrieve raw image data from references stored within a Lima data view object.

ct(0.1, lima_simulator)
Scan(number=1, name=ct, path=<no saving>)

SCANS[-1].node
<bliss.data.scan.Scan object at 0x7efda1dd0898>

# there is only one Lima node for this 'ct'
lima_node = next(SCANS[-1].node.iterator.walk(filter="lima", wait=False))

lima_data_view = lima_node.get(-1)

raw_image_data = lima_data_view.get_image(-1)

Note

.get_image(-1) means “get latest image from the view”. In the example above, there is only one image so it would be equivalent to .get_image(0).

Scanning & experiment sequences¶

In the context of an experiment or a scan procedure it exist a more convenient way to obtain the data produced by a scan. At the shell level, it exist a global variable SCANS that stores all the scan objects that have been launched in this shell session.

For example SCANS[-1] returns the last scan object and SCANS[-1].get_data() returns the data of that scan.

SCANS
deque([Scan(number=1, name=ascan, path=/mnt/c/tmp/sample2/test_session/data.h5),
       Scan(number=2, name=ascan, path=/mnt/c/tmp/sample2/test_session/data.h5)],
maxlen=20)

SCANS[-1]
Scan(number=11, name=ascan, path=/mnt/c/tmp/sample2/test_session/data.h5)

SCANS[-1].get_data()
{
'roby':         array([0., 1., 2., 3., 4., 5., 6., 7., 8., 9.]),
'elapsed_time': array([0., 0.15733, 0.33716, 0.49581, 0.65329,
                       0.80668, 0.94995, 1.12723, 1.28504, 1.43661 ]),
'diode':        array([ 70., -57., -61., -43., 89., 54., 23., -89., -87., -98.])
}

Note

Scans are stored per session in the .scans property of Session object. The global SCANS is only available from Bliss Shell and refers to current_session.scans.

Getting image data¶

Image data is not directly retrieved via the scan object .get_data() method, since it can be very voluminous. In case of an acquisition with images, the returned dictionary will contain a key with the image channel fullname, that gives access to a data view object.

The view object is used to get the raw image data from references. Indeed, the image data can be still available from the Lima server memory, if it is not the case it may try other data sources – at the last resort it tries to open the file (if image saving is activated).

The Lima data view object has a .get_image(image_index) method that returns the raw image data:

BLISS [1]: lima_simulator=config.get("lima_simulator")
BLISS [2]: ct(0.1, lima_simulator)
  Out [2]: Scan(number=14, name=ct, path=<no saving>)
BLISS [3]: SCANS[-1].get_data()['lima_simulator:image'].get_image(-1)
  Out [3]: array([[0, 0, 0, ..., 0, 0, 0],
                  [0, 0, 0, ..., 0, 0, 0],
                  [0, 0, 0, ..., 0, 0, 0],
                   ...,
                  [0, 0, 0, ..., 0, 0, 0],
                  [0, 0, 0, ..., 0, 0, 0],
                  [0, 0, 0, ..., 0, 0, 0]], dtype=uint32)