Skip to content

cheetah_streaming

Cheetah Streaming.

This module contains a specific version of Cheetah, a data-processing program for Serial X-ray Crystallography. Compare to Cheetah, this version processes data frames, but does not save the extracted data to files: it sends it to external programs for further processing.

StreamingCheetahProcessing

Bases: OmProcessingProtocol

See documentation for the __init__ function.

__init__(*, monitor_parameters)

Cheetah Streaming.

This Processing class implements the Cheetah Streaming software package. Cheetah Streaming processes detector data frames, detecting Bragg peaks in each frame using the Peakfinder8PeakDetection algorithm. It retrieves information about the location, size, intensity, SNR and maximum pixel value of each peak, and then streams the information retrieved from the facility or extracted from the data to external programs for further processing. Optionally, it can also broadcast full detector data frames. Cheetah Streaming can also compute, and write to HDF5 sum files, sums of detector data frames (calculating separate sums for hit and non-hit frames). The sums can saved together with their corresponding Virtual Powder patterns. Cheetah Streaming can also respond to requests for data or change of behavior from external programs (a control GUI, for example.)

This class implements the interface described by its base Protocol class. Please see the documentation of that class for additional information about the interface.

Parameters:

Name Type Description Default
monitor_parameters MonitorParameters

An object storing OM's configuration parameters.

required

initialize_processing_node(*, node_rank, node_pool_size)

Initializes the processing nodes for Cheetah Streaming.

This function initializes all the required algorithms (peak finding, binning, etc.), plus some internal counters.

Please see the documentation of the base Protocol class for additional information about this method.

Parameters:

Name Type Description Default
node_rank int

The OM rank of the current node, which is an integer that unambiguously identifies the current node in the OM node pool.

required
node_pool_size int

The total number of nodes in the OM pool, including all the processing nodes and the collecting node.

required

initialize_collecting_node(node_rank, node_pool_size)

Initializes the collecting node for Cheetah.

This function initializes the data accumulation algorithms, the storage buffers used to compute aggregated statistics on the processed data, and some internal counters. Additionally, it prepares all the necessary network sockets.

Please see the documentation of the base Protocol class for additional information about this method.

Parameters:

Name Type Description Default
node_rank int

The OM rank of the current node, which is an integer that unambiguously identifies the current node in the OM node pool.

required
node_pool_size int

The total number of nodes in the OM pool, including all the processing nodes and the collecting node.

required

process_data(*, node_rank, node_pool_size, data)

Processes a detector data frame.

This function processes retrieved data events, extracting the Bragg peak information. It prepares the reduced data (and optionally, the detector frame data) to be transmitted to the collecting node.

Please see the documentation of the base Protocol class for additional information about this method.

Parameters:

Name Type Description Default
node_rank int

The OM rank of the current node, which is an integer that unambiguously identifies the current node in the OM node pool.

required
node_pool_size int

The total number of nodes in the OM pool, including all the processing nodes and the collecting node.

required
data Dict[str, Any]

A dictionary containing the data that OM retrieved for the detector data frame being processed.

  • The dictionary keys describe the Data Sources for which OM has retrieved data. The keys must match the source names listed in the required_data entry of OM's om configuration parameter group.

  • The corresponding dictionary values must store the the data that OM retrieved for each of the Data Sources.

required

Returns:

Type Description
Tuple[Dict[str, Any], int]

A tuple with two entries. The first entry is a dictionary storing the processed data that should be sent to the collecting node. The second entry is the OM rank number of the node that processed the information.

wait_for_data(*, node_rank, node_pool_size)

Receives and handles requests from external programs.

This function receives requests from external programs over a network socket and reacts according to the nature of the request, sending data back to the source of the request or modifying the internal behavior of Cheetah Streaming.

Please see the documentation of the base Protocol class for additional information about this method.

Parameters:

Name Type Description Default
node_rank int

The OM rank of the current node, which is an integer that unambiguously identifies the current node in the OM node pool.

required
node_pool_size int

The total number of nodes in the OM pool, including all the processing nodes and the collecting node.

required

collect_data(*, node_rank, node_pool_size, processed_data)

Computes statistics on aggregated data and broadcasts data to external programs.

This function collects and accumulates frame- and peak-related information received from the processing nodes, and streams it to external programs. Optionally, it computes the sums of hit and non-hit detector frames and the corresponding virtual powder patterns, and saves them to file. Additionally, this function writes information about the processing statistics (number of processed events, number of found hits and the elapsed time) to a status file at regular intervals. External programs can inspect the file to determine the advancement of the data processing.

Please see the documentation of the base Protocol class for additional information about this method.

Parameters:

Name Type Description Default
node_rank int

The OM rank of the current node, which is an integer that unambiguously identifies the current node in the OM node pool.

required
node_pool_size int

The total number of nodes in the OM pool, including all the processing nodes and the collecting node.

required
processed_data Tuple[Dict, int]

A tuple whose first entry is a dictionary storing the data received from a processing node, and whose second entry is the OM rank number of the node that processed the information.

required

end_processing_on_processing_node(*, node_rank, node_pool_size)

Ends processing on the processing nodes for Cheetah Streaming.

This function prints a message on the console and ends the processing.

Please see the documentation of the base Protocol class for additional information about this method.

Parameters:

Name Type Description Default
node_rank int

The OM rank of the current node, which is an integer that unambiguously identifies the current node in the OM node pool.

required
node_pool_size int

The total number of nodes in the OM pool, including all the processing nodes and the collecting node.

required

Returns:

Type Description
Union[Dict[str, Any], None]

Usually nothing. Optionally, a dictionary storing information to be sent to the processing node.

end_processing_on_collecting_node(*, node_rank, node_pool_size)

Ends processing on the collecting node for Cheetah Streaming.

This function prints a message on the console, writes the final information in the sum and status files, closes the files and ends the processing.

Please see the documentation of the base Protocol class for additional information about this method.

Parameters:

Name Type Description Default
node_rank int

The OM rank of the current node, which is an integer that unambiguously identifies the current node in the OM node pool.

required
node_pool_size int

The total number of nodes in the OM pool, including all the processing nodes and the collecting node.

required