How to write a custom control line handling plugin module for spec#

Sections

A custom plugin module for spec2nexus.spec is provided in a python module (Python source code file). In this custom plugin module are subclasses for each new control line to be supported. An exception will be raised if a custom plugin module tries to provide support for an existing control line.

Load a plugin module#

Control line handling plugins for spec2nexus will automatically register themselves when their module is imported.

 1import pathlib
 2import spec2nexus.plugin_core
 3import spec2nexus.spec
 4
 5# load each custom plugin file:
 6path = pathlib.Path("my_plugin_file.py").absolute()
 7spec2nexus.plugin_core.install_user_plugin(path)
 8
 9# read a SPEC data file, scan 5
10spec_data_file = spec2nexus.spec.SpecDataFile("path/to/spec/datafile")
11scan5 = spec_data_file.getScan(5)

Write a plugin module#

Give the custom plugin module a name ending with .py. As with any Python module, the name must be unique within a directory. If the plugin is not in your working directory, there must be a __init__.py file in the same directory (even if that file is empty) so that your plugin module can be loaded with import <MODULE>.

Please view the existing plugins in spec_common for examples. The custom plugin module should contain, at minimum one subclass of ControlLineBase which allows them to register themselves when their module is imported. A custom plugin module can contain many such handlers, as needs dictate.

These imports are necessary to to write plugins for spec2nexus:

1from spec2nexus.plugin_core import ControlLineBase
2from spec2nexus.utils import strip_first_word

Attribute: ``key`` (required)

Each subclass must define key key as a regular expression match for the control line key. It is possible to override any of the supplied plugins for scan control line control lines. Caution is advised to avoid introducing instability.

Attribute: ``scan_attributes_defined`` (optional)

If your plugin creates any attributes to the SpecDataScan() object (such as the hypothetical scan.hdf5_path and scan.hdf5_file), you declare the new attributes in the scan_attributes_defined list. Such as this:

1scan_attributes_defined = ['hdf5_path', 'hdf5_file']

Method: ``process()`` (required)

Each subclass must also define a process() method to process the control line. A NotImplementedError exception is raised if key is not defined.

Method: ``match_key()`` (optional)

For difficult regular expressions (or other situations), it is possible to replace the function that matches for a particular control line key. Override the handler’s match_key() method. For more details, see the section Custom key match function.

Method: ``postprocess()`` (optional)

For some types of control lines, processing can only be completed after all lines of the scan have been read. In such cases, add a line such as this to the process() method:

scan.addPostProcessor(self.key, self.postprocess)

(You could replace self.key here with some other text. If you do, make sure that text will be unique as it is used internally as a python dictionary key.) Then, define a postprocess() method in your handler:

def postprocess(self, scan, *args, **kws):
    # handle your custom info here

See section Postprocessing below for more details. See spec_common for many examples.

Method: ``writer()`` (optional)

Writing a NeXus HDF5 data file is one of the main goals of the spec2nexus package. If you intend data from your custom control line handler to end up in the HDF5 data file, add a line such as this to either the process() or postprocess() method:

scan.addH5writer(self.key, self.writer)

Then, define a writer() method in your handler. Here’s an example:

def writer(self, h5parent, writer, scan, nxclass=None, *args, **kws):
    """Describe how to store this data in an HDF5 NeXus file"""
    desc='SPEC positioners (#P & #O lines)'
    group = makeGroup(h5parent, 'positioners', nxclass, description=desc)
    writer.save_dict(group, scan.positioner)

See section Custom HDF5 writer below for more details.

Full Example: #PV control line#

Consider a SPEC data file (named pv_data.txt) with the contrived example of a #PV control line that associates a mnemonic with an EPICS process variable (PV). Suppose we take this control line content to be two words (text with no whitespace):

 1#F pv_data.txt
 2#E 1454539891
 3#D Wed Feb 03 16:51:31 2016
 4#C pv_data.txt  User = spec2nexus
 5#O0 USAXS.a2rp  USAXS.m2rp  USAXS.asrp  USAXS.msrp  mr  unused37  mst  ast
 6#O1 msr  asr  unused42  unused43  ar  ay  dy  un47
 7
 8#S 1  ascan  mr 10.3467 10.3426  30 0.1
 9#D Wed Feb 03 16:52:03 2016
10#T 0.1  (seconds)
11#P0 3.5425 6.795 7.7025 5.005 10.34465 0 0 0
12#P1 7.6 17.17188 -8.67896 -0.351 10.318091 0 18.475664 0
13#C tuning USAXS motor mr
14#PV mr ioc:m1
15#PV ay ioc:m2
16#PV dy ioc:m3
17#N 18
18#L mr    ay  dy  ar_enc  pd_range  pd_counts  pd_rate  pd_curent  I0_gain  I00_gain  Und_E  Epoch  seconds  I00  USAXS_PD  TR_diode  I0  I0
1910.34665  0.000 18.476 10.318091 1 5 481662 0.000481658 1e+07 1e+09 18.172565 33.037 0.1 199 2 1 114 114
2010.34652  0.000 18.476 10.318091 1 5 481662 0.000481658 1e+07 1e+09 18.172565 33.294 0.1 198 2 1 139 139
2110.34638  0.000 18.476 10.318091 1 5 481662 0.000481658 1e+07 1e+09 18.172565 33.553 0.1 198 2 1 181 181
2210.34625  0.000 18.476 10.318091 1 5 481662 0.000481658 1e+07 1e+09 18.172565 33.952 0.1 198 2 1 274 274
2310.34278  0.000 18.476 10.318091 1 5 481662 0.000481658 1e+07 1e+09 18.172309 41.621 0.1 198 2 1 232 232
2410.34265  0.000 18.476 10.318091 1 5 481662 0.000481658 1e+07 1e+09 18.172565 41.867 0.1 199 2 1 159 159
25#C Wed Feb 03 16:52:14 2016.  removed many data rows for this example.

A plugin (named pv_plugin.py) to handle the #PV control lines could be written as:

 1from spec2nexus.plugin_core import ControlLineBase
 2from spec2nexus.utils import strip_first_word
 3
 4class PV_ControlLine(ControlLineBase):
 5    '''**#PV** -- EPICS PV associates mnemonic with PV'''
 6    
 7    key = '#PV'
 8    scan_attributes_defined = ['EPICS_PV']
 9    
10    def process(self, text, spec_obj, *args, **kws):
11        args = strip_first_word(text).split()
12        mne = args[0]
13        pv = args[1]
14        if not hasattr(spec_obj, "EPICS_PV"):
15            # use OrderedDict since it remembers the order we found these
16            spec_obj.EPICS_PV = {}
17        spec_obj.EPICS_PV[mne] = pv

When the scan parser encounters the #PV lines in our SPEC data file, it will call this process() code with the full text of the line and the spec scan object where this data should be stored. We will choose to store this (following the pattern of other data names in SpecDataFileScan) as scan_obj.EPICS_PV using a dictionary.

It is up to the user what to do with the scan_obj.EPICS_PV data. We will not consider the write() method in this example. (We will not write this infromation to a NeXus HDF5 file.)

We can then write a python program (named pv_example.py) that will load the data file and interpret it using our custom plugin:

 1import spec2nexus.plugin_core
 2import spec2nexus.spec
 3from spec2nexus.control_lines import control_line_registry
 4from spec2nexus.plugin_core import install_user_plugin
 5
 6# show our plugin is not loaded
 7print("known: ", "#PV" in control_line_registry.known_keys) # expect False
 8
 9# load our new plugin
10import pathlib
11install_user_plugin(pathlib.Path("pv_plugin.py").absolute())
12
13# show that our plugin is registered
14print("known: ", "#PV" in control_line_registry.known_keys) # expect True
15
16# read a SPEC data file, scan 1
17spec_data_file = spec2nexus.spec.SpecDataFile("pv_data.txt")
18scan = spec_data_file.getScan(1)
19
20# Do we have our PV data?
21print(hasattr(scan, "EPICS_PV"))    # expect True
22print(scan.EPICS_PV)

The output of our program:

1known:  False
2known:  True
3False
4True
5OrderedDict([('mr', 'ioc:m1'), ('ay', 'ioc:m2'), ('dy', 'ioc:m3')])

Example to ignore a #Y control line#

Suppose a control line in a SPEC data file must be ignored. For example, suppose a SPEC file contains this control line: #Y 1 2 3 4 5. Since there is no standard handler for this control line, we create one that ignores processing by doing nothing:

 1from spec2nexus.plugin_core import ControlLineBase
 2
 3class Ignore_Y_ControlLine(ControlLineBase):
 4    '''
 5    **#Y** -- as in ``#Y 1 2 3 4 5``
 6
 7    example: ignore any and all #Y control lines
 8    '''
 9
10    key = '#Y'
11
12    def process(self, text, spec_obj, *args, **kws):
13        pass # do nothing

Postprocessing#

Sometimes, it is necessary to defer a step of processing until after the complete scan data has been read. One example is for 2-D or 3-D data that has been acquired as a vector rather than matrix. The matrix must be constructed only after all the scan data has been read. Such postprocessing is handled in a method in a plugin file. The postprocessing method is registered from the control line handler by calling the addPostProcessor() method of the spec_obj argument received by the handler’s process() method. A key name 1 is supplied when registering to avoid registering this same code more than once. The postprocessing function will be called with the instance of SpecDataFileScan as its only argument.

An important role of the postprocessing is to store the result in the scan object. It is important not to modify other data in the scan object. Pick an attribute named similarly to the plugin (e.g., MCA configuration uses the MCA attribute, UNICAT metadata uses the metadata attribute, …) This attribute will define where and how the data from the plugin is available. The writer() method (see below) is one example of a user of this attribute.

Example postprocessing#

Consider the #U control line example above. For some contrived reason, we wish to store the sum of the numbers as a separate number, but only after all the scan data has been read. This can be done with the simple expression:

1spec_obj.U_sum = sum(spec_obj.U)

To build a postprocessing method, we write:

1def contrived_summation(scan):
2    '''
3    add up all the numbers in the #U line
4
5    :param SpecDataFileScan scan: data from a single SPEC scan
6    '''
7    scan.U_sum = sum(scan.U)

To register this postprocessing method, place this line in the process() of the handler:

1spec_obj.addPostProcessor('contrived_summation', contrived_summation)

Summary Example Custom Plugin with postprocessing#

Gathering all parts of the examples above, the custom plugin module is:

 1from spec2nexus.plugin_core import ControlLineBase
 2from spec2nexus.utils import strip_first_word
 3
 4class User_ControlLine(ControlLineBase):
 5    '''**#U** -- User data (#U user1 user2 user3)'''
 6
 7    key = '#U'
 8
 9    def process(self, text, spec_obj, *args, **kws):
10        args = strip_first_word(text).split()
11        user1 = float(args[0])
12        user2 = float(args[1])
13        user3 = float(args[2])
14        spec_obj.U = [user1, user2, user3]
15        spec_obj.addPostProcessor('contrived_summation', contrived_summation)
16
17
18def contrived_summation(scan):
19    '''
20    add up all the numbers in the #U line
21
22    :param SpecDataFileScan scan: data from a single SPEC scan
23    '''
24    scan.U_sum = sum(scan.U)
25
26
27class Ignore_Y_ControlLine(ControlLineBase):
28    '''**#Y** -- as in ``#Y 1 2 3 4 5``'''
29
30    key = '#Y'
31
32    def process(self, text, spec_obj, *args, **kws):
33        pass

Custom HDF5 writer#

A custom HDF5 writer method defines how the data from the plugin will be written to the HDF5+NeXus data file. The writer will be called with several arguments:

h5parent: obj : the HDF5 group that will hold this plugin’s data

writer: obj : instance of Writer() that manages the content of the HDF5 file

scan: obj : instance of SpecDataFileScan() containing this scan’s data

nxclass: str : (optional) name of NeXus base class to be created

Since the file is being written according to the NeXus data standard 2, use the NeXus base classes 3 as references for how to structure the data written by the custom HDF5 writer.

One responsibility of a custom HDF5 writer method is to create unique names for every object written in the h5parent group. Usually, this will be a NXentry 4 group. You can determine the NeXus base class of this group using code such as this:

1>>> print h5parent.attrs['NX_class']
2<<< NXentry

Choice of NeXus base class#

If your custom HDF5 writer must create a group and you are uncertain which base class to select, it is recommended to use the NXnote 5 base class.

If your data does not fit the structure of the NXnote base class, you are encouraged to find one of the other NeXus base classes that best fits your data. Look at the source code of the supplied plugins for examples.

As a last resort, if your content cannot fit within the parameters of the NeXus standard, use NXcollection, 6 an unvalidated catch-all base class) which can store any content.

The writer uses the eznx module to create and write the various parts of the HDF5 file.

Example writer() method#

Here is an example writer() method from the unicat module:

1 def writer(self, h5parent, writer, scan, nxclass=None, *args, **kws):
2     '''Describe how to store this data in an HDF5 NeXus file'''
3     if hasattr(scan, 'metadata') and len(scan.metadata) > 0:
4         desc='SPEC metadata (UNICAT-style #H & #V lines)'
5         group = eznx.makeGroup(h5parent, 'metadata', nxclass, description=desc)
6         writer.save_dict(group, scan.metadata)

Custom key match function#

The default test that a given line matches a specific spec2nexus.plugin_core.ControlLineBase subclass is to use a regular expression match.

1 def match_key(self, text):
2     '''default regular expression match, based on self.key'''
3     t = re.match(self.key, text)
4     if t is not None:
5         if t.regs[0][1] != 0:
6             return True
7     return False

In some cases, that may prove tedious or difficult, such as when testing for a floating point number with optional preceding white space at the start of a line. This is typical for data lines in a scan or continued lines from an MCA spectrum. in such cases, the handler can override the match_key() method. Here is an example from SPEC_DataLine:

1 def match_key(self, text):
2     '''
3     Easier to try conversion to number than construct complicated regexp
4     '''
5     try:
6         float( text.strip().split()[0] )
7         return True
8     except ValueError:
9         return False

Summary Requirements for custom plugin#

  • file can go in any directory

  • directory containing does not need a __init__.py file

  • multiple control line handlers (plugins) can go in a single file

  • for each control line:

    • subclass ControlLineBase

    • identify the control line pattern

    • define key with a regular expression to match 7

      • key is used to identify control line handlers

      • redefine existing supported control line control lines to replace supplied behavior (use caution!)

      • Note: key="scan data" is used to process the scan data: SPEC_DataLine()

    • define a process() method to handle the supplied text

    • (optional) define a postprocess() method to coordinate data from several process() steps

    • define a writer() method to write the in-memory data structure from this plugin to HDF5+NeXus data file

    • (optional) define match_key() to override the default regular expression to match the key

  • for each postprocessing method:

    • write the method

    • register the method with spec_obj.addPostProcessor(key_name, the_method) in the handler’s process() method.

  • for each plugin file you want to load:


Changes in plugin format with release 2021.2.0#

Changes in plugin format with release 2021.0.0#

With release 2021.0.0, the code to setup plugins has changed. The new code allows all plugins in a module to auto-register themselves as long as the module is imported. All custom plugins must be modified and import code revised to work with new system. See the spec2nexus.plugins.spec_common source code for many examples.

  • SAME: The basics of writing the plugins remains the same.

  • CHANGED: The method of registering the plugins has changed.

  • CHANGED: The declaration of each plugin has changed.

  • CHANGED: The name of each plugin file has been relaxed.

  • CHANGED: Plugin files do not have to be in their own directory.

  • REMOVED: The SPEC2NEXUS_PLUGIN_PATH environment variable has been eliminated.


Footnotes#

1

The key name must be unique amongst all postprocessing functions. A good choice is the name of the postprocessing function itself.

2

https://nexusformat.org

3

https://download.nexusformat.org/doc/html/classes/base_classes/

4

https://download.nexusformat.org/doc/html/classes/base_classes/NXentry.html

5

https://download.nexusformat.org/doc/html/classes/base_classes/NXnote.html

6

https://download.nexusformat.org/doc/html/classes/base_classes/NXcollection.html

7

It is possible to override the default regular expression match in the subclass with a custom match function. See the match_key() method for an example.