data_loaders.py

Provides several Dataloader objects which open different kinds of data files - typically acquired at different sources (i.e. beamlines at various synchrotrons) - and bring them into the same shape and form.

class data_loaders.Dataset[source]

The generic class inheriting after BaseModel and defining structure of the dataset in piva. Data files read by corresponding Dataloader classes are returned in this format.

Creates a generic dataset object inheriting from BaseModel that will be filled with all required data and metadata, depending on what is accessible from files generated by different instruments, and left as None otherwise. Asterix indicates attributes that are mandatory for functioning of the DataViewers.

Object contains:

attribute

type

description

data *

np.ndarray

Acquired data set, always 3D

matrix (len(data.shape) = 3). Oriented as: dim(0)-scanned axis, dim(1)- analyzer axis, dim(2)-energy axis. When scan type is a single cut (resulting data are 2D), first dimension is equal (data[0, :, :] = np.array([0]))

xscale *

np.ndarray

Axis along the scanned direction,

units depend on the scan type. When scan type is a single cut (2D), it is set to np.array([1])

yscale *

np.ndarray

Axis along the analyzer slit,

most likely in [deg]

zscale *

np.ndarray

Axis along the energy direction,

most likely in [eV]

ekin

np.ndarray
None
Energy axis in kinetic energy

scale (if default scale is in

binding energy)

kxscale

np.ndarray
None
Momentum axis (saved after

conversion) along the scanned

direction

kyscale

np.ndarray
None
Momentum axis (saved after

conversion) along the analyzer

direction

x

float |

None

x position of the manipulator

y

float |

None

y position of the manipulator

z

float |

None

z position of the manipulator

theta

float |

None

theta angle of the manipulator;

often referred as polar

phi

float |

None

phi angle of the manipulator;

often referred as azimuth

tilt

float |

None

tilt angle of the manipulator

temp

float |

None

Temperature during the experiment

pressure

float |

None

Pressure during the experiment

hv

float |

None

Photon energy used during the

experiment

wf

float |

None

Work function of the analyzer

Ef

float |

None

Correction for the Fermi level

polarization

str |

None

Photon polarization

PE

int |

None

Pass energy of the analyzer

exit_slit

float |

None

Exit (vertical) slit of the

beamline; responsible for energy

resolution

FE

float |

None

Front end of the beamline

scan_type

str |

None

Type of the measurement (e.g.

cut, tilt scan, hv scan)

scan_dim

list |

None

If scan other than cut, scanned

dimensions as list: [start,

stop, step]

acq_mode

str |

None

Data acquisition mode

lens_mode

str |

None

Lens mode of the analyzer

ana_slit

str |

None

Slit opening of the analyzer

defl_angle

float |

None

Applied deflection angle

n_sweeps

int |

None

Number of sweeps

DT

int |

None

Dwell time

data_provenance

dict

Dataset logbook; contains

information about original file and keeps track of functions called on the data

model_config: ClassVar[ConfigDict] = {'arbitrary_types_allowed': True, 'extra': 'allow'}

Configuration for the model, should be a dictionary conforming to [ConfigDict][pydantic.config.ConfigDict].

add_org_file_entry(fname, dl)[source]

Add information about the original data file to data provenance logbook.

Parameters:
  • fname (str) – file name

  • dl (str) – specific Dataloader (inheriting from Dataloader) that was used to open the file

Return type:

None

class data_loaders.Dataloader[source]

Parent class (interface) from which other DataLoaders inherit some methods. Even while using same software, files can differ from beamline to beamline in terms of format, amount of saved metadata and the way they are stored. To take differences into account DataLoaders for specific beamlines are implemented separately.

__init__()[source]
Return type:

None

abstract load_data(filename, metadata)[source]

Must be implemented in subclasses.

Parameters:
Return type:

Dataset

load_ses_zip(filename, bl_md=None, metadata=False)[source]

Load data from SES (Scienta) *.zip files.

Parameters:
  • filename (str) – absolute path to the file

  • bl_md (list | None) – beamline specific metadata passed as a list of tuples in format (name str, label str, type type), where name stands for how data entry is saved in the file, label - how the information should be called in the BaseModel and type - type of the variable (float, str, etc.)

  • metadata (bool) – if True, read only metadata and size of the dataset to display them in DataBrowser window. Helps to browse through the files faster, without actually loading entire file.

Return type:

None

load_ses_ibw(filename, bl_md=None, metadata=False)[source]

Load data from SES (Scianta) IGOR binary wave (*.ibw) files.

Parameters:
  • filename (str) – absolute path to the file

  • bl_md (list | None) – beamline specific metadata. Not used here, but required to mach format of other Dataloaders. See load_ses_zip() for more info.

  • metadata (bool) – if True, read only metadata and size of the dataset. See load_ses_zip() for more info.

Return type:

None

load_ses_pxt(filename, bl_md=None, metadata=False)[source]

Load data from SES (Scianta) IGOR packed experiment (*.pxt) files.

Parameters:
  • filename (str) – absolute path to the file

  • bl_md (list | None) – beamline specific metadata. Not used here, but required to mach format of other Dataloaders. See load_ses_zip() for more info.

  • metadata (bool) – if True, read only metadata and size of the dataset. See load_ses_zip() for more info.

Return type:

None

static read_ses_metadata(ns, meta, bl_md=None, zip=False)[source]

Load metadata from SES file/notes/comments for the analyzer settings and some beamline specific, if provided.

Parameters:
  • ns (Dataset | Namespace) – object to fill up with values.

  • meta (list) – list of strings, usually lines read from loaded data, where the metadata can be found.

  • bl_md (list | None) – beamline specific metadata. See load_ses_zip() for more info.

  • zip (bool) – lines in meta might require slightly different decoding. If True, apply the one used in zip files.

Return type:

None

static load_raster_scan(wave, bl_md=None, metadata=False)[source]

Load data from xy manipulator raster scan. Each energy-momentum map is saved as a separate Dataset object.

Parameters:
  • wave (Any) – loaded ibw wave.

  • bl_md (list | None) – beamline specific metadata. Not used here, but required to mach format of other Dataloaders. See load_ses_zip() for more info.

  • metadata (bool) – if True, read only metadata and size of the dataset. See load_ses_zip() for more info.

Returns:

array with loaded Dataset objects

Return type:

ndarray

validate_at_return(filename)[source]

Validate that the Dataset was correctly populated with data and add original file information to the data provenance record.

Parameters:

filename (str) – absolute path to the file

Returns:

loaded dataset with available metadata

class data_loaders.DataloaderPickle[source]

Dataloader for opening files saved with piva. Files are in binary format saved using pickle module, and contain raw Dataset object.

__init__()[source]
load_data(filename, metadata=False)[source]

Load pickle file and bring it into correct format.

Parameters:
  • filename (str) – absolute path to the file

  • metadata (bool) – if True, read only metadata and size of the dataset. Not used here, but required to mach format of other Dataloaders. See load_ses_zip() for more info.

Returns:

loaded dataset with available metadata

Return type:

Dataset

class data_loaders.DataloaderSIS[source]

Dataloader for opening files from SIS beamline at SLS (Swiss Light Source, Switzerland).

__init__()[source]
load_data(filename, metadata=False)[source]

Recognize correct format and load data from the file.

Parameters:
  • filename (str) – absolute path to the file

  • metadata (bool) – if True, read only metadata and size of the dataset. Not used here, but required to mach format of other Dataloaders. See load_ses_zip() for more info.

Returns:

loaded dataset with available metadata

Return type:

Dataset

load_h5(filename, metadata=False)[source]

Load HDF file and all available metadata.

Parameters:
  • filename (str) – absolute path to the file

  • metadata (bool) – if True, read only metadata and size of the dataset. Not used here, but required to mach format of other Dataloaders. See load_ses_zip() for more info.

Return type:

None

class data_loaders.DataloaderADRESS[source]

Dataloader for opening files from Address beamline at SLS (Swiss Light Source, Switzerland).

__init__()[source]
load_data(filename, metadata=False)[source]

Recognize correct format and load data from the file.

Parameters:
  • filename (str) – absolute path to the file

  • metadata (bool) – if True, read only metadata and size of the dataset. Not used here, but required to mach format of other Dataloaders. See load_ses_zip() for more info.

Returns:

loaded dataset with available metadata

Return type:

Dataset

load_h5(filename, metadata=False)[source]

Load HDF file and all available metadata.

Parameters:
  • filename (str) – absolute path to the file

  • metadata (bool) – if True, read only metadata and size of the dataset. Not used here, but required to mach format of other Dataloaders. See load_ses_zip() for more info.

Return type:

None

read_metadata(keys, metadata_list)[source]

Read metadata from HDF file in a similar fashion as in Dataloader.read_ses_metadata() (see for more details).

Parameters:
  • keys (list) – keys to metadata passed as a list of tuples in format (name str, label str, type type), where name stands for how data entry is saved in the file, label - how the information should be called in the BaseModel and type - type of the variable (float, str, etc.)

  • metadata_list (list) – list of strings, usually lines read from loaded data, where the metadata can be found.

Return type:

None

class data_loaders.DataloaderBloch[source]

Dataloader for opening files from Bloch beamline at MAX-IV (Sweden).

__init__()[source]
load_data(filename, metadata=False)[source]

Recognize correct format and load data from the file.

Parameters:
  • filename (str) – absolute path to the file

  • metadata (bool) – if True, read only metadata and size of the dataset. Not used here, but required to mach format of other Dataloaders.See load_ses_zip() for more info.

Returns:

loaded dataset with available metadata

Return type:

Dataset

class data_loaders.DataloaderI05[source]

Dataloader for opening files from I05 beamline at Diamond Light Source (UK).

__init__()[source]
load_data(filename, metadata=False)[source]

Recognize correct format and load data from the file.

Parameters:
  • filename (str) – absolute path to the file

  • metadata (bool) – if True, read only metadata and size of the dataset. Not used here, but required to mach format of other Dataloaders.See load_ses_zip() for more info.

Returns:

loaded dataset with available metadata

Return type:

Dataset

load_nxs(filename, metadata)[source]

Load nexus file and all available metadata.

Parameters:
  • filename (str) – absolute path to the file

  • metadata (bool) – if True, read only metadata and size of the dataset. Not used here, but required to mach format of other Dataloaders. See load_ses_zip() for more info.

Return type:

None

class data_loaders.DataloaderMERLIN[source]

Dataloader for opening files from Merlin beamline at ALS (Advanced Light Source, Berkeley, CA).

__init__()[source]
load_data(filename, metadata=False)[source]

Recognize correct format and load data from the file.

Parameters:
  • filename (str) – absolute path to the file

  • metadata (bool) – if True, read only metadata and size of the dataset. Not used here, but required to mach format of other Dataloaders.See load_ses_zip() for more info.

Returns:

loaded dataset with available metadata

Return type:

Dataset

load_h5(filename, metadata=False)[source]

Load HDF type file and all available metadata.

Parameters:
  • filename (str) – absolute path to the file

  • metadata (bool) – if True, read only metadata and size of the dataset. Not used here, but required to mach format of other Dataloaders. See load_ses_zip() for more info.

Return type:

None

class data_loaders.DataloaderHERS[source]

Dataloader for opening files from Merlin beamline at ALS (Advanced Light Source, Berkeley, CA).

__init__()[source]
load_data(filename, metadata=False)[source]

Recognize correct format and load data from the file.

Parameters:
  • filename (str) – absolute path to the file

  • metadata (bool) – if True, read only metadata and size of the dataset. Not used here, but required to mach format of other Dataloaders.See load_ses_zip() for more info.

Returns:

loaded dataset with available metadata

Return type:

Dataset

class data_loaders.DataloaderURANOS[source]

Dataloader for opening files from Uranos beamline at Solaris (Poland).

__init__()[source]
load_data(filename, metadata=False)[source]

Recognize correct format and load data from the file.

Parameters:
  • filename (str) – absolute path to the file

  • metadata (bool) – if True, read only metadata and size of the dataset. Not used here, but required to mach format of other Dataloaders. See load_ses_zip() for more info.

Returns:

loaded dataset with available metadata

Return type:

Dataset

class data_loaders.DataloaderCASSIOPEE[source]

Dataloader for opening files from CASSIOPEE beamline at SOLEIL (France).

__init__()[source]
load_data(filename, metadata=False)[source]

Recognize correct format and load data from the file.

Parameters:
  • filename (str) – absolute path to the file.

  • metadata (bool) – if True, read only metadata and size of the dataset. Not used here, but required to mach format of other Dataloaders. See load_ses_zip() for more info.

Returns:

loaded dataset with available metadata.

Return type:

Dataset

load_from_file(filename, metadata=False)[source]

Recognize correct format and load data from the file.

Parameters:
  • filename (str) – absolute path to the file

  • metadata (bool) – if True, read only metadata and size of the dataset. Not used here, but required to mach format of other Dataloaders. See load_ses_zip() for more info.

Returns:

loaded dataset with available metadata

Return type:

None

load_from_dir(dirname)[source]

Read data from directory containing slices of data saved in separate files. Note: At CASSIOPEE beamline multidimensional scans are saved as a collection of *.txt files, that need to be combined into full map.

Parameters:

dirname (str) – absolute path to the file

Return type:

None

load_from_txt(filename)[source]

Load data from *.txt file.

Parameters:

filename (str) – absolute path to the file

Return type:

None

get_outer_loop(dirname, filenames)[source]

Try to determine the scantype and the corresponding z-axis scale from the additional metadata textfiles. These follow the assumptions made in load_from_dir(). Additionally, the MONOCHROMATOR section must come before the UNDULATOR section as in both sections we have a key hv but only the former makes sense.

Parameters:
  • dirname (str) – absolute path to the directory

  • filenames (list) – list of files’ names to load

Returns:

A tuple of (str - scantype, np.ndarray or float - extracted xscale or the value for hv for non-hv-scans (scantype, zscale, hvs[0]) or (None, None, hvs[0]) in case of failure.

Return type:

tuple

static get_metadata(filename)[source]

Extract some of the metadata stored in a CASSIOPEE output text file. Also try to detect the line number below which the data starts (for np.loadtxt( , skiprows=) .)

Parameters:

filename (str) – absolute path to the file

Returns:

(i, energy, angles), where i - numbers of rows to skip before redaing data, energy - energy axis, angles - analyzer axis.

Return type:

tuple

static read_metadata(keys, metadata_file)[source]

Read some metadata from one of the header files.

Parameters:
  • keys (list) – keys to metadata passed as a list of tuples in format (name str, label str, type type), where name stands for how data entry is saved in the file, label - how the information should be called in the Namespace and type - type of the variable (float, str, etc.)

  • metadata_file (IOBase) – opened file containing metadata

Returns:

object with collected metadata

Return type:

Namespace

data_loaders.start_step_n(start, step, n)[source]

Return an array that starts at value start and goes n steps of step. Helpful for generating axes, as many systems provide exactly starting value, step and dimensionality of the data.

Parameters:
  • start (float) – begining value of the axis

  • step (float) – step value along the axis

  • n (int) – number of steps

Returns:

generated axis

Return type:

ndarray

data_loaders.load_data(filename, metadata=False, suppress_warnings=False)[source]

Try to load file by iterating through all Dataloaders and applying the respective Dataloader’s load_data method.

Parameters:
  • filename (str) – absolute path to the file

  • metadata (bool) – if True, read only metadata and size of the dataset. Not used here, but required to mach format of other Dataloaders. See load_ses_zip() for more info.

  • suppress_warnings (bool) – if True, suppress possible warning to keep terminal clean

Returns:

loaded dataset with available metadata. NOTE: method returns Dataset loaded with the first Dataloader that didn’t raise any errors. Might be, that other Dataloader can perform better, especially with regard to loaded metadata.

Return type:

Dataset

data_loaders.dump(data, filename, force=False)[source]

Wrapper for pickle.dump(), to save opened and modified Dataset.

Parameters:
  • data (Dataset) – dataset to save

  • filename (str) – absolute path to the file,

  • force (bool) – if True, overwrite existing file without asking. Default is False

Return type:

None

data_loaders.update_namespace(data, *attributes)[source]

Add attributes to a Dataset.

Parameters:
  • data (Dataset) – dataset object

  • attributes (list) – list of tuples (name, value) pairs of the attributes to add. Where name is a str and value any python object.

Return type:

None