reborn.analysis package

Submodules

reborn.analysis.beam_center module

class reborn.analysis.beam_center.FriedelPearsonCC(dataframe=None, max_shift=10, min_unmasked=100)[source]

Bases: object

get_shifts(dataframe)[source]
get_shifted_geom(dataframe)[source]

reborn.analysis.masking module

reborn.analysis.masking.snr_filter_test(data, mask, mask2, nin, ncent, nout)[source]

This is a slow version of snr_filter, which is about 10-fold faster on multi-core CPUs.

reborn.analysis.masking.snr_filter(dat, mask_in, mask_out, n_in, n_cent, n_out)[source]

Transform an 2D image into a map of local signal-to-noise ratio by the following equivalent steps:

  1. For every pixel in the input data, do a local signal integration within a square region of size \(n_\text{in}*2+1\). Pixels masked by mask_in will be ignored. Masked pixels are indicated by the value zero, while unmasked pixels are indicated by the value one.

  2. Estimate background via a local integration within a square annulus of outer size \(2 n_\text{out} + 1\) and inner size \(2 n_\text{cent} - 1\). Pixels within mask_out will be ignored.

  3. From every pixel in the local signal integration square, subtract the average background value from step (2).

  4. Compute the standard deviation \(\sigma\) in the square annulus. Pixels within mask_out will be ignored.

  5. Divide the locally-integrated signal-minus-background by the standard error. The standard error is equal to \(\sigma/\sqrt{M}\) where \(M\) is the number of unmasked pixels in the locally-integratied signal region, and \(\sigma\) comes from step (4).

The use of two distinct masks allows for multi-pass SNR computations in which the results of the first pass may be used to exclude high-SNR regions from contributing to error estimates in the annulus. See snr_mask if you want to generate a mask this way.

Note

This routine will attempt to use openmp to parallelize the computations. It is affected by the environment variable OMP_NUM_THREADS. You can check how many cores are used by openmp by running the following:

import reborn.fortran; reborn.fortran.omp_test_f.omp_test()
Parameters:
  • dat (ndarray) – The image to analyze.

  • mask_in (ndarray) – The mask for the square central integration region.

  • mask_out (ndarray) – The mask for the square annulus integration region.

  • n_in (int) – Size of the central integration region; integrate from \((-n_{in}, n_{in})\), inclusively.

  • n_cent (int) – Define the annulus integration region; we ignore the box from (-n_cent, n_cent), inclusively.

  • n_out (int) – Define the annulus integration region; we include the box from (-n_out, n_out), inclusively.

Returns:

snr (ndarray): The SNR array.

signal (ndarray): The signal array.

Return type:

(tuple)

reborn.analysis.masking.snr_mask(dat, mask, nin, ncent, nout, threshold=6, mask_negative=True, max_iterations=3, pad_geometry=None, beam=None, subtract_median=False)[source]

Mask out pixels above some chosen SNR threshold. The image is converted to a map of SNR using boxsnr. Additional iterations follow, in which pixels above threshold in the previous run are also masked in the annulus. This iterative procedure helps avoid contributions of peak signals to the Noise calculation.

Parameters:
  • dat (numpy array) – Input data to calculate SNR from.

  • mask (numpy array) – Mask indicating bad pixels (zero means bad, one means ok)

  • nin (int) – See boxsnr function.

  • ncent (int) – See boxsnr function.

  • nout (int) – See boxsnr function.

  • threshold (float) – Reject pixels above this SNR.

  • mask_negative (bool) – Also reject pixels below the negative of the SNR threshold (default: True).

  • max_iterations (int) – The maxumum number of iterations (note: the loop exits if the mask stops changing).

  • pad_geometry (PADGeometryList) – PAD geometry (optional)

  • subtract_median (bool) – Subtract median profiler before masking.

Returns:

The mask with pixels above the SNR threshold

Return type:

numpy array

class reborn.analysis.masking.StreakMasker(geom: PADGeometryList, beam: Beam, n_q=100, q_range=(0, 3000000000.0), prominence=0.8, max_streaks=2, debug=0, streak_width=0.01, snr=6, angle_range=None)[source]

Bases: object

A tool for masking jet streaks or other streak-like features in diffraction patterns. It is assumed that the streak crosses through the beam center.

Parameters:
  • geom (PADGeometryList) – PAD geometry.

  • beam (Beam) – Beam info.

  • n_q (int) – Number of q bins. Default: 100.

  • q_range (float tuple) – Range of q values to consider. Default: (0, 2e10).

  • prominence (float) – Look at the corresponding parameter in scipy.signal.find_peaks. Default: 0.8.

  • max_streaks (int) – Maximum number of streaks. Default: 2.

  • debug (int) – Set to 1 or greater to see some standard output.

  • streak_width (float) – Angular width of the jet streak mask.

  • snr (float) – Threshold the signal-to-noise ration for jet streaks (peak height - average background)/standard deviation

dbgmsg(*args, **kwargs)[source]
get_mask(pattern, mask=None, return_extra=False)[source]

Find streaks and return a mask.

Parameters:
  • pattern (ndarray) – Diffraction intensities.

  • mask (ndarray) – Diffraction intensity mask. Zero means ignore.

Returns: ndarray

reborn.analysis.optimize module

Miscellaneous optimization and fitting routines.

reborn.analysis.optimize.fit_ellipse(x, y)[source]

Given lists of coordinates \((x,\;y)\), do least-squares fit to the coefficients \(a_i\) of the function

(1)\[a_{xx} x^2 + a_{xy} xy + a_{yy}y^2 +a_x x + a_y y + a_1 = 0 \;.\]

The algorithm ensures that the coefficients satisfy the ellipse condition \(4a_{xx} a_{yy}−a_{xy}^2 > 0\). We use the exact code found here, which in turn follows the approch in Fitzgibbon, A.W., Pilu, M., and Fischer R.B., Direct least squares fitting of ellipsees, Proc. of the 13th Internation Conference on Pattern Recognition, pp 253–257, Vienna, 1996.

For convenience, an alternative parameterization is also returned. In this parameterization the ellipse is specified by the coordinates \(X, Y\) and the following relations:

(2)\[\frac{X}{a^2} + \frac{Y}{b^2} = 1\]

and

(3)\[\begin{split}x = \phantom{-}(X - X_0)\cos\theta + (Y - Y_0)\sin\theta \\ y = -(X - X_0)\sin\theta + (Y - Y_0)\cos\theta\end{split}\]

By definition, \(a \ge b\) so that \(a\) is the semi-major axis, and \(\theta\) is the angle by which the semi-major axis is rotated.

Parameters:
  • x (ndarray) – The \(x\) coordinates

  • y (ndarray) – The \(y\) coordinates

Returns:

ndarray with the coefficients \([a_x, a_{xy}, a_{yy}, a_x, a_y, a_1, a, b, X_0, Y_0, \theta]\)

reborn.analysis.optimize.ellipse_center(a)[source]

Find the center \(x_0, \; y_0\) of the ellipse function

(4)\[\frac{(x-x_0)^2}{a^2} + \frac{(y-y_0)^2}{b^2} = 1\]

given the coefficients \(a_i\) of the function

(5)\[a_1 + a_x x^2 + a_{xy} xy + a_{yy}y^2 +a_x x + a_y y = 0 \;.\]

We use the exact code found here. This function should typically be used along with the function fit_ellipse.

Parameters:

a (ndarray) – The array of \(a_i\) coefficients

Returns:

ndarray with center position \([x_0, \; y_0]\)

reborn.analysis.optimize.ellipse_parameters(a)[source]

Convert between the ellipse parameterization

(6)\[a_{xx} x^2 + a_{xy} xy + a_{yy}y^2 +a_x x + a_y y + a_1 = 0 \;.\]

and the parameterization specified by the coordinates \(X, Y\) and the following relations:

(7)\[\frac{X}{a^2} + \frac{Y}{b^2} = 1\]

and

By definition, \(a \ge b\) so that \(a\) is the semi-major axis, and \(\theta\) is the angle by which the semi-major axis is rotated.

Parameters:

a (ndarray) – The coefficients \([a_x, a_{xy}, a_{yy}, a_x, a_y, a_1]\)

Returns:

The coefficients \([a, b, X_0, Y_0, \theta]\)

Return type:

ndarray

reborn.analysis.optimize.fit_ellipse_pad(pad_geometry, data, threshold, mask=None)[source]

Fit an ellipse to pixels above threshold. In order to deal with the possibility of multiple PADs with different detector distances, the x,y coordinates are projected onto a plane located at a distance of 1 meter from the origin. The x-ray beam is assumed to be along the z direction (we can change this if the need arises).

Parameters:
  • pad_geometry (list of PADGeometry’s) – PAD geometry.

  • data (list of ndarray’s) – Data to threshold.

  • threshold (float) – Threshold value. The x,y coordinates from pixels above this will be used in the fit.

  • mask (list of ndarray’s)

Returns:

Ellipse fit parameters (see fit_ellipse function)

Return type:

ndarray

class reborn.analysis.optimize.SphericalDroplets(q=None, r=None)[source]

Bases: object

Attempt to determine the radius of a sphere based on diffraction intensities. This does a very simple thing: it pre-computes a bunch of spherical diffraction profiles upon initialization. Subsequently, you may

Parameters:
  • q (ndarray) – Array of q magnitudes

  • r (ndarray) – Array of radii that you would like to fit against

fit_profile(data, mask=None)[source]

Fit a profile and determine optimal radius.

Parameters:
  • data (ndarray) – Diffraction data, on the same grid of q magnitudes used on the initialization.

  • mask (ndarray) – Optional mask corresponding to data (1 means “good pixe”)

Returns:

The best-fit radius and a dictionary of things that are not documented well (sorry…)

reborn.analysis.parallel module

reborn.analysis.parallel.get_setup_data(**kwargs)[source]
class reborn.analysis.parallel.ParallelAnalyzer(framegetter=None, start=0, stop=None, step=1, n_processes=1, debug=False, message_prefix=None, log_file=None, clear_logs=False, checkpoint_file=None, reduce_from_checkpoints=True, checkpoint_interval=None, clear_checkpoints=False, _parallel=True, _process_id=0)[source]

Bases: ABC

A skeleton for parallel processing of datasets with logging and checkpoints. This class is only useful if each frame is processed independently. The normal use case is to accumulate results from many frames in a run.

You must create a subclass as follows:

  • Define the to_dict method, which creates a dictionary that contains all information needed to restore the state of analysis. See method docs for more details.

  • Define the from_dict method, which restores the state of analysis. See method docs for more detail.

  • Define the add_frame method. This is the core of the processing pipeline; it does all needed actions associated with the addition of a DataFrame to the compiled results. See method docs for more detail.

  • Define the concatenate method, which combines data from different chunks handled by different processors.

  • Optionally define the finalize method that will run after all processing/concatenation is complete.

  • At the beginning of your __init__, include the following line: super().__init__(framegetter=framegetter, **kwargs) This is essential – your subclass will not function properly without this line. The initialization of the base class handles all the configurations associated with logging, checkpoints, and parallel processing.

Important numbers that you might use but should not modify:

  • self.n_processed: Counts the number of good frames processed (for which DataFrame is not None)

  • self.processing_index: Indicates the current raw index for the given analysis chunk.

    Starts at zero for each process.

  • self.framegetter_index: Indicates the current framegetter index.

    E.g. if you start at frame 10, this starts at 10 for worker process 1. It will start elsewhere for the other worker processes.

Parallel analysis base class.

Parameters:
  • framegetter (FrameGetter) – FrameGetter that serves data for analysis(default is None).

  • start (int) – Frame at which to begin analysis (default is 0).

  • stop (int) – Frame at which to end analysis (default is None).

  • step (int) – Number to next frame (default is 1).

  • n_processes (int) – Number of nodes to use for parallel analysis (default is 1).

  • debug (bool) – Set to true to produce debug messages in log files.

  • message_prefix (str) – Prefix to log messages (for example: “Run 7: “; default is None).

  • log_file (str) – The base filepath for log files (e.g. “/results/logs/run0010.log”). Processor IDs will be appended as necessary. Default is None.

  • clear_logs (bool) – Log files keep growing by default. Set this to True if you want them to be cleared. Default is False.

  • checkpoint_file (str) – The base filepath for checkpoint files (e.g. “./results/checkpoints/run0010.pkl”). Processor IDs will be appended as necessary. If this is set to None, no checkpoints will be created, and parallel processing might fail. Be careful to ensure that you will not be processing different runs with the same checkpoint files!!!

  • reduce_from_checkpoints (bool) – Reduce/concatenate data by first saving to disk (minimize memory use). Default is True.

  • checkpoint_interval (int) – How often to save checkpoints (default is None).

  • clear_checkpoints (bool) – Set this to True if you want to remove all stale checkpoints. Be very careful with paths if you do this. You might wipe out something important… Default is False.

  • _parallel (bool) – DO NOT USE. Run analysis on multiple nodes simultaneously (internal keyword).

  • _process_id (int) – DO NOT USE. ID of process that is running analysis (internal keyword).

super_initialized = False
abstract add_frame(dat: DataFrame)[source]

User-defined method that does all actions associated with the addition of one DataFrame to the results. You should probably add a test to determine if any initializations are needed, e.g. in the event that empty arrays must be pre-allocated or counters need to be set to zero.

abstract to_dict()[source]

User-defined method that compiles all relevant data into a dictionary. This will be used to save the state of the analysis in checkpoint files, so be sure that all information that is needed to fast-forward the processing to an intermediate state is provided. You should probably include the config dictionary in this file so you know how the analysis was configured.

abstract from_dict(stats)[source]

User-defined method that is complementary to the to_dict method. Given a dictionary produced by from_dict, this method must take all necessary actions to restore a given analysis state.

abstract concatenate(stats)[source]

User-defined method that combines an existing instance of ParallelAnalyzer with the results of another ParallelAnalyzer that has operated on a different chunk of DataFrames.

finalize()[source]

Optional user-defined method that will be called at the end of all processing, after concatenation and immediately before a final results dictionary is returned or saved to disk.

save(filepath)[source]

Saves dictionary produced by the to_dict method as a pickle file.

load(filepath)[source]

Loads pickled dictionary, as defined by the to_dict method.

process_frames()[source]

Process all dataframes. Will launch parallel processes if n_processes is greater than 1.

Note: If you want the results in a dictionary format, run this method and then run the to_dict() method.

Returns either a dictionary of results, or a string that indicates the path to the results file. In the case of parallel processing, each worker process creates a file containing results, and the main process then combines all of the worker results via the concatenate method.

reborn.analysis.peaks module

class reborn.analysis.peaks.MultiPADPeakFinder(snr_threshold=6, radii=(1, 5, 10), mask=None, subtract_radial_median=False, correct_polarization=False, pad_geometry=None, beam=None)[source]

Bases: object

A peak finder class that deals with multiple PADs.

Parameters:
  • snr_threshold – Peaks must have a signal-to-noise ratio above this value

  • radii – These are the radii associated with the boxsnr function.

  • mask – Ignore pixels where mask == 0.

  • subtract_radial_median – Boolean variable to switch on radial median subtraction

  • correct_polarization – Boolean variable to switch on polarization correction

  • pad_geometry – pad geometry object

  • beam – beam object

find_peaks(data, mask=None)[source]

Do peak-finding on a data array with multiple PADs.

Parameters:
  • data – The data to perform peak-finding on.

  • mask – A mask, if desired. By default, the mask used on creation of a PeakFinder instance will be used. This defaults to ones everywhere.

Returns:

A list of centroids of the peaks.

class reborn.analysis.peaks.PeakFinder(snr_threshold=6, radii=(3, 5, 10), mask=None, max_iterations=3, beam=None)[source]

Bases: object

A crude peak finder. It works by firstly running a signal-to-noise filter over an entire image, and then searching for connected regions with SNR above some threshold. It is not fully developed. For example, it does not yet check for the minimum distance between peaks.

Parameters:
  • snr_threshold – Peaks must have a signal-to-noise ratio above this value

  • radii – These are the radii associated with the boxsnr function.

  • mask – Ignore pixels where mask == 0.

snr = None
signal = None
labels = None
n_labels = 0
centroids = None
beam = None
snr_threshold = None
radii = None
mask = None

Ignore pixels where mask == 0.

find_peaks(data, mask=None)[source]

Do peak finding on a data array.

Parameters:
  • data – The data to perform peak finding on.

  • mask – A mask, if desired. By default, the mask used on creation of a PeakFinder instance will be used. This defaults to ones everywhere.

Returns:

reborn.analysis.peaks.boxsnr(dat, mask_in, mask_out, n_in, n_cent, n_out)[source]

Transform an 2D image into a map of local signal-to-noise ratio by the following equivalent steps:

  1. For every pixel in the input data, do a local signal integration within a square region of size \(n_\text{in}*2+1\). Pixels masked by mask_in will be ignored. Masked pixels are indicated by the value zero, while unmasked pixels are indicated by the value one.

  2. Estimate background via a local integration within a square annulus of outer size \(2 n_\text{out} + 1\) and inner size \(2 n_\text{cent} - 1\). Pixels within mask_out will be ignored.

  3. From every pixel in the local signal integration square, subtract the average background value from step (2).

  4. Compute the standard deviation \(\sigma\) in the square annulus. Pixels within mask_out will be ignored.

  5. Divide the locally-integrated signal-minus-background by the standard error. The standard error is equal to \(\sigma/\sqrt{M}\) where \(M\) is the number of unmasked pixels in the locally-integratied signal region, and \(\sigma\) comes from step (4).

The use of two distinct masks allows for multi-pass SNR computations in which the results of the first pass may be used to exclude high-SNR regions from contributing to error estimates in the annulus. See snr_mask if you want to generate a mask this way.

Note

This routine will attempt to use openmp to parallelize the computations. It is affected by the environment variable OMP_NUM_THREADS. You can check how many cores are used by openmp by running the following:

import reborn.fortran; reborn.fortran.omp_test_f.omp_test()
Parameters:
  • dat (ndarray) – The image to analyze.

  • mask_in (ndarray) – The mask for the square central integration region.

  • mask_out (ndarray) – The mask for the square annulus integration region.

  • n_in (int) – Size of the central integration region; integrate from \((-n_{in}, n_{in})\), inclusively.

  • n_cent (int) – Define the annulus integration region; we ignore the box from (-n_cent, n_cent), inclusively.

  • n_out (int) – Define the annulus integration region; we include the box from (-n_out, n_out), inclusively.

Returns:

snr (ndarray): The SNR array.

signal (ndarray): The signal array.

Return type:

(tuple)

reborn.analysis.peaks.snr_filter(dat, mask, nin, ncent, nout, threshold=6, mask_negative=True, max_iterations=1)[source]

Transform an image into a map of local signal-to-noise ratio.

Parameters:
  • dat (numpy array) – Input data to calculate SNR from.

  • mask (numpy array) – Mask indicating bad pixels (zero means bad, one means ok)

  • nin (int) – See boxsnr function.

  • ncent (int) – See boxsnr function.

  • nout (int) – See boxsnr function.

  • threshold (float) – Reject pixels above this SNR.

  • mask_negative (bool) – Also reject pixels below the negative of the SNR threshold (default: True).

  • max_iterations (int) – The maxumum number of iterations (note: the loop exits if the mask stops changing).

Returns:

The SNR map

Return type:

numpy array

reborn.analysis.peaks.snr_mask(dat, mask, nin, ncent, nout, threshold=6, mask_negative=True, max_iterations=3)[source]

Mask out pixels above some chosen SNR threshold. The image is converted to a map of SNR using boxsnr. Additional iterations follow, in which pixels above threshold in the previous run are also masked in the annulus. This iterative procedure helps avoid contributions of peak signals to the Noise calculation.

Parameters:
  • dat (numpy array) – Input data to calculate SNR from.

  • mask (numpy array) – Mask indicating bad pixels (zero means bad, one means ok)

  • nin (int) – See boxsnr function.

  • ncent (int) – See boxsnr function.

  • nout (int) – See boxsnr function.

  • threshold (float) – Reject pixels above this SNR.

  • mask_negative (bool) – Also reject pixels below the negative of the SNR threshold (default: True).

  • max_iterations (int) – The maxumum number of iterations (note: the loop exits if the mask stops changing).

Returns:

The mask with pixels above the SNR threshold

Return type:

numpy array

reborn.analysis.runstats module

Utilities for gathering statistics from data runs.

class reborn.analysis.runstats.PixelHistogram(bin_min=None, bin_max=None, n_bins=None, n_pixels=None, zero_photon_peak=None, one_photon_peak=None, peak_width=None)[source]

Bases: object

Creates an intensity histogram for each pixel in a PAD. For a PAD with N pixels in total, this class will produce an array of shape (M, N) assuming that you requested M bins in the histogram.

Parameters:
  • bin_min (float) – The minimum value corresponding to histogram bin centers.

  • bin_max (float) – The maximum value corresponding to histogram bin centers.

  • n_bins (int) – The number of histogram bins.

  • n_pixels (int) – How many pixels there are in the detector.

  • zero_photon_peak (float) – Where you think the peak of the zero-photon signal should be.

  • one_photon_peak (float) – Where you think the peak of the one-photon signal should be.

  • peak_width (float) – Width of peaks for fitting to 2nd-order polynomial

count = 0

Number of frames contributing to the histogram.

bin_min = None

The minimum value corresponding to histogram bin centers.

bin_max = None

The maximum value corresponding to histogram bin centers.

n_bins = None

The number of histogram bins.

to_dict()[source]

Convert class to dictionary for storage.

from_dict(d)[source]

Set data according to data stored in dictionary created by to_dict.

concatentate(hist)[source]

Combines two histograms (e.g. if parallel processing and chunks need to be combined).

get_bin_centers()[source]

Returns an 1D array of histogram bin centers.

get_histogram_normalized()[source]

Returns a normalized histogram - an ndarray of shape (M, N) where M is the number of pixels and N is the number of requested bins per pixel.

get_histogram()[source]

Returns a copy of the histogram - an ndarray of shape (M, N) where M is the number of pixels and N is the number of requested bins per pixel.

add_frame(data, mask=None)[source]

Add PAD measurement to the histogram.

get_peaks_fast()[source]

Finds the local maxima in the neighborhood of estimated zero-photon and one-photon peak locations, and peak width estimate.

get_average_peak_locations()[source]

Average all pixels in the entire detector, estimate the locations of the zero-photon and one-photon peaks. Assumes a very smooth histogram.

Returns np.nan values when there is one or zero peaks found (e.g. on a dark dataset there will not be a second peak!)

gain_and_offset()[source]
get_refined_gain_and_offset(n_processes=1)[source]

Does a 2nd-order polynomial fit to the peak. First guess is the local maximum.

correct_gain_and_offset(histogram=None, gain=None, offset=None)[source]

Apply gain and offset values to a histogram using interpolations.

convert_to_q_histogram(n_q_bins=None, q_range=None, pad_geometry=None, beam=None, mask=None, normalized=False)[source]

Convert pixel histogram to q-bin histogram.

Parameters:
  • n_q_bins (int) – = None,

  • q_range (tuple) – = None,

  • pad_geometry (PADGeometry) – = None,

  • beam (Beam) – = None,

Return type:

q_histogram (ndarray)

get_median_profile(n_q_bins=None, q_range=None, pad_geometry=None, beam=None, mask=None, normalized=False)[source]

Convert pixel histogram to q-bin histogram and then produce a scattering profile based on median values of each q bin.

Parameters:
  • n_q_bins (int) – = None,

  • q_range (tuple) – = None,

  • pad_geometry (PADGeometry) – = None,

  • beam (Beam) – = None,

Returns:

Return type:

tuple

process_framegetter(framegetter)[source]
class reborn.analysis.runstats.ParallelPADStats(framegetter=None, histogram_params=None, **kwargs)[source]

Bases: ParallelAnalyzer

Gather PAD statistics for a dataset.

Given a FrameGetter subclass instance, fetch the mean intensity, mean squared intensity, minimum, and maximum intensities, and optionally a pixel-by-pixel intensity histogram. The function can run on multiple processors via the joblib library. Logfiles and checkpoint files are created.

The return of this function is a dictionary with the following keys:

  • ‘sum’: Sum of PAD intensities

  • ‘dataset_id’: A unique identifier for the data set (e.g. ‘run0154’)

  • ‘pad_geometry’: PADGeometryList

  • ‘mask’: Bad pixel mask

  • ‘n_frames’: Number of valid frames contributing to the statistics

  • ‘sum’: Sum of PAD intensities

  • ‘sum2’: Sum of squared PAD intensities

  • ‘min’: Pixel-by-pixel minimum of PAD intensities

  • ‘max’: Pixel-by-pixel maximum of PAD intensities

  • ‘counts’: Sum of the PAD mask

  • ‘beam’: Beam info

  • ‘start’: Frame at which processing started (global framegetter index)

  • ‘stop’: Frame at which processing stopped (global framegetter index)

  • ‘step’: Step size between frames (helpful to speed up processing by sub-sampling)

  • ‘wavelengths’: An array of wavelengths

  • ‘histogram_params’: Dictionary with histogram parameters

  • ‘histogram’: Pixel-by-pixel intensity histogram (MxN array, with M the number of pixels)

There is a corresponding view_padstats function to view the results in this dictionary.

padstats needs a configuration dictionary with the following contents:

  • ‘log_file’: Base path/name for status logging. Set to None to skip logging.

  • ‘checkpoint_file’: Base path/name for saving check points. Set to None to skip checkpoints.

  • ‘checkpoint_interval’: How many frames between checkpoints.

  • ‘message_prefix’: Prefix added to all logging messages. For example: “Run 35” might be helpful

  • ‘debug’: Set to True for more logging messages.

  • ‘reduce_from_checkpoints’: True by default, this indicates that data produced by multiple processors should be compiled by loading the checkpoints from disk. Without this, you might have memory problems. (The need for this is due to the joblib package; normally the reduce functions from MPI would be used to avoid hitting the disk.)

  • ‘histogram_params’: If not None, triggers production of a pixel-by-pixel histogram. This is a dictionary with the following entries: dict(bin_min=-30, bin_max=100, n_bins=100, zero_photon_peak=0, one_photon_peak=30)

Parameters:
  • framegetter (FrameGetter) – A FrameGetter subclass. If running in parallel, you should instead pass a dictionary with keys ‘framegetter’ (with reference to FrameGetter subclass, not an actual class instance) and ‘kwargs’ containing a dictionary of keyword arguments needed to create a class instance.

  • histogram_params (dict)

  • start (int) – Which frame to start with.

  • stop (int) – Which frame to stop at.

  • step (int) – Step size between frames (default 1).

  • n_processes (int) – How many processes to run in parallel (if parallel=True).

  • **kwargs – Any additional key-word arguments you would like to pass to the base class. See: ..:py:class::~reborn.analysis.parallel.ParallelAnalyzer

Returns: dict

add_frame(dat)[source]

User-defined method that does all actions associated with the addition of one DataFrame to the results. You should probably add a test to determine if any initializations are needed, e.g. in the event that empty arrays must be pre-allocated or counters need to be set to zero.

finalize()[source]

Compute gain and offset from histogram, trim arrays.

initialize_data(rdat)[source]

Allocate memory, initialize counters.

to_dict()[source]

User-defined method that compiles all relevant data into a dictionary. This will be used to save the state of the analysis in checkpoint files, so be sure that all information that is needed to fast-forward the processing to an intermediate state is provided. You should probably include the config dictionary in this file so you know how the analysis was configured.

from_dict(stats)[source]

User-defined method that is complementary to the to_dict method. Given a dictionary produced by from_dict, this method must take all necessary actions to restore a given analysis state.

concatenate(stats)[source]

Take the existing data in this class instance and ‘add’ the date in the incoming stats dictionary. E.g. extend arrays as needed, or sum them. Also known as a ‘reduce’ step. One can think of ‘merging’ the data.

reborn.analysis.runstats.save_padstats(stats: dict, filepath: str)[source]

Saves the results of the padstats function.

Parameters:
  • stats (dict) – Dictionary output of padstats.

  • filepath (str) – Path to the file you want to save.

reborn.analysis.runstats.load_padstats(filepath: str)[source]

Load the results of padstats from disk.

Parameters:

filepath (str) – Path to the file you want to load.

Returns: dict

reborn.analysis.runstats.padstats_framegetter(stats)[source]

Make a FrameGetter that can flip through the output of padstats.

Parameters:

stats (dict) – Output of padstats.

Returns: ListFrameGetter

reborn.analysis.runstats.view_padstats(stats, histogram=False, start=True, main=True, show_corrected_histogram=False, pad_geometry=None, beam=None, debug=0)[source]

View the output of padstats.

Parameters:
  • stats (dict) – Output dictionary from padstats.

  • histogram (bool) – Show the histogram

  • start (bool) – Choose if the PADView app should be started (executed)

  • main (bool) – Choose if the window should be a Qt main window.

  • show_corrected_histogram (bool) – Undocumented.

  • pad_geometry (PADGeometryList) – PAD geometry

  • beam (Beam) – Beam parameters

reborn.analysis.runstats.get_gain_and_offset(stats, n_processes=1, _peaks=None, debug=1)[source]

Analyze histogram and attempt to extract offsets and gains from the zero- and one-photon peak. Experimental. Use at your own risk!

reborn.analysis.saxs module

reborn.analysis.saxs.debug_message(*args, caller=True, **kwargs)[source]

Standard debug message, which includes the function called.

class reborn.analysis.saxs.QHistogram(bin_min=None, bin_max=None, n_bins=None, n_q_bins=None, pad_geometry=None, beam=None)[source]

Bases: object

Creates an intensity histogram for each pixel in a PAD. For a PAD with N pixels in total, this class will produce an array of shape (M, N) assuming that you requested M bins in the histogram.

Parameters:
  • bin_min (float) – The minimum value corresponding to histogram bin centers.

  • bin_max (float) – The maximum value corresponding to histogram bin centers.

  • n_bins (int) – The number of histogram bins.

  • n_q_bins (int) – The number of q bins.

count = 0
bin_min = None
bin_max = None
n_bins = None
n_q_bins = None
to_dict()[source]

Convert class to dictionary for storage.

from_dict(d)[source]

Set data according to data stored in dictionary created by to_dict.

get_bin_centers()[source]

Returns an 1D array of histogram bin centers.

get_q_bin_centers()[source]

Returns an 1D array of histogram q bin centers.

get_histogram()[source]

Returns a copy of the histogram - an ndarray of shape (M, N) where M is the number of pixels and N is the number of requested bins per pixel.

add_frame(data, mask=None)[source]

Add PAD measurement to the histogram.

process_framegetter(framegetter)[source]

TODO: docstring

class reborn.analysis.saxs.RadialProfiler(mask=None, n_bins=None, q_range=None, pad_geometry=None, beam=None)[source]

Bases: object

A class for creating radial profiles from image data. Standard profiles are computed using fortran code. Bin indices are cached for speed, provided that the PADGeometry and Beam do not change. Arbitrary profiles can also be computed (slowly!) provided a function handle that operates on an ndarray.

For a parallelized version see: ..:py:class::~reborn.analysis.saxs.RadialProfiler

Parameters:
  • mask (ndarray) – Optional. The arrays will be multiplied by this mask, and the counts per radial bin will come from this (e.g. use values of 0 and 1 if you want a normal average, otherwise you get a weighted average).

  • n_bins (int) – Number of radial bins you desire.

  • q_range (list-like) – The minimum and maximum of the centers of the q bins.

  • pad_geometry (PADGeometryList) – Optional. Will be used to generate q magnitudes. You must provide beam if you provide this.

  • beam (Beam) – Optional, unless pad_geometry is provided. Wavelength and beam direction are needed in order to calculate q magnitudes.

n_bins = None
q_range = None
q_edge_range = None
bin_centers = None
bin_edges = None
bin_size = None
set_mask(mask)[source]

Update the mask. Concatenate (if needed), copy, and make it un-writeable.

Parameters:

mask (ndarray) – Mask.

set_beam(beam)[source]

Update the Beam with a copy and clear out derived caches.

Parameters:

beam (Beam) – Beam properties.

set_pad_geometry(geom)[source]

Update the PADGeometryList with a copy and clear out derived caches.

Parameters:

geom (PADGeometryList) – PAD geometry.

property q_bin_centers

The centers of the q bins.

make_plan(mask=None, n_bins=None, q_range=None, pad_geometry=None, beam=None)[source]

Set up the binning indices for the creation of radial profiles. Cache some useful data to speed things up later.

Parameters:
  • mask (ndarray) – Optional. The arrays will be multiplied by this mask, and the counts per radial bin will come from this (e.g. use values of 0 and 1 if you want a normal average, otherwise you get a weighted average).

  • n_bins (int) – Number of radial bins you desire.

  • q_range (tuple) – The minimum and maximum of the centers of the q bins.

  • pad_geometry (list of PADGeometry instances) – Optional. Will be used to generate q magnitudes. You must provide beam if you provide this.

  • beam (Beam instance) – Optional, unless pad_geometry is provided. Wavelength and beam direction are needed in order to calculate q magnitudes.

concat_data(data)[source]

Concatenate 2D diffraction data.

get_profile_statistic(data, mask=None, statistic=None)[source]

Calculate the radial profile using an arbitrary function.

Parameters:
  • data (ndarray) – The intensity data from which the radial profile is formed.

  • mask (ndarray) – Optional. A mask to indicate bad pixels. Zero is bad, one is good. If no mask is provided here, the mask configured with set_mask will be used.

  • statistic (function or list of functions) – Provide a function of your choice that runs on each radial bin.

Returns: ndarray

get_counts_profile(mask=None, quick=True)[source]

Calculate the radial profile of counts that fall in each bin. If the mask is not binary, then the “counts” are actually the sum of weights in each bin.

Parameters:

mask (ndarray) – Optional mask. Uses cache if not provided.

Returns:

ndarray

get_sum_profile(data, mask=None, quick=True)[source]

Calculate the radial profile of summed intensities. This is divided by counts to get an average.

Parameters:
  • data (ndarray) – The intensity data from which the radial profile is formed.

  • mask (ndarray) – Optional. A mask to indicate bad pixels. Zero is bad, one is good.

Returns: ndarray

get_mean_profile(data, mask=None, quick=True)[source]

Calculate the radial profile of averaged intensities.

Parameters:
  • data (ndarray) – The intensity data from which the radial profile is formed.

  • mask (ndarray) – Optional. A mask to indicate bad pixels. Zero is bad, one is good. If no mask is provided here, the mask configured with set_mask will be used.

Returns: ndarray

get_median_profile(data, mask=None)[source]

Calculate the radial profile of averaged intensities.

Parameters:
  • data (ndarray) – The intensity data from which the radial profile is formed.

  • mask (ndarray) – Optional. A mask to indicate bad pixels. Zero is bad, one is good. If no mask is provided here, the mask configured with set_mask will be used.

Returns: ndarray

get_sdev_profile(data, mask=None, quick=True)[source]

Calculate the standard deviations of radial bin.

Parameters:
  • data (ndarray) – The intensity data from which the radial profile is formed.

  • mask (ndarray) – Optional. A mask to indicate bad pixels. Zero is bad, one is good. If no mask is provided here, the mask configured with set_mask will be used.

Returns: ndarray

subtract_profile(data, mask=None, statistic=<function median>)[source]

Given some PAD data, subtract a radial profile (mean or median).

Parameters:
  • data (ndarray) – The intensity data from which the radial profile is formed.

  • mask (ndarray) – Optional. A mask to indicate bad pixels. Zero is bad, one is good. If no mask is provided here, the mask configured with set_mask will be used.

  • statistic (function) – Provide a function of your choice that runs on each radial bin.

Returns:

divide_profile(data, mask=None, statistic=<function median>)[source]

Given some PAD data, subtract a radial profile (mean or median).

Parameters:
  • data (ndarray) – The intensity data from which the radial profile is formed.

  • mask (ndarray) – Optional. A mask to indicate bad pixels. Zero is bad, one is good. If no mask is provided here, the mask configured with set_mask will be used.

  • statistic (function) – Provide a function of your choice that runs on each radial bin.

Returns:

ndarray

subtract_median_profile(data, mask=None)[source]

Given some PAD data, calculate the radial median and subtract it from the data.

Parameters:
Returns:

ndarray

quickstats(data, weights=None)[source]

Use the faster fortran functions to get the mean and standard deviations. These are weighted by the mask, and you are allowed to use non-binary values in the mask. The proper weighting is most likely equal to the product of the pixel solid angle, the polarization factor, and the binary mask.

Parameters:
  • data (ndarray) – Diffraction intensities.

  • weights (ndarray) – Weights (or binary mask if you prefer unweighted average)

Returns:

dict

class reborn.analysis.saxs.ParallelRadialProfiler(framegetter=None, n_q_bins=1000, q_range=None, pad_geometry=None, beam=None, mask=None, include_median=False, **kwargs)[source]

Bases: ParallelAnalyzer

Parallelized class for creating radial profiles from x-ray diffraction data.

Parallel analyzer that produces scattering (“radial”) profiles from diffraction patterns. See the to_dict method for details of what the output is.

Standard profiles are computed using fortran code. Bin indices are cached for speed, provided that the PADGeometry and Beam do not change.

Parameters:
  • framegetter (FrameGetter) – The FrameGetter that serves the data for analysis.

  • n_bins (int) – Number of radial bins (default is 1000).

  • q_range (list-like) – The minimum and maximum of the centers of the q bins.

  • pad_geometry (PADGeometryList) – Detector geometry, used to generate q magnitudes. If None, then automatically retrieved from raw data.

  • beam (Beam) – X-ray beam. Wavelength and beam direction required to calculate q magnitudes. If None, then automatically retrieved from raw data.

  • mask (List or ndarray) – Optional (default is no masked pixels). Data will be multiplied by this mask, and the counts per radial bin will come from this (e.g. use values of 0 and 1 if you want a normal average, otherwise you get a weighted average).

  • include_median (bool) – Set to True to include median profile (default is False).

  • start (int) – Which frame to start with.

  • stop (int) – Which frame to stop at.

  • step (int) – Step size between frames (default 1).

  • n_processes (int) – How many processes to run in parallel (if parallel=True).

  • **kwargs – Any key-word arguments you would like to pass to the base class. See: ..:py:class::~reborn.analysis.parallel.ParallelAnalyzer

include_median = False
profiler = None
radials = None
add_frame(dat: DataFrame)[source]

User-defined method that does all actions associated with the addition of one DataFrame to the results. You should probably add a test to determine if any initializations are needed, e.g. in the event that empty arrays must be pre-allocated or counters need to be set to zero.

concatenate(stats: dict)[source]

User-defined method that combines an existing instance of ParallelAnalyzer with the results of another ParallelAnalyzer that has operated on a different chunk of DataFrames.

to_dict()[source]

Create a dictionary of the results.

Returns:

  • mean (ndarray) – Mean of unmasked intensities

  • sdev (ndarray) – Standard deviation of unmasked intensities

  • median (ndarray) – Median of unmasked intensities (only if requested; this is slow)

  • sum (ndarray) – Sum of unmasked intensities

  • sum2 (ndarray) – Sum of squared unmasked intensities

  • wsum (ndarray) – Weighted sum (if no weights are provided,

    then this is the number of unmasked pixels in the q bin)

  • n_frames (int) – Number of frames analyzed.

  • initial_frame (int) – First frame from which geometry and beam data are extracted.

  • n_q_bins (int) – Number of q bins in radial profiles.

  • q_bins (ndarray) – The centers of the q bins.

  • experiment_id (str) – Identifier for experiment being analyzed.

  • run_id (str) – Identifier for run being analyzed

  • pad_geometry (PADGeometryList) – Detector geometry used to set up RadialProfiler.

  • beam (Beam) – X-ray beam used to set up RadialProfiler.

  • mask (ndarray) – Mask used to set up RadialProfiler.

Return type:

(dict)

from_dict(stats: dict)[source]

User-defined method that is complementary to the to_dict method. Given a dictionary produced by from_dict, this method must take all necessary actions to restore a given analysis state.

reborn.analysis.saxs.save_profiles(stats, filepath)[source]
reborn.analysis.saxs.load_profiles(filepath)[source]
reborn.analysis.saxs.view_profile_runstats(stats)[source]
reborn.analysis.saxs.normalize_profile_stats(stats, q_range=None)[source]

Given a stats dictionary, normalize by setting a particular q range to an average of 1.

Parameters:
  • stats (dict) – A stats dictionary created by get_profile_stats_from_pandas()

  • q_range (tuple) – Optionally specify the normalization range with a tuple containing (q_min, q_max)

Returns:

An updates stats dictionary

Return type:

dict

reborn.analysis.svd module

reborn.analysis.svd.addblock_svd_update(U, S, V, A, force_orth=False)[source]

This is a nearly direct translation of the Matlab code found here: https://pcc.byu.edu/scripts/addblock_svd_update.m

The mathematics is discussed here: Brand, M. Fast low-rank modifications of the thin singular value decomposition. Linear Algebra and its Applications 415, 20-30 (2006).

Note that there are some differences between Numpy and Matlab implementations of SVD. So far, I noticed that the V matrix in the X = USV decomposition is transposed when comparing Matlab to Numpy. Also, the ordering of the diagonal S matrix is different; Numpy is more economical since only the diagonals are specified as a 1D matrix. More importantly, the actual diagonal entries appear to be reversed in their ordering by comparison. I don’t know how they differ in the event of degenerate values. There are likely some “gotcha’s” that I am not yet aware of.

Original documenation by D. Wingate 8/17/2007:

=================================================================

Given the SVD of

X = U*S*V’

update it to be the SVD of

[X A] = Up*Sp*Vp’

that is, add new columns (ie, data points).

I have found that it is faster to add several (say, 200) data points at a time, rather than updating it incrementally with individual data points (for 500 dimensional vectors, the speedup was roughly 25x). However, in the rank-one case there is structure that I have not exploited, so that may still be faster than a block method.

The subspace rotations involved may not preserve orthogonality due to numerical round-off errors. To compensate, you can set the “force_orth” flag, which will force orthogonality via a QR plus another SVD. In a long loop, you may want to force orthogonality every so often.

See Matthew Brand, “Fast low-rank modifications of the thin singular value decomposition”.

  1. Wingate 8/17/2007

=====================================================================

Parameters:
  • U (numpy array) – Left singular vectors of shape p x q

  • S (numpy array) – Diagonal matrix (shape q – only the q diagonal elements specified)

  • V (numpy array) – Right singular vectors of shape q x n

  • A (numpy array) – The matrix to be appended to X = USV (shape p x n)

  • force_orth – Force orthogonality

Returns:

Up, Sp, Vp

Return type:

numpy arrays

Module contents