reborn.analysis package¶
Submodules¶
reborn.analysis.beam_center module¶
reborn.analysis.masking module¶
- reborn.analysis.masking.snr_filter_test(data, mask, mask2, nin, ncent, nout)[source]¶
This is a slow version of snr_filter, which is about 10-fold faster on multi-core CPUs.
- reborn.analysis.masking.snr_filter(dat, mask_in, mask_out, n_in, n_cent, n_out)[source]¶
Transform an 2D image into a map of local signal-to-noise ratio by the following equivalent steps:
For every pixel in the input data, do a local signal integration within a square region of size \(n_\text{in}*2+1\). Pixels masked by mask_in will be ignored. Masked pixels are indicated by the value zero, while unmasked pixels are indicated by the value one.
Estimate background via a local integration within a square annulus of outer size \(2 n_\text{out} + 1\) and inner size \(2 n_\text{cent} - 1\). Pixels within mask_out will be ignored.
From every pixel in the local signal integration square, subtract the average background value from step (2).
Compute the standard deviation \(\sigma\) in the square annulus. Pixels within mask_out will be ignored.
Divide the locally-integrated signal-minus-background by the standard error. The standard error is equal to \(\sigma/\sqrt{M}\) where \(M\) is the number of unmasked pixels in the locally-integratied signal region, and \(\sigma\) comes from step (4).
The use of two distinct masks allows for multi-pass SNR computations in which the results of the first pass may be used to exclude high-SNR regions from contributing to error estimates in the annulus. See
snr_mask
if you want to generate a mask this way.Note
This routine will attempt to use openmp to parallelize the computations. It is affected by the environment variable OMP_NUM_THREADS. You can check how many cores are used by openmp by running the following:
import reborn.fortran; reborn.fortran.omp_test_f.omp_test()
- Parameters:
dat (
ndarray
) – The image to analyze.mask_in (
ndarray
) – The mask for the square central integration region.mask_out (
ndarray
) – The mask for the square annulus integration region.n_in (int) – Size of the central integration region; integrate from \((-n_{in}, n_{in})\), inclusively.
n_cent (int) – Define the annulus integration region; we ignore the box from (-n_cent, n_cent), inclusively.
n_out (int) – Define the annulus integration region; we include the box from (-n_out, n_out), inclusively.
- Returns:
snr (
ndarray
): The SNR array.signal (
ndarray
): The signal array.- Return type:
(tuple)
- reborn.analysis.masking.snr_mask(dat, mask, nin, ncent, nout, threshold=6, mask_negative=True, max_iterations=3, pad_geometry=None, beam=None, subtract_median=False)[source]¶
Mask out pixels above some chosen SNR threshold. The image is converted to a map of SNR using boxsnr. Additional iterations follow, in which pixels above threshold in the previous run are also masked in the annulus. This iterative procedure helps avoid contributions of peak signals to the Noise calculation.
- Parameters:
dat (numpy array) – Input data to calculate SNR from.
mask (numpy array) – Mask indicating bad pixels (zero means bad, one means ok)
nin (int) – See boxsnr function.
ncent (int) – See boxsnr function.
nout (int) – See boxsnr function.
threshold (float) – Reject pixels above this SNR.
mask_negative (bool) – Also reject pixels below the negative of the SNR threshold (default: True).
max_iterations (int) – The maxumum number of iterations (note: the loop exits if the mask stops changing).
pad_geometry (
PADGeometryList
) – PAD geometry (optional)subtract_median (bool) – Subtract median profiler before masking.
- Returns:
The mask with pixels above the SNR threshold
- Return type:
numpy array
- class reborn.analysis.masking.StreakMasker(geom: PADGeometryList, beam: Beam, n_q=100, q_range=(0, 3000000000.0), prominence=0.8, max_streaks=2, debug=0, streak_width=0.01, snr=6, angle_range=None)[source]¶
Bases:
object
A tool for masking jet streaks or other streak-like features in diffraction patterns. It is assumed that the streak crosses through the beam center.
- Parameters:
geom (
PADGeometryList
) – PAD geometry.beam (
Beam
) – Beam info.n_q (int) – Number of q bins. Default: 100.
q_range (float tuple) – Range of q values to consider. Default: (0, 2e10).
prominence (float) – Look at the corresponding parameter in scipy.signal.find_peaks. Default: 0.8.
max_streaks (int) – Maximum number of streaks. Default: 2.
debug (int) – Set to 1 or greater to see some standard output.
streak_width (float) – Angular width of the jet streak mask.
snr (float) – Threshold the signal-to-noise ration for jet streaks (peak height - average background)/standard deviation
reborn.analysis.optimize module¶
Miscellaneous optimization and fitting routines.
- reborn.analysis.optimize.fit_ellipse(x, y)[source]¶
Given lists of coordinates \((x,\;y)\), do least-squares fit to the coefficients \(a_i\) of the function
(1)¶\[a_{xx} x^2 + a_{xy} xy + a_{yy}y^2 +a_x x + a_y y + a_1 = 0 \;.\]The algorithm ensures that the coefficients satisfy the ellipse condition \(4a_{xx} a_{yy}−a_{xy}^2 > 0\). We use the exact code found here, which in turn follows the approch in Fitzgibbon, A.W., Pilu, M., and Fischer R.B., Direct least squares fitting of ellipsees, Proc. of the 13th Internation Conference on Pattern Recognition, pp 253–257, Vienna, 1996.
For convenience, an alternative parameterization is also returned. In this parameterization the ellipse is specified by the coordinates \(X, Y\) and the following relations:
(2)¶\[\frac{X}{a^2} + \frac{Y}{b^2} = 1\]and
(3)¶\[\begin{split}x = \phantom{-}(X - X_0)\cos\theta + (Y - Y_0)\sin\theta \\ y = -(X - X_0)\sin\theta + (Y - Y_0)\cos\theta\end{split}\]By definition, \(a \ge b\) so that \(a\) is the semi-major axis, and \(\theta\) is the angle by which the semi-major axis is rotated.
- reborn.analysis.optimize.ellipse_center(a)[source]¶
Find the center \(x_0, \; y_0\) of the ellipse function
(4)¶\[\frac{(x-x_0)^2}{a^2} + \frac{(y-y_0)^2}{b^2} = 1\]given the coefficients \(a_i\) of the function
(5)¶\[a_1 + a_x x^2 + a_{xy} xy + a_{yy}y^2 +a_x x + a_y y = 0 \;.\]We use the exact code found here. This function should typically be used along with the function
fit_ellipse
.
- reborn.analysis.optimize.ellipse_parameters(a)[source]¶
Convert between the ellipse parameterization
(6)¶\[a_{xx} x^2 + a_{xy} xy + a_{yy}y^2 +a_x x + a_y y + a_1 = 0 \;.\]and the parameterization specified by the coordinates \(X, Y\) and the following relations:
(7)¶\[\frac{X}{a^2} + \frac{Y}{b^2} = 1\]and
By definition, \(a \ge b\) so that \(a\) is the semi-major axis, and \(\theta\) is the angle by which the semi-major axis is rotated.
- reborn.analysis.optimize.fit_ellipse_pad(pad_geometry, data, threshold, mask=None)[source]¶
Fit an ellipse to pixels above threshold. In order to deal with the possibility of multiple PADs with different detector distances, the x,y coordinates are projected onto a plane located at a distance of 1 meter from the origin. The x-ray beam is assumed to be along the z direction (we can change this if the need arises).
- Parameters:
pad_geometry (list of
PADGeometry
’s) – PAD geometry.data (list of
ndarray
’s) – Data to threshold.threshold (float) – Threshold value. The x,y coordinates from pixels above this will be used in the fit.
mask (list of
ndarray
’s)
- Returns:
Ellipse fit parameters (see
fit_ellipse
function)- Return type:
- class reborn.analysis.optimize.SphericalDroplets(q=None, r=None)[source]¶
Bases:
object
Attempt to determine the radius of a sphere based on diffraction intensities. This does a very simple thing: it pre-computes a bunch of spherical diffraction profiles upon initialization. Subsequently, you may
- Parameters:
reborn.analysis.parallel module¶
- class reborn.analysis.parallel.ParallelAnalyzer(framegetter=None, start=0, stop=None, step=1, n_processes=1, debug=False, message_prefix=None, log_file=None, clear_logs=False, checkpoint_file=None, reduce_from_checkpoints=True, checkpoint_interval=None, clear_checkpoints=False, _parallel=True, _process_id=0)[source]¶
Bases:
ABC
A skeleton for parallel processing of datasets with logging and checkpoints. This class is only useful if each frame is processed independently. The normal use case is to accumulate results from many frames in a run.
You must create a subclass as follows:
Define the to_dict method, which creates a dictionary that contains all information needed to restore the state of analysis. See method docs for more details.
Define the from_dict method, which restores the state of analysis. See method docs for more detail.
Define the add_frame method. This is the core of the processing pipeline; it does all needed actions associated with the addition of a
DataFrame
to the compiled results. See method docs for more detail.Define the concatenate method, which combines data from different chunks handled by different processors.
Optionally define the finalize method that will run after all processing/concatenation is complete.
At the beginning of your __init__, include the following line: super().__init__(framegetter=framegetter, **kwargs) This is essential – your subclass will not function properly without this line. The initialization of the base class handles all the configurations associated with logging, checkpoints, and parallel processing.
Important numbers that you might use but should not modify:
self.n_processed: Counts the number of good frames processed (for which
DataFrame
is not None)- self.processing_index: Indicates the current raw index for the given analysis chunk.
Starts at zero for each process.
- self.framegetter_index: Indicates the current framegetter index.
E.g. if you start at frame 10, this starts at 10 for worker process 1. It will start elsewhere for the other worker processes.
Parallel analysis base class.
- Parameters:
framegetter (
FrameGetter
) – FrameGetter that serves data for analysis(default is None).start (int) – Frame at which to begin analysis (default is 0).
stop (int) – Frame at which to end analysis (default is None).
step (int) – Number to next frame (default is 1).
n_processes (int) – Number of nodes to use for parallel analysis (default is 1).
debug (bool) – Set to true to produce debug messages in log files.
message_prefix (str) – Prefix to log messages (for example: “Run 7: “; default is None).
log_file (str) – The base filepath for log files (e.g. “/results/logs/run0010.log”). Processor IDs will be appended as necessary. Default is None.
clear_logs (bool) – Log files keep growing by default. Set this to True if you want them to be cleared. Default is False.
checkpoint_file (str) – The base filepath for checkpoint files (e.g. “./results/checkpoints/run0010.pkl”). Processor IDs will be appended as necessary. If this is set to None, no checkpoints will be created, and parallel processing might fail. Be careful to ensure that you will not be processing different runs with the same checkpoint files!!!
reduce_from_checkpoints (bool) – Reduce/concatenate data by first saving to disk (minimize memory use). Default is True.
checkpoint_interval (int) – How often to save checkpoints (default is None).
clear_checkpoints (bool) – Set this to True if you want to remove all stale checkpoints. Be very careful with paths if you do this. You might wipe out something important… Default is False.
_parallel (bool) – DO NOT USE. Run analysis on multiple nodes simultaneously (internal keyword).
_process_id (int) – DO NOT USE. ID of process that is running analysis (internal keyword).
- super_initialized = False¶
- abstract add_frame(dat: DataFrame)[source]¶
User-defined method that does all actions associated with the addition of one
DataFrame
to the results. You should probably add a test to determine if any initializations are needed, e.g. in the event that empty arrays must be pre-allocated or counters need to be set to zero.
- abstract to_dict()[source]¶
User-defined method that compiles all relevant data into a dictionary. This will be used to save the state of the analysis in checkpoint files, so be sure that all information that is needed to fast-forward the processing to an intermediate state is provided. You should probably include the config dictionary in this file so you know how the analysis was configured.
- abstract from_dict(stats)[source]¶
User-defined method that is complementary to the to_dict method. Given a dictionary produced by from_dict, this method must take all necessary actions to restore a given analysis state.
- abstract concatenate(stats)[source]¶
User-defined method that combines an existing instance of ParallelAnalyzer with the results of another ParallelAnalyzer that has operated on a different chunk of DataFrames.
- finalize()[source]¶
Optional user-defined method that will be called at the end of all processing, after concatenation and immediately before a final results dictionary is returned or saved to disk.
- process_frames()[source]¶
Process all dataframes. Will launch parallel processes if n_processes is greater than 1.
Note: If you want the results in a dictionary format, run this method and then run the to_dict() method.
Returns either a dictionary of results, or a string that indicates the path to the results file. In the case of parallel processing, each worker process creates a file containing results, and the main process then combines all of the worker results via the concatenate method.
reborn.analysis.peaks module¶
- class reborn.analysis.peaks.MultiPADPeakFinder(snr_threshold=6, radii=(1, 5, 10), mask=None, subtract_radial_median=False, correct_polarization=False, pad_geometry=None, beam=None)[source]¶
Bases:
object
A peak finder class that deals with multiple PADs.
- Parameters:
snr_threshold – Peaks must have a signal-to-noise ratio above this value
radii – These are the radii associated with the
boxsnr
function.mask – Ignore pixels where mask == 0.
subtract_radial_median – Boolean variable to switch on radial median subtraction
correct_polarization – Boolean variable to switch on polarization correction
pad_geometry – pad geometry object
beam – beam object
- find_peaks(data, mask=None)[source]¶
Do peak-finding on a data array with multiple PADs.
- Parameters:
data – The data to perform peak-finding on.
mask – A mask, if desired. By default, the mask used on creation of a PeakFinder instance will be used. This defaults to ones everywhere.
- Returns:
A list of centroids of the peaks.
- class reborn.analysis.peaks.PeakFinder(snr_threshold=6, radii=(3, 5, 10), mask=None, max_iterations=3, beam=None)[source]¶
Bases:
object
A crude peak finder. It works by firstly running a signal-to-noise filter over an entire image, and then searching for connected regions with SNR above some threshold. It is not fully developed. For example, it does not yet check for the minimum distance between peaks.
- Parameters:
snr_threshold – Peaks must have a signal-to-noise ratio above this value
radii – These are the radii associated with the
boxsnr
function.mask – Ignore pixels where mask == 0.
- snr = None¶
- signal = None¶
- labels = None¶
- n_labels = 0¶
- centroids = None¶
- beam = None¶
- snr_threshold = None¶
- radii = None¶
- mask = None¶
Ignore pixels where mask == 0.
- reborn.analysis.peaks.boxsnr(dat, mask_in, mask_out, n_in, n_cent, n_out)[source]¶
Transform an 2D image into a map of local signal-to-noise ratio by the following equivalent steps:
For every pixel in the input data, do a local signal integration within a square region of size \(n_\text{in}*2+1\). Pixels masked by mask_in will be ignored. Masked pixels are indicated by the value zero, while unmasked pixels are indicated by the value one.
Estimate background via a local integration within a square annulus of outer size \(2 n_\text{out} + 1\) and inner size \(2 n_\text{cent} - 1\). Pixels within mask_out will be ignored.
From every pixel in the local signal integration square, subtract the average background value from step (2).
Compute the standard deviation \(\sigma\) in the square annulus. Pixels within mask_out will be ignored.
Divide the locally-integrated signal-minus-background by the standard error. The standard error is equal to \(\sigma/\sqrt{M}\) where \(M\) is the number of unmasked pixels in the locally-integratied signal region, and \(\sigma\) comes from step (4).
The use of two distinct masks allows for multi-pass SNR computations in which the results of the first pass may be used to exclude high-SNR regions from contributing to error estimates in the annulus. See
snr_mask
if you want to generate a mask this way.Note
This routine will attempt to use openmp to parallelize the computations. It is affected by the environment variable OMP_NUM_THREADS. You can check how many cores are used by openmp by running the following:
import reborn.fortran; reborn.fortran.omp_test_f.omp_test()
- Parameters:
dat (
ndarray
) – The image to analyze.mask_in (
ndarray
) – The mask for the square central integration region.mask_out (
ndarray
) – The mask for the square annulus integration region.n_in (int) – Size of the central integration region; integrate from \((-n_{in}, n_{in})\), inclusively.
n_cent (int) – Define the annulus integration region; we ignore the box from (-n_cent, n_cent), inclusively.
n_out (int) – Define the annulus integration region; we include the box from (-n_out, n_out), inclusively.
- Returns:
snr (
ndarray
): The SNR array.signal (
ndarray
): The signal array.- Return type:
(tuple)
- reborn.analysis.peaks.snr_filter(dat, mask, nin, ncent, nout, threshold=6, mask_negative=True, max_iterations=1)[source]¶
Transform an image into a map of local signal-to-noise ratio.
- Parameters:
dat (numpy array) – Input data to calculate SNR from.
mask (numpy array) – Mask indicating bad pixels (zero means bad, one means ok)
nin (int) – See boxsnr function.
ncent (int) – See boxsnr function.
nout (int) – See boxsnr function.
threshold (float) – Reject pixels above this SNR.
mask_negative (bool) – Also reject pixels below the negative of the SNR threshold (default: True).
max_iterations (int) – The maxumum number of iterations (note: the loop exits if the mask stops changing).
- Returns:
The SNR map
- Return type:
numpy array
- reborn.analysis.peaks.snr_mask(dat, mask, nin, ncent, nout, threshold=6, mask_negative=True, max_iterations=3)[source]¶
Mask out pixels above some chosen SNR threshold. The image is converted to a map of SNR using boxsnr. Additional iterations follow, in which pixels above threshold in the previous run are also masked in the annulus. This iterative procedure helps avoid contributions of peak signals to the Noise calculation.
- Parameters:
dat (numpy array) – Input data to calculate SNR from.
mask (numpy array) – Mask indicating bad pixels (zero means bad, one means ok)
nin (int) – See boxsnr function.
ncent (int) – See boxsnr function.
nout (int) – See boxsnr function.
threshold (float) – Reject pixels above this SNR.
mask_negative (bool) – Also reject pixels below the negative of the SNR threshold (default: True).
max_iterations (int) – The maxumum number of iterations (note: the loop exits if the mask stops changing).
- Returns:
The mask with pixels above the SNR threshold
- Return type:
numpy array
reborn.analysis.runstats module¶
Utilities for gathering statistics from data runs.
- class reborn.analysis.runstats.PixelHistogram(bin_min=None, bin_max=None, n_bins=None, n_pixels=None, zero_photon_peak=None, one_photon_peak=None, peak_width=None)[source]¶
Bases:
object
Creates an intensity histogram for each pixel in a PAD. For a PAD with N pixels in total, this class will produce an array of shape (M, N) assuming that you requested M bins in the histogram.
- Parameters:
bin_min (float) – The minimum value corresponding to histogram bin centers.
bin_max (float) – The maximum value corresponding to histogram bin centers.
n_bins (int) – The number of histogram bins.
n_pixels (int) – How many pixels there are in the detector.
zero_photon_peak (float) – Where you think the peak of the zero-photon signal should be.
one_photon_peak (float) – Where you think the peak of the one-photon signal should be.
peak_width (float) – Width of peaks for fitting to 2nd-order polynomial
- count = 0¶
Number of frames contributing to the histogram.
- bin_min = None¶
The minimum value corresponding to histogram bin centers.
- bin_max = None¶
The maximum value corresponding to histogram bin centers.
- n_bins = None¶
The number of histogram bins.
- concatentate(hist)[source]¶
Combines two histograms (e.g. if parallel processing and chunks need to be combined).
- get_histogram_normalized()[source]¶
Returns a normalized histogram - an
ndarray
of shape (M, N) where M is the number of pixels and N is the number of requested bins per pixel.
- get_histogram()[source]¶
Returns a copy of the histogram - an
ndarray
of shape (M, N) where M is the number of pixels and N is the number of requested bins per pixel.
- get_peaks_fast()[source]¶
Finds the local maxima in the neighborhood of estimated zero-photon and one-photon peak locations, and peak width estimate.
- get_average_peak_locations()[source]¶
Average all pixels in the entire detector, estimate the locations of the zero-photon and one-photon peaks. Assumes a very smooth histogram.
Returns np.nan values when there is one or zero peaks found (e.g. on a dark dataset there will not be a second peak!)
- get_refined_gain_and_offset(n_processes=1)[source]¶
Does a 2nd-order polynomial fit to the peak. First guess is the local maximum.
- correct_gain_and_offset(histogram=None, gain=None, offset=None)[source]¶
Apply gain and offset values to a histogram using interpolations.
- convert_to_q_histogram(n_q_bins=None, q_range=None, pad_geometry=None, beam=None, mask=None, normalized=False)[source]¶
Convert pixel histogram to q-bin histogram.
- Parameters:
n_q_bins (int) – = None,
q_range (tuple) – = None,
pad_geometry (
PADGeometry
) – = None,beam (
Beam
) – = None,
- Return type:
q_histogram (
ndarray
)
- class reborn.analysis.runstats.ParallelPADStats(framegetter=None, histogram_params=None, **kwargs)[source]¶
Bases:
ParallelAnalyzer
Gather PAD statistics for a dataset.
Given a
FrameGetter
subclass instance, fetch the mean intensity, mean squared intensity, minimum, and maximum intensities, and optionally a pixel-by-pixel intensity histogram. The function can run on multiple processors via the joblib library. Logfiles and checkpoint files are created.The return of this function is a dictionary with the following keys:
‘sum’: Sum of PAD intensities
‘dataset_id’: A unique identifier for the data set (e.g. ‘run0154’)
‘pad_geometry’: PADGeometryList
‘mask’: Bad pixel mask
‘n_frames’: Number of valid frames contributing to the statistics
‘sum’: Sum of PAD intensities
‘sum2’: Sum of squared PAD intensities
‘min’: Pixel-by-pixel minimum of PAD intensities
‘max’: Pixel-by-pixel maximum of PAD intensities
‘counts’: Sum of the PAD mask
‘beam’: Beam info
‘start’: Frame at which processing started (global framegetter index)
‘stop’: Frame at which processing stopped (global framegetter index)
‘step’: Step size between frames (helpful to speed up processing by sub-sampling)
‘wavelengths’: An array of wavelengths
‘histogram_params’: Dictionary with histogram parameters
‘histogram’: Pixel-by-pixel intensity histogram (MxN array, with M the number of pixels)
There is a corresponding view_padstats function to view the results in this dictionary.
padstats needs a configuration dictionary with the following contents:
‘log_file’: Base path/name for status logging. Set to None to skip logging.
‘checkpoint_file’: Base path/name for saving check points. Set to None to skip checkpoints.
‘checkpoint_interval’: How many frames between checkpoints.
‘message_prefix’: Prefix added to all logging messages. For example: “Run 35” might be helpful
‘debug’: Set to True for more logging messages.
‘reduce_from_checkpoints’: True by default, this indicates that data produced by multiple processors should be compiled by loading the checkpoints from disk. Without this, you might have memory problems. (The need for this is due to the joblib package; normally the reduce functions from MPI would be used to avoid hitting the disk.)
‘histogram_params’: If not None, triggers production of a pixel-by-pixel histogram. This is a dictionary with the following entries: dict(bin_min=-30, bin_max=100, n_bins=100, zero_photon_peak=0, one_photon_peak=30)
- Parameters:
framegetter (
FrameGetter
) – A FrameGetter subclass. If running in parallel, you should instead pass a dictionary with keys ‘framegetter’ (with reference to FrameGetter subclass, not an actual class instance) and ‘kwargs’ containing a dictionary of keyword arguments needed to create a class instance.histogram_params (dict)
start (int) – Which frame to start with.
stop (int) – Which frame to stop at.
step (int) – Step size between frames (default 1).
n_processes (int) – How many processes to run in parallel (if parallel=True).
**kwargs – Any additional key-word arguments you would like to pass to the base class. See: ..:py:class::~reborn.analysis.parallel.ParallelAnalyzer
Returns: dict
- add_frame(dat)[source]¶
User-defined method that does all actions associated with the addition of one
DataFrame
to the results. You should probably add a test to determine if any initializations are needed, e.g. in the event that empty arrays must be pre-allocated or counters need to be set to zero.
- to_dict()[source]¶
User-defined method that compiles all relevant data into a dictionary. This will be used to save the state of the analysis in checkpoint files, so be sure that all information that is needed to fast-forward the processing to an intermediate state is provided. You should probably include the config dictionary in this file so you know how the analysis was configured.
- reborn.analysis.runstats.save_padstats(stats: dict, filepath: str)[source]¶
Saves the results of the padstats function.
- reborn.analysis.runstats.load_padstats(filepath: str)[source]¶
Load the results of padstats from disk.
- Parameters:
filepath (str) – Path to the file you want to load.
Returns: dict
- reborn.analysis.runstats.padstats_framegetter(stats)[source]¶
Make a FrameGetter that can flip through the output of padstats.
- Parameters:
stats (dict) – Output of padstats.
Returns: ListFrameGetter
- reborn.analysis.runstats.view_padstats(stats, histogram=False, start=True, main=True, show_corrected_histogram=False, pad_geometry=None, beam=None, debug=0)[source]¶
View the output of padstats.
- Parameters:
stats (dict) – Output dictionary from padstats.
histogram (bool) – Show the histogram
start (bool) – Choose if the PADView app should be started (executed)
main (bool) – Choose if the window should be a Qt main window.
show_corrected_histogram (bool) – Undocumented.
pad_geometry (
PADGeometryList
) – PAD geometrybeam (
Beam
) – Beam parameters
reborn.analysis.saxs module¶
- reborn.analysis.saxs.debug_message(*args, caller=True, **kwargs)[source]¶
Standard debug message, which includes the function called.
- class reborn.analysis.saxs.QHistogram(bin_min=None, bin_max=None, n_bins=None, n_q_bins=None, pad_geometry=None, beam=None)[source]¶
Bases:
object
Creates an intensity histogram for each pixel in a PAD. For a PAD with N pixels in total, this class will produce an array of shape (M, N) assuming that you requested M bins in the histogram.
- Parameters:
- count = 0¶
- bin_min = None¶
- bin_max = None¶
- n_bins = None¶
- n_q_bins = None¶
- class reborn.analysis.saxs.RadialProfiler(mask=None, n_bins=None, q_range=None, pad_geometry=None, beam=None)[source]¶
Bases:
object
A class for creating radial profiles from image data. Standard profiles are computed using fortran code. Bin indices are cached for speed, provided that the
PADGeometry
andBeam
do not change. Arbitrary profiles can also be computed (slowly!) provided a function handle that operates on anndarray
.For a parallelized version see: ..:py:class::~reborn.analysis.saxs.RadialProfiler
- Parameters:
mask (
ndarray
) – Optional. The arrays will be multiplied by this mask, and the counts per radial bin will come from this (e.g. use values of 0 and 1 if you want a normal average, otherwise you get a weighted average).n_bins (int) – Number of radial bins you desire.
q_range (list-like) – The minimum and maximum of the centers of the q bins.
pad_geometry (
PADGeometryList
) – Optional. Will be used to generate q magnitudes. You must provide beam if you provide this.beam (
Beam
) – Optional, unless pad_geometry is provided. Wavelength and beam direction are needed in order to calculate q magnitudes.
- n_bins = None¶
- q_range = None¶
- q_edge_range = None¶
- bin_centers = None¶
- bin_edges = None¶
- bin_size = None¶
- set_mask(mask)[source]¶
Update the mask. Concatenate (if needed), copy, and make it un-writeable.
- Parameters:
mask (
ndarray
) – Mask.
- set_beam(beam)[source]¶
Update the
Beam
with a copy and clear out derived caches.- Parameters:
beam (
Beam
) – Beam properties.
- set_pad_geometry(geom)[source]¶
Update the
PADGeometryList
with a copy and clear out derived caches.- Parameters:
geom (
PADGeometryList
) – PAD geometry.
- property q_bin_centers¶
The centers of the q bins.
- make_plan(mask=None, n_bins=None, q_range=None, pad_geometry=None, beam=None)[source]¶
Set up the binning indices for the creation of radial profiles. Cache some useful data to speed things up later.
- Parameters:
mask (
ndarray
) – Optional. The arrays will be multiplied by this mask, and the counts per radial bin will come from this (e.g. use values of 0 and 1 if you want a normal average, otherwise you get a weighted average).n_bins (int) – Number of radial bins you desire.
q_range (tuple) – The minimum and maximum of the centers of the q bins.
pad_geometry (list of
PADGeometry
instances) – Optional. Will be used to generate q magnitudes. You must provide beam if you provide this.beam (
Beam
instance) – Optional, unless pad_geometry is provided. Wavelength and beam direction are needed in order to calculate q magnitudes.
- get_profile_statistic(data, mask=None, statistic=None)[source]¶
Calculate the radial profile using an arbitrary function.
- Parameters:
data (
ndarray
) – The intensity data from which the radial profile is formed.mask (
ndarray
) – Optional. A mask to indicate bad pixels. Zero is bad, one is good. If no mask is provided here, the mask configured withset_mask
will be used.statistic (function or list of functions) – Provide a function of your choice that runs on each radial bin.
Returns:
ndarray
- get_counts_profile(mask=None, quick=True)[source]¶
Calculate the radial profile of counts that fall in each bin. If the mask is not binary, then the “counts” are actually the sum of weights in each bin.
- get_sum_profile(data, mask=None, quick=True)[source]¶
Calculate the radial profile of summed intensities. This is divided by counts to get an average.
- Parameters:
Returns:
ndarray
- get_mean_profile(data, mask=None, quick=True)[source]¶
Calculate the radial profile of averaged intensities.
- Parameters:
Returns:
ndarray
- get_median_profile(data, mask=None)[source]¶
Calculate the radial profile of averaged intensities.
- Parameters:
Returns:
ndarray
- get_sdev_profile(data, mask=None, quick=True)[source]¶
Calculate the standard deviations of radial bin.
- Parameters:
Returns:
ndarray
- subtract_profile(data, mask=None, statistic=<function median>)[source]¶
Given some PAD data, subtract a radial profile (mean or median).
- Parameters:
data (
ndarray
) – The intensity data from which the radial profile is formed.mask (
ndarray
) – Optional. A mask to indicate bad pixels. Zero is bad, one is good. If no mask is provided here, the mask configured withset_mask
will be used.statistic (function) – Provide a function of your choice that runs on each radial bin.
Returns:
- divide_profile(data, mask=None, statistic=<function median>)[source]¶
Given some PAD data, subtract a radial profile (mean or median).
- Parameters:
data (
ndarray
) – The intensity data from which the radial profile is formed.mask (
ndarray
) – Optional. A mask to indicate bad pixels. Zero is bad, one is good. If no mask is provided here, the mask configured withset_mask
will be used.statistic (function) – Provide a function of your choice that runs on each radial bin.
- Returns:
- subtract_median_profile(data, mask=None)[source]¶
Given some PAD data, calculate the radial median and subtract it from the data.
- quickstats(data, weights=None)[source]¶
Use the faster fortran functions to get the mean and standard deviations. These are weighted by the mask, and you are allowed to use non-binary values in the mask. The proper weighting is most likely equal to the product of the pixel solid angle, the polarization factor, and the binary mask.
- class reborn.analysis.saxs.ParallelRadialProfiler(framegetter=None, n_q_bins=1000, q_range=None, pad_geometry=None, beam=None, mask=None, include_median=False, **kwargs)[source]¶
Bases:
ParallelAnalyzer
Parallelized class for creating radial profiles from x-ray diffraction data.
Parallel analyzer that produces scattering (“radial”) profiles from diffraction patterns. See the
to_dict
method for details of what the output is.Standard profiles are computed using fortran code. Bin indices are cached for speed, provided that the
PADGeometry
andBeam
do not change.- Parameters:
framegetter (
FrameGetter
) – The FrameGetter that serves the data for analysis.n_bins (int) – Number of radial bins (default is 1000).
q_range (list-like) – The minimum and maximum of the centers of the q bins.
pad_geometry (
PADGeometryList
) – Detector geometry, used to generate q magnitudes. If None, then automatically retrieved from raw data.beam (
Beam
) – X-ray beam. Wavelength and beam direction required to calculate q magnitudes. If None, then automatically retrieved from raw data.mask (List or
ndarray
) – Optional (default is no masked pixels). Data will be multiplied by this mask, and the counts per radial bin will come from this (e.g. use values of 0 and 1 if you want a normal average, otherwise you get a weighted average).include_median (bool) – Set to True to include median profile (default is False).
start (int) – Which frame to start with.
stop (int) – Which frame to stop at.
step (int) – Step size between frames (default 1).
n_processes (int) – How many processes to run in parallel (if parallel=True).
**kwargs – Any key-word arguments you would like to pass to the base class. See: ..:py:class::~reborn.analysis.parallel.ParallelAnalyzer
- include_median = False¶
- profiler = None¶
- radials = None¶
- add_frame(dat: DataFrame)[source]¶
User-defined method that does all actions associated with the addition of one
DataFrame
to the results. You should probably add a test to determine if any initializations are needed, e.g. in the event that empty arrays must be pre-allocated or counters need to be set to zero.
- concatenate(stats: dict)[source]¶
User-defined method that combines an existing instance of ParallelAnalyzer with the results of another ParallelAnalyzer that has operated on a different chunk of DataFrames.
- to_dict()[source]¶
Create a dictionary of the results.
- Returns:
mean (
ndarray
) – Mean of unmasked intensitiessdev (
ndarray
) – Standard deviation of unmasked intensitiesmedian (
ndarray
) – Median of unmasked intensities (only if requested; this is slow)sum (
ndarray
) – Sum of unmasked intensitiessum2 (
ndarray
) – Sum of squared unmasked intensities- wsum (
ndarray
) – Weighted sum (if no weights are provided, then this is the number of unmasked pixels in the q bin)
- wsum (
n_frames (int) – Number of frames analyzed.
initial_frame (int) – First frame from which geometry and beam data are extracted.
n_q_bins (int) – Number of q bins in radial profiles.
q_bins (
ndarray
) – The centers of the q bins.experiment_id (str) – Identifier for experiment being analyzed.
run_id (str) – Identifier for run being analyzed
pad_geometry (
PADGeometryList
) – Detector geometry used to set up RadialProfiler.beam (
Beam
) – X-ray beam used to set up RadialProfiler.mask (
ndarray
) – Mask used to set up RadialProfiler.
- Return type:
(dict)
reborn.analysis.svd module¶
- reborn.analysis.svd.addblock_svd_update(U, S, V, A, force_orth=False)[source]¶
This is a nearly direct translation of the Matlab code found here: https://pcc.byu.edu/scripts/addblock_svd_update.m
The mathematics is discussed here: Brand, M. Fast low-rank modifications of the thin singular value decomposition. Linear Algebra and its Applications 415, 20-30 (2006).
Note that there are some differences between Numpy and Matlab implementations of SVD. So far, I noticed that the V matrix in the X = USV decomposition is transposed when comparing Matlab to Numpy. Also, the ordering of the diagonal S matrix is different; Numpy is more economical since only the diagonals are specified as a 1D matrix. More importantly, the actual diagonal entries appear to be reversed in their ordering by comparison. I don’t know how they differ in the event of degenerate values. There are likely some “gotcha’s” that I am not yet aware of.
Original documenation by D. Wingate 8/17/2007:
=================================================================
Given the SVD of
X = U*S*V’
update it to be the SVD of
[X A] = Up*Sp*Vp’
that is, add new columns (ie, data points).
I have found that it is faster to add several (say, 200) data points at a time, rather than updating it incrementally with individual data points (for 500 dimensional vectors, the speedup was roughly 25x). However, in the rank-one case there is structure that I have not exploited, so that may still be faster than a block method.
The subspace rotations involved may not preserve orthogonality due to numerical round-off errors. To compensate, you can set the “force_orth” flag, which will force orthogonality via a QR plus another SVD. In a long loop, you may want to force orthogonality every so often.
See Matthew Brand, “Fast low-rank modifications of the thin singular value decomposition”.
Wingate 8/17/2007
=====================================================================
- Parameters:
U (numpy array) – Left singular vectors of shape p x q
S (numpy array) – Diagonal matrix (shape q – only the q diagonal elements specified)
V (numpy array) – Right singular vectors of shape q x n
A (numpy array) – The matrix to be appended to X = USV (shape p x n)
force_orth – Force orthogonality
- Returns:
Up, Sp, Vp
- Return type:
numpy arrays