Dist Regression Tests¶
%load_ext autoreload
%autoreload 2
%config InlineBackend.figure_format = 'retina'
from distgen.physical_constants import unit_registry
import numpy as np
from matplotlib import pyplot as plt
Random Number Generation¶
To sample various distributions requires generating random numbers and supplying them to the $CDF^{-1}$ functions for each corresponding distribution. Currently, this is handled using
distgen.dist.random_generator(shape, sequence, **params)
.
Here shape = (n_dimension, n_particle)
determines the shape of the random numbers returned. The keyword 'sequence' can be used to set the sequence to Hammerlsey for quasi-random numbers.
The difference is shown below:
from distgen.dist import random_generator
shape = (2, 100)
p1 = random_generator(shape, sequence="hammersley")
p2 = random_generator(shape, "pseudo")
fig, ax = plt.subplots(1, 2, constrained_layout=True)
ax[0].plot(p1[0, :], p1[1, :], ".")
ax[0].set(xlabel="rx", ylabel="ry", title="hammersley")
ax[1].plot(p2[0, :], p2[1, :], "*")
ax[1].set(xlabel="rx", ylabel="ry", title="random.rand");
1D Distributions¶
Distgen supports several one dimensional distribution types.
Uniform 1D¶
The uniform distirbuition is defined by a probability distribution function:
$\rho(x) = \frac{1}{b-a}$ for $a\leq x\leq b$ and zero elsewhere.
The corresponding CDF is
$P(x) = \frac{x-a}{b-a}$ for $a\leq x\leq b$ and zero elsewhere.
The first and second moments of this distribution are:
$\langle x \rangle = \frac{1}{2}(a+b)$ and $\sigma_x = \frac{b-a}{\sqrt{12}}$
from distgen.dist import Uniform
var = "x"
verbose = 1
params = {"min_x": 2 * unit_registry("mm"), "max_x": 4 * unit_registry("mm")}
uniform = Uniform(var, verbose=verbose, **params)
uniform.plot_pdf()
uniform.plot_cdf()
uniform.test_sampling()
uniform min_x = 2 mm, max_x = 4 mm, avg_x = 3 mm, sigma_x: 0.57735 mm
Linear¶
The linear distirbuition is defined by a probability distribution function:
$\rho(x) \propto \frac{\rho_b-\rho_a}{b-a}(x-a) + \rho_a$ for $a\leq x\leq b$ and zero elsewhere.
The corresponding CDF is
$P(x) \propto \frac{1}{2}\frac{\rho_b-\rho_a}{b-a}(x-a)$ for $a\leq x\leq b$ and zero elsewhere.
The first and second moments of this distribution are:
$\langle x \rangle \propto \frac{\rho_b-\rho_a}{b-a}\left(\frac{1}{3}(b^3-a^3) -a(b^2-a^2)\right) + \frac{1}{2}\rho_a(b^2-a^2)$.
from distgen.dist import Linear
var = "x"
verbose = 1
params = {
"min_x": -2 * unit_registry("mm"),
"max_x": 1 * unit_registry("mm"),
"slope_fraction": -1,
}
norm = Linear(var, verbose=verbose, **params)
norm.plot_pdf()
norm.plot_cdf()
norm.test_sampling()
Linear
Normal Distribution (including truncation)¶
The general form of a normal distribution PDF with truncation is given by
$\rho(x) = \frac{1}{\sigma}\frac{\phi\left(\frac{x-\mu}{\sigma}\right)}{\Phi\left(\frac{b-\mu}{\sigma}\right)-\Phi\left(\frac{a-\mu}{\sigma}\right)}$.
In this expression $\phi(\xi) = \frac{1}{\sqrt{2\pi}}e^{-\frac{1}{2}\xi^2}$ is the canonical normal distribution, $\Phi(\xi) = \frac{1}{2}\left[1 + \text{erf}\left(\frac{\xi}{\sqrt{2}}\right) \right]$ is the canonical normal CDF, and $a=-N_{\text{cutoff}}\cdot\sigma$ and $b=-N_{\text{cutoff}}\cdot\sigma$ are the left and right truncation points. The CDF if given by
$P(x) = \frac{\Phi\left(\frac{x-\mu}{\sigma}\right) - \Phi\left(\frac{a-\mu}{\sigma}\right)}{\Phi\left(\frac{b-\mu}{\sigma}\right)-\Phi\left(\frac{a-\mu}{\sigma}\right)}$.
Defining $\alpha = \frac{a-\mu}{\sigma}$ and $\beta = \frac{b-\mu}{\sigma}$, the first and second moments of the distribution are:
$\langle x\rangle = \mu + \frac{\phi\left(\alpha\right) - \phi\left(\beta\right)}{\Phi\left(\beta\right)-\Phi\left(\alpha\right)}\sigma$ and $\sigma_x = \sigma \left\{1 + \frac{\alpha\phi\left(\alpha\right) - \beta\phi(\beta) }{\Phi(\beta) - \Phi(\alpha)} - \left(\frac{\phi\left(\alpha\right) - \phi(\beta)}{\Phi(\beta) - \Phi(\alpha)}\right)^{2} \right\}^{1/2} $.
When using this distribution, if the $N_{\text{cutoff}}$ is not set then the distribution reduces to an infinite range normal distribution, as first shown below:
from distgen.dist import Norm
var = "x"
verbose = 1
params = {"sigma_x": 2 * unit_registry("mm"), "avg_x": -1 * unit_registry("mm")}
norm = Norm(var, verbose=verbose, **params)
norm.plot_pdf()
norm.plot_cdf()
norm.test_sampling()
Gaussian avg_x = -1 mm, sigma_x = 2.000 mm
Below the $N_{\text{cutoff}}$ parameter is set to cut the distribution symmetrically:
from distgen.dist import Norm
var = "x"
verbose = 1
params = {
"sigma_x": 2 * unit_registry("mm"),
"avg_x": 0 * unit_registry("mm"),
"n_sigma_cutoff": 2,
}
norm = Norm(var, verbose=verbose, **params)
norm.plot_pdf()
norm.plot_cdf()
norm.test_sampling()
Gaussian avg_x = 0 mm, sigma_x = 2.000 mm Left n_sigma_cutoff = 2, Right n_sigma_cutoff = -2
The distribution can be truncated asymmetrically using the $N_{\text{cutoff},R}$ and $N_{\text{cutoff},L}$ parameters, as shown below. Note in this case, it is only required that $N_{\text{cutoff},L} < N_{\text{cutoff},R}$, allowing for completley arbtitray location of the truncation points. This requires a minus sign for the cut off parameters for truncation values less than zero.
from distgen.dist import Norm
params = {
"sigma_x": 2 * unit_registry("mm"),
"avg_x": 0 * unit_registry("mm"),
"n_sigma_cutoff_left": -1.5,
"n_sigma_cutoff_right": 1,
}
norm = Norm("x", verbose=1, **params)
norm.plot_pdf()
norm.plot_cdf()
norm.test_sampling()
Gaussian avg_x = 0 mm, sigma_x = 2.000 mm Left n_sigma_cutoff = 1, Right n_sigma_cutoff = -1.5
Super Gaussian¶
In additional to the regular Gaussian function, it is also possible to sample a super-Gaussian distribution defined by
$\rho(x; \lambda, p) = \frac{1}{2\sqrt{2}\Gamma\left(1+\frac{1}{2p}\right)\lambda } \exp\left[-\left(\frac{(x-\mu)^2 }{2\lambda^2}\right)^{p}\right]$
Here $\sigma_1$ is the length scale and $p$ is the power of the super-Gaussian. Note when $p=1$ reduces to a Normal distirbution, in which case $\sigma_x=\lambda$. As $p\rightarrow\infty$ the distribution reduces to a flat-top (uniform). The full range of powers is given by $p\in\left(0,\infty\right]$.
The first and second moments of the distribution are given by:
$\langle x\rangle = \mu$, and $\sigma_x = \left(\frac{2\Gamma\left(1+\frac{3}{2p}\right)}{3\Gamma\left(1+\frac{1}{2p}\right)}\right)^{1/2}\lambda$.
Often, it is convenient to scan the distribution from the uniform limit to the Gaussian limit. To do some, the input $p$ can be parameterized by $\alpha\in[0,1]$ where $p = 1/\alpha$. Here $\alpha=0$ corresponds to a flat-top (uniform) and $\alpha=1$ corresponds to a Gaussian. Examples of both types of usage are shown below.
from distgen.dist import SuperGaussian
ps = [0.5, 1, 5, float("Inf")]
alphas = [0, 0.25, 0.5, 1]
fig, (ax1, ax2) = plt.subplots(
1, 2, sharex="col", figsize=(12, 4), constrained_layout=True
)
plegs = ["p = " + str(p) for p in ps]
alegs = ["$\\alpha$ = " + str(a) for a in alphas]
for ii, p in enumerate(ps):
pparams = {
"lambda": 2 * unit_registry("mm"),
"p": p * unit_registry("dimensionless"),
}
supG = SuperGaussian("x", verbose=0, **pparams)
x = supG.get_x_pts(1000)
rho = supG.pdf(x)
ax1.plot(x, rho)
a = alphas[ii]
aparams = {
"lambda": 2 * unit_registry("mm"),
"alpha": a * unit_registry("dimensionless"),
}
x = np.linspace(-3 * aparams["lambda"], 3 * aparams["lambda"], 100)
supG = SuperGaussian("x", verbose=0, **aparams)
rho = supG.pdf(x)
ax2.plot(x, rho)
ax1.set_xlabel("x (mm)")
ax2.set_xlabel("x (mm)")
ax1.set_ylabel("pdf (1/mm)")
ax2.set_ylabel("pdf (1/mm)")
ax1.legend(plegs)
ax2.legend(alegs);
To set the length scale of the distribution, the user must either supply 'sigma_[var]' or 'lambda'. See usage below:
params = {
"sigma_x": 2 * unit_registry("mm"),
#'alpha': 0.75*unit_registry('dimensionless'),
"alpha": 0.003 * unit_registry("dimensionless"),
"avg_x": 0.25 * unit_registry("mm"),
}
supG = SuperGaussian("x", verbose=1, **params)
supG.plot_pdf()
supG.plot_cdf()
supG.test_sampling()
Super Gaussian sigma_x = 2 mm, power = 333.333
1D PDF from a file¶
Disgten supports importing a 1D PDF saved in column form in. The input form of the file should have space separated headers such as $x$ and $Px$, with corresponding column data below it. The PDF is normalized numerically using the numpy.trapz numerical integration routine. The CDF is computed using the scipy.cumtrapz cumulative numerical intgration routine.
The following example shows a gaussian PDF with cuts added to it.
from distgen.dist import File1d
var = "t"
verbose = 1
params = {"file": "../examples/data/cutgauss.1d.txt", "units": "ps"}
file1d = File1d(var, verbose=verbose, **params)
file1d.plot_pdf()
file1d.plot_cdf()
file1d.test_sampling()
t-distribution file: "../examples/data/cutgauss.1d.txt"
Laser pulse stacking¶
from distgen.dist import TemporalLaserPulseStacking
verbose = 1
params = {
"crystal_length_1": 15.096 * unit_registry("mm"),
"crystal_length_2": 7.548 * unit_registry("mm"),
"crystal_length_3": 3.774 * unit_registry("mm"),
"crystal_length_4": 1.887 * unit_registry("mm"),
"crystal_angle_1": 0.6 * unit_registry("deg"),
"crystal_angle_2": 1.8 * unit_registry("deg"),
"crystal_angle_3": -0.9 * unit_registry("deg"),
"crystal_angle_4": -0.5 * unit_registry("deg"),
}
laser_pulse = TemporalLaserPulseStacking(verbose=verbose, **params)
laser_pulse.plot_pdf()
laser_pulse.plot_cdf()
laser_pulse.test_sampling()
crystal temporal laser shaping crystal 1 length = 15.096 mm, angle = 0.6 deg crystal 2 length = 7.548 mm, angle = 1.8 deg crystal 3 length = 3.774 mm, angle = -0.9 deg crystal 4 length = 1.887 mm, angle = -0.5 deg Pulses propagated: min t = -23.9053 ps, max t = 23.9053 ps
Tukey 1D¶
from distgen.dist import Tukey
var = "y"
verbose = 1
params = {
"length": 2 * unit_registry("mm"),
"ratio": 0.75 * unit_registry("dimensionless"),
}
tukey = Tukey(var, verbose=1, **params)
tukey.plot_pdf()
tukey.plot_cdf()
tukey.test_sampling()
Tukey length = 2 mm, ratio = 0.75
Superposition 1D¶
This distribution allows the user to superimpose an arbitrary number of 1D PDFs. The general form is:
$\rho(x) = \sum_i w_i \frac{\rho_i(x)}{\max(\rho_i)}$.
Here the $w_i$ are user specified weights. If no weight is specified for a given $rho_i$, then the weight will default to one.
from distgen.dist import Superposition
params = {
"weights": {"d1": 1, "d2": 2},
"dists": {
"d1": {
"avg_z": -2 * unit_registry("mm"),
"sigma_z": 1 * unit_registry("mm"),
"type": "gaussian",
},
"d2": {
"avg_z": +2 * unit_registry("mm"),
"sigma_z": 1 * unit_registry("mm"),
"type": "gaussian",
},
},
}
sup = Superposition("z", 1, **params)
sup.plot_pdf()
sup.plot_cdf()
sup.test_sampling()
superpostion 1. distribution name: d1, type: Gaussian avg_z = -2 mm, sigma_z = 1.000 mm 2. distribution name: d2, type: Gaussian avg_z = 2 mm, sigma_z = 1.000 mm
Product Dist 1D¶
This distribution allows the user to multiply an arbitrary number of 1D PDFs. The general form is:
$\rho(x) = \prod_i \rho_i(x)$.
from distgen.dist import Product
params = {
"dists": {
"d1": {
"min_z": -3 * unit_registry("mm"),
"max_z": 3 * unit_registry("mm"),
"type": "uniform",
},
"d2": {
"avg_z": +0 * unit_registry("mm"),
"sigma_z": 3 * unit_registry("mm"),
"type": "gaussian",
},
}
}
pro = Product("z", 1, **params)
pro.plot_pdf()
pro.plot_cdf()
pro.test_sampling()
distribution name: d1 uniform min_z = -3 mm, max_z = 3 mm, avg_z = 0 mm, sigma_z: 1.73205 mm distribution name: d2 Gaussian avg_z = 0 mm, sigma_z = 3.000 mm
Deformable¶
from distgen.dist import Deformable
params = {
"alpha": 0.750 * unit_registry(""),
"slope_fraction": -1,
"sigma_x": 2 * unit_registry("mm"),
"avg_x": 0 * unit_registry("mm"),
}
deform = Deformable("x", verbose=1, **params)
deform.plot_pdf()
deform.plot_cdf()
deform.test_sampling()
Super Gaussian sigma_x = 2 mm, power = 1.33333 Linear
Interpolation1D¶
from distgen.dist import Interpolation1d
params = {
"avg_t": 0.0 * unit_registry("ps"),
"sigma_t": 2 * unit_registry("ps"),
"Pt": [1, 2.2, 3.3, 4, 3, 2, 6, 7, 7.5, 8, 7.5, 1],
"method": "spline",
"n_pts": 100000,
}
i1d = Interpolation1d("t", verbose=1, **params)
i1d.plot_pdf()
i1d.plot_cdf()
i1d.test_sampling()
Radial Distributions¶
from distgen.dist import UniformRad
verbose = 1
params = {"min_r": 1 * unit_registry("mm"), "max_r": 2 * unit_registry("mm")}
urad = UniformRad(verbose=1, **params)
urad.plot_pdf()
urad.plot_cdf()
urad.test_sampling()
radial uniform min_r = 1 mm, max_r = 2 mm
Radial Linear¶
from distgen.dist import LinearRad
verbose = 1
params = {
"min_r": 0 * unit_registry("mm"),
"max_r": 2 * unit_registry("mm"),
"slope_fraction": 1,
}
lin = LinearRad(verbose=verbose, **params)
lin.plot_pdf()
lin.plot_cdf()
lin.test_sampling()
LinearRad
Radial Normal Distribution (with truncation)¶
The radial normal distribution including truncation(s) has a probability function given by
$\rho_r(r) = \frac{1}{\sigma^2}\frac{\phi(r/\sigma)}{\phi\left(\frac{r_L}{\sigma}\right)-\phi\left(\frac{r_R}{\sigma}\right)} $ for $0 \leq r_L \leq r \leq r_R$ and zero everywhere else.
In this expresion $\phi(\xi) = \frac{1}{2\pi}\exp\left(-\xi^2/2\right)$ is the canonical raidial normal distirbution (no truncation), and the scale parameter $\sigma$ follows from the product of two normal distributions in $x$ and $y$ when $\sigma=\sigma_x=\sigma_y$. The corresponding CDF is given by
$P(r)= \frac{\phi\left(\frac{r_L}{\sigma}\right)-\phi\left(\frac{r}{\sigma}\right)}{\phi\left(\frac{r_L}{\sigma}\right)-\phi\left(\frac{r_R}{\sigma}\right)} $ for $0 \leq r_L \leq r$.
The corresponding first and second moments are:
$\langle r\rangle = \frac{\frac{r_L}{\sigma}\phi\left(\frac{r_L}{\sigma}\right) -\frac{r_R}{\sigma}\phi\left(\frac{r_R}{\sigma}\right) +\frac{1}{2\sqrt{2\pi}}\left( \text{erf}\left(\frac{r_R}{\sigma\sqrt{2}}\right) - \text{erf}\left(\frac{r_L}{\sigma\sqrt{2}}\right) \right) } {\phi\left(\frac{r_L}{\sigma}\right)-\phi\left(\frac{r_R}{\sigma}\right)}$,
$r_{rms} = \sqrt{ 2\sigma^2 + r_L^2 - \frac{(r_R^2-r_L^2)\phi(r_R/\sigma)}{\phi\left(\frac{r_L}{\sigma}\right)-\phi\left(\frac{r_R}{\sigma}\right)} }$.
Note that in the limits $r_L\rightarrow 0$ and $r_R -> \infty$ the above expressions reduce to the underlying radial normal distribution:
$\rho_r(r)\rightarrow \frac{\phi\left(\frac{r}{\sigma}\right)}{\sigma^2}$, $P(r)\rightarrow 1 - \phi\left(\frac{r}{\sigma}\right)$, $\langle r\rangle\rightarrow \sqrt{\frac{\pi}{2}}\sigma$, and $r_{rms}\rightarrow \sqrt{2}\sigma$. This limiting case is shown first below.
from distgen.dist import NormRad
verbose = 1
params = {"sigma_xy": 1 * unit_registry("mm")}
nrad = NormRad(verbose=1, **params)
nrad.plot_pdf()
nrad.plot_cdf()
nrad.test_sampling()
radial Gaussian
For laser scientists it can be convenient to to work with a pinhole radius and a fraction of the laser intensity to clip a transverse normal laser mode at. In this case the user can supply a truncation radius ($=r_R$) and a truncation fraction $f = \exp\left(-\frac{r_R^2}{2\sigma}\right)$ from which distgen determines the underlying $\sigma$. The example below demonstrates this usage:
from distgen.dist import NormRad
verbose = 1
params = {
"truncation_radius": 1 * unit_registry("mm"),
"truncation_fraction": 0.5 * unit_registry("dimensionless"),
}
nrad = NormRad(verbose=1, **params)
nrad.plot_pdf()
nrad.plot_cdf()
nrad.test_sampling()
radial Gaussian
from distgen.dist import NormRad
verbose = 1
params = {"sigma_xy": 2 * unit_registry("mm"), "n_sigma_cutoff": 1}
nrad = NormRad(verbose=1, **params)
nrad.plot_pdf()
nrad.plot_cdf()
nrad.test_sampling()
radial Gaussian
Radial Tukey¶
from distgen.dist import TukeyRad
verbose = 1
params = {
"length": 1 * unit_registry("mm"),
"ratio": 0.75 * unit_registry("dimensionless"),
}
rtukey = TukeyRad(verbose=1, **params)
rtukey.plot_pdf()
rtukey.plot_cdf()
rtukey.test_sampling()
TukeyRad legnth = 1.000 mm, ratio = 0.750
Radial Super Gaussian¶
This implements a radial version of the Super Gaussian function discussed above. Here the radial function takes the form:
$2\pi\rho(r;\lambda,p) = \frac{1}{\Gamma\left(1+\frac{1}{p}\right)\lambda^2} \exp\left[-\left(\frac{r^2}{2\lambda^2}\right)^p\right]$.
The first and (rms) second moment of the distribution are given by:
$\langle r\rangle = \frac{2\sqrt{2}}{3}\frac{\Gamma\left(1+\frac{3}{2p}\right)}{\Gamma\left(1+\frac{1}{p}\right)}\lambda$,
$r_{\text{rms}} = \sqrt{\frac{\Gamma\left(1+\frac{2}{p}\right)}{\Gamma\left(1+\frac{1}{p}\right)}}\lambda$.
from distgen.dist import SuperGaussianRad
verbose = 1
params = {
"sigma_xy": 1 * unit_registry("mm"),
"alpha": 0.0 * unit_registry("dimensionless"),
}
supG = SuperGaussianRad(verbose=1, **params)
supG.plot_pdf()
supG.plot_cdf()
supG.test_sampling()
SuperGaussianRad lambda = 1.41421 mm, power = INF
Radial File Distribution¶
from distgen.dist import RadFile
params = {"file": "../examples/data/cutgauss.rad.txt", "units": "mm"}
rfd = RadFile(verbose=1, **params)
rfd.plot_pdf()
rfd.plot_cdf()
rfd.test_sampling()
radial file r-dist file: "../examples/data/cutgauss.rad.txt"
Deformable¶
from distgen.dist import DeformableRad
params = {
"alpha": 0.05 * unit_registry(""),
"slope_fraction": -1,
"sigma_xy": 2 * unit_registry("mm"),
}
deform = DeformableRad(verbose=1, **params)
deform.plot_pdf()
deform.plot_cdf()
deform.test_sampling()
SuperGaussianRad lambda = 2.86117 mm, power = 20 LinearRad
Angular Distributions (TODO)¶
Angular distributions define one dimensional probability functions for the cylindrical variable $\theta$.