{ } Raw JSON

bundles / scipy latest / docs

Doc

Continuous Statistical Distributions

docs/tutorial:stats:continuous

Overview

All distributions will have location (L) and Scale (S) parameters along with any shape parameters needed, the names for the shape parameters will vary. Standard form for the distributions will be given where and The nonstandard forms can be obtained for the various functions using (note is a standard uniform random variate).

======================================  =============================================================================  =========================================================================================================================================
Function Name                           Standard Function                                                              Transformation
======================================  =============================================================================  =========================================================================================================================================
Cumulative Distribution Function (CDF)  :math:`F\left(x\right)`                                                        :math:`F\left(x;L,S\right)=F\left(\frac{\left(x-L\right)}{S}\right)`
Probability Density Function (PDF)      :math:`f\left(x\right)=F^{\prime}\left(x\right)`                               :math:`f\left(x;L,S\right)=\frac{1}{S}f\left(\frac{\left(x-L\right)}{S}\right)`
Percent Point Function (PPF)            :math:`G\left(q\right)=F^{-1}\left(q\right)`                                   :math:`G\left(q;L,S\right)=L+SG\left(q\right)`
Probability Sparsity Function (PSF)     :math:`g\left(q\right)=G^{\prime}\left(q\right)`                               :math:`g\left(q;L,S\right)=Sg\left(q\right)`
Hazard Function (HF)                    :math:`h_{a}\left(x\right)=\frac{f\left(x\right)}{1-F\left(x\right)}`          :math:`h_{a}\left(x;L,S\right)=\frac{1}{S}h_{a}\left(\frac{\left(x-L\right)}{S}\right)`
Cumulative Hazard Function (CHF)        :math:`H_{a}\left(x\right)=` :math:`\log\frac{1}{1-F\left(x\right)}`           :math:`H_{a}\left(x;L,S\right)=H_{a}\left(\frac{\left(x-L\right)}{S}\right)`
Survival Function (SF)                  :math:`S\left(x\right)=1-F\left(x\right)`                                      :math:`S\left(x;L,S\right)=S\left(\frac{\left(x-L\right)}{S}\right)`
Inverse Survival Function (ISF)         :math:`Z\left(\alpha\right)=S^{-1}\left(\alpha\right)=G\left(1-\alpha\right)`  :math:`Z\left(\alpha;L,S\right)=L+SZ\left(\alpha\right)`
Moment Generating Function (MGF)        :math:`M_{Y}\left(t\right)=E\left[e^{Yt}\right]`                               :math:`M_{X}\left(t\right)=e^{Lt}M_{Y}\left(St\right)`
Random Variates                         :math:`Y=G\left(U\right)`                                                      :math:`X=L+SY`
(Differential) Entropy                  :math:`h\left[Y\right]=-\int f\left(y\right)\log f\left(y\right)dy`            :math:`h\left[X\right]=h\left[Y\right]+\log S`
(Non-central) Moments                   :math:`\mu_{n}^{\prime}=E\left[Y^{n}\right]`                                   :math:`E\left[X^{n}\right]=L^{n}\sum_{k=0}^{N}\left(\begin{array}{c} n\\ k\end{array}\right)\left(\frac{S}{L}\right)^{k}\mu_{k}^{\prime}`
Central Moments                         :math:`\mu_{n}=E\left[\left(Y-\mu\right)^{n}\right]`                           :math:`E\left[\left(X-\mu_{X}\right)^{n}\right]=S^{n}\mu_{n}`
mean (mode, median), var                :math:`\mu,\,\mu_{2}`                                                          :math:`L+S\mu,\, S^{2}\mu_{2}`
skewness                                :math:`\gamma_{1}=\frac{\mu_{3}}{\left(\mu_{2}\right)^{3/2}}`                  :math:`\gamma_{1}`
kurtosis                                :math:`\gamma_{2}=\frac{\mu_{4}}{\left(\mu_{2}\right)^{2}}-3`                  :math:`\gamma_{2}`
======================================  =============================================================================  =========================================================================================================================================

Moments

Non-central moments are defined using the PDF

Note, that these can always be computed using the PPF. Substitute in the above equation and get

which may be easier to compute numerically. Note that so that Central moments are computed similarly

In particular

Skewness is defined as

while (Fisher) kurtosis is

so that a normal distribution has a kurtosis of zero.

Median and mode

The median, is defined as the point at which half of the density is on one side and half on the other. In other words, so that

In addition, the mode, , is defined as the value for which the probability density function reaches it's peak

Fitting data

To fit data to a distribution, maximizing the likelihood function is common. Alternatively, some distributions have well-known minimum variance unbiased estimators. These will be chosen by default, but the likelihood function will always be available for minimizing.

If is the PDF of a random-variable where is a vector of parameters ( e.g. and ), then for a collection of independent samples from this distribution, the joint distribution the random vector is

The maximum likelihood estimate of the parameters are the parameters which maximize this function with fixed and given by the data:

Where

Note that if includes only shape parameters, the location and scale-parameters can be fit by replacing with in the log-likelihood function adding and minimizing, thus

If desired, sample estimates for and (not necessarily maximum likelihood estimates) can be obtained from samples estimates of the mean and variance using

where and are assumed known as the mean and variance of the untransformed distribution (when and ) and

Standard notation for mean

We will use

where should be clear from context as the number of samples

References

  • Documentation for ranlib, rv2, cdflib

  • Eric Weisstein's world of mathematics http://mathworld.wolfram.com/, http://mathworld.wolfram.com/topics/StatisticalDistributions.html

  • Documentation to Regress+ by Michael McLaughlin item Engineering and Statistics Handbook (NIST), https://www.itl.nist.gov/div898/handbook/

  • Documentation for DATAPLOT from NIST, https://www.itl.nist.gov/div898/software/dataplot/distribu.htm

  • Norman Johnson, Samuel Kotz, and N. Balakrishnan Continuous Univariate Distributions, second edition, Volumes I and II, Wiley & Sons, 1994.

In the tutorials several special functions appear repeatedly and are listed here.

===============================================================  ======================================================================================  =============================================================================================================================
Symbol                                                           Description                                                                             Definition
===============================================================  ======================================================================================  =============================================================================================================================
:math:`\gamma\left(s, x\right)`                                  lower incomplete Gamma function                                                         :math:`\int_0^x t^{s-1} e^{-t} dt`
:math:`\Gamma\left(s, x\right)`                                  upper incomplete Gamma function                                                         :math:`\int_x^\infty t^{s-1} e^{-t} dt`
:math:`B\left(x;a,b\right)`                                      incomplete Beta function                                                                :math:`\int_{0}^{x} t^{a-1}\left(1-t\right)^{b-1} dt`
:math:`I\left(x;a,b\right)`                                      regularized incomplete Beta function                                                    :math:`\frac{\Gamma\left(a+b\right)}{\Gamma\left(a\right)\Gamma\left(b\right)} \int_{0}^{x} t^{a-1}\left(1-t\right)^{b-1} dt`
:math:`\phi\left(x\right)`                                       PDF for normal distribution                                                             :math:`\frac{1}{\sqrt{2\pi}}e^{-x^{2}/2}`
:math:`\Phi\left(x\right)`                                       CDF for normal distribution                                                             :math:`\int_{-\infty}^{x}\phi\left(t\right) dt = \frac{1}{2}+\frac{1}{2}\mathrm{erf}\left(\frac{x}{\sqrt{2}}\right)`
:math:`\psi\left(z\right)`                                       digamma function                                                                        :math:`\frac{d}{dz} \log\left(\Gamma\left(z\right)\right)`
:math:`\psi_{n}\left(z\right)`                                   polygamma function                                                                      :math:`\frac{d^{n+1}}{dz^{n+1}}\log\left(\Gamma\left(z\right)\right)`
:math:`I_{\nu}\left(y\right)`                                    modified Bessel function of the first kind
:math:`\mathrm{Ei}(\mathrm{z})`                                  exponential integral                                                                    :math:`-\int_{-x}^\infty \frac{e^{-t}}{t} dt`
:math:`\zeta\left(n\right)`                                      Riemann zeta function                                                                   :math:`\sum_{k=1}^{\infty} \frac{1}{k^{n}}`
:math:`\zeta\left(n,z\right)`                                    Hurwitz zeta function                                                                   :math:`\sum_{k=0}^{\infty} \frac{1}{\left(k+z\right)^{n}}`
:math:`\,{}_{p}F_{q}(a_{1},\ldots,a_{p};b_{1},\ldots,b_{q};z)`   Hypergeometric function                                                                 :math:`\sum_{n=0}^{\infty} {\frac{(a_{1})_{n}\cdots(a_{p})_{n}}{(b_{1})_{n}\cdots(b_{q})_{n}}} \,{\frac{z^{n}}{n!}}`
===============================================================  ======================================================================================  =============================================================================================================================

Continuous Distributions in scipy.stats