`scipy.stats._qmc:discrepancy`

source: /scipy/stats/_qmc.py :201

Signature

   def     discrepancy (    sample  :  npt.ArrayLike    ,  * ,     iterative  :  bool    =  False   ,    method  :  Literal['CD', 'WD', 'MD', 'L2-star']    =  CD   ,    workers  :  IntNumber    =  1     )   →  float

Summary

Discrepancy of a given sample.

Parameters

sample : array_like (n, d): The sample to compute the discrepancy from.
iterative : bool, optional: Must be False if not using it for updating the discrepancy. Default is False. Refer to the notes for more details.
method : str, optional: Type of discrepancy, can be CD, WD, MD or L2-star. Refer to the notes for more details. Default is CD.
workers : int, optional: Number of workers to use for parallel processing. If -1 is given all CPU threads are used. Default is 1.

Returns

discrepancy : float: Discrepancy.

Notes

The discrepancy is a uniformity criterion used to assess the space filling of a number of samples in a hypercube. A discrepancy quantifies the distance between the continuous uniform distribution on a hypercube and the discrete uniform distribution on $n$ distinct sample points.

The lower the value is, the better the coverage of the parameter space is.

For a collection of subsets of the hypercube, the discrepancy is the difference between the fraction of sample points in one of those subsets and the volume of that subset. There are different definitions of discrepancy corresponding to different collections of subsets. Some versions take a root mean square difference over subsets instead of a maximum.

A measure of uniformity is reasonable if it satisfies the following criteria ^[1]:

It is invariant under permuting factors and/or runs.
It is invariant under rotation of the coordinates.
It can measure not only uniformity of the sample over the hypercube, but also the projection uniformity of the sample over non-empty subset of lower dimension hypercubes.
There is some reasonable geometric meaning.
It is easy to compute.
It satisfies the Koksma-Hlawka-like inequality.
It is consistent with other criteria in experimental design.

Four methods are available:

CD: Centered Discrepancy - subspace involves a corner of the hypercube
WD: Wrap-around Discrepancy - subspace can wrap around bounds
MD: Mixture Discrepancy - mix between CD/WD covering more criteria
L2-star: L2-star discrepancy - like CD BUT variant to rotation

Methods CD, WD, and MD implement the right hand side of equations 9, 10, and 18 of ^[2], respectively; the square root is not taken. On the other hand, L2-star computes the quantity given by equation 10 of ^[3] as implemented by subsequent equations; the square root is taken.

Lastly, using iterative=True, it is possible to compute the discrepancy as if we had $n + 1$ samples. This is useful if we want to add a point to a sampling and check the candidate which would give the lowest discrepancy. Then you could just update the discrepancy with each candidate using update_discrepancy. This method is faster than computing the discrepancy for a large number of candidates.

Examples

Calculate the quality of the sample using the discrepancy:

import numpy as np
from scipy.stats import qmc
space = np.array([[1, 3], [2, 6], [3, 2], [4, 5], [5, 1], [6, 4]])
l_bounds = [0.5, 0.5]
u_bounds = [6.5, 6.5]
space = qmc.scale(space, l_bounds, u_bounds, reverse=True)

✓

space

✗

qmc.discrepancy(space)

✓

We can also compute iteratively the ``CD`` discrepancy by using ``iterative=True``.

disc_init = qmc.discrepancy(space[:-1], iterative=True)
disc_init
qmc.update_discrepancy(space[-1], space[:-1], disc_init)

`scipy.stats._qmc:discrepancy`

Signature

Summary

Parameters

Returns

Notes

Examples

See also

Aliases

Referenced by

This package