{ } Raw JSON

bundles / scipy latest / scipy / stats / _morestats / boxcox_normmax

function

scipy.stats._morestats:boxcox_normmax

source: /scipy/stats/_morestats.py :1235

Signature

def   boxcox_normmax ( x brack = None method = pearsonr optimizer = None * ymax = BIG_FLOAT )

Summary

Compute optimal Box-Cox transform parameter for input data.

Parameters

x : array_like

Input array. All entries must be positive, finite, real numbers.

brack : 2-tuple, optional, default (-2.0, 2.0)

The starting interval for a downhill bracket search for the default optimize.brent solver. Note that this is in most cases not critical; the final result is allowed to be outside this bracket. If optimizer is passed, brack must be None.

method : str, optional

The method to determine the optimal transform parameter (boxcox lmbda parameter). Options are:

'pearsonr' (default)

Maximizes the Pearson correlation coefficient between y = boxcox(x) and the expected values for y if x would be normally-distributed.

'mle'

Maximizes the log-likelihood boxcox_llf. This is the method used in boxcox.

'all'

Use all optimization methods available, and return all results. Useful to compare different methods.

optimizer : callable, optional

optimizer is a callable that accepts one argument:

fun

fun

and returns an object, such as an instance of scipy.optimize.OptimizeResult, which holds the optimal value of lmbda in an attribute x.

See the example below or the documentation of scipy.optimize.minimize_scalar for more information.

ymax : float, optional

The unconstrained optimal transform parameter may cause Box-Cox transformed data to have extreme magnitude or even overflow. This parameter constrains MLE optimization such that the magnitude of the transformed x does not exceed ymax. The default is the maximum value of the input dtype. If set to infinity, boxcox_normmax returns the unconstrained optimal lambda. Ignored when method='pearsonr'.

Returns

maxlog : float or ndarray

The optimal transform parameter found. An array instead of a scalar for method='all'.

Notes

Array API Standard Support

boxcox_normmax has experimental support for Python Array API Standard compatible backends in addition to NumPy. Please consider testing these features by setting an environment variable SCIPY_ARRAY_API=1 and providing CuPy, PyTorch, JAX, or Dask arrays as array arguments. The following combinations of backend and device (or other capability) are supported.

====================  ====================  ====================
Library               CPU                   GPU
====================  ====================  ====================
NumPy                 ✅                     n/a                 
CuPy                  n/a                   ⛔                   
PyTorch               ⛔                     ⛔                   
JAX                   ⛔                     ⛔                   
Dask                  ⛔                     n/a                 
====================  ====================  ====================

See dev-arrayapi for more information.

Examples

import numpy as np
from scipy import stats
import matplotlib.pyplot as plt
We can generate some data and determine the optimal ``lmbda`` in various ways:
rng = np.random.default_rng()
x = stats.loggamma.rvs(5, size=30, random_state=rng) + 5
y, lmax_mle = stats.boxcox(x)
lmax_pearsonr = stats.boxcox_normmax(x)
lmax_mle
lmax_pearsonr
stats.boxcox_normmax(x, method='all')
fig = plt.figure()
ax = fig.add_subplot(111)
prob = stats.boxcox_normplot(x, -10, 10, plot=ax)
ax.axvline(lmax_mle, color='r')
ax.axvline(lmax_pearsonr, color='g', ls='--')
plt.show()
fig-2a0c5e68a3ae5e78.png
Alternatively, we can define our own `optimizer` function. Suppose we are only interested in values of `lmbda` on the interval [6, 7], we want to use `scipy.optimize.minimize_scalar` with ``method='bounded'``, and we want to use tighter tolerances when optimizing the log-likelihood function. To do this, we define a function that accepts positional argument `fun` and uses `scipy.optimize.minimize_scalar` to minimize `fun` subject to the provided bounds and tolerances:
from scipy import optimize
options = {'xatol': 1e-12}  # absolute tolerance on `x`
def optimizer(fun):
    return optimize.minimize_scalar(fun, bounds=(6, 7),
                                    method="bounded", options=options)
stats.boxcox_normmax(x, optimizer=optimizer)

See also

boxcox
boxcox_llf
boxcox_normplot
scipy.optimize.minimize_scalar

Aliases

  • scipy.stats.boxcox_normmax

Referenced by