`scipy.stats._stats_py:kruskal`

source: /scipy/stats/_stats_py.py :8392

Signature

   def     kruskal (  *  samples    ,    nan_policy    =  propagate   ,    axis    =  0   ,    keepdims    =  False     )

Summary

Compute the Kruskal-Wallis H-test for independent samples.

Extended Summary

The Kruskal-Wallis H-test tests the null hypothesis that the population median of all of the groups are equal. It is a non-parametric version of ANOVA. The test works on 2 or more independent samples, which may have different sizes. Note that rejecting the null hypothesis does not indicate which of the groups differs. Post hoc comparisons between groups are required to determine which groups are different.

Parameters

sample1, sample2, ... : array_like

Two or more arrays with the sample measurements can be given as arguments. Samples must be one-dimensional.

nan_policy : {'propagate', 'omit', 'raise'}

Defines how to handle input NaNs.

propagate: if a NaN is present in the axis slice (e.g. row) along which the statistic is computed, the corresponding entry of the output will be NaN.
omit: NaNs will be omitted when performing the calculation. If insufficient data remains in the axis slice along which the statistic is computed, the corresponding entry of the output will be NaN.
raise: if a NaN is present, a ValueError will be raised.

axis : int or None, default: 0

If an int, the axis of the input along which to compute the statistic. The statistic of each axis-slice (e.g. row) of the input will appear in a corresponding element of the output. If None, the input will be raveled before computing the statistic.

keepdims : bool, default: False

If this is set to True, the axes which are reduced are left in the result as dimensions with size one. With this option, the result will broadcast correctly against the input array.

Returns

statistic : float: The Kruskal-Wallis H statistic, corrected for ties.
pvalue : float: The p-value for the test using the assumption that H has a chi square distribution. The p-value returned is the survival function of the chi square distribution evaluated at H.

Notes

Due to the assumption that H has a chi square distribution, the number of samples in each group must not be too small. A typical rule is that each sample must have at least 5 measurements.

Beginning in SciPy 1.9, np.matrix inputs (not recommended for new code) are converted to np.ndarray before the calculation is performed. In this case, the output will be a scalar or np.ndarray of appropriate shape rather than a 2D np.matrix. Similarly, while masked elements of masked arrays are ignored, the output will be a scalar or np.ndarray rather than a masked array with mask=False.

Array API Standard Support

kruskal has experimental support for Python Array API Standard compatible backends in addition to NumPy. Please consider testing these features by setting an environment variable SCIPY_ARRAY_API=1 and providing CuPy, PyTorch, JAX, or Dask arrays as array arguments. The following combinations of backend and device (or other capability) are supported.

====================  ====================  ====================
Library               CPU                   GPU
====================  ====================  ====================
NumPy                 ✅                     n/a                 
CuPy                  n/a                   ⛔                   
PyTorch               ✅                     ✅                   
JAX                   ⚠️ no JIT             ⚠️ no JIT           
Dask                  ⛔                     n/a                 
====================  ====================  ====================

See dev-arrayapi for more information.

Examples

from scipy import stats
x = [1, 3, 5, 7, 9]
y = [2, 4, 6, 8, 10]

✓

stats.kruskal(x, y)

✗

x = [1, 1, 1]
y = [2, 2, 2]
z = [2, 2]

✓

stats.kruskal(x, y, z)

`scipy.stats._stats_py:kruskal`

Signature

Summary

Extended Summary

Parameters

Returns

Notes

Examples

See also

Aliases

Referenced by

This package