bundles / scipy 1.17.1 / scipy / stats / _mannwhitneyu / mannwhitneyu
function
scipy.stats._mannwhitneyu:mannwhitneyu
Signature
def mannwhitneyu ( x , y , use_continuity = True , alternative = two-sided , axis = 0 , method = auto , * , nan_policy = propagate , keepdims = False ) Summary
Perform the Mann-Whitney U rank test on two independent samples.
Extended Summary
The Mann-Whitney U test is a nonparametric test of the null hypothesis that the distribution underlying sample x is the same as the distribution underlying sample y. It is often used as a test of difference in location between distributions.
Parameters
x, y: array-likeN-d arrays of samples. The arrays must be broadcastable except along the dimension given by
axis.use_continuity: bool, optionalWhether a continuity correction (1/2) should be applied. Default is True when
methodis'asymptotic'; has no effect otherwise.alternative: {'two-sided', 'less', 'greater'}, optionalDefines the alternative hypothesis. Default is 'two-sided'. Let SX(u) and SY(u) be the survival functions of the distributions underlying
xandy, respectively. Then the following alternative hypotheses are available:'two-sided': the distributions are not equal, i.e. SX(u) ≠ SY(u) for at least one u.
'less': the distribution underlying
xis stochastically less than the distribution underlyingy, i.e. SX(u) < SY(u) for all u.'greater': the distribution underlying
xis stochastically greater than the distribution underlyingy, i.e. SX(u) > SY(u) for all u.
Under a more restrictive set of assumptions, the alternative hypotheses can be expressed in terms of the locations of the distributions; see [5] section 5.1.
axis: int or None, default: 0If an int, the axis of the input along which to compute the statistic. The statistic of each axis-slice (e.g. row) of the input will appear in a corresponding element of the output. If
None, the input will be raveled before computing the statistic.method: {'auto', 'asymptotic', 'exact'} or `PermutationMethod` instance, optionalSelects the method used to calculate the p-value. Default is 'auto'. The following options are available.
'asymptotic': compares the standardized test statistic against the normal distribution, correcting for ties.'exact': computes the exact p-value by comparing the observed statistic against the exact distribution of the statistic under the null hypothesis. No correction is made for ties.'auto': chooses'exact'when the size of one of the samples is less than or equal to 8 and there are no ties; chooses'asymptotic'otherwise.PermutationMethod instance. In this case, the p-value is computed using permutation_test with the provided configuration options and other appropriate settings.
nan_policy: {'propagate', 'omit', 'raise'}Defines how to handle input NaNs.
propagate: if a NaN is present in the axis slice (e.g. row) along which the statistic is computed, the corresponding entry of the output will be NaN.omit: NaNs will be omitted when performing the calculation. If insufficient data remains in the axis slice along which the statistic is computed, the corresponding entry of the output will be NaN.raise: if a NaN is present, aValueErrorwill be raised.
keepdims: bool, default: FalseIf this is set to True, the axes which are reduced are left in the result as dimensions with size one. With this option, the result will broadcast correctly against the input array.
Returns
res: MannwhitneyuResultAn object containing attributes:
statistic
statistic
pvalue
pvalue
Notes
If U1 is the statistic corresponding with sample x, then the statistic corresponding with sample y is U2 = x.shape[axis] * y.shape[axis] - U1.
mannwhitneyu is for independent samples. For related / paired samples, consider scipy.stats.wilcoxon.
method 'exact' is recommended when there are no ties and when either sample size is less than 8 [1]. The implementation follows the algorithm reported in [3]. Note that the exact method is not corrected for ties, but mannwhitneyu will not raise errors or warnings if there are ties in the data. If there are ties and either samples is small (fewer than ~10 observations), consider passing an instance of PermutationMethod as the method to perform a permutation test.
The Mann-Whitney U test is a non-parametric version of the t-test for independent samples. When the means of samples from the populations are normally distributed, consider scipy.stats.ttest_ind.
Beginning in SciPy 1.9, np.matrix inputs (not recommended for new code) are converted to np.ndarray before the calculation is performed. In this case, the output will be a scalar or np.ndarray of appropriate shape rather than a 2D np.matrix. Similarly, while masked elements of masked arrays are ignored, the output will be a scalar or np.ndarray rather than a masked array with mask=False.
Array API Standard Support
mannwhitneyu has experimental support for Python Array API Standard compatible backends in addition to NumPy. Please consider testing these features by setting an environment variable SCIPY_ARRAY_API=1 and providing CuPy, PyTorch, JAX, or Dask arrays as array arguments. The following combinations of backend and device (or other capability) are supported.
==================== ==================== ==================== Library CPU GPU ==================== ==================== ==================== NumPy ✅ n/a CuPy n/a ⛔ PyTorch ✅ ⛔ JAX ⚠️ no JIT ⛔ Dask ⛔ n/a ==================== ==================== ====================
See
dev-arrayapifor more information.
Examples
We follow the example from [4]_: nine randomly sampled young adults were diagnosed with type II diabetes at the ages below.males = [19, 22, 16, 29, 24] females = [20, 11, 17, 12]✓
from scipy.stats import mannwhitneyu U1, p = mannwhitneyu(males, females, method="exact") print(U1)✓
nx, ny = len(males), len(females) U2 = nx*ny - U1 print(U2)✓
print(p)
✓_, pnorm = mannwhitneyu(males, females, method="asymptotic") print(pnorm)✓
import numpy as np from scipy.stats import norm U = min(U1, U2) N = nx + ny z = (U - nx*ny/2 + 0.5) / np.sqrt(nx*ny * (N + 1)/ 12) p = 2 * norm.cdf(z) # use CDF to get p-value from smaller statistic print(p)✓
_, pnorm = mannwhitneyu(males, females, use_continuity=False, method="asymptotic") print(pnorm)✓
res = mannwhitneyu(females, males, alternative="less", method="exact")
✓print(res)
✗from scipy.stats import ttest_ind res = ttest_ind(females, males, alternative="less")✓
print(res)
✗See also
- scipy.stats.ranksums
- scipy.stats.ttest_ind
- scipy.stats.wilcoxon
Aliases
-
scipy.stats.mannwhitneyu