bundles / scipy latest / scipy / stats / _stats_py / weightedtau
function
scipy.stats._stats_py:weightedtau
source: /scipy/stats/_stats_py.py :5748
Signature
def weightedtau ( x , y , rank = True , weigher = None , additive = True , * , axis = None , nan_policy = propagate , keepdims = False ) Summary
Compute a weighted version of Kendall's .
Extended Summary
The weighted is a weighted version of Kendall's in which exchanges of high weight are more influential than exchanges of low weight. The default parameters compute the additive hyperbolic version of the index, , which has been shown to provide the best balance between important and unimportant elements [1].
The weighting is defined by means of a rank array, which assigns a nonnegative rank to each element (higher importance ranks being associated with smaller values, e.g., 0 is the highest possible rank), and a weigher function, which assigns a weight based on the rank to each element. The weight of an exchange is then the sum or the product of the weights of the ranks of the exchanged elements. The default parameters compute : an exchange between elements with rank and (starting from zero) has weight .
Specifying a rank array is meaningful only if you have in mind an external criterion of importance. If, as it usually happens, you do not have in mind a specific rank, the weighted is defined by averaging the values obtained using the decreasing lexicographical rank by (x, y) and by (y, x). This is the behavior with default parameters. Note that the convention used here for ranking (lower values imply higher importance) is opposite to that used by other SciPy statistical functions.
Parameters
x, y: array_likeArrays of scores, of the same shape. If arrays are not 1-D, they will be flattened to 1-D.
rank: array_like of ints or bool, optionalA nonnegative rank assigned to each element. If it is None, the decreasing lexicographical rank by (
x,y) will be used: elements of higher rank will be those with largerx-values, usingy-values to break ties (in particular, swappingxandywill give a different result). If it is False, the element indices will be used directly as ranks. The default is True, in which case this function returns the average of the values obtained using the decreasing lexicographical rank by (x,y) and by (y,x).weigher: callable, optionalThe weigher function. Must map nonnegative integers (zero representing the most important element) to a nonnegative weight. The default, None, provides hyperbolic weighing, that is, rank is mapped to weight .
additive: bool, optionalIf True, the weight of an exchange is computed by adding the weights of the ranks of the exchanged elements; otherwise, the weights are multiplied. The default is True.
axis: int or None, default: NoneIf an int, the axis of the input along which to compute the statistic. The statistic of each axis-slice (e.g. row) of the input will appear in a corresponding element of the output. If
None, the input will be raveled before computing the statistic.nan_policy: {'propagate', 'omit', 'raise'}Defines how to handle input NaNs.
propagate: if a NaN is present in the axis slice (e.g. row) along which the statistic is computed, the corresponding entry of the output will be NaN.omit: NaNs will be omitted when performing the calculation. If insufficient data remains in the axis slice along which the statistic is computed, the corresponding entry of the output will be NaN.raise: if a NaN is present, aValueErrorwill be raised.
keepdims: bool, default: FalseIf this is set to True, the axes which are reduced are left in the result as dimensions with size one. With this option, the result will broadcast correctly against the input array.
Returns
: res: SignificanceResultAn object containing attributes:
statistic
statistic
pvalue
pvalue
Notes
This function uses an , mergesort-based algorithm [1] that is a weighted extension of Knight's algorithm for Kendall's [2]. It can compute Shieh's weighted [3] between rankings without ties (i.e., permutations) by setting additive and rank to False, as the definition given in [1] is a generalization of Shieh's.
NaNs are considered the smallest possible score.
Beginning in SciPy 1.9, np.matrix inputs (not recommended for new code) are converted to np.ndarray before the calculation is performed. In this case, the output will be a scalar or np.ndarray of appropriate shape rather than a 2D np.matrix. Similarly, while masked elements of masked arrays are ignored, the output will be a scalar or np.ndarray rather than a masked array with mask=False.
Array API Standard Support
weightedtau has experimental support for Python Array API Standard compatible backends in addition to NumPy. Please consider testing these features by setting an environment variable SCIPY_ARRAY_API=1 and providing CuPy, PyTorch, JAX, or Dask arrays as array arguments. The following combinations of backend and device (or other capability) are supported.
==================== ==================== ==================== Library CPU GPU ==================== ==================== ==================== NumPy ✅ n/a CuPy n/a ⛔ PyTorch ⛔ ⛔ JAX ⛔ ⛔ Dask ⛔ n/a ==================== ==================== ====================
See
dev-arrayapifor more information.
Examples
import numpy as np from scipy import stats x = [12, 2, 1, 12, 2] y = [1, 4, 7, 1, 0] res = stats.weightedtau(x, y)✓
res.statistic res.pvalue✗
res = stats.weightedtau(x, y, additive=False)
✓res.statistic
✗x = [12, 2, 1, 12, 2] y = [1, 4, 7, 1, np.nan] res = stats.weightedtau(x, y)✓
res.statistic
✗x = [12, 2, 1, 12, 2] y = [1, 4, 7, 1, 0] res = stats.weightedtau(x, y, weigher=lambda x: 1)✓
res.statistic
✗x = [12, 2, 1, 12, 2] y = [1, 4, 7, 1, 0]✓
stats.weightedtau(x, y, rank=None) stats.weightedtau(y, x, rank=None)✗
See also
- kendalltau
Calculates Kendall's tau.
- spearmanr
Calculates a Spearman rank-order correlation coefficient.
- theilslopes
Computes the Theil-Sen estimator for a set of points (x, y).
Aliases
-
scipy.stats.weightedtau