`scipy.stats._morestats:directional_stats`

source: /scipy/stats/_morestats.py :4497

Signature

def directional_stats ( samples , * , axis = 0 , normalize = True )

Summary

Computes sample statistics for directional data.

Extended Summary

Computes the directional mean (also called the mean direction vector) and mean resultant length of a sample of vectors.

The directional mean is a measure of "preferred direction" of vector data. It is analogous to the sample mean, but it is for use when the length of the data is irrelevant (e.g. unit vectors).

The mean resultant length is a value between 0 and 1 used to quantify the dispersion of directional data: the smaller the mean resultant length, the greater the dispersion. Several definitions of directional variance involving the mean resultant length are given in ^[1] and ^[2].

Parameters

samples : array_like: Input array. Must be at least two-dimensional, and the last axis of the input must correspond with the dimensionality of the vector space. When the input is exactly two dimensional, this means that each row of the data is a vector observation.
axis : int, default: 0: Axis along which the directional mean is computed.
normalize: boolean, default: True: If True, normalize the input to ensure that each observation is a unit vector. It the observations are already unit vectors, consider setting this to False to avoid unnecessary computation.

Returns

res : DirectionalStats

An object containing attributes:

mean_direction: mean_direction
mean_resultant_length: mean_resultant_length

Notes

This uses a definition of directional mean from ^[1]. Assuming the observations are unit vectors, the calculation is as follows.

mean = samples.mean(axis=0)
mean_resultant_length = np.linalg.norm(mean)
mean_direction = mean / mean_resultant_length

This definition is appropriate for directional data (i.e. vector data for which the magnitude of each observation is irrelevant) but not for axial data (i.e. vector data for which the magnitude and sign of each observation is irrelevant).

Several definitions of directional variance involving the mean resultant length R have been proposed, including 1 - R ^[1], 1 - R**2 ^[2], and 2 * (1 - R) ^[2]. Rather than choosing one, this function returns R as attribute mean_resultant_length so the user can compute their preferred measure of dispersion.

Array API Standard Support

directional_stats has experimental support for Python Array API Standard compatible backends in addition to NumPy. Please consider testing these features by setting an environment variable SCIPY_ARRAY_API=1 and providing CuPy, PyTorch, JAX, or Dask arrays as array arguments. The following combinations of backend and device (or other capability) are supported.

====================  ====================  ====================
Library               CPU                   GPU
====================  ====================  ====================
NumPy                 ✅                     n/a                 
CuPy                  n/a                   ✅                   
PyTorch               ✅                     ✅                   
JAX                   ✅                     ✅                   
Dask                  ✅                     n/a                 
====================  ====================  ====================

See dev-arrayapi for more information.

Examples

import numpy as np
from scipy.stats import directional_stats
data = np.array([[3, 4],    # first observation, 2D vector space
                 [6, -8]])  # second observation
dirstats = directional_stats(data)
dirstats.mean_direction

✓

In contrast, the regular sample mean of the vectors would be influenced by the magnitude of each observation. Furthermore, the result would not be a unit vector.

data.mean(axis=0)

✗

An exemplary use case for `directional_stats` is to find a *meaningful* center for a set of observations on a sphere, e.g. geographical locations.

data = np.array([[0.8660254, 0.5, 0.],
                 [0.8660254, -0.5, 0.]])
dirstats = directional_stats(data)
dirstats.mean_direction

✓

The regular sample mean on the other hand yields a result which does not lie on the surface of the sphere.

data.mean(axis=0)

✗

The function also returns the mean resultant length, which can be used to calculate a directional variance. For example, using the definition ``Var(z) = 1 - R`` from [2]_ where ``R`` is the mean resultant length, we can calculate the directional variance of the vectors in the above example as:

1 - dirstats.mean_resultant_length

`scipy.stats._morestats:directional_stats`

Signature

Summary

Extended Summary

Parameters

Returns

Notes

Examples

See also

Aliases

Referenced by

This package