{ } Raw JSON

bundles / scipy latest / scipy / stats / _stats_py / f_oneway

function

scipy.stats._stats_py:f_oneway

source: /scipy/stats/_stats_py.py :3750

Signature

def   f_oneway ( * samples axis = 0 equal_var = True nan_policy = propagate keepdims = False )

Summary

Perform one-way ANOVA.

Extended Summary

The one-way ANOVA tests the null hypothesis that two or more groups have the same population mean. The test is applied to samples from two or more groups, possibly with differing sizes.

Parameters

sample1, sample2, ... : array_like

The sample measurements for each group. There must be at least two arguments. If the arrays are multidimensional, then all the dimensions of the array must be the same except for axis.

axis : int or None, default: 0

If an int, the axis of the input along which to compute the statistic. The statistic of each axis-slice (e.g. row) of the input will appear in a corresponding element of the output. If None, the input will be raveled before computing the statistic.

equal_var: bool, optional

If True (default), perform a standard one-way ANOVA test that assumes equal population variances [2]. If False, perform Welch's ANOVA test, which does not assume equal population variances [4].

nan_policy : {'propagate', 'omit', 'raise'}

Defines how to handle input NaNs.

  • propagate: if a NaN is present in the axis slice (e.g. row) along which the statistic is computed, the corresponding entry of the output will be NaN.

  • omit: NaNs will be omitted when performing the calculation. If insufficient data remains in the axis slice along which the statistic is computed, the corresponding entry of the output will be NaN.

  • raise: if a NaN is present, a ValueError will be raised.

keepdims : bool, default: False

If this is set to True, the axes which are reduced are left in the result as dimensions with size one. With this option, the result will broadcast correctly against the input array.

Returns

statistic : float

The computed F statistic of the test.

pvalue : float

The associated p-value from the F distribution.

Warns

: `~scipy.stats.ConstantInputWarning`

Emitted if all values within each of the input arrays are identical. In this case the F statistic is either infinite or isn't defined, so np.inf or np.nan is returned.

: RuntimeWarning

Emitted if the length of any input array is 0, or if all the input arrays have length 1. np.nan is returned for the F statistic and the p-value in these cases.

Notes

The ANOVA test has important assumptions that must be satisfied in order for the associated p-value to be valid.

  • The samples are independent.

  • Each sample is from a normally distributed population.

  • The population standard deviations of the groups are all equal. This property is known as homoscedasticity.

If these assumptions are not true for a given set of data, it may still be possible to use the Kruskal-Wallis H-test (scipy.stats.kruskal) or the Alexander-Govern test (scipy.stats.alexandergovern) although with some loss of power.

The length of each group must be at least one, and there must be at least one group with length greater than one. If these conditions are not satisfied, a warning is generated and (np.nan, np.nan) is returned.

If all values in each group are identical, and there exist at least two groups with different values, the function generates a warning and returns (np.inf, 0).

If all values in all groups are the same, function generates a warning and returns (np.nan, np.nan).

The algorithm is from Heiman [2], pp.394-7.

Beginning in SciPy 1.9, np.matrix inputs (not recommended for new code) are converted to np.ndarray before the calculation is performed. In this case, the output will be a scalar or np.ndarray of appropriate shape rather than a 2D np.matrix. Similarly, while masked elements of masked arrays are ignored, the output will be a scalar or np.ndarray rather than a masked array with mask=False.

Array API Standard Support

f_oneway has experimental support for Python Array API Standard compatible backends in addition to NumPy. Please consider testing these features by setting an environment variable SCIPY_ARRAY_API=1 and providing CuPy, PyTorch, JAX, or Dask arrays as array arguments. The following combinations of backend and device (or other capability) are supported.

====================  ====================  ====================
Library               CPU                   GPU
====================  ====================  ====================
NumPy                 ✅                     n/a                 
CuPy                  n/a                   ✅                   
PyTorch               ✅                     ⛔                   
JAX                   ⚠️ no JIT
Dask                  ✅                     n/a                 
====================  ====================  ====================

See dev-arrayapi for more information.

Examples

import numpy as np
from scipy.stats import f_oneway
Here are some data [3]_ on a shell measurement (the length of the anterior adductor muscle scar, standardized by dividing by length) in the mussel Mytilus trossulus from five locations: Tillamook, Oregon; Newport, Oregon; Petersburg, Alaska; Magadan, Russia; and Tvarminne, Finland, taken from a much larger data set used in McDonald et al. (1991).
tillamook = [0.0571, 0.0813, 0.0831, 0.0976, 0.0817, 0.0859, 0.0735,
             0.0659, 0.0923, 0.0836]
newport = [0.0873, 0.0662, 0.0672, 0.0819, 0.0749, 0.0649, 0.0835,
           0.0725]
petersburg = [0.0974, 0.1352, 0.0817, 0.1016, 0.0968, 0.1064, 0.105]
magadan = [0.1033, 0.0915, 0.0781, 0.0685, 0.0677, 0.0697, 0.0764,
           0.0689]
tvarminne = [0.0703, 0.1026, 0.0956, 0.0973, 0.1039, 0.1045]
f_oneway(tillamook, newport, petersburg, magadan, tvarminne)
`f_oneway` accepts multidimensional input arrays. When the inputs are multidimensional and `axis` is not given, the test is performed along the first axis of the input arrays. For the following data, the test is performed three times, once for each column.
a = np.array([[9.87, 9.03, 6.81],
              [7.18, 8.35, 7.00],
              [8.39, 7.58, 7.68],
              [7.45, 6.33, 9.35],
              [6.41, 7.10, 9.33],
              [8.00, 8.24, 8.44]])
b = np.array([[6.35, 7.30, 7.16],
              [6.65, 6.68, 7.63],
              [5.72, 7.73, 6.72],
              [7.01, 9.19, 7.41],
              [7.75, 7.87, 8.30],
              [6.90, 7.97, 6.97]])
c = np.array([[3.31, 8.77, 1.01],
              [8.25, 3.24, 3.62],
              [6.32, 8.81, 5.19],
              [7.48, 8.83, 8.91],
              [8.59, 6.01, 6.07],
              [3.07, 9.72, 7.48]])
F = f_oneway(a, b, c)
F.statistic
F.pvalue
Welch ANOVA will be performed if `equal_var` is False.

Aliases

  • scipy.stats.f_oneway

Referenced by