{ } Raw JSON

bundles / scipy latest / scipy / stats / _hypotests / tukey_hsd

function

scipy.stats._hypotests:tukey_hsd

source: /scipy/stats/_hypotests.py :1949

Signature

def   tukey_hsd ( * args equal_var = True )

Summary

Perform Tukey's HSD test for equality of means over multiple treatments.

Extended Summary

Tukey's honestly significant difference (HSD) test performs pairwise comparison of means for a set of samples. Whereas ANOVA (e.g. f_oneway) assesses whether the true means underlying each sample are identical, Tukey's HSD is a post hoc test used to compare the mean of each sample to the mean of each other sample.

The null hypothesis is that the distributions underlying the samples all have the same mean. The test statistic, which is computed for every possible pairing of samples, is simply the difference between the sample means. For each pair, the p-value is the probability under the null hypothesis (and other assumptions; see notes) of observing such an extreme value of the statistic, considering that many pairwise comparisons are being performed. Confidence intervals for the difference between each pair of means are also available.

Parameters

sample1, sample2, ... : array_like

The sample measurements for each group. There must be at least two arguments.

equal_var: bool, optional

If True (default) and equal sample size, perform Tukey-HSD test [6]. If True and unequal sample size, perform Tukey-Kramer test [4]. If False, perform Games-Howell test [7], which does not assume equal variances.

Returns

result : `~scipy.stats._result_classes.TukeyHSDResult` instance

The return value is an object with the following attributes:

statistic

statistic

pvalue

pvalue

The object has the following methods:

confidence_interval(confidence_level=0.95):

Compute the confidence interval for the specified confidence level.

Notes

The use of this test relies on several assumptions.

  • The observations are independent within and among groups.

  • The observations within each group are normally distributed.

  • The distributions from which the samples are drawn have the same finite variance.

The original formulation of the test was for samples of equal size drawn from populations assumed to have equal variances [6]. In case of unequal sample sizes, the test uses the Tukey-Kramer method [4]. When equal variances are not assumed (equal_var=False), the test uses the Games-Howell method [7].

Array API Standard Support

tukey_hsd has experimental support for Python Array API Standard compatible backends in addition to NumPy. Please consider testing these features by setting an environment variable SCIPY_ARRAY_API=1 and providing CuPy, PyTorch, JAX, or Dask arrays as array arguments. The following combinations of backend and device (or other capability) are supported.

====================  ====================  ====================
Library               CPU                   GPU
====================  ====================  ====================
NumPy                 ✅                     n/a                 
CuPy                  n/a                   ⛔                   
PyTorch               ⛔                     ⛔                   
JAX                   ⛔                     ⛔                   
Dask                  ⛔                     n/a                 
====================  ====================  ====================

See dev-arrayapi for more information.

Examples

Here are some data comparing the time to relief of three brands of headache medicine, reported in minutes. Data adapted from [3]_.
import numpy as np
from scipy.stats import tukey_hsd
group0 = [24.5, 23.5, 26.4, 27.1, 29.9]
group1 = [28.4, 34.2, 29.5, 32.2, 30.1]
group2 = [26.1, 28.3, 24.3, 26.2, 27.8]
We would like to see if the means between any of the groups are significantly different. First, visually examine a box and whisker plot.
import matplotlib.pyplot as plt
fig, ax = plt.subplots(1, 1)
ax.boxplot([group0, group1, group2])
plt.show()
fig-285879dcc77fa012.png
From the box and whisker plot, we can see overlap in the interquartile ranges group 1 to group 2 and group 3, but we can apply the ``tukey_hsd`` test to determine if the difference between means is significant. We set a significance level of .05 to reject the null hypothesis.
res = tukey_hsd(group0, group1, group2)
print(res)
The null hypothesis is that each group has the same mean. The p-value for comparisons between ``group0`` and ``group1`` as well as ``group1`` and ``group2`` do not exceed .05, so we reject the null hypothesis that they have the same means. The p-value of the comparison between ``group0`` and ``group2`` exceeds .05, so we accept the null hypothesis that there is not a significant difference between their means. We can also compute the confidence interval associated with our chosen confidence level.
group0 = [24.5, 23.5, 26.4, 27.1, 29.9]
group1 = [28.4, 34.2, 29.5, 32.2, 30.1]
group2 = [26.1, 28.3, 24.3, 26.2, 27.8]
result = tukey_hsd(group0, group1, group2)
conf = res.confidence_interval(confidence_level=.99)
for ((i, j), l) in np.ndenumerate(conf.low):
    # filter out self comparisons
    if i != j:
        h = conf.high[i,j]
        print(f"({i} - {j}) {l:>6.3f} {h:>6.3f}")

See also

dunnett

performs comparison of means against a control group.

Aliases

  • scipy.stats.tukey_hsd

Referenced by