bundles / scipy latest / scipy / stats / _stats_py / ks_1samp
function
scipy.stats._stats_py:ks_1samp
source: /scipy/stats/_stats_py.py :7417
Signature
def ks_1samp ( x , cdf , args = () , alternative = two-sided , method = auto , * , axis = 0 , nan_policy = propagate , keepdims = False ) Summary
Performs the one-sample Kolmogorov-Smirnov test for goodness of fit.
Extended Summary
This test compares the underlying distribution F(x) of a sample against a given continuous distribution G(x). See Notes for a description of the available null and alternative hypotheses.
Parameters
x: array_likea 1-D array of observations of iid random variables.
cdf: callablecallable used to calculate the cdf.
args: tuple, sequence, optionalDistribution parameters, used with
cdf.alternative: {'two-sided', 'less', 'greater'}, optionalDefines the null and alternative hypotheses. Default is 'two-sided'. Please see explanations in the Notes below.
method: {'auto', 'exact', 'approx', 'asymp'}, optionalDefines the distribution used for calculating the p-value. The following options are available (default is 'auto'):
'auto'selects one of the other options.
'exact'uses the exact distribution of test statistic.
'approx'approximates the two-sided probability with twice the one-sided probability
'asymp': uses asymptotic distribution of test statistic
axis: int or None, default: 0If an int, the axis of the input along which to compute the statistic. The statistic of each axis-slice (e.g. row) of the input will appear in a corresponding element of the output. If
None, the input will be raveled before computing the statistic.nan_policy: {'propagate', 'omit', 'raise'}Defines how to handle input NaNs.
propagate: if a NaN is present in the axis slice (e.g. row) along which the statistic is computed, the corresponding entry of the output will be NaN.omit: NaNs will be omitted when performing the calculation. If insufficient data remains in the axis slice along which the statistic is computed, the corresponding entry of the output will be NaN.raise: if a NaN is present, aValueErrorwill be raised.
keepdims: bool, default: FalseIf this is set to True, the axes which are reduced are left in the result as dimensions with size one. With this option, the result will broadcast correctly against the input array.
Returns
: res: KstestResultAn object containing attributes:
statistic
statistic
pvalue
pvalue
statistic_location
statistic_location
statistic_sign
statistic_sign
Notes
There are three options for the null and corresponding alternative hypothesis that can be selected using the alternative parameter.
two-sided: The null hypothesis is that the two distributions are identical, F(x)=G(x) for all x; the alternative is that they are not identical.less: The null hypothesis is that F(x) >= G(x) for all x; the alternative is that F(x) < G(x) for at least one x.greater: The null hypothesis is that F(x) <= G(x) for all x; the alternative is that F(x) > G(x) for at least one x.
Note that the alternative hypotheses describe the CDFs of the underlying distributions, not the observed values. For example, suppose x1 ~ F and x2 ~ G. If F(x) > G(x) for all x, the values in x1 tend to be less than those in x2.
Beginning in SciPy 1.9, np.matrix inputs (not recommended for new code) are converted to np.ndarray before the calculation is performed. In this case, the output will be a scalar or np.ndarray of appropriate shape rather than a 2D np.matrix. Similarly, while masked elements of masked arrays are ignored, the output will be a scalar or np.ndarray rather than a masked array with mask=False.
Array API Standard Support
ks_1samp has experimental support for Python Array API Standard compatible backends in addition to NumPy. Please consider testing these features by setting an environment variable SCIPY_ARRAY_API=1 and providing CuPy, PyTorch, JAX, or Dask arrays as array arguments. The following combinations of backend and device (or other capability) are supported.
==================== ==================== ==================== Library CPU GPU ==================== ==================== ==================== NumPy ✅ n/a CuPy n/a ⛔ PyTorch ✅ ⛔ JAX ⚠️ no JIT ⛔ Dask ⛔ n/a ==================== ==================== ====================
See
dev-arrayapifor more information.
Examples
Suppose we wish to test the null hypothesis that a sample is distributed according to the standard normal. We choose a confidence level of 95%; that is, we will reject the null hypothesis in favor of the alternative if the p-value is less than 0.05. When testing uniformly distributed data, we would expect the null hypothesis to be rejected.import numpy as np from scipy import stats rng = np.random.default_rng()✓
stats.ks_1samp(stats.uniform.rvs(size=100, random_state=rng), stats.norm.cdf)✗
x = stats.norm.rvs(size=100, random_state=rng)
✓stats.ks_1samp(x, stats.norm.cdf)
✗x = stats.norm.rvs(size=100, loc=0.5, random_state=rng)
✓stats.ks_1samp(x, stats.norm.cdf, alternative='less')
✗See also
- ks_2samp
- kstest
Aliases
-
scipy.stats.ks_1samp