bundles / scipy latest / scipy / stats / _continuous_distns / rv_histogram
class
scipy.stats._continuous_distns:rv_histogram
Signature
class rv_histogram ( histogram , * args , density = None , ** kwargs ) Members
Summary
Generates a distribution given by a histogram. This is useful to generate a template distribution from a binned datasample.
Extended Summary
As a subclass of the rv_continuous class, rv_histogram inherits from it a collection of generic methods (see rv_continuous for the full list), and implements them based on the properties of the provided binned datasample.
Parameters
histogram: tuple of array_likeTuple containing two array_like objects. The first containing the content of n bins, the second containing the (n+1) bin boundaries. In particular, the return value of numpy.histogram is accepted.
density: bool, optionalIf False, assumes the histogram is proportional to counts per bin; otherwise, assumes it is proportional to a density. For constant bin widths, these are equivalent, but the distinction is important when bin widths vary (see Notes). If None (default), sets
density=Truefor backwards compatibility, but warns if the bin widths are variable. Setdensityexplicitly to silence the warning.
Notes
When a histogram has unequal bin widths, there is a distinction between histograms that are proportional to counts per bin and histograms that are proportional to probability density over a bin. If numpy.histogram is called with its default density=False, the resulting histogram is the number of counts per bin, so density=False should be passed to rv_histogram. If numpy.histogram is called with density=True, the resulting histogram is in terms of probability density, so density=True should be passed to rv_histogram. To avoid warnings, always pass density explicitly when the input histogram has unequal bin widths.
There are no additional shape parameters except for the loc and scale. The pdf is defined as a stepwise function from the provided histogram. The cdf is a linear interpolation of the pdf.
Examples
Create a scipy.stats distribution from a numpy histogramimport scipy.stats import numpy as np data = scipy.stats.norm.rvs(size=100000, loc=0, scale=1.5, random_state=123) hist = np.histogram(data, bins=100) hist_dist = scipy.stats.rv_histogram(hist, density=False)✓
hist_dist.pdf(1.0) hist_dist.cdf(2.0)✗
hist_dist.pdf(np.max(data)) hist_dist.cdf(np.max(data)) hist_dist.pdf(np.min(data)) hist_dist.cdf(np.min(data))✗
import matplotlib.pyplot as plt X = np.linspace(-5.0, 5.0, 100) fig, ax = plt.subplots()✓
ax.set_title("PDF from Template") ax.hist(data, density=True, bins=100) ax.plot(X, hist_dist.pdf(X), label='PDF') ax.plot(X, hist_dist.cdf(X), label='CDF') ax.legend()✗
fig.show()
✓Aliases
-
scipy.stats.rv_histogram