SciPy - Binomial Distribution

Binomial Distribution is a discrete probability distribution and it expresses the probability of a given number of successes in a sequence of n independent experiments with a known probability of success on each trial.

The probability mass function (pmf) of binomial distribution is defined as:

Where,

p is the probability of success in each trial
q is the probability of failure in each trial, q = 1 - p
n is number of trials
k is the number of successes which can occur anywhere among the n trials

An binomial distribution has mean np and variance npq.

The cumulative distribution function (cdf) evaluated at k, is the probability that the random variable (X) will take a value less than or equal to k. The cdf of binomial distribution is defined as:

Where, [k] is the greatest integer less than or equal to k.

The scipy.stats.binom contains all the methods required to generate and work with a binomial distribution. The most frequently methods are mentioned below:

Syntax

scipy.stats.binom.pmf(k, n, p, loc=0)
scipy.stats.binom.cdf(k, n, p, loc=0)
scipy.stats.binom.ppf(q, n, p, loc=0)
scipy.stats.binom.rvs(n, p, loc=0, size=1)

Parameters

`k`	`Required.` Specify float or array_like of floats representing random variable.
`q`	`Required.` Specify float or array_like of floats representing probabilities.
`n`	`Required.` Specify number of trials, must be >= 0. Floats are also accepted, but they will be truncated to integers.
`p`	`Required.` Specify probability of success in each trial, must be in range [0, 1]. float or array_like of floats.
`loc`	`Optional.` Specify the location of the distribution. Default is 0.
`size`	`Optional.` Specify output shape.

binom.pmf()

The binom.pmf() function measures probability mass function (pmf) of the distribution.

from scipy.stats import binom
import matplotlib.pyplot as plt
import numpy as np

#creating an array of values between
#0 to 20 with a difference of 1
x = np.arange(0, 20, 1)
   
y = binom.pmf(x, 20, 0.5)
   
plt.plot(x, y, 'bo') 
plt.show()

The output of the above code will be:

binom.cdf()

The binom.cdf() function returns cumulative distribution function (cdf) of the distribution.

from scipy.stats import binom
import matplotlib.pyplot as plt
import numpy as np

#creating an array of values between
#0 to 20 with a difference of 0.01
x = np.arange(0, 20, 0.01)
   
y = binom.cdf(x, 20, 0.5)
   
plt.plot(x, y) 
plt.show()

The output of the above code will be:

binom.ppf()

The binom.ppf() function takes the probability value and returns cumulative value corresponding to probability value of the distribution.

from scipy.stats import binom
import matplotlib.pyplot as plt
import numpy as np

#creating an array of probability from
#0 to 1 with a difference of 0.001
x = np.arange(0, 1, 0.001)
   
y = binom.ppf(x, 20, 0.5)
   
plt.plot(x, y) 
plt.show()

The output of the above code will be:

binom.rvs()

The binom.ppf() function generates an array containing specified number of random values drawn from the given binomial distribution. In the example below, a histogram is plotted to visualize the result.

from scipy.stats import binom
import matplotlib.pyplot as plt
import numpy as np

#fixing the seed for reproducibility
#of the result
np.random.seed(10)

#creating a vector containing 10000
#random values from binomial distribution
y = binom.rvs(20, 0.5, 0, 10000)

#creating bin
bin = np.arange(0,25,1)  

plt.hist(y, bins=bin, edgecolor='blue') 
plt.show()

The output of the above code will be: