Matplotlib Tutorial

Matplotlib - Histogram



A histogram is a graphical representation of the distribution of numerical data. To construct a histogram, the steps are given below:

  • Bin (or bucket) the range of values.
  • Divide the entire range of values into a series of intervals.
  • Count how many values fall into each interval.

The bins are usually specified as consecutive, non-overlapping intervals of a variable. The bins (intervals) must be adjacent and are often (but not required to be) of equal size.

The matplotlib.pyplot.hist() function computes and draws the histogram of x, array or sequence of arrays.

Syntax

matplotlib.pyplot.hist(x, bins=None, 
                       range=None, density=False, 
                       cumulative=False, histtype='bar', 
                       color=None, edgecolor=None)

Parameters

x Required. Specify array or sequence of arrays.
bins Optional. Specify the bins. It can be int or sequence or str.
If bins is an int, it defines the number of equal-width bins in the given range.
If bins is a sequence, it defines a monotonically increasing array of bin edges, including the rightmost edge, allowing for non-uniform bin widths.
range Optional. Specify the lower and upper range of the bins. If not provided, range is simply (x.min(), x.max()). Values outside the range are ignored. It has no effect if bins is a sequence.
density Optional. If it is set to False, the result will contain the number of samples in each bin. If True, the result is the value of the probability density function at the bin.
cumulative Optional. If it is set to True, the histogram is computed where each bin gives the counts in that bin plus all bins for smaller values.
histtype Optional. If can take value from {'bar', 'barstacked', 'step', 'stepfilled'}.
  • 'bar' - bar-type histogram. If multiple data are given the bars are arranged side by side.
  • 'barstacked' - bar-type histogram where multiple data are stacked on top of each other.
  • 'step' - generates a lineplot that is by default unfilled.
  • 'stepfilled' - generates a lineplot that is by default filled.
color Optional. Specify color or sequence of colors, one per dataset.
edgecolor Optional. Specify color for border of the bin.

Example: creating histogram

In the example below, the hist() function is used to create the histogram. The bins parameter is used to specify number of bins in the histogram.

import matplotlib.pyplot as plt
import numpy as np

#creating dataset
Arr = np.array([45,64,5,22,55,89,59,35,78,42,34,15])
#creating bins
b = np.array([0,20,40,60,80,100])

#drawing histogram
plt.hist(Arr, bins = b, color="blue", edgecolor='darkblue') 
plt.title("Histogram") 

plt.show()

The output of the above code will be:

Histogram

Example: creating cumulative histogram

By specifying cumulative to True, cumulative histogram can be created as shown in the example below.

import matplotlib.pyplot as plt
import numpy as np

#creating dataset
Arr = np.array([45,64,5,22,55,89,59,35,78,42,34,15])
#creating bins
b = np.array([0,20,40,60,80,100])

fig, ax = plt.subplots()
ax.set_xlabel('Marks')
ax.set_ylabel('Number of Students')
ax.set_title("Cumulative Histogram") 

#drawing histogram
ax.hist(Arr, bins = b, cumulative=True, color="darkblue", edgecolor='white') 

plt.show()

The output of the above code will be:

Histogram

5