Pandas Tutorial Pandas References

Pandas DataFrame - value_counts() function



The Pandas DataFrame value_counts() function returns a Series containing counts of unique rows in the DataFrame.

Syntax

DataFrame.value_counts(subset=None, normalize=False, sort=True, 
                       ascending=False, dropna=True)

Parameters

subset Optional. Specify columns to use when counting unique combinations, by default use all of the columns.
normalize Optional. If set to True, returns proportions rather than frequencies of the unique values.
sort Optional. A boolean value to specify to sort by frequencies or not. Default is True.
ascending Optional. A boolean value to specify to sort in ascending order or not. Default is False.
dropna Optional. A boolean value to specify whether to drop rows that contain NA values or not. Default is True.

Return Value

Returns a Series containing counts of unique rows in the DataFrame.

Example: using value_counts() on a DataFrame

In the example below, a DataFrame df is created. The value_counts() function is used to get the count of unique rows in this DataFrame.

import pandas as pd
import numpy as np

df = pd.DataFrame({
  "Age": [24, 27, 24, 27, 30],
  "Sex": ['M', 'F', 'M', 'F', 'M'],
  "City": ['London', 'London', 'Paris', 'London', 'Paris']},
  index= ["John", "Marry", "Jo", "Kim", "Huang"]
)

print("The DataFrame contains:")
print(df)

print("\ndf.value_counts() returns:")
print(df.value_counts())

print("\ndf.value_counts(sort=False) returns:")
print(df.value_counts(sort=False))

print("\ndf.value_counts(ascending=True) returns:")
print(df.value_counts(ascending=True))

print("\ndf.value_counts(normalize=True) returns:")
print(df.value_counts(normalize=True))

The output of the above code will be:

The DataFrame contains:
       Age Sex    City
John    24   M  London
Marry   27   F  London
Jo      24   M   Paris
Kim     27   F  London
Huang   30   M   Paris

df.value_counts() returns:
Age  Sex  City  
27   F    London    2
24   M    London    1
          Paris     1
30   M    Paris     1
dtype: int64

df.value_counts(sort=False) returns:
Age  Sex  City  
24   M    London    1
          Paris     1
27   F    London    2
30   M    Paris     1
dtype: int64

df.value_counts(ascending=True) returns:
Age  Sex  City  
24   M    London    1
          Paris     1
30   M    Paris     1
27   F    London    2
dtype: int64

df.value_counts(normalize=True) returns:
Age  Sex  City  
27   F    London    0.4
24   M    London    0.2
          Paris     0.2
30   M    Paris     0.2
dtype: float64

Example: using subset parameter

By using subset parameter, we can specify columns to use when counting unique combinations. Consider the example below:

import pandas as pd
import numpy as np

df = pd.DataFrame({
  "Age": [24, 27, 24, 27, 30],
  "Sex": ['M', 'F', 'M', 'F', 'M'],
  "City": ['London', 'London', 'Paris', 'London', 'Paris']},
  index= ["John", "Marry", "Jo", "Kim", "Huang"]
)

print("The DataFrame contains:")
print(df)

print("\ndf.value_counts() returns:")
print(df.value_counts())

print("\ndf.value_counts(subset=['Age', 'Sex']) returns:")
print(df.value_counts(subset=['Age', 'Sex']))

The output of the above code will be:

The DataFrame contains:
       Age Sex    City
John    24   M  London
Marry   27   F  London
Jo      24   M   Paris
Kim     27   F  London
Huang   30   M   Paris

df.value_counts() returns:
Age  Sex  City  
27   F    London    2
24   M    London    1
          Paris     1
30   M    Paris     1
dtype: int64

df.value_counts(subset=['Age', 'Sex']) returns:
Age  Sex
24   M      2
27   F      2
30   M      1
dtype: int64

❮ Pandas DataFrame - Functions

5