Pandas Tutorial Pandas References

Pandas DataFrame - nunique() function



The Pandas DataFrame nunique() function counts the number of distinct elements in specified axis. This function returns a Series with number of distinct elements.

Syntax

DataFrame.nunique(axis=0, dropna=True)

Parameters

axis Optional. Specify {0 or 'index', 1 or 'columns'}. If 0 or 'index', counts of distinct elements are generated for each column. If 1 or 'columns', counts of distinct elements are generated for each row. Default: 0
dropna Optional. Specify False to include NaN in the counts. Default is True.

Return Value

Returns a Series with number of distinct elements.

Example: using nunique() column-wise on whole DataFrame

In the example below, a DataFrame df is created. The nunique() function is used to get the count of distinct elements in each column.

import pandas as pd
import numpy as np

df = pd.DataFrame({
  "x": [5, 5, 2, 2, 7],
  "y": [10, 5, 5, 10, 5]},
  index= ["a", "b", "c", "d", "e"]
)

print("The DataFrame is:")
print(df)

#getting the count of distinct 
#elements in each column
print("\ndf.nunique() returns:")
print(df.nunique())

The output of the above code will be:

The DataFrame is:
   x   y
a  5  10
b  5   5
c  2   5
d  2  10
e  7   5

df.nunique() returns:
x    3
y    2
dtype: int64

Example: using nunique() row-wise on whole DataFrame

To perform the operation row-wise, the axis parameter can be set to 1.

import pandas as pd
import numpy as np

df = pd.DataFrame({
  "a": [5, 10],
  "b": [5, 5],
  "c": [2, 5],
  "d": [2, 10],
  "e": [7, 5]},
  index= ["x", "y"]
)

print("The DataFrame is:")
print(df)

#getting the count of distinct 
#elements in each row
print("\ndf.nunique(axis=1) returns:")
print(df.nunique(axis=1))

The output of the above code will be:

The DataFrame is:
    a  b  c   d  e
x   5  5  2   2  7
y  10  5  5  10  5

df.nunique(axis=1) returns:
x    3
y    2
dtype: int64

Example: using nunique() on selected column

Instead of whole DataFrame, the nunique() function can be applied on selected columns. Consider the following example.

import pandas as pd
import numpy as np

df = pd.DataFrame({
  "x": [5, 5, 2, 2, 7],
  "y": [10, 5, 5, 10, 5],
  "z": [1, 1, 1, 1, 1]},
  index= ["a", "b", "c", "d", "e"]
)

print("The DataFrame is:")
print(df)

#count of distinct elements in a single column
print("\ndf['z'].nunique() returns:")
print(df["z"].nunique())

#count of distinct elements in multiple columns
print("\ndf[['x', 'z']].nunique() returns:")
print(df[["x", "z"]].nunique())

The output of the above code will be:

The DataFrame is:
   x   y  z
a  5  10  1
b  5   5  1
c  2   5  1
d  2  10  1
e  7   5  1

df['z'].nunique() returns:
1

df[['x', 'z']].nunique() returns:
x    3
z    1
dtype: int64

❮ Pandas DataFrame - Functions

5