Pandas Tutorial Pandas Resources
Python Java C++ C C# PHP R SQL DS Algo InterviewQ

Pandas - DataFrame Comparison Functions



The Pandas package contains a number of comparison functions which provides all the functionality required for various comparison operations on a DataFrame. Below mentioned are the most frequently used such functions.

FunctionsDescription
lt() Get less than of dataframe and argument, element-wise (binary operator lt).
gt() Get greater than of dataframe and argument, element-wise (binary operator gt).
le() Get less than equal to of dataframe and argument, element-wise (binary operator le).
ge() Get greater than equal to of dataframe and argument, element-wise (binary operator ge).
eq() Get equal to of dataframe and argument, element-wise (binary operator eq).
ne() Get Not equal to of dataframe and argument, element-wise (binary operator ne).

Lets discuss these functions in detail:

Comparison Functions

Comparison operations can be performed on a given DataFrame, element-wise using lt(), gt(), le(), ge(), eq() and ne() functions. It is equivalent to using operator like <, >, <=, >=, == or != but with support to substitute a fill_value for missing data as one of the parameters. The syntax for using this function is given below:

Syntax

DataFrame.lt(other, axis='columns', level=None)
DataFrame.gt(other, axis='columns', level=None)
DataFrame.le(other, axis='columns', level=None)
DataFrame.ge(other, axis='columns', level=None)
DataFrame.eq(other, axis='columns', level=None)
DataFrame.ne(other, axis='columns', level=None)

Parameters

other Required. Specify any single or multiple element data structure, or list-like object.
axis Optional. Specify whether to compare by the index (0 or 'index') or columns (1 or 'columns').
level Optional. Specify int or label to broadcast across a level, matching Index values on the passed MultiIndex level. Default is None.

Example:

In the example below, a DataFrame df is created. The different comparison functions are used with the given data frame .

import pandas as pd
import numpy as np

df = pd.DataFrame({
  "Bonus": [5, 4, 2],
  "Salary": [60, 62, 65]},
  index= ["John", "Marry", "Sam"]
)

print("The DataFrame is:")
print(df)

#comparing for less than for all entries 
#of the DataFrame by 4
print("\ndf.lt(4) returns:")
print(df.lt(4))

#comparing all entries of Bonus column by 4
#comparing all entries of Salary column by 62
print("\ndf.gt([4,62]) returns:")
print(df.gt([4,62]))

#comparing for less than equal to for 
#all entries of the DataFrame by 4
print("\ndf.le(4) returns:")
print(df.le(4))

#comparing all entries of Bonus column by 4
#comparing all entries of Salary column by 62
print("\ndf.ge([4,62]) returns:")
print(df.ge([4,62]))

The output of the above code will be:

The DataFrame is:
       Bonus  Salary
John       5      60
Marry      4      62
Sam        2      65

df.lt(4) returns:
       Bonus  Salary
John   False   False
Marry  False   False
Sam     True   False

df.gt([4,62]) returns:
       Bonus  Salary
John    True   False
Marry  False   False
Sam    False    True

df.le(4) returns:
       Bonus  Salary
John   False   False
Marry   True   False
Sam     True   False

df.ge([4,62]) returns:
       Bonus  Salary
John    True   False
Marry   True    True
Sam    False    True

Example:

Similarly, eq() and ne() functions can be used on a DataFrame. Consider the example below.

import pandas as pd
import numpy as np

df = pd.DataFrame({
  "Bonus": [5, 4, 2],
  "Salary": [60, 62, 65]},
  index= ["John", "Marry", "Sam"]
)

print("The DataFrame is:")
print(df)

#comparing for equal to for all
#entries of the DataFrame by 4
print("\ndf.eq(4) returns:")
print(df.eq(4))

#comparing all entries of Bonus column by 4
#comparing all entries of Salary column by 62
print("\ndf.ne([4,62]) returns:")
print(df.ne([4,62]))

The output of the above code will be:

The DataFrame is:
       Bonus  Salary
John       5      60
Marry      4      62
Sam        2      65

df.eq(4) returns:
       Bonus  Salary
John   False   False
Marry   True   False
Sam    False   False

df.ne([4,62]) returns:
       Bonus  Salary
John    True    True
Marry  False   False
Sam     True    True

5