Pandas Tutorial Pandas References

Pandas DataFrame - sub() function



The Pandas sub() function returns subtraction of dataframe and other, element-wise. It is equivalent to dataframe - other, but with support to substitute a fill_value for missing data as one of the parameters.

Syntax

DataFrame.sub(other, axis='columns', 
              level=None, fill_value=None)

Parameters

other Required. Specify any single or multiple element data structure, or list-like object.
axis Optional. Specify whether to compare by the index (0 or 'index') or columns (1 or 'columns'). For Series input, axis to match Series index on. Default is 'columns'.
level Optional. Specify int or label to broadcast across a level, matching Index values on the passed MultiIndex level. Default is None.
fill_value Optional. Specify value to fill existing missing (NaN) values, and any new element needed for successful DataFrame alignment. If data in both corresponding DataFrame locations is missing the result will be missing. Default is None.

Return Value

Returns the result of the arithmetic operation.

Example: using sub() on whole DataFrame

In the example below, a DataFrame df is created. The sub() function is used to subtract a scalar value from the whole DataFrame.

import pandas as pd
import numpy as np

df = pd.DataFrame({
  "Bonus": [5, 3, 2, 4],
  "Salary": [60, 62, 65, 59]},
  index= ["John", "Marry", "Sam", "Jo"]
)

print("The DataFrame is:")
print(df)

#subtracting 2 from all entries of the DataFrame
print("\ndf.sub(2) returns:")
print(df.sub(2))

The output of the above code will be:

The DataFrame is:
       Bonus  Salary
John       5      60
Marry      3      62
Sam        2      65
Jo         4      59

df.sub(2) returns:
       Bonus  Salary
John       3      58
Marry      1      60
Sam        0      63
Jo         2      57

Example: Subtracting different value from different column

Different scalar value can be subtracted from different column by providing other argument as a list. Consider the following example:

import pandas as pd
import numpy as np

df = pd.DataFrame({
  "Bonus": [5, 3, 2, 4],
  "Salary": [60, 62, 65, 59]},
  index= ["John", "Marry", "Sam", "Jo"]
)

print("The DataFrame is:")
print(df)

#subtracting 2 from all entries of Bonus column
#subtracting 10 from all entries of Salary column
print("\ndf.sub([2,10]) returns:")
print(df.sub([2,10]))

The output of the above code will be:

The DataFrame is:
       Bonus  Salary
John       5      60
Marry      3      62
Sam        2      65
Jo         4      59

df.sub([2,10]) returns:
       Bonus  Salary
John       3      50
Marry      1      52
Sam        0      55
Jo         2      49

Example: using sub() on selected columns

Instead of whole DataFrame, the sub() function can be applied on selected columns. Consider the following example.

import pandas as pd
import numpy as np

df = pd.DataFrame({
  "Bonus": [5, 3, 2, 4],
  "Last Salary": [58, 60, 63, 57],
  "Salary": [60, 62, 65, 59]},
  index= ["John", "Marry", "Sam", "Jo"]
)

print("The DataFrame is:")
print(df)

#subtracting 3 from all entries of Salary column
print("\ndf['Salary'].sub(3) returns:")
print(df["Salary"].sub(3))

#subtracting 3 from all entries of Salary column
#subtracting 2 from all entries of Bonus column
print("\ndf[['Salary', 'Bonus']].sub([3,2]) returns:")
print(df[["Salary", "Bonus"]].sub([3,2]))

The output of the above code will be:

The DataFrame is:
       Bonus  Last Salary  Salary
John       5           58      60
Marry      3           60      62
Sam        2           63      65
Jo         4           57      59

df['Salary'].sub(3) returns:
John     57
Marry    59
Sam      62
Jo       56
Name: Salary, dtype: int64

df[['Salary', 'Bonus']].sub([3,2]) returns:
       Salary  Bonus
John       57      3
Marry      59      1
Sam        62      0
Jo         56      2

Example: Subtracting columns in a DataDrame

The sub() function can be applied in a DataFrame to get the subtraction of two series/column element-wise. Consider the following example.

import pandas as pd
import numpy as np

df = pd.DataFrame({
  "Bonus": [5, 3, 2, 4],
  "Total Salary": [60, 62, 65, 59]},
  index= ["John", "Marry", "Sam", "Jo"]
)

print("The DataFrame is:")
print(df)

#subtracting 'Bonus' from 'Total Salary' column
df['Salary'] = df['Total Salary'].sub(df['Bonus'])

print("\nThe DataFrame is:")
print(df)

The output of the above code will be:

The DataFrame is:
       Bonus  Total Salary
John       5            60
Marry      3            62
Sam        2            65
Jo         4            59

The DataFrame is:
       Bonus  Total Salary  Salary
John       5            60      55
Marry      3            62      59
Sam        2            65      63
Jo         4            59      55

❮ Pandas DataFrame - Functions