Pandas Tutorial Pandas References

Pandas DataFrame - pct_change() function



The Pandas DataFrame pct_change() function computes the percentage change between the current and a prior element by default. This is useful in comparing the percentage of change in a time series of elements.

Syntax

DataFrame.pct_change(periods=1, fill_method='pad', 
                     limit=None, freq=None)

Parameters

periods Optional. Specify the period to shift for calculating percent change. Default: 1
fill_method Optional. Specify how to handle NAs before computing percent changes. Default: 'pad'. It can take values from {'backfill', 'bfill', 'pad', 'ffill', None}. pad / ffill: use last valid observation to fill gap. backfill / bfill: use next valid observation to fill gap.
limit Optional. Specify the number of consecutive NAs to fill before stopping. Default is None.
freq Optional. A DateOffset, timedelta, or str to specify increment to use from time series API (e.g. 'M' or BDay()). Default is None.

Return Value

Returns the same type as the calling object with percentage change of element.

Example: Percentage change of elements of a DataFrame

In the example below, a DataFrame df is created. The pct_change() function is used to calculate the percentage change of elements of all numerical columns.

import pandas as pd
import numpy as np

df = pd.DataFrame({
  "GDP": [1.5, 2.5, 3.5, 1.5, 2.5, -1],
  "GNP": [1, 2, 3, 3, 2, -1],
  "HPI": [2, 3, 2, np.NaN, 2, 2]},
  index= ["2015", "2016", "2017", 
          "2018", "2019", "2020"]
)

print("The DataFrame is:")
print(df)

#percentage change of element with period = 1
print("\ndf.pct_change() returns:")
print(df.pct_change())

#percentage change of element with period = 2
print("\ndf.pct_change(periods=2) returns:")
print(df.pct_change(periods=2))

The output of the above code will be:

The DataFrame is:
      GDP  GNP  HPI
2015  1.5    1  2.0
2016  2.5    2  3.0
2017  3.5    3  2.0
2018  1.5    3  NaN
2019  2.5    2  2.0
2020 -1.0   -1  2.0

df.pct_change() returns:
           GDP       GNP       HPI
2015       NaN       NaN       NaN
2016  0.666667  1.000000  0.500000
2017  0.400000  0.500000 -0.333333
2018 -0.571429  0.000000  0.000000
2019  0.666667 -0.333333  0.000000
2020 -1.400000 -1.500000  0.000000

df.pct_change(periods=2) returns:
           GDP       GNP       HPI
2015       NaN       NaN       NaN
2016       NaN       NaN       NaN
2017  1.333333  2.000000  0.000000
2018 -0.400000  0.500000 -0.333333
2019 -0.285714 -0.333333  0.000000
2020 -1.666667 -1.333333  0.000000

Example: Percentage change row-wise

To calculate the percentage change row-wise, the axis=1 can be passed. Consider the example below:

import pandas as pd
import numpy as np

df = pd.DataFrame({
  "2015": [1.5, 1, 2],
  "2016": [2.5, 2, 3],
  "2017": [3.5, 3, 2],
  "2018": [1.5, 3, np.NaN],
  "2019": [2.5, 2, 2],
  "2020": [-1, -1, 2]},
  index= ["GDP", "GNP", "HDI"]
)

print("The DataFrame is:")
print(df)

#percentage change of element with period = 1
print("\ndf.pct_change(axis=1) returns:")
print(df.pct_change(axis=1))

#percentage change of element with period = 2
print("\ndf.pct_change(axis=1, periods=2) returns:")
print(df.pct_change(axis=1, periods=2))

The output of the above code will be:

The DataFrame is:
     2015  2016  2017  2018  2019  2020
GDP   1.5   2.5   3.5   1.5   2.5    -1
GNP   1.0   2.0   3.0   3.0   2.0    -1
HDI   2.0   3.0   2.0   NaN   2.0     2

df.pct_change(axis=1) returns:
     2015      2016      2017      2018      2019  2020
GDP   NaN  0.666667  0.400000 -0.571429  0.666667  -1.4
GNP   NaN  1.000000  0.500000  0.000000 -0.333333  -1.5
HDI   NaN  0.500000 -0.333333  0.000000  0.000000   0.0

df.pct_change(axis=1, periods=2) returns:
     2015  2016      2017      2018      2019      2020
GDP   NaN   NaN  1.333333 -0.400000 -0.285714 -1.666667
GNP   NaN   NaN  2.000000  0.500000 -0.333333 -1.333333
HDI   NaN   NaN  0.000000 -0.333333  0.000000  0.000000

Example: Percentage change of selected columns

Instead of whole DataFrame, the pct_change() function can be applied on selected columns. Consider the following example.

import pandas as pd
import numpy as np

df = pd.DataFrame({
  "GDP": [1.5, 2.5, 3.5, 1.5, 2.5, -1],
  "GNP": [1, 2, 3, 3, 2, -1],
  "HPI": [2, 3, 2, np.NaN, 2, 2]},
  index= ["2015", "2016", "2017", 
          "2018", "2019", "2020"]
)

print("The DataFrame is:")
print(df)

#percentage change of element of single column
print("\ndf['GDP'].pct_change() returns:")
print(df['GDP'].pct_change())

#percentage change of element of multiple columns
print("\ndf[['GDP', 'GNP']].pct_change() returns:")
print(df[['GDP', 'GNP']].pct_change())

The output of the above code will be:

The DataFrame is:
      GDP  GNP  HPI
2015  1.5    1  2.0
2016  2.5    2  3.0
2017  3.5    3  2.0
2018  1.5    3  NaN
2019  2.5    2  2.0
2020 -1.0   -1  2.0

df['GDP'].pct_change() returns:
2015         NaN
2016    0.666667
2017    0.400000
2018   -0.571429
2019    0.666667
2020   -1.400000
Name: GDP, dtype: float64

df[['GDP', 'GNP']].pct_change() returns:
           GDP       GNP
2015       NaN       NaN
2016  0.666667  1.000000
2017  0.400000  0.500000
2018 -0.571429  0.000000
2019  0.666667 -0.333333
2020 -1.400000 -1.500000

Example: using fill_method parameter

The example below demonstrates on how to use fill_method parameter with this function.

import pandas as pd
import numpy as np

df = pd.DataFrame({
  "GDP": [1.5, np.NaN, 2, np.NaN, 2.5, -1],
  "GNP": [1, 2, 3, 3, 2, -1]},
  index= ["2015", "2016", "2017", 
          "2018", "2019", "2020"]
)

print("The DataFrame is:")
print(df)

#using bfill as fill_method method
print("\ndf['GDP'].pct_change(fill_method='bfill') returns:")
print(df['GDP'].pct_change(fill_method='bfill'))

#using ffill as fill_method method
print("\ndf['GDP'].pct_change(fill_method='ffill') returns:")
print(df['GDP'].pct_change(fill_method='ffill'))

The output of the above code will be:

The DataFrame is:
      GDP  GNP
2015  1.5    1
2016  NaN    2
2017  2.0    3
2018  NaN    3
2019  2.5    2
2020 -1.0   -1

df['GDP'].pct_change(fill_method=bfill) returns:
2015         NaN
2016    0.333333
2017    0.000000
2018    0.250000
2019    0.000000
2020   -1.400000
Name: GDP, dtype: float64

df['GDP'].pct_change(fill_method=ffill) returns:
2015         NaN
2016    0.000000
2017    0.333333
2018    0.000000
2019    0.250000
2020   -1.400000
Name: GDP, dtype: float64

❮ Pandas DataFrame - Functions

5