Pandas Tutorial Pandas References

Pandas DataFrame - cumprod() function



The Pandas DataFrame cumprod() function computes cumulative product over a DataFrame or Series axis and returns a DataFrame or Series of the same size containing the cumulative product.

Syntax

DataFrame.cumprod(axis=None, skipna=True)

Parameters

axis Optional. Specify {0 or 'index', 1 or 'columns'}. If 0 or 'index', cumulative products are generated for each column. If 1 or 'columns', cumulative products are generated for each row. Default: 0
skipna Optional. Specify True to exclude NA/null values when computing the result. Default is True.

Return Value

Return cumulative product of Series or DataFrame.

Example: using cumprod() column-wise on whole DataFrame

In the example below, a DataFrame report is created. The cumprod() function is used to get the cumulative product of each column.

import pandas as pd
import numpy as np

report = pd.DataFrame({
  "GDP": [1.02, 1.03, 1.04, 0.98],
  "GNP": [1.05, 0.99, np.nan, 1.04]},
  index= ["Q1", "Q2", "Q3", "Q4"]
)

#displaying the dataframe
print(report,"\n")

#displaying the cumulative product
print("report.cumprod() returns:")
print(report.cumprod(),"\n")

#using skipna=False
print("report.cumprod(skipna=False) returns:")
print(report.cumprod(skipna=False))

The output of the above code will be:

     GDP   GNP
Q1  1.02  1.05
Q2  1.03  0.99
Q3  1.04   NaN
Q4  0.98  1.04 

report.cumprod() returns:
         GDP      GNP
Q1  1.020000  1.05000
Q2  1.050600  1.03950
Q3  1.092624      NaN
Q4  1.070772  1.08108 

report.cumprod(skipna=False) returns:
         GDP     GNP
Q1  1.020000  1.0500
Q2  1.050600  1.0395
Q3  1.092624     NaN
Q4  1.070772     NaN

Example: using cumprod() row-wise on whole DataFrame

To get the row-wise cumulative product, the axis parameter can be set to 1.

import pandas as pd
import numpy as np

report = pd.DataFrame({
  "Q1": [1.02, 1.03, 1.02, 1.01, 1.03],
  "Q2": [0.98, 1.01, 1.01, np.nan, 1.01],
  "Q3": [1.02, 1.01, 1.02, 1.03, 1.04],
  "Q4": [1.02, 1.02, 0.99, 1.01, 1.02]},
  index= ["GDP", "GNP", "HDI", "Manufacturing", "Agriculture"]
)

#displaying the dataframe
print(report,"\n")

#displaying the cumulative product
print("report.cumprod(axis=1) returns:")
print(report.cumprod(axis=1),"\n")

#using skipna=False
print("report.cumprod(axis=1, skipna=False) returns:")
print(report.cumprod(axis=1, skipna=False))

The output of the above code will be:

                 Q1    Q2    Q3    Q4
GDP            1.02  0.98  1.02  1.02
GNP            1.03  1.01  1.01  1.02
HDI            1.02  1.01  1.02  0.99
Manufacturing  1.01   NaN  1.03  1.01
Agriculture    1.03  1.01  1.04  1.02 

report.cumprod(axis=1) returns:
                 Q1      Q2        Q3        Q4
GDP            1.02  0.9996  1.019592  1.039984
GNP            1.03  1.0403  1.050703  1.071717
HDI            1.02  1.0302  1.050804  1.040296
Manufacturing  1.01     NaN  1.040300  1.050703
Agriculture    1.03  1.0403  1.081912  1.103550 

report.cumprod(axis=1, skipna=False) returns:
                 Q1      Q2        Q3        Q4
GDP            1.02  0.9996  1.019592  1.039984
GNP            1.03  1.0403  1.050703  1.071717
HDI            1.02  1.0302  1.050804  1.040296
Manufacturing  1.01     NaN       NaN       NaN
Agriculture    1.03  1.0403  1.081912  1.103550

Example: using cumprod() on selected column

Instead of whole DataFrame, the cumprod() function can be applied on selected columns. Consider the following example.

import pandas as pd
import numpy as np

report = pd.DataFrame({
  "GDP": [1.02, 1.03, 1.04, 0.98],
  "GNP": [1.05, 0.99, np.nan, 1.04],
  "HDI": [1.02, 1.01, 1.02, 1.03]},
  index= ["Q1", "Q2", "Q3", "Q4"]
)

#displaying the dataframe
print(report,"\n")

#cumulative product on single column
print("report['GDP'].cumprod() returns:")
print(report['GDP'].cumprod(),"\n")

#cumulative product on multiple column
print("report[['GDP', 'HDI']].cumprod() returns:")
print(report[['GDP', 'HDI']].cumprod(),"\n")

The output of the above code will be:

     GDP   GNP   HDI
Q1  1.02  1.05  1.02
Q2  1.03  0.99  1.01
Q3  1.04   NaN  1.02
Q4  0.98  1.04  1.03 

report['GDP'].cumprod() returns:
Q1    1.020000
Q2    1.050600
Q3    1.092624
Q4    1.070772
Name: GDP, dtype: float64 

report[['GDP', 'HDI']].cumprod() returns:
         GDP       HDI
Q1  1.020000  1.020000
Q2  1.050600  1.030200
Q3  1.092624  1.050804
Q4  1.070772  1.082328 

❮ Pandas DataFrame - Functions

5