Pandas Tutorial Pandas References

Pandas DataFrame - cumsum() function



The Pandas DataFrame cumsum() function computes cumulative sum over a DataFrame or Series axis and returns a DataFrame or Series of the same size containing the cumulative sum.

Syntax

DataFrame.cumsum(axis=None, skipna=True)

Parameters

axis Optional. Specify {0 or 'index', 1 or 'columns'}. If 0 or 'index', cumulative sums are generated for each column. If 1 or 'columns', cumulative sums are generated for each row. Default: 0
skipna Optional. Specify True to exclude NA/null values when computing the result. Default is True.

Return Value

Return cumulative sum of Series or DataFrame.

Example: using cumsum() column-wise on whole DataFrame

In the example below, a DataFrame info is created. The cumsum() function is used to get the cumulative sum of each column.

import pandas as pd
import numpy as np

info = pd.DataFrame({
  "Salary": [25, 24, 30, 28, 25],
  "Bonus": [10, 8, 9, np.nan, 9]},
  index= ["2015", "2016", "2017", "2018", "2019"]
)

#displaying the dataframe
print(info,"\n")

#displaying the cumulative sum
print("info.cumsum() returns:")
print(info.cumsum(),"\n")

#using skipna=False
print("info.cumsum(skipna=False) returns:")
print(info.cumsum(skipna=False))

The output of the above code will be:

      Salary  Bonus
2015      25   10.0
2016      24    8.0
2017      30    9.0
2018      28    NaN
2019      25    9.0 

info.cumsum() returns:
      Salary  Bonus
2015      25   10.0
2016      49   18.0
2017      79   27.0
2018     107    NaN
2019     132   36.0 

info.cumsum(skipna=False) returns:
      Salary  Bonus
2015      25   10.0
2016      49   18.0
2017      79   27.0
2018     107    NaN
2019     132    NaN

Example: using cumsum() row-wise on whole DataFrame

To get the row-wise cumulative sum, the axis parameter can be set to 1.

import pandas as pd
import numpy as np

info = pd.DataFrame({
  "2016": [25, 24, 30, 28, 25],
  "2017": [18, 20, 25, np.nan, 28],
  "2018": [25, 24, 25, 30, 25]},
  index= ["P1", "P2", "P3", "P4", "P5"]
)

#displaying the dataframe
print(info,"\n")

#displaying the cumulative sum
print("info.cumsum(axis=1) returns:")
print(info.cumsum(axis=1),"\n")

#using skipna=False
print("info.cumsum(axis=1, skipna=False) returns:")
print(info.cumsum(axis=1, skipna=False))

The output of the above code will be:

    2016  2017  2018
P1    25  18.0    25
P2    24  20.0    24
P3    30  25.0    25
P4    28   NaN    30
P5    25  28.0    25 

info.cumsum(axis=1) returns:
    2016  2017  2018
P1  25.0  43.0  68.0
P2  24.0  44.0  68.0
P3  30.0  55.0  80.0
P4  28.0   NaN  58.0
P5  25.0  53.0  78.0 

info.cumsum(axis=1, skipna=False) returns:
    2016  2017  2018
P1  25.0  43.0  68.0
P2  24.0  44.0  68.0
P3  30.0  55.0  80.0
P4  28.0   NaN   NaN
P5  25.0  53.0  78.0

Example: using cumsum() on selected column

Instead of whole DataFrame, the cumsum() function can be applied on selected columns. Consider the following example.

import pandas as pd
import numpy as np

info = pd.DataFrame({
  "Salary": [25, 24, 30, 28, 25],
  "Bonus": [10, 8, 9, np.nan, 9],
  "Others": [5, 4, 7, 5, 8]},
  index= ["2015", "2016", "2017", "2018", "2019"]
)

#displaying the dataframe
print(info,"\n")

#cumulative sum on single column
print("info['Salary'].cumsum() returns:")
print(info['Salary'].cumsum(),"\n")

#cumulative sum on multiple column
print("info[['Salary', 'Others']].cumsum() returns:")
print(info[['Salary', 'Others']].cumsum(),"\n")

The output of the above code will be:

      Salary  Bonus  Others
2015      25   10.0       5
2016      24    8.0       4
2017      30    9.0       7
2018      28    NaN       5
2019      25    9.0       8 

info['Salary'].cumsum() returns:
2015     25
2016     49
2017     79
2018    107
2019    132
Name: Salary, dtype: int64 

info[['Salary', 'Others']].cumsum() returns:
      Salary  Others
2015      25       5
2016      49       9
2017      79      16
2018     107      21
2019     132      29 

❮ Pandas DataFrame - Functions

5