Pandas Tutorial Pandas References

Pandas DataFrame - sum() function



The Pandas DataFrame sum() function returns the sum of the values over the specified axis. The syntax for using this function is mentioned below:

Syntax

DataFrame.sum(axis=None, skipna=None, level=None, 
              numeric_only=None, min_count=0)

Parameters

axis Optional. Specify {0 or 'index', 1 or 'columns'}. If 0 or 'index', sums are generated for each column. If 1 or 'columns', sums are generated for each row. Default: 0
skipna Optional. Specify True to exclude NA/null values when computing the result. Default is True.
level Optional. Specify level (int or str). If the axis is a MultiIndex (hierarchical), count along a particular level, collapsing into a Series. A str specifies the level name.
numeric_only Optional. Specify True to include only float, int or boolean data. Default: False
min_count Optional. Specify required number of valid values to perform the operation. If the count of non-NA values is less than the min_count, the result will be NA.

Return Value

Returns sum of Series or DataFrame if a level is specified.

Example: using sum() column-wise on whole DataFrame

In the example below, a DataFrame df is created. The sum() function is used to get the sum of each column.

import pandas as pd
import numpy as np

df = pd.DataFrame({
  "Bonus": [5, 3, 2, 4],
  "Salary": [60, 62, 65, 59]},
  index= ["John", "Marry", "Sam", "Jo"]
)

print("The DataFrame is:")
print(df)

#Sum of all entries column-wise
print("\ndf.sum() returns:")
print(df.sum())

The output of the above code will be:

The DataFrame is:
       Bonus  Salary
John       5      60
Marry      3      62
Sam        2      65
Jo         4      59

df.sum() returns:
Bonus      14
Salary    246
dtype: int64

Example: using sum() row-wise on whole DataFrame

To perform the operation row-wise, the axis parameter can be set to 1.

import pandas as pd
import numpy as np

df = pd.DataFrame({
  "Bonus": [5, 3, 2, 4],
  "Salary": [60, 62, 65, 59]},
  index= ["John", "Marry", "Sam", "Jo"]
)

print("The DataFrame is:")
print(df)

#Sum of all entries row-wise
print("\ndf.sum(axis=1) returns:")
print(df.sum(axis=1))

The output of the above code will be:

The DataFrame is:
       Bonus  Salary
John       5      60
Marry      3      62
Sam        2      65
Jo         4      59

df.sum(axis=1) returns:
John     65
Marry    65
Sam      67
Jo       63
dtype: int64

Example: using sum() on selected column

Instead of whole DataFrame, the sum() function can be applied on selected columns. Consider the following example.

import pandas as pd
import numpy as np

df = pd.DataFrame({
  "Bonus": [5, 3, 2, 4],
  "Last Salary": [58, 60, 63, 57],
  "Salary": [60, 62, 65, 59]},
  index= ["John", "Marry", "Sam", "Jo"]
)

print("The DataFrame is:")
print(df)

#sum of single column
print("\ndf['Salary'].sum() returns:")
print(df["Salary"].sum())

#sum of multiple columns
print("\ndf[['Salary', 'Bonus']].sum() returns:")
print(df[["Salary", "Bonus"]].sum())

The output of the above code will be:

The DataFrame is:
       Bonus  Last Salary  Salary
John       5           58      60
Marry      3           60      62
Sam        2           63      65
Jo         4           57      59

df['Salary'].sum() returns:
246

df[['Salary', 'Bonus']].sum() returns:
Salary    246
Bonus      14
dtype: int64

❮ Pandas DataFrame - Functions