Pandas Tutorial Pandas Resources
Python Java C++ C C# PHP R SQL DS Algo InterviewQ

Pandas DataFrame - std() function



The Pandas DataFrame - std() function returns the sample standard deviation of the values over the specified axis. The syntax for using this function is mentioned below:

Syntax

DataFrame.std(axis=None, skipna=None, level=None, ddof=1, numeric_only=None)

Parameters

axis Optional. Specify {0 or 'index', 1 or 'columns'}. If 0 or 'index' sample standard deviation of the values are generated for each column. If 1 or 'columns' sample standard deviation of the values are generated for each row. Default: 0
skipna Optional. Specify True to exclude NA/null values when computing the result. Default is True.
level Optional. Specify level (int or str). If the axis is a MultiIndex (hierarchical), count along a particular level, collapsing into a Series. A str specifies the level name.
ddof Optional. Specify Delta Degrees of Freedom. The divisor used in calculations is N - ddof, where N represents the number of elements.
numeric_only Optional. Specify True to include only float, int or boolean data. Default: False

Return Value

Returns sample standard deviation of the values of Series or DataFrame if a level is specified.

Example: Using std() column-wise on whole DataFrame

In the example below, a DataFrame df is created. The std() function is used to get the sample standard deviation of values for each column.

import pandas as pd
import numpy as np

df = pd.DataFrame({
  "Bonus": [5, 3, 2, 4],
  "Salary": [60, 62, 65, 59]},
  index= ["John", "Marry", "Sam", "Jo"]
)

print("The DataFrame is:")
print(df)

#sample standard deviation of values 
#of all entries column-wise
print("\ndf.std() returns:")
print(df.std())

The output of the above code will be:

The DataFrame is:
       Bonus  Salary
John       5      60
Marry      3      62
Sam        2      65
Jo         4      59

df.std() returns:
Bonus     1.290994
Salary    2.645751
dtype: float64

Example: Using std() row-wise on whole DataFrame

To get the row-wise sum, the axis parameter can set to 1.

import pandas as pd
import numpy as np

df = pd.DataFrame({
  "Bonus": [5, 3, 2, 4],
  "Salary": [60, 62, 65, 59]},
  index= ["John", "Marry", "Sam", "Jo"]
)

print("The DataFrame is:")
print(df)

#sample standard deviation of values 
#of all entries row-wise
print("\ndf.std(axis=1) returns:")
print(df.std(axis=1))

The output of the above code will be:

The DataFrame is:
       Bonus  Salary
John       5      60
Marry      3      62
Sam        2      65
Jo         4      59

df.std(axis=1) returns:
John     38.890873
Marry    41.719300
Sam      44.547727
Jo       38.890873
dtype: float64

Example: Using std() on selected column

Instead of whole data frame, the std() function can be applied on selected columns. Consider the following example.

import pandas as pd
import numpy as np

df = pd.DataFrame({
  "Bonus": [5, 3, 2, 4],
  "Last Salary": [58, 60, 63, 57],
  "Salary": [60, 62, 65, 59]},
  index= ["John", "Marry", "Sam", "Jo"]
)

print("The DataFrame is:")
print(df)

#sample standard deviation of 
#values of single column
print("\ndf['Salary'].std() returns:")
print(df["Salary"].std())

#sample standard deviation of
#values of multiple columns
print("\ndf[['Salary', 'Bonus']].std() returns:")
print(df[["Salary", "Bonus"]].std())

The output of the above code will be:

The DataFrame is:
       Bonus  Last Salary  Salary
John       5           58      60
Marry      3           60      62
Sam        2           63      65
Jo         4           57      59

df['Salary'].std() returns:
2.64575131106

df[['Salary', 'Bonus']].std() returns:
Salary    2.645751
Bonus     1.290994
dtype: float64

❮ Pandas DataFrame - Functions

5