Pandas Tutorial Pandas References

Pandas DataFrame - count() function



The Pandas DataFrame count() function is used to count non-NA cells for each column or row. The values None, NaN, NaT, and optionally pandas.inf (depending on pandas.options.mode.use_inf_as_na) are considered NA.

Syntax

DataFrame.count(axis=0, level=None, 
                numeric_only=False)

Parameters

axis Optional. Specify {0 or 'index', 1 or 'columns'}. If 0 or 'index', counts are generated for each column. If 1 or 'columns', counts are generated for each row. Default: 0
level Optional. Specify level (int or str). If the axis is a MultiIndex (hierarchical), count along a particular level, collapsing into a DataFrame. A str specifies the level name.
numeric_only Optional. Specify True to include only float, int or boolean data. Default: False

Return Value

Returns a Series of count of non-NA/null entries for each column/row. If level is specified DataFrame is returned.

Example: using count() column-wise on whole DataFrame

In the example below, a DataFrame info is created. The count() function is used to get the count of non-NA values of each column.

import pandas as pd
import numpy as np

info = pd.DataFrame({
	"Person": ["John", "Mary", "Jo", "Sam"],
	"Age": [25, 24, 30, 28],
	"Bonus": ["10K", np.nan, "10K", "9K"]
})

print(info,"\n")
print(info.count())

The output of the above code will be:

  Person  Age Bonus
0   John   25   10K
1   Mary   24   NaN
2     Jo   30   10K
3    Sam   28    9K 

Person    4
Age       4
Bonus     3
dtype: int64

Example: using count() row-wise on whole DataFrame

To get the row-wise count, the axis parameter can be set to 1.

import pandas as pd
import numpy as np

info = pd.DataFrame({
	"Person": ["John", "Mary", "Jo", "Sam"],
	"Age": [25, 24, 30, 28],
	"Bonus": ["10K", np.nan, "10K", "9K"]
})

print(info,"\n")
print(info.count(axis=1))

The output of the above code will be:

  Person  Age Bonus
0   John   25   10K
1   Mary   24   NaN
2     Jo   30   10K
3    Sam   28    9K 

0    3
1    2
2    3
3    3
dtype: int64

Example: using count() on selected column

Instead of whole DataFrame, the count() function can be applied on selected columns. Consider the following example.

import pandas as pd
import numpy as np

info = pd.DataFrame({
	"Person": ["John", "Mary", "Jo", "Sam"],
	"Age": [25, 24, 30, 28],
	"Bonus": ["10K", np.nan, "10K", "9K"]
})

print(info)

#count on single column
print("\ncount on Person returns:")
print(info['Person'].count())

#count on multiple columns
print("\ncount on Person and Bonus returns:")
print(info[['Person', 'Bonus']].count())

The output of the above code will be:

  Person  Age Bonus
0   John   25   10K
1   Mary   24   NaN
2     Jo   30   10K
3    Sam   28    9K

count on Person returns:
4

count on Person and Bonus returns:
Person    4
Bonus     3
dtype: int64

Example: using count() with level parameter

The example below shows how to create one level of a MultiIndex.

import pandas as pd
import numpy as np

info = pd.DataFrame({
	"Person": ["John", "Mary", "Jo", "Sam"],
	"Age": [25, np.nan, 30, 28],
	"Bonus": ["10K", np.nan, "10K", "9K"]
})

print(info,"\n")

#count with level parameter
print(info.set_index(['Person', 'Bonus']).count(level='Person'))

The output of the above code will be:

  Person   Age Bonus
0   John  25.0   10K
1   Mary   NaN   NaN
2     Jo  30.0   10K
3    Sam  28.0    9K 

        Age
Person     
Jo        1
John      1
Mary      0
Sam       1

❮ Pandas DataFrame - Functions

5