Pandas Tutorial Pandas References

Pandas - DataFrame Attributes



DataFrame attributes reflect information that is intrinsic to the DataFrame. Accessing a DataFrame through its attributes allows us to get the intrinsic properties of the DataFrame. The most commonly used attributes are mentioned below:

FunctionDescription
DataFrame.columns Returns column labels of the DataFrame.
DataFrame.dtypes Return the dtypes in the DataFrame.
DataFrame.empty Indicates whether DataFrame is empty.
DataFrame.index Returns the index (row labels) of the DataFrame.
DataFrame.ndim Return an int representing the number of axes / array dimensions.
DataFrame.shape Return a tuple representing the dimensionality of the DataFrame.
DataFrame.size Return an int representing the number of elements in this object.
DataFrame.values Return a Numpy representation of the DataFrame.

Lets discuss these attributes in detail:

DataFrame.columns

The columns attribute is used to return column labels of the DataFrame. Consider the following example:

import pandas as pd

df = pd.DataFrame({
  "Bonus": [5, 3, 2, 4, 3, 4],
  "Last Salary": [58, 60, 63, 57, 62, 59],
  "Salary": [60, 62, 65, 59, 63, 62]},
  index= ["John", "Marry", "Sam", "Jo", "Ramesh", "Kim"]
)

print("The DataFrame contains:")
print(df)

print("\nThe column labels are:")
print(df.columns)

The output of the above code will be:

The DataFrame contains:
        Bonus  Last Salary  Salary
John        5           58      60
Marry       3           60      62
Sam         2           63      65
Jo          4           57      59
Ramesh      3           62      63
Kim         4           59      62

The column labels are:
Index([u'Bonus', u'Last Salary', u'Salary'], dtype='object')

DataFrame.dtype

The dtypes attribute is used to get the dtypes in the given DataFrame. Consider the following example.

import pandas as pd

data = {'Name': ['John', 'Marry', 'Jo', 'Sam'],
        'Age': [25, 24, 30, 28]}
df = pd.DataFrame(data)

print("dtypes of df:\n", df.dtypes)

The output of the above code will be:

dtypes of df:
Name    object
Age      int64
dtype: object

DataFrame.empty

The empty attribute is used to check whether the given DataFrame is empty or not.

import pandas as pd
Name = ['John', 'Marry', 'Jo', 'Sam']
df1 = pd.DataFrame(Name)
df2 = pd.DataFrame()

print("Is df1 empty?:", df1.empty)
print("Is df2 empty?:", df2.empty)

The output of the above code will be:

Is df1 empty?: False
Is df2 empty?: True

DataFrame.index

The index attribute is used to return the index (row labels) of the DataFrame.

import pandas as pd

df = pd.DataFrame({
  "Bonus": [5, 3, 2, 4, 3, 4],
  "Last Salary": [58, 60, 63, 57, 62, 59],
  "Salary": [60, 62, 65, 59, 63, 62]},
  index= ["John", "Marry", "Sam", "Jo", "Ramesh", "Kim"]
)

print("The DataFrame contains:")
print(df)

print("\nThe index (row labels) are:")
print(df.index)

The output of the above code will be:

The DataFrame contains:
        Bonus  Last Salary  Salary
John        5           58      60
Marry       3           60      62
Sam         2           63      65
Jo          4           57      59
Ramesh      3           62      63
Kim         4           59      62

The index (row labels) are:
Index(['John', 'Marry', 'Sam', 'Jo', 'Ramesh', 'Kim'], dtype='object')

DataFrame.ndim

The ndim attribute is used to get the dimensions (number of axes / array dimensions) of the given DataFrame. Consider the example below:

import pandas as pd

df = pd.DataFrame({
  "Bonus": [5, 3, 2, 4, 3, 4],
  "Last Salary": [58, 60, 63, 57, 62, 59],
  "Salary": [60, 62, 65, 59, 63, 62]},
  index= ["John", "Marry", "Sam", "Jo", "Ramesh", "Kim"]
)

#dimension of df
print("Dimension of df:", df.ndim)

The output of the above code will be:

Dimension of df: 2

DataFrame.shape

The shape attribute can be used to get a tuple representing the dimensionality of the DataFrame. Consider the following example.

import pandas as pd

df = pd.DataFrame({
  "Bonus": [5, 3, 2, 4, 3, 4],
  "Last Salary": [58, 60, 63, 57, 62, 59],
  "Salary": [60, 62, 65, 59, 63, 62]},
  index= ["John", "Marry", "Sam", "Jo", "Ramesh", "Kim"]
)

#shape of df
print("Shape of df:", df.shape)

The output of the above code will be:

Shape of df: (6, 3)

DataFrame.size

The size attribute is used to get number of elements in the given DataFrame. Consider the example below:

import pandas as pd

df = pd.DataFrame({
  "Bonus": [5, 3, 2, 4, 3, 4],
  "Last Salary": [58, 60, 63, 57, 62, 59],
  "Salary": [60, 62, 65, 59, 63, 62]},
  index= ["John", "Marry", "Sam", "Jo", "Ramesh", "Kim"]
)

print("The DataFrame is:")
print(df)

print("\nThe number of elements in df:", df.size)

The output of the above code will be:

The DataFrame is:
        Bonus  Last Salary  Salary
John        5           58      60
Marry       3           60      62
Sam         2           63      65
Jo          4           57      59
Ramesh      3           62      63
Kim         4           59      62

The number of elements in df: 18

DataFrame.values

The values attribute is used to return numpy representation of the DataFrame. Consider the following example:

import pandas as pd

df = pd.DataFrame({
  "Bonus": [5, 3, 2, 4, 3, 4],
  "Last Salary": [58, 60, 63, 57, 62, 59],
  "Salary": [60, 62, 65, 59, 63, 62]},
  index= ["John", "Marry", "Sam", "Jo", "Ramesh", "Kim"]
)

print("The DataFrame is:")
print(df)

print("\nThe numpy representation of df:")
print(df.values)

The output of the above code will be:

The DataFrame is:
        Bonus  Last Salary  Salary
John        5           58      60
Marry       3           60      62
Sam         2           63      65
Jo          4           57      59
Ramesh      3           62      63
Kim         4           59      62

The numpy representation of df:
[[ 5 58 60]
 [ 3 60 62]
 [ 2 63 65]
 [ 4 57 59]
 [ 3 62 63]
 [ 4 59 62]]