Pandas Tutorial Pandas References

Pandas DataFrame - transform() function



The Pandas DataFrame transform() function calls func on self and produce a DataFrame with transformed values.

Syntax

DataFrame.transform(func, axis=0)

Parameters

func Required. Specify function used for transforming the data. If a function, must either work when passed a DataFrame or when passed to DataFrame.apply. If func is both list-like and dict-like, dict-like behavior takes precedence. Accepted combinations are:
  • function
  • string function name
  • list of functions and/or function names, e.g. [np.exp, 'sqrt']
  • dictionary of axis labels -> functions, function names or list of such.
axis Optional. Specify axis on which the function need to be applied. Default is 0. If 0 or 'index': applies function to each column. If 1 or 'columns': applies function to each row.

Return Value

Returns a DataFrame of same length as self with transformed values.

Exceptions

Returns ValueError, if the returned DataFrame has a different length than self.

Example: using transform() on whole DataFrame

In the example below, a DataFrame df is created. The transform() function is used to apply given function on this DataFrame.

import pandas as pd
import numpy as np

df = pd.DataFrame({
  "Sample1": [1, 2, 3, 4, 5],
  "Sample2": [11, 12, 13, 14, 15]},
  index= ["x1", "x2", "x3", "x4", "x5"]
)

print("The DataFrame contains:")
print(df)

print("\ndf.transform(lambda x: x*10) returns:")
print(df.transform(lambda x: x*10))

The output of the above code will be:

The DataFrame contains:
    Sample1  Sample2
x1        1       11
x2        2       12
x3        3       13
x4        4       14
x5        5       15

df.transform(lambda x: x*10) returns:
    Sample1  Sample2
x1       10      110
x2       20      120
x3       30      130
x4       40      140
x5       50      150

Example: using more operations on whole DataFrame

Multiple operations can be applied on a DataFrame at the same time. Like in the example below, two operations - 'sqrt' (square root) and 'cbrt' (cube root) are applied, each producing the DataFrame of same length.

import pandas as pd
import numpy as np

df = pd.DataFrame({
  "Sample1": [1, 2, 3, 4, 5],
  "Sample2": [11, 12, 13, 14, 15]},
  index= ["x1", "x2", "x3", "x4", "x5"]
)

print("The DataFrame is:")
print(df)

print("\ndf.transform(['sqrt', 'cbrt']) returns:")
print(df.transform(['sqrt', 'cbrt']))

The output of the above code will be:

The DataFrame is:
    Sample1  Sample2
x1        1       11
x2        2       12
x3        3       13
x4        4       14
x5        5       15

df.transform(['sqrt', 'cbrt']) returns:
     Sample1             Sample2          
        sqrt      cbrt      sqrt      cbrt
x1  1.000000  1.000000  3.316625  2.223980
x2  1.414214  1.259921  3.464102  2.289428
x3  1.732051  1.442250  3.605551  2.351335
x4  2.000000  1.587401  3.741657  2.410142
x5  2.236068  1.709976  3.872983  2.466212

Example: using transform() on selected columns

Instead of whole DataFrame, the transform() function can be applied on selected columns. Consider the following example.

import pandas as pd
import numpy as np

df = pd.DataFrame({
  "Sample1": [1, 2, 3, 4, 5],
  "Sample2": [11, 12, 13, 14, 15],
  "Sample3": [21, 22, 23, 24, 25]},
  index= ["x1", "x2", "x3", "x4", "x5"]
)

print("The DataFrame contains:")
print(df)

#applying transform() on single column
print("\ndf['Sample1'].transform('sqrt') returns:")
print(df['Sample1'].transform('sqrt'))

#applying transform() on multiple columns
print("\ndf[['Sample1', 'Sample2']].transform('sqrt') returns:")
print(df[['Sample1', 'Sample2']].transform('sqrt'))

The output of the above code will be:

The DataFrame contains:
    Sample1  Sample2  Sample3
x1        1       11       21
x2        2       12       22
x3        3       13       23
x4        4       14       24
x5        5       15       25

df['Sample1'].transform('sqrt') returns:
x1    1.000000
x2    1.414214
x3    1.732051
x4    2.000000
x5    2.236068
Name: Sample1, dtype: float64

df[['Sample1', 'Sample2']].transform('sqrt') returns:
     Sample1   Sample2
x1  1.000000  3.316625
x2  1.414214  3.464102
x3  1.732051  3.605551
x4  2.000000  3.741657
x5  2.236068  3.872983

Example: using different operation on different column

It is possible to use different operation on different column. Consider the following example.

import pandas as pd
import numpy as np

df = pd.DataFrame({
  "Sample1": [1, 2, 3, 4, 5],
  "Sample2": [11, 12, 13, 14, 15]},
  index= ["x1", "x2", "x3", "x4", "x5"]
)

print("The DataFrame contains:")
print(df)

#different operation on different columns
print("\ndf.transform({'Sample1':'sqrt', 'Sample2':'log10'} returns:")
print(df.transform({'Sample1':'sqrt', 'Sample2':'log10'}))

The output of the above code will be:

The DataFrame contains:
    Sample1  Sample2
x1        1       11
x2        2       12
x3        3       13
x4        4       14
x5        5       15

df.transform({'Sample1':'sqrt', 'Sample2':'log10'} returns:
     Sample1   Sample2
x1  1.000000  1.041393
x2  1.414214  1.079181
x3  1.732051  1.113943
x4  2.000000  1.146128
x5  2.236068  1.176091

❮ Pandas DataFrame - Functions

5