Pandas Tutorial Pandas References

Pandas DataFrame - shift() function



The Pandas DataFrame shift() function shifts index by specified number of periods with an optional time freq.

When freq is not provided, then the function shifts the index without realigning the data. If freq is provided (in this case, the index must be date or datetime, or it will raise a NotImplementedError), the index will be increased using the periods and the freq.

Syntax

DataFrame.shift(periods=1, freq=None, axis=0, fill_value)

Parameters

periods Optional. Specify the period to shift. It can be positive or negative. Default: 1
freq Optional. Specify a freqDateOffset, tseries.offsets, timedelta, or str. It is the offset to use from the tseries module or time rule (e.g. 'EOM'). If freq is specified then the index values are shifted but the data is not realigned. Default: None.
axis Optional. Specify {0 or 'index', 1 or 'columns'}. If 0 or 'index', shift takes place in column direction. If 1 or 'columns', shift takes place in row direction. Default: 0
fill_value Optional. Specify the scalar value to use for newly introduced missing values. Default is self.dtype.na_value.

Return Value

Returns the shifted input object.

Example: shift() example

In the example below, a DataFrame df is created. The shift() function is used to shift the data by specified number of periods.

import pandas as pd
import numpy as np

df = pd.DataFrame({
  "ColA": [20, 15, 25, 32, 45],
  "ColB": [17, 23, 18, 33, 38],
  "ColC": [24, 27, 22, 37, 52]},
  index=pd.date_range("2018-05-01", "2018-05-05")
)

print("The DataFrame is:")
print(df)

#shifting by 1 period column-wise
print("\ndf.shift() returns:")
print(df.shift())

#shifting by 3 period column-wise
print("\ndf.shift(3) returns:")
print(df.shift(3))

#shifting by 1 period row-wise
print("\ndf.shift(axis=1) returns:")
print(df.shift(axis=1))

The output of the above code will be:

The DataFrame is:
            ColA  ColB  ColC
2018-05-01    20    17    24
2018-05-02    15    23    27
2018-05-03    25    18    22
2018-05-04    32    33    37
2018-05-05    45    38    52

df.shift() returns:
            ColA  ColB  ColC
2018-05-01   NaN   NaN   NaN
2018-05-02  20.0  17.0  24.0
2018-05-03  15.0  23.0  27.0
2018-05-04  25.0  18.0  22.0
2018-05-05  32.0  33.0  37.0

df.shift(3) returns:
            ColA  ColB  ColC
2018-05-01   NaN   NaN   NaN
2018-05-02   NaN   NaN   NaN
2018-05-03   NaN   NaN   NaN
2018-05-04  20.0  17.0  24.0
2018-05-05  15.0  23.0  27.0

df.shift(axis=1) returns:
            ColA  ColB  ColC
2018-05-01   NaN    20    17
2018-05-02   NaN    15    23
2018-05-03   NaN    25    18
2018-05-04   NaN    32    33
2018-05-05   NaN    45    38

Example: using fill_value parameter

By using fill_value parameter, we can specify the scalar value to fill for newly introduced missing values. Consider the example below:

import pandas as pd
import numpy as np

df = pd.DataFrame({
  "ColA": [20, 15, 25, 32, 45],
  "ColB": [17, 23, 18, 33, 38],
  "ColC": [24, 27, 22, 37, 52]},
  index=pd.date_range("2018-05-01", "2018-05-05")
)

print("The DataFrame is:")
print(df)

#shifting by 3 period column-wise
print("\ndf.shift(3) returns:")
print(df.shift(3))

#shifting by 3 period column-wise
#with fill_value as 0
print("\ndf.shift(3, fill_value=0) returns:")
print(df.shift(3, fill_value=0))

The output of the above code will be:

The DataFrame is:
            ColA  ColB  ColC
2018-05-01    20    17    24
2018-05-02    15    23    27
2018-05-03    25    18    22
2018-05-04    32    33    37
2018-05-05    45    38    52

df.shift(3) returns:
            ColA  ColB  ColC
2018-05-01   NaN   NaN   NaN
2018-05-02   NaN   NaN   NaN
2018-05-03   NaN   NaN   NaN
2018-05-04  20.0  17.0  24.0
2018-05-05  15.0  23.0  27.0

df.shift(3, fill_value=0) returns:
            ColA  ColB  ColC
2018-05-01     0     0     0
2018-05-02     0     0     0
2018-05-03     0     0     0
2018-05-04    20    17    24
2018-05-05    15    23    27

Example: using freq parameter

By using freq parameter, we can shift the index by specified number of periods and the freq. Consider the example below:

import pandas as pd
import numpy as np

df = pd.DataFrame({
  "ColA": [20, 15, 25, 32, 45],
  "ColB": [17, 23, 18, 33, 38],
  "ColC": [24, 27, 22, 37, 52]},
  index=pd.date_range("2018-05-01", "2018-05-05")
)

print("The DataFrame is:")
print(df)

#shifting the index by 3 periods using freq='D'
print("\ndf.shift(periods=3, freq='D') returns:")
print(df.shift(periods=3, freq='D'))

#shifting the index by 3 periods using freq='infer'
print("\ndf.shift(periods=3, freq='infer') returns:")
print(df.shift(periods=3, freq='D'))

The output of the above code will be:

The DataFrame is:
            ColA  ColB  ColC
2018-05-01    20    17    24
2018-05-02    15    23    27
2018-05-03    25    18    22
2018-05-04    32    33    37
2018-05-05    45    38    52

df.shift(periods=3, freq='D') returns:
            ColA  ColB  ColC
2018-05-04    20    17    24
2018-05-05    15    23    27
2018-05-06    25    18    22
2018-05-07    32    33    37
2018-05-08    45    38    52

df.shift(periods=3, freq='infer') returns:
            ColA  ColB  ColC
2018-05-04    20    17    24
2018-05-05    15    23    27
2018-05-06    25    18    22
2018-05-07    32    33    37
2018-05-08    45    38    52

❮ Pandas DataFrame - Functions