row_to_names : Elevates a row to be the column names of a DataFrame.

Background

This notebook serves to show a brief and simple example of how to swap column names using one of the rows in the dataframe.

[2]:
import pandas as pd
import janitor
from io import StringIO
[3]:
data = '''shoe, 220, 100
          shoe, 450, 40
          item, retail_price, cost
          shoe, 200, 38
          bag, 305, 25
       '''
[4]:
temp = pd.read_csv(StringIO(data), header=None)
temp
[4]:
0 1 2
0 shoe 220 100
1 shoe 450 40
2 item retail_price cost
3 shoe 200 38
4 bag 305 25

Looking at the dataframe above, we would love to use row 2 as our column names. One way to achieve this involves a couple of steps

  1. Use loc/iloc to assign row 2 to columns.

  2. Strip off any whitespace.

  3. Drop row 2 from the dataframe using the drop method.

  4. Set axis name to none.

[5]:
temp.columns = temp.iloc[2, :]
temp.columns = temp.columns.str.strip()
temp = temp.drop(2, axis=0)
temp = temp.rename_axis(None, axis='columns')
temp
[5]:
item retail_price cost
0 shoe 220 100
1 shoe 450 40
3 shoe 200 38
4 bag 305 25

However, the first two steps prevent us from method chaining. This is easily resolved using the row_to_names function

[6]:
df = (
    pd.read_csv(StringIO(data), header=None)
    .row_to_names(row_number=2, remove_row=True)
)

df
[6]:
item retail_price cost
0 shoe 220 100
1 shoe 450 40
3 shoe 200 38
4 bag 305 25