janitor.drop_duplicate_columns

janitor.drop_duplicate_columns(df: pandas.core.frame.DataFrame, column_name: Hashable, nth_index: int = 0) → pandas.core.frame.DataFrame[source]

Remove a duplicated column specified by column_name, its index.

This method does not mutate the original DataFrame.

Column order 0 is to remove the first column,

order 1 is to remove the second column, and etc

The corresponding tidyverse R’s library is: select(-<column_name>_<nth_index + 1>)

Method chaining syntax:

df = pd.DataFrame({
    "a": range(10),
    "b": range(10),
    "A": range(10, 20),
    "a*": range(20, 30),
}).clean_names(remove_special=True)

# remove a duplicated second 'a' column
df.drop_duplicate_columns(column_name="a", nth_index=1)
Parameters
  • df – A pandas DataFrame

  • column_name – Column to be removed

  • nth_index – Among the duplicated columns, select the nth column to drop.

Returns

A pandas DataFrame