janitor.take_first

janitor.take_first(df: pandas.core.frame.DataFrame, subset: Union[Hashable, Iterable[Hashable]], by: Hashable, ascending: bool = True) → pandas.core.frame.DataFrame[source]

Take the first row within each group specified by subset.

This method does not mutate the original DataFrame.

import pandas as pd
import janitor

data = {
    "a": ["x", "x", "y", "y"],
    "b": [0, 1, 2, 3]
}
df = pd.DataFrame(data)

df.take_first(subset="a", by="b")
Parameters
  • df – A pandas DataFrame.

  • subset – Column(s) defining the group.

  • by – Column to sort by.

  • ascending – Whether or not to sort in ascending order, bool.

Returns

A pandas DataFrame.