janitor.count_cumulative_unique¶
-
janitor.
count_cumulative_unique
(df: pandas.core.frame.DataFrame, column_name: Hashable, dest_column_name: str, case_sensitive: bool = True) → pandas.core.frame.DataFrame[source]¶ Generates a running total of cumulative unique values in a given column.
Functional usage syntax:
import pandas as pd import janitor as jn df = pd.DataFrame(...) df = jn.functions.count_cumulative_unique( df=df, column_name='animals', dest_column_name='animals_unique_count', case_sensitive=True )
Method chaining usage example:
import pandas as pd import janitor df = pd.DataFrame(...) df = df.count_cumulative_unique( column_name='animals', dest_column_name='animals_unique_count', case_sensitive=True )
A new column will be created containing a running count of unique values in the specified column. If case_sensitive is True, then the case of any letters will matter (i.e., ‘a’ != ‘A’); otherwise, the case of any letters will not matter.
This method mutates the original DataFrame.
- Parameters
df – A pandas dataframe.
column_name – Name of the column containing values from which a running count of unique values will be created.
dest_column_name – The name of the new column containing the cumulative count of unique values that will be created.
case_sensitive – Whether or not uppercase and lowercase letters will be considered equal (e.g., ‘A’ != ‘a’ if True).
- Returns
A pandas DataFrame with a new column containing a cumulative count of unique values from another column.