janitor.filter_column_isin

janitor.filter_column_isin(df: pandas.core.frame.DataFrame, column_name: Hashable, iterable: Iterable, complement: bool = False) → pandas.core.frame.DataFrame[source]

Filter a dataframe for values in a column that exist in another iterable.

This method does not mutate the original DataFrame.

Assumes exact matching; fuzzy matching not implemented.

The below example syntax will filter the DataFrame such that we only get rows for which the “names” are exactly “James” and “John”.

df = (
    pd.DataFrame(...)
    .clean_names()
    .filter_column_isin(column_name="names", iterable=["James", "John"]
    )
)

This is the method chaining alternative to:

df = df[df['names'].isin(['James', 'John'])]

If “complement” is true, then we will only get rows for which the names are not James or John.

Parameters
  • df – A pandas DataFrame

  • column_name – The column on which to filter.

  • iterable – An iterable. Could be a list, tuple, another pandas Series.

  • complement – Whether to return the complement of the selection or not.

Returns

A filtered pandas DataFrame.

Raises

ValueError – if iterable does not have a length of 1 or greater.