janitor.expand_grid

janitor.expand_grid(df: Optional[pandas.core.frame.DataFrame] = None, df_key: Optional[str] = None, others: Optional[Dict] = None) → pandas.core.frame.DataFrame[source]

Creates a dataframe from a cartesian combination of all inputs.

This works with a dictionary of name value pairs.

It is also not restricted to dataframes; it can work with any list-like structure that is 1 or 2 dimensional.

If method-chaining to a dataframe, a key to represent the column name in the output must be provided.

Data types are preserved in this function, including Pandas’ extension array dtypes.

The output will always be a dataframe.

Example:

import pandas as pd
import janitor as jn

df = pd.DataFrame({"x":range(1,3), "y":[2,1]})
others = {"z" : range(1,4)}

df.expand_grid(df_key="df",others=others)

# df_x |   df_y |   z
#    1 |      2 |   1
#    1 |      2 |   2
#    1 |      2 |   3
#    2 |      1 |   1
#    2 |      1 |   2
#    2 |      1 |   3

# create a dataframe from all combinations in a dictionary
data = {"x":range(1,4), "y":[1,2]}

jn.expand_grid(others=data)

#  x |   y
#  1 |   1
#  1 |   2
#  2 |   1
#  2 |   2
#  3 |   1
#  3 |   2

Note

If a MultiIndex DataFrame or Series is passed, the index/columns will be discarded, and a single indexed dataframe will be returned.

Functional usage syntax:

import pandas as pd
import janitor as jn

df = pd.DataFrame(...)
df = jn.expand_grid(df=df, df_key="...", others={...})

Method-chaining usage syntax:

import pandas as pd
import janitor as jn

df = pd.DataFrame(...).expand_grid(df_key="bla",others={...})

Usage independent of a dataframe

import pandas as pd
from janitor import expand_grid

df = expand_grid({"x":range(1,4), "y":[1,2]})
Parameters
  • df – A pandas dataframe.

  • df_key – name of key for the dataframe. It becomes part of the column names of the dataframe.

  • others – A dictionary that contains the data to be combined with the dataframe. If no dataframe exists, all inputs in others will be combined to create a dataframe.

Returns

A pandas dataframe of all combinations of name value pairs.

Raises
  • TypeError – if others is not a dictionary

  • KeyError – if there is a dataframe and no key is provided.

  • ValueError – if others is empty.