janitor.min_max_scale¶
-
janitor.
min_max_scale
(df: pandas.core.frame.DataFrame, old_min=None, old_max=None, column_name=None, new_min=0, new_max=1) → pandas.core.frame.DataFrame[source]¶ Scales data to between a minimum and maximum value.
This method mutates the original DataFrame.
If minimum and maximum are provided, the true min/max of the DataFrame or column is ignored in the scaling process and replaced with these values, instead.
One can optionally set a new target minimum and maximum value using the new_min and new_max keyword arguments. This will result in the transformed data being bounded between new_min and new_max.
If a particular column name is specified, then only that column of data are scaled. Otherwise, the entire dataframe is scaled.
Method chaining syntax:
df = pd.DataFrame(...).min_max_scale(column_name="a")
Setting custom minimum and maximum:
df = ( pd.DataFrame(...) .min_max_scale( column_name="a", new_min=2, new_max=10 ) )
Setting a min and max that is not based on the data, while applying to entire dataframe:
df = ( pd.DataFrame(...) .min_max_scale( old_min=0, old_max=14, new_min=0, new_max=1, ) )
The aforementioned example might be applied to something like scaling the isoelectric points of amino acids. While technically they range from approx 3-10, we can also think of them on the pH scale which ranges from 1 to 14. Hence, 3 gets scaled not to 0 but approx. 0.15 instead, while 10 gets scaled to approx. 0.69 instead.
- Parameters
df – A pandas DataFrame.
old_min – (optional) Overrides for the current minimum value of the data to be transformed.
old_max – (optional) Overrides for the current maximum value of the data to be transformed.
new_min – (optional) The minimum value of the data after it has been scaled.
new_max – (optional) The maximum value of the data after it has been scaled.
column_name – (optional) The column on which to perform scaling.
- Returns
A pandas DataFrame with scaled data.
- Raises
ValueError – if
old_max
is not greater thanold_min
.ValueError – if
new_max
is not greater thannew_min
.