optimize_pd¶
-
hhpy.ds.
optimize_pd
(df: pandas.core.frame.DataFrame, c_int: bool = True, c_float: bool = True, c_cat: bool = True, cat_frac: float = 0.5, convert_dtypes: bool = True, drop_all_na_cols: bool = False) → pandas.core.frame.DataFrame[source]¶ optimize memory usage of a pandas df, automatically downcast all var types and converts objects to categories
Parameters: - df – pandas DataFrame to be optimized. Other objects are implicitly cast to DataFrame
- c_int – Whether to downcast integers [optional]
- c_float – Whether to downcast floats [optional]
- c_cat – Whether to cast objects to categories. Uses cat_frac as condition [optional]
- cat_frac – If c_cat: If the column has less than cat_frac percent unique values it will be cast to category [optional]
- convert_dtypes – Whether to call convert dtypes (pandas 1.0.0+) [optional]
- drop_all_na_cols – Whether to drop columns that contain only missing values [optional]
Returns: the optimized pandas DataFrame