frflib.utils.ml_analysis.processing
module for genereating prepocessing pipeline in sklearnDF
Functions
|
Create a list to store the preprocessing transformations we want to apply to each column. |
|
Create a dict with the name of transformation to apply to categorical and numerical columns. |
|
Concatenate the numerical pipeline and categorical pipeline to make the final pipeline to handle the data preprocessing |
Module Contents
- frflib.utils.ml_analysis.processing.make_static_list(df: pandas.DataFrame) list
Create a list to store the preprocessing transformations we want to apply to each column.
- Parameters:
df (pd.DataFrame) – Dataframe
- Returns:
list for all columns with the transformation to apply to each columns
- Return type:
list
- frflib.utils.ml_analysis.processing.make_dict_preprocessing(data_static: list) dict
Create a dict with the name of transformation to apply to categorical and numerical columns.
- Parameters:
data_static (list for all columns with the transformation to apply to each columns) – []
- Returns:
In each list of transformation available we add the names of the columns concerned
- Return type:
dict
- frflib.utils.ml_analysis.processing.make_pipeline(dict_preprocessing: dict, fill_num=DEFAULT_NUM_FILLNA, fill_cat=DEFAULT_CAT_FILLNA) sklearn.compose.ColumnTransformer
Concatenate the numerical pipeline and categorical pipeline to make the final pipeline to handle the data preprocessing
- Parameters:
dict_preprocessing (dict) – dict with the name of transformation to apply to categorical and numerical columns
fill_num (str, optional) – replace the numerical missing values, defaults to “mean”
fill_cat (str, optional) – replace the categorical missing values, defaults to “constant”
- Returns:
Final pipeline with all preprocessing steps
- Return type:
sklearndf.transformation.ColumnTransformerDF