I'm facing an issue I don't know how to solve, but as I'm a beginner probably there is an easy solution I can't find.
I'm playing with the titanic dataset and I want to work with pipelines (In order to avoid data leakage using cross validation). For that reason I'm using two pipelines (one for numerical, one for categorical) + FeatureUnion().
What's the problem? In the numerical pipeline I fill the NaN values of Age and then I create some buckets for that variable. The result of this pipeline would be a dataframe containing all the numerical features + 1 categorical variable. For encoding categorical variables, I use a pipeline for the categorical variables, and then use FeatureUnion to join both datasets. But the problem is that the new variable I create in the numerical pipeline doesn't go into the categorical pipeline, resulting with a dataframe with one categorical variable that hasn't been encoded. How can I solve this?
CODE:
num_pipeline = Pipeline(steps = [ ('selector', DataFrameSelector(numerical_features)), ('imputer', df_imputer(strategy="median")), #Numerical ('new_variables', df_new_variables()) #Numerical ]) cat_pipeline = Pipeline(steps = [ ('selector', DataFrameSelector(categorical_features)), ('label_encoder', MultiColumnLabelEncoder()) #Categorical ]) full_pipeline = FeatureUnion(transformer_list=[ ("num_pipeline", num_pipeline), ("cat_pipeline", cat_pipeline) ]) Thank you for your time
Best regards
EDIT:
I was thinking about using ColumnTransformer as I think it suits better in my example as I have to apply different transformations for different columns, but the problem is that when working with ColumnTransformer the output would be an array with no columns' names, which I think would be hard to deal if we want to use feature selection. That's why I chose Pipelines rather than ColumnTransformer.
Talking about the option of creating the bucket before going into the pipeline, I can't because it's created based on the variable I'm dealing with missing values.
What would be the best option in this case?

