Use ordinal encoding for dynamic categorical features in GluonTSAdapter#31
Use ordinal encoding for dynamic categorical features in GluonTSAdapter#31
Conversation
| df = df.astype(astype_dict) | ||
| if category_as_ordinal: | ||
| cat_cols = [col for col in df.select_dtypes(include="category").columns if col != id_column] | ||
| df = df.assign(**{col: df[col].cat.codes for col in cat_cols}) |
There was a problem hiding this comment.
Possible alternatives:
- automatically one-hot-encode categorical columns
- drop categorical columns
Ideally, this should be a configurable option, but currently the fev.convert_input_data method does not allow routing kwargs to the individual adapters. @abdulfatir what do you think?
There was a problem hiding this comment.
just discussed target encoding as a good option w/ @abdulfatir . why not also use it here?
There was a problem hiding this comment.
My initial idea was that adapters perform the bare minimum preprocessing such that the data can be consumed by the respective frameworks, but I agree that we can also incorporate the best practices here.
If we go for target encoding, we should probably enable/disable it via an optional argument to the GluonTSAdapter. Currently these are not supported since fev.convert_input_data does not forward kwargs to the adapters.
How about we
- Merge this (or some other simple strategy) as a simple default that unbreaks GluonTS models with covaraites
- Add a better strategy after the
Taskrefactor with an optional argument to theGluonTSAdapter?
There was a problem hiding this comment.
I would vote for putting as little model-related stuff here as possible. If the user wants to do other types of encodings, they should do this on the model side.
Issue #, if available:
Description of changes:
objectdtype, which broke GluonTS models that acceptfeat_dynamic_real/past_feat_dynamic_realand attempt to convert them tofloat32inside the transform. Now categorical features are encoded as integers (using ordinal encoding).By submitting this pull request, I confirm that you can use, modify, copy, and redistribute this contribution, under the terms of your choice.