| VIF Score | Multicollinearity Severity | Actionable Interpretation |
|---|---|---|
| $VIF = 1$ | Very little multicollinearity | Ideal condition. |
| $VIF < 5$ | Moderate multicollinearity | Generally acceptable for most models. |
| $VIF > 5$ | Extreme multicollinearity | Indicates a severe dependency; avoidance is necessary. |
| Encoding Technique | Appropriate Data Type | Dimensionality Impact | Primary Risk | Multicollinearity Mitigation |
|---|---|---|---|---|
| Label Encoding | Ordinal | Minimal (1 feature) | Implies false order for nominal data | N/A |
| One-Hot Encoding | Nominal | Significant ($N$ features) | Dummy Variable Trap, High Dimensionality | Drop one dummy variable ($N-1$ encoding) |
| Encoding Technique | Primary Advantage | Mitigation/Regularization | Primary Risk/Trade-Off | Scalability |
|---|---|---|---|---|
| Target Encoding | High predictive power, dimensionality reduction | Smoothing, K-Fold Cross-Validation | Overfitting, Data Leakage | Moderate |
| Frequency/Count | Reduces feature space, compact representation | Grouping rare categories | Loss of distinct identity, potential bias | High |
| Binary/Base N | Significant dimensionality reduction from OHE | N/A | Loss of interpretability vs. OHE | Moderate to High |
| Feature Hashing | Fixed feature size, no dictionary needed | Tuning hash space dimension | Collision risk, complete loss of interpretability | Extreme |
| Embeddings | Captures category similarity and interaction | Tuning embedding size | Requires deep learning architecture | High |
| Machine Learning Model Class | Mechanism Reliance | Preferred Encoder(s) | Rationale |
|---|---|---|---|
| Linear/ATI Models (e.g., Regressors, SVM, MLP) | Feature Independence, Weight Learning | One-Hot Encoding (N-1) | Avoids implied ordering; enables learning of distinct additive coefficients [15] |
| Tree-Based Models (e.g., RF, XGBoost, LightGBM) | Optimal Threshold Splitting | Target/CatBoost Encoding | Single numerical feature leverages mean statistics for high-gain splits [15] |
| Deep Neural Networks (DNNs) | Feature Interaction Learning | Embeddings, Feature Hashing | Efficient dimensionality, captures deep semantic relationships |
scikit-learn: Provides basic Label/Ordinal Encoding.pandas: Used for simple One-Hot Encoding (pandas.get_dummies). [17]category_encoders library (scikit-contrib): Recommended standard for sophisticated techniques (Target, CatBoost, Binary, Hashing). [18, 17]scikit-learn transformers, allowing seamless integration into ML pipelines. [18]