Feature Engineering for Policy Analysis

Interactive Visualizations: Massachusetts Education Case Study

By Rosalina Torres
Data Analytics Engineering at Northeastern University

Model Performance Impact

+32.8%
R² Score Improvement
-41.9%
RMSE Reduction
17
Engineered Features
4.2x
Best Policy ROI
Before vs After Feature Engineering

Feature Importance Analysis

Top 10 Most Important Engineered Features (SHAP Values)

💡 Key Insights from Feature Importance

  • Human Capital Index contributes 24% to predictions
  • Interaction terms outperform individual features by 3x
  • Polynomial features capture critical non-linearities
  • Domain-specific indices provide interpretable insights

Feature Engineering Categories

Interaction Features Impact
Non-Linear Relationships
Policy-Specific Indices
Efficiency Ratios

Feature Correlation Analysis

Correlation Heatmap: Engineered Features

Policy Impact Visualization

Poverty-Education Interaction Effect
Feature Type Example Feature Policy Insight Impact
Interaction poverty_education_interaction Education impact varies with poverty +2.3x in high-poverty areas
Polynomial education_squared Diminishing returns threshold at 50% Optimal investment point identified
Domain Index human_capital_index Workforce development priority areas 24% prediction contribution
Ratio teacher_effectiveness Optimal salary range: 1.4-1.6x median income 15% efficiency gain

Return on Investment Analysis

Policy ROI with Engineered Features

Feature Engineering Pipeline

Feature Engineering Workflow
# Example: Creating the Human Capital Index
df['human_capital_index'] = (
    df['education_bachelor_plus'] *
    df['gdp_per_capita'] /
    (df['overall_poverty_rate'] + 1)
)

# Result: Most predictive feature with 24% SHAP contribution