π― Project Overview
This project explores how economic, demographic, and development indicators shape migration flows globally. Using open datasets from the World Bank and UNDP, it builds an interpretable machine learning model that forecasts net migration rates and simulates βwhat-ifβ development and crisis scenarios.
π Recent Update (October 2025)
The pipeline now includes forecasting through 2030 under four socioeconomic scenarios:
- π Baseline β steady continuation of current trends
- π High Growth β +2 pp GDP growth, +3 % HDI
- β οΈ Crisis β β2 pp GDP growth, β2 % HDI
- πΆ Demographic Pressure β +5 % adolescent fertility
Each scenario includes 90 % prediction intervals and regional aggregation summaries.
Results are stored in outputs/forecast_results_2024_2030.csv and visualized
in the final notebook.
π Workflow Notebooks
- π§Ή Data Preparation & Cleaning β Collect, merge, and clean World Bank and HDI data.
- π Exploratory Data Analysis (EDA) β Explore distributions, correlations, and missingness patterns.
- βοΈ Feature Engineering & Modeling β Build interpretable and interaction-based predictors for modeling.
- π³ Model Interpretation & Scenario Analysis β Train Random Forest models and interpret them using SHAP values.
- π Forecasting & Validation β Evaluate temporal performance and produce scenario-based forecasts through 2030.
π‘ Key Insights
- π Most countries show stable migration rates near zero, with crisis periods driving extreme outflows.
- π± Fertility decline and HDI growth are the strongest predictors of sustained migration inflows.
- πΌ Economic expansion amplifies inflows in developed regions, while demographic pressure drives outflows in low-income countries.
- π Forecasts for 2024β2030 reveal regional divergence between economic and demographic migration drivers.
π Additional Information
Full project documentation, data sources, and reproducible code are available on GitHub. You can also explore interactive notebook outputs directly above.