Portfolio Project
COVID-19 Outbreak Drivers
Python XGBoost & SHAP
Context
I used COVID historical data to predict future outbreaks.
Approach
- Cleaned & enriched more than 50k records from the HHS hospital-capacity time-series; added rolling means, trends, and 1/3/7/14-day lag features.
- Built an XGBoost classifier with class-imbalance weighting and a strict time-based train/test split.
Impact
- Used SHAP to surface the 7 most influential drivers and embedded the interactive plot in the report.
- The most significant driver of COVID outbreaks was the percentage of ICU beds with COVID.
- Utah was the most likely next location for a COVID outbreak (6.1%).