- XGBoost regressor with 200 estimators, depth 6, subsample 0.8
- Optuna hyperparameter optimization
- TimeSeriesSplit cross-validation (temporal ordering preserved)
- Metrics: MAE, MAPE, RMSE, R2 — tracked per model version
- Model registry with versioning, artifact storage, and promotion
System Deep Dive
Snapshot: March 2026
Boosted Charge, ML-powered placement intelligence for physical revenue assets.
A revenue forecasting and venue recommendation engine for phone charging kiosks. XGBoost with Optuna hyperparameter tuning, a 35-feature engineering pipeline spanning weather, demographics, foot traffic, and venue quality, 7 external API connectors, a weighted scoring algorithm that explains every rank, and 140 passing tests behind the forecasting stack.
ML Pipeline
XGBoost, Optuna, and time-series cross-validation.
The model predicts daily revenue per kiosk 7–14 days out. Training uses TimeSeriesSplit to respect temporal ordering. Optuna tunes hyperparameters. A model registry tracks versions, metrics, and promotion.
- Time-based (6): day of week, weekend, holiday, daypart weighting
- Weather (7): temp range, precipitation, comfort score, extreme flags
- Venue (8): type encoding across 14 categories, ratings, price level
- Foot traffic (4): Yelp checkins, Google popularity, nearby events
- Demographics (5): population, median income, age cohorts from Census
- Lag features (5): rolling means and recent revenue patterns
Data Layer
Seven connectors. Five ETL pipelines. Cron-scheduled.
- Stripe — transaction data and revenue history
- Open-Meteo — historical and forecast weather (free)
- Google Places — venue discovery and place details
- Yelp Fusion — ratings, reviews, checkin counts
- Ticketmaster + Eventbrite — nearby event signals
- US Census Bureau — demographics by tract
- Transactions synced every 15 minutes from Stripe
- Weather refreshed every 6 hours
- Venue metrics updated daily at 2 AM
- Events synced daily at 3 AM
- APScheduler cron with background execution
- Predicted revenue (30%), foot traffic (25%)
- Demographics (20%), competition (15%), venue quality (10%)
- Per-signal breakdowns for explainability
- Candidate venues ranked with composite scores
Stack
Python, FastAPI, PostgreSQL, XGBoost.
- Python
- FastAPI
- PostgreSQL
- SQLAlchemy 2.0
- Alembic
- Docker
- XGBoost
- Optuna
- scikit-learn
- pandas
- APScheduler
- Stripe
Need ML that explains its recommendations, not just outputs them?
That takes feature design, connector discipline, and a scoring model built for operators.