← All case studies
TransportationMachine Learning
1.4M trips · XGBoost
NYC Taxi Trip Duration Forecasting at City Scale
Problem
NYC trip duration is driven by time-of-day, origin-destination patterns, and city-scale congestion. Simple time averages mis-price ETAs in operational routing systems, leading to customer dissatisfaction and driver inefficiency.
Approach
Trained XGBoost model on 1.4 million trip records with engineered temporal and geospatial features. Incorporated weather data and traffic patterns using Folium for geographic sanity checks. Implemented Pandas-based data hygiene and leak-safe train-test splits for production credibility.
Result
High-accuracy trip duration model deployed on Streamlit, production-ready for routing and driver-ETA systems. Optimizes route planning in complex urban environment with real-world credibility beyond toy notebooks.
- ◆XGBoost model trained on 1.4M+ trip records
- ◆Production-ready for routing and ETA systems
- ◆Temporal and geospatial feature engineering
- ◆Weather and traffic pattern integration
1.4M trips
Training data
Temporal · Geo · Weather
Features
XGBoost
Model
Streamlit
Deployment
TransportationXGBoostUrban AnalyticsGeospatialTime Series