GlucoPred
Degree thesis
Forecasting blood glucose 15-60 minutes ahead for Type 1 Diabetes, trained on ~2.5 years of my child's CGM and insulin-pump data. An automated pipeline feeds a 45-column feature matrix; an LSTM beats classical baselines (1.28 mmol/L RMSE at +30 min), with Clarke Error Grid analysis showing where accuracy and clinical safety diverge.
GlucoPred is my degree thesis (examensarbete): forecasting blood glucose 15-60 minutes ahead for a person with Type 1 Diabetes, using ~2.5 years of my child's continuous glucose monitor (CGM) and insulin-pump data (Dexcom via Glooko, 5-minute intervals). All values are in mmol/L (≈18 mg/dL per mmol/L).
Pipeline
- Automated, idempotent ZIP→DuckDB ingestion of raw CGM and pump exports (language-aware: Swedish/English)
- CGM cleaning, sensor-session detection, and gap interpolation
- 45-column engineered feature matrix - lags, rolling statistics, time encodings, and a NovoRapid pharmacokinetic insulin-on-board model (bolus + basal) - against +15 / +30 / +60 min targets, with verified zero future leakage (CI-tested)
Models & evaluation
Baselines (persistence, moving average, linear extrapolation, AR(2)) and an LSTM are scored with RMSE/MAE, time-in-range, and Clarke Error Grid zone analysis (A-E per model and horizon). The LSTM beats every baseline at all horizons - 1.28 mmol/L RMSE at +30 min vs AR(2)'s 1.45 - evaluated on held-out recent data.
The most interesting finding is a divergence between accuracy and clinical safety: although the LSTM is the most accurate model, a plain persistence baseline produces a lower fraction of dangerous-zone errors at longer horizons. The best average-error model isn't automatically the safest near hypoglycemia - surfacing that, rather than just the RMSE headline, is central to the work.
Interface
A React + Recharts UI renders actual-vs-predicted traces with model and horizon selectors. Because the real app runs against personal medical data, it is never hosted - instead, a public demo built on entirely synthetic data is deployed separately, so the interface can be explored without exposing any real record.
Next steps
Gradient-boosting models, hyperparameter tuning, and a deeper clinical error analysis (pinpointing where and when the unsafe predictions occur). Figures reflect the current state of an evolving thesis project.