Characterising information gains and losses when collecting multiple epidemic model outputs

K. Sherratt; A. Srivastava; K. Ainslie; D. E. Singh; A. Cublier; M. C. Marinescu; J. Carretero; A. Cascajo Garcia; N. Franco; L. Willem; S. Abrams; C. Faes; P. Beutels; N. Hens; S. Mueller; B. Charlton; R. Ewert; S. Paltra; C. Rakow; J. Rehmann; T. Conrad; C. Schuette; K. Nagel; R. Grah; R. Niehus; B. Prasse; F. Sandmann; S. Funk; Katharine Sherratt

doi:10.1016/j.epidem.2024.100765

Characterising information gains and losses when collecting multiple epidemic model outputs

K. Sherratt, A. Srivastava, K. Ainslie, D. E. Singh, A. Cublier, M. C. Marinescu, J. Carretero, A. Cascajo Garcia, N. Franco, L. Willem, S. Abrams, C. Faes, P. Beutels, N. Hens, S. Mueller, B. Charlton, R. Ewert, S. Paltra, C. Rakow, J. RehmannT. Conrad, C. Schuette, K. Nagel, R. Grah, R. Niehus, B. Prasse, F. Sandmann, S. Funk, Katharine Sherratt

Research output: Contribution to journal › Article › peer-review

Abstract

Background. Collaborative comparisons and combinations of multiple epidemic models are used as policy-relevant evidence during epidemic outbreaks. Typically, each modeller summarises their own distribution of simulated trajectories using descriptive statistics at each modelled time step. We explored information losses compared to directly collecting a sample of the simulated trajectories, in terms of key epidemic quantities, ensemble uncertainty, and performance against data. Methods. We compared July 2022 projections from the European COVID-19 Scenario Modelling Hub. Using shared scenario assumptions, five modelling teams contributed up to 100 simulated trajectories projecting incidence in Belgium, the Netherlands, and Spain. First, we compared epidemic characteristics including incidence, peaks, and cumulative totals. Second, we drew a set of quantiles from the sampled trajectories for each model at each time step. We created an ensemble as the median across models at each quantile, and compared this to an ensemble of quantiles drawn from all available trajectories at each time step. Third, we compared each trajectory to between 4 and 29 weeks of observed data, using the mean absolute error to weight trajectories in consecutive ensembles. Results. We found that collecting models' simulated trajectories, as opposed to collecting models' quantiles at each time point, enabled us to show additional epidemic characteristics, a wider range of uncertainty, and performance against data. Sampled trajectories contained a right-skewed distribution which was poorly captured by an ensemble of models' quantile intervals. Ensembles weighted by predictive performance narrowed the range of plausible incidence over time, excluding some epidemic shapes altogether. Conclusions. Understanding potential information loss when collecting model projections can support the accuracy, reliability, and communication of collaborative infectious disease modelling efforts. The importance of different information losses may vary with each collaboration's aims, with lesser impact on short term predictions compared to assessing threshold risks and longer term uncertainty.

Original language	English
Journal	Epidemics
Volume	47
Issue number	100765
DOIs	https://doi.org/10.1016/j.epidem.2024.100765 https://doi.org/10.1101/2023.07.05.23292245
Publication status	Published - Jun 2024

Keywords

epidemiology

Access to Document

1-s2.0-S1755436524000264-mainFinal published version, 2.68 MBLicence: CC BY

High Performance Computing Technology Platform
Benoît Champagne (Manager)
Technological Platform High Performance Computing
Facility/equipment: Technological Platform

Cite this

Sherratt, K., Srivastava, A., Ainslie, K., Singh, D. E., Cublier, A., Marinescu, M. C., Carretero, J., Cascajo Garcia, A., Franco, N., Willem, L., Abrams, S., Faes, C., Beutels, P., Hens, N., Mueller, S., Charlton, B., Ewert, R., Paltra, S., Rakow, C., ... Sherratt, K. (2024). Characterising information gains and losses when collecting multiple epidemic model outputs. Epidemics, 47(100765). https://doi.org/10.1016/j.epidem.2024.100765, https://doi.org/10.1101/2023.07.05.23292245

@article{03aec36aa590416291ce6c982d0228ba,

title = "Characterising information gains and losses when collecting multiple epidemic model outputs",

abstract = "Background. Collaborative comparisons and combinations of multiple epidemic models are used as policy-relevant evidence during epidemic outbreaks. Typically, each modeller summarises their own distribution of simulated trajectories using descriptive statistics at each modelled time step. We explored information losses compared to directly collecting a sample of the simulated trajectories, in terms of key epidemic quantities, ensemble uncertainty, and performance against data. Methods. We compared July 2022 projections from the European COVID-19 Scenario Modelling Hub. Using shared scenario assumptions, five modelling teams contributed up to 100 simulated trajectories projecting incidence in Belgium, the Netherlands, and Spain. First, we compared epidemic characteristics including incidence, peaks, and cumulative totals. Second, we drew a set of quantiles from the sampled trajectories for each model at each time step. We created an ensemble as the median across models at each quantile, and compared this to an ensemble of quantiles drawn from all available trajectories at each time step. Third, we compared each trajectory to between 4 and 29 weeks of observed data, using the mean absolute error to weight trajectories in consecutive ensembles. Results. We found that collecting models' simulated trajectories, as opposed to collecting models' quantiles at each time point, enabled us to show additional epidemic characteristics, a wider range of uncertainty, and performance against data. Sampled trajectories contained a right-skewed distribution which was poorly captured by an ensemble of models' quantile intervals. Ensembles weighted by predictive performance narrowed the range of plausible incidence over time, excluding some epidemic shapes altogether. Conclusions. Understanding potential information loss when collecting model projections can support the accuracy, reliability, and communication of collaborative infectious disease modelling efforts. The importance of different information losses may vary with each collaboration's aims, with lesser impact on short term predictions compared to assessing threshold risks and longer term uncertainty.",

keywords = "epidemiology, information, scenarios, uncertainty, aggregation, modelling",

author = "K. Sherratt and A. Srivastava and K. Ainslie and Singh, {D. E.} and A. Cublier and Marinescu, {M. C.} and J. Carretero and {Cascajo Garcia}, A. and N. Franco and L. Willem and S. Abrams and C. Faes and P. Beutels and N. Hens and S. Mueller and B. Charlton and R. Ewert and S. Paltra and C. Rakow and J. Rehmann and T. Conrad and C. Schuette and K. Nagel and R. Grah and R. Niehus and B. Prasse and F. Sandmann and S. Funk and Katharine Sherratt",

note = "Publisher Copyright: {\textcopyright} 2024 The Authors",

year = "2024",

month = jun,

doi = "10.1016/j.epidem.2024.100765",

language = "English",

volume = "47",

journal = "Epidemics",

issn = "1755-4365",

publisher = "Elsevier",

number = "100765",

}

Sherratt, K, Srivastava, A, Ainslie, K, Singh, DE, Cublier, A, Marinescu, MC, Carretero, J, Cascajo Garcia, A, Franco, N, Willem, L, Abrams, S, Faes, C, Beutels, P, Hens, N, Mueller, S, Charlton, B, Ewert, R, Paltra, S, Rakow, C, Rehmann, J, Conrad, T, Schuette, C, Nagel, K, Grah, R, Niehus, R, Prasse, B, Sandmann, F, Funk, S & Sherratt, K 2024, 'Characterising information gains and losses when collecting multiple epidemic model outputs', Epidemics, vol. 47, no. 100765. https://doi.org/10.1016/j.epidem.2024.100765, https://doi.org/10.1101/2023.07.05.23292245

TY - JOUR

T1 - Characterising information gains and losses when collecting multiple epidemic model outputs

AU - Sherratt, K.

AU - Srivastava, A.

AU - Ainslie, K.

AU - Singh, D. E.

AU - Cublier, A.

AU - Marinescu, M. C.

AU - Carretero, J.

AU - Cascajo Garcia, A.

AU - Franco, N.

AU - Willem, L.

AU - Abrams, S.

AU - Faes, C.

AU - Beutels, P.

AU - Hens, N.

AU - Mueller, S.

AU - Charlton, B.

AU - Ewert, R.

AU - Paltra, S.

AU - Rakow, C.

AU - Rehmann, J.

AU - Conrad, T.

AU - Schuette, C.

AU - Nagel, K.

AU - Grah, R.

AU - Niehus, R.

AU - Prasse, B.

AU - Sandmann, F.

AU - Funk, S.

AU - Sherratt, Katharine

PY - 2024/6

Y1 - 2024/6

N2 - Background. Collaborative comparisons and combinations of multiple epidemic models are used as policy-relevant evidence during epidemic outbreaks. Typically, each modeller summarises their own distribution of simulated trajectories using descriptive statistics at each modelled time step. We explored information losses compared to directly collecting a sample of the simulated trajectories, in terms of key epidemic quantities, ensemble uncertainty, and performance against data. Methods. We compared July 2022 projections from the European COVID-19 Scenario Modelling Hub. Using shared scenario assumptions, five modelling teams contributed up to 100 simulated trajectories projecting incidence in Belgium, the Netherlands, and Spain. First, we compared epidemic characteristics including incidence, peaks, and cumulative totals. Second, we drew a set of quantiles from the sampled trajectories for each model at each time step. We created an ensemble as the median across models at each quantile, and compared this to an ensemble of quantiles drawn from all available trajectories at each time step. Third, we compared each trajectory to between 4 and 29 weeks of observed data, using the mean absolute error to weight trajectories in consecutive ensembles. Results. We found that collecting models' simulated trajectories, as opposed to collecting models' quantiles at each time point, enabled us to show additional epidemic characteristics, a wider range of uncertainty, and performance against data. Sampled trajectories contained a right-skewed distribution which was poorly captured by an ensemble of models' quantile intervals. Ensembles weighted by predictive performance narrowed the range of plausible incidence over time, excluding some epidemic shapes altogether. Conclusions. Understanding potential information loss when collecting model projections can support the accuracy, reliability, and communication of collaborative infectious disease modelling efforts. The importance of different information losses may vary with each collaboration's aims, with lesser impact on short term predictions compared to assessing threshold risks and longer term uncertainty.

AB - Background. Collaborative comparisons and combinations of multiple epidemic models are used as policy-relevant evidence during epidemic outbreaks. Typically, each modeller summarises their own distribution of simulated trajectories using descriptive statistics at each modelled time step. We explored information losses compared to directly collecting a sample of the simulated trajectories, in terms of key epidemic quantities, ensemble uncertainty, and performance against data. Methods. We compared July 2022 projections from the European COVID-19 Scenario Modelling Hub. Using shared scenario assumptions, five modelling teams contributed up to 100 simulated trajectories projecting incidence in Belgium, the Netherlands, and Spain. First, we compared epidemic characteristics including incidence, peaks, and cumulative totals. Second, we drew a set of quantiles from the sampled trajectories for each model at each time step. We created an ensemble as the median across models at each quantile, and compared this to an ensemble of quantiles drawn from all available trajectories at each time step. Third, we compared each trajectory to between 4 and 29 weeks of observed data, using the mean absolute error to weight trajectories in consecutive ensembles. Results. We found that collecting models' simulated trajectories, as opposed to collecting models' quantiles at each time point, enabled us to show additional epidemic characteristics, a wider range of uncertainty, and performance against data. Sampled trajectories contained a right-skewed distribution which was poorly captured by an ensemble of models' quantile intervals. Ensembles weighted by predictive performance narrowed the range of plausible incidence over time, excluding some epidemic shapes altogether. Conclusions. Understanding potential information loss when collecting model projections can support the accuracy, reliability, and communication of collaborative infectious disease modelling efforts. The importance of different information losses may vary with each collaboration's aims, with lesser impact on short term predictions compared to assessing threshold risks and longer term uncertainty.

KW - epidemiology

KW - information

KW - scenarios

KW - uncertainty

KW - aggregation

KW - modelling

U2 - 10.1016/j.epidem.2024.100765

DO - 10.1016/j.epidem.2024.100765

M3 - Article

SN - 1755-4365

VL - 47

JO - Epidemics

JF - Epidemics

IS - 100765

ER -

Characterising information gains and losses when collecting multiple epidemic model outputs

Abstract

Keywords

Access to Document

Fingerprint

Equipment

High Performance Computing Technology Platform

Cite this