Tasks

  • Daily target: On day t, prediction of target in t+2 days:

    • province/state (optional), country, total number of cases [t+2], total number of recovered [t+2] (optional), total number of deaths [t+2], total number of tests [t+2] (optional), mortality estimate [t+2] (optional), and confidence intervals for all point predictions (optional)

  • Weekly target: On day t, prediction of target in t+7 days:

    • province/state (optional), country, total number of cases [t+7], total number of recovered [t+7] (optional), total number of deaths [t+7], total number of tests [t+7] (optional), mortality estimate [t+7] (optional), and confidence intervals for all point predictions (optional)

    • Optional: About 5-10 % of all cases are so-called serious/critical infections that require hospitalization and treatment in ICUs. Hospitals need these estimates to plan and prepare their response. You may include such estimates in your weekly target submission.

        Further information:

  • Monthly target: On day t, prediction of target in t+30 days:

    • province/state (optional), country, total number of cases [t+30], total number of recovered [t+30] (optional), total number of deaths [t+30], total number of tests [t+30] (optional), mortality estimate [t+30] (optional), and confidence intervals for all point predictions (optional)

  • External datasets: You may also consider further datasets in your models (e.g., mobility data, weather data, etc.)  

  • Prediction for more than one location/country/state: We explicitly encourage participants to develop “global” modeling frameworks that are able to model the evolution of confirmed COVID-19 cases for more than one location at a given time. This can be helpful to have a more complete picture of the current global transmission dynamics.

 

Comments: 

  • Optional target: Study data and models in order to figure out procedures that can help to contain the outbreak (e.g., quarantine measures, mobility restrictions, etc.). 

  • All submissions should be based on cumulative numbers. However, note that in the modelling phase, you are allowed to explore non-cumulative variants.

To make a prediction, please select at least one country from the following list:

http://lab.iamrohit.in/php_ajax_country_state_city_dropdown/

 

You may also select states or cities for your predictions.

 

Corresponding json files with country/state/city names can be found here:

https://github.com/hiiamrohit/Countries-States-Cities-database

 

Please adhere to the country/state/city naming convention in the above lists! We are not able to process your submission if you use other names.

 

If you decide to submit mortality predictions, make sure that your estimates reflect the probability (i.e., M must be between 0 and 1) that an individual that has been tested positive eventually dies of the disease. A simple estimate is the so-called case fatality rate (https://en.wikipedia.org/wiki/Case_fatality_rate). You can also think about other estimates that yield better estimates during the course of an outbreak. To keep things simple, the estimate should not be stratified on age. For each country, we use the final number (after the end of the outbreak) of deaths divided by the final number of confirmed cases as ground truth of M (neglecting the unreported cases).  We will thus only evaluate your mortality-ratio predictions at the end of the outbreak. For more information on mortality estimates, see: Böttcher, L., Xia, M., &  Chou, T. (2020). Why estimating population-based case fatality rates during epidemics may be misleading. arXiv:2003.12032.

 

If you provide confidence intervals, make sure that they correspond to 95 % confidence intervals.

Evaluation

 

The submission server has the following leaderboards for each country:

  • 2 day prediction leaderboard

  • 7 day prediction leaderboard

  • 30 day prediction leaderboard

UPDATE [March 31st, 2020]:

The submissions for each country will be daily ranked according to the simple mean absolute error (MAE) over all target variables that are mandatory (N...total number of cases, D...total number of deaths). MAE is fundamentally easier to understand than the square root of the average of the sum of squared deviations.

Additionally, the organizing and the advisory board shall select the global leaderboard for each week. The criteria for global leaderboard will include: error terms and the accuracy of confidence intervals for mandatory and optional variables over all countries in the world for the past 7 days. Only open-source solutions will be taken into account!

UPDATE [April 23rd, 2020]: Please check our document (i.e. equation 4) for main leaderboard.

 

All predictions have to be submitted in separate CSV files. We created some baseline solutions as reference. The Newton-Leibniz baseline uses a simple forward Euler extrapolation to predict the evolution of case numbers in the 2-day prediction window. As an alternative, we also provide a logistic growth model baseline solution. Please feel free to use and further develop these simple baseline models.

Submit your results!

www.epidemicdatathon.com 2020. For Questions - Contact us by email