Claudia Vitolo from the European Centre for Medium-Range Weather Forecasts (ECMWF) had the brilliant idea to host a workshop about building reproducible workflows for earth sciences. It is not surprising that weather forecasts strongly depend on computational analyses, statistics, and data. Wait! Isn’t this exactly what o2r addresses? Well observed, that is certainly correct. For this reason, we were very happy to receive an invitation from Claudia for giving a keynote. But let’s start from the beginning.
Ana Trisovic from the Harvard University opened the workshop with an interesting keynote about the reproducibility challenges in physics and the social sciences. She also conducted a reproducibility study to investigate if R scripts stored on Dataverse are actually reproducible. Her results were similarly worrying as those reported in our paper about computational reproducibility in the geosciences. From the 3208 R files, only 502 could be executed successfully. The remaining scripts had issues such as a wrong file directory or a missing functionality.
The second keynote was given by Carol Willing from Project Jupyter. She argued that lives depend on scaling reproducible research and took the example of the typhoon that hit Japan recently. Her main point was that reproducible research improves prediction which is particularly necessary in the context of storms. Having reliable predictions can help people to prepare accordingly. Based on this use case, she presented some Jupyter-based tools such as Jupyter notebooks and Binder.
Many other talks presented approaches to address very specific reproducibility issues. Some of these approaches build on top of Jupyter notebooks and containers which demonstrates again that these two tools are probably the right way to go for the next few years. However, the speakers did not put much focus on user-related aspects and the publication process. As a consequence, the talks were more about the technical realization and less about creating a connection between the article and the reproducible analysis. This was a nice gap for o2r to fill. Markus presented our key concepts such as the Executable Research Compendium (ERC) and bindings and how these can be integrated into the process of publishing scientific articles.
All in all, the number of talks indicates that there is some need for exchange about reproducibility in weather forecasting. This is not surprising due to the intense and often emotional discussion about climate change. Reproducible research can help to demonstrate the robustness of the results and to find errors in the analysis before publication. Hence, this way of publishing research makes it easier to counteract two popular points of attack used by climate change deniers. We hope ECMWF is going for a second edition next year!
By the way, all slides and even the video recordings of the talks are available online: https://events.ecmwf.int/event/116/timetable/#20191014.detailed