17 Dec 2018 | Markus Konkol, Daniel Nüst
A few weeks ago, a new journal article written by o2r team member Markus got published.
In our last article, we talked about the reproducibility of papers submitted to the AGILE conference.
We checked if the papers had materials attached and if these materials were complete.
The results were rather unfortunate.
In our newest article, we took one further step and tried to re-run the analyses of articles which had code and data in the supplements.
Markus Konkol, Christian Kray & Max Pfeiffer (2019). Computational reproducibility in geoscientific papers: Insights from a series of studies with geoscientists and a reproduction study, International Journal of Geographical Information Science, 33:2, 408-429, DOI: 10.1080/13658816.2018.1508687
The article builds upon our paper corpus for demonstrating the o2r platform.
Feel free to distribute this piece of research to whoever might be interested.
Feedback is always welcome.
Here is a non-specialist summary:
Recreating scientific data analysis is hard, but important.
To learn more about the state of reproducibility in geosciences, we conducted several studies.
We contacted over 150 geoscientists who publish and read articles based on code and data.
We learned that as readers they often would like to have access to these materials, but as authors they often do not have the time or expertise to make them available.
We also collected articles which use computational analyses and tried to execute the attached code.
This was not as easy as it sounds! We describe these numerous issues in a structured way and our experiences in this publication.
Some issues were pretty easy to solve, such as installing a missing library.
Others were more demanding and required deep knowledge of the code which is, as you might imagine, highly time consuming.
Further issues were missing materials (code snippets, data subsets) and flawed functionalities.
In some cases, we contacted the original authors who were, and this was a positive outcome, mostly willing to help.
We also compared the figures we got out of the code with those contained in the original article.
Bad news: We found several differences related to the design of the figures and results that deviated from those described in the paper.
OK, this is interesting, but why is it important?
We argue, a key advantage of open reproducible research is that you can reuse existing materials.
Apparently, this is usually not possible without some significant effort. Our goal is not to blame authors.
We are very happy that they shared their materials.
But they did that with a specific purpose in mind, i.e. making code and data available and reusable for others to build upon that.
One incentive in this context is an increased number of citations, one of the main currencies for researchers.
To facilitate that, we suggest some guidelines to avoid the issues we encountered during our reproducibility study, such as using Executable Research Compendia (ever heard of them? :)).
21 Nov 2018 | Daniel Nüst
This article reports on a project, integrating Stencila and Binder, which started at the eLife Innovation Sprint 2018. It has been cross-posted on multiple blogs (eLife Labs, Stencila, Jupyter). We welcome comments and feedback on any of them!
eLife, an open science journal published by the non-profit organisation eLife Sciences Publications from the UK, hosted the first eLife Innovation Sprint 2018 as part of their Innovation Initiative in Cambridge, UK:
“[..] a two-day gathering of 62 researchers, designers, developers, technologists, science communicators and more, with the goal of developing prototypes of innovations that bring cutting-edge technology to open research communication.”
One of the 13 projects at the excellently organised event was an integration of Binder and Stencila…
14 Aug 2018 | Daniel Nüst
We’ve been working on demonstrating our reference-implementation during spring an managed to create a number of example workspaces.
We now decided to publish these workspaces on our demo server.
Screenshot 1: o2r reference implementation listing of published Executable Research Compendia. The right-hand side shows a metadata summary including original authors.
The papers were originally published in…
13 Jul 2018 | Daniel Nüst
Today a new journal article lead by o2r team member Daniel was published in the journal PeerJ:
Reproducible research and GIScience: an evaluation using AGILE conference papers by Daniel Nüst, Carlos Granell, Barbara Hofer, Markus Konkol, Frank O. Ostermann, Rusne Sileryte, Valentina Cerutti
PeerJ. 2018.doi: 10.7717/peerj.5072
The article is an outcome of a collaboration around the AGILE conference, see https://o2r.info/reproducible-agile/ for more information.
Please retweet and spread the word!
Your questions & feedback are most welcome.
Here is Daniel’s attempt at a non-specialist summary:
More and more research use data and algorithms to answer a question.
That makes it harder for researchers to understand a scientific publication, because you need more than just the text to understand what is really going on.
You need the software and the data to be able to tell if everything is done correctly, and to be able to re-use new and exciting methods.
We took a look at the existing guides for such research and created our own criteria for research in sciences using environmental observations and maps.
We used the criteria to test how reproducible a set of papers from the AGILE conference actually are.
The conference is quite established and the papers are of high quality because they were all suggested for the “best paper” awards at the conference.
The results are quite bad!
We could not re-create any of the analyses.
Then we asked the authors of the papers we evaluated if they had considered that someone else might want to re-do their work.
While they all think the idea is great, many said they do not have the time for it.
The only way for researchers to have the time and resources to work in a way that is transparent to others and reusable openly is either to convince them of the importance or to force them.
We came up with a list of suggestions to publishers and scientific conference organisers to create enough reasons for researchers to publish science in a re-creatable way.
21 Jun 2018 | Daniel Nüst
Last week o2r team member Daniel co-organised a workshop at the 21st AGILE International Conference on Geographic Information Science in Lund, Sweden.
The workshop went very well and Daniel together with his colleagues was able to spread the word about reproducible research and Open Science.
They are pretty sure they convinced some new scientists to reconsider their habits!
Daniel wrote a short report about the workshop: https://o2r.info/reproducible-agile/2018/#workshop-report
The workshop series will probably be continued at the next AGILE conference in Limassol, Cyprus.
For o2r participating in such a workshop is a great way to stay in touch with users of reproducibility tools and practices, and to give back to the communities not only with technology but with education.