
Dr. Domenico Giusti
Paläoanthropologie, Senckenberg Centre for Human Evolution and Palaeoenvironment
Open research software, or open-source research software, refers to the use and development of software for analysis, simulation, visualization, etc. where the full source code is available [and] shared under a license that allows modification, derivation, and redistribution..
Archaeology is living a tool-driven revolution, or at least a change, in search for a more transparent and reproducible model.
Factors:
Percentage of articles per year citing R in top Ecology journals (5,800 articles out of 42,659). Data from Web of Science. Schmidt & Marwick 2020
Proportion of Archaeology articles per year citing R (a total of 154 out of 42,991 articles in our sample for 2008–2018). Sub-plot shows articles published in the Journal of Archaeological Science during 2008–2017.
Articles in archaeology journals using R for reproducible research, and making code files openly available to accompany the published article (n = 85). Schmidt & Marwick 2020
Median citation rates per year for archaeology articles 2010–2017 that cite R (n = 216) and articles that do not cite R (n = 42,828). On average, articles citing R have higher numbers of citations (m = 10.1) than articles that do not (m = 6.5), t(158) = 3.38, p = 0.00092. Schmidt & Marwick 2020
This image was created by Scriberia for The Turing Way community and is used under a CC-BY licence
This image was created by Scriberia for The Turing Way community and is used under a CC-BY licence
An article about a computational result is advertising, not scholarship. The actual scholarship is the full software environment, code and data, that produced the result
Buckheit & Donoho 1995
A certain familiarity with the computational tools is required to be able to use them not just on a technological basis, but creatively and in full knowledge of their restrictions and ambiguities
As with data, preparing code to make it publicly available takes time to ensure that it is fit for others to read and use.
I can’t share my [(R) code] — it’s too messy / it doesn’t have good documentation / I didn’t leave good comments!
Developers of research software around the world empathize with this feeling—people rarely feel like their code is "ready" to publicly share or that it is “finished”. However, as Barnes (2010) put it, “if your code is good enough to do the job, then it is good enough to release — and releasing it will help your research and your field.” In other words, if you feel comfortable enough with your software to publish a study or report results, then the code is sufficiently developed to share with your colleagues. (In the other direction, if you don’t feel comfortable sharing the code, then perhaps it requires more development or testing before using in a publication). Plus, sharing your code allows others to improve and build upon it, leading to even greater impact and innovation (and citations for you!).
What if someone takes the code I have shared and uses it for nefarious purposes, or claims they wrote it?
Selecting an appropriate license for your software will help protect you from any uses of your software by others; for example, the common MIT License includes both limitations of liability and states that no warranty is provided. If someone else tries to claim that they wrote the software you made available, then you can point to the timestamps on your repository or archived versions as proof of your prior work.
If I share my code in an online repository, I will be deluged with requests for user support.
Although potential users may ask you for help, either via email or (e.g.) issues filed on the online repository, you are under no obligation to provide support if you prefer not to or cannot do so. An appropriate license even provides you with legal protection for this (e.g., the no-warranty clause of the MIT License).
The primary cost of enhancing reproducibility is the time required to learn to use the software tools.
"Developing competence in using these tools for enhancing computational reproducibility is time-consuming, and raises the question of how much of this is practical for most archaeologists, and what the benefits and costs might be. Our view is that once the initial costs of learning the tools is paid off, implementing the principles [of RR] makes research and analysis easier, and has material professional benefits [such as citation advantages]." Marwick 2017