Open Science

Basic principles and best practices

Dr. Domenico Giusti
Paläoanthropologie, Senckenberg Centre for Human Evolution and Palaeoenvironment

Introducing myself

I am a research/technician [ORCID] at the Senckenberg Centre for Human Evolution and Palaeoenvironment in Tübingen and an adjunct lecturer at the University of Tübingen, WG Palaeoanthropology. My main research interest lies in the study of archaeological site formation processes through spatial analysis. I am specialized in computer applications and quantitative methods for archaeological research. I am active in international archaeological research projects. I am also interested in techniques and methods for reproducible research and open science.

My contacts

Source: .

1. Open Concepts and Principles

Outline

  • Course syllabus
  • Definitions
  • Rationale
  • Summary
  • FAQ
  • Food for thought
  • Practical exercises

Course syllabus

Associated course of studies

Bachelor Ur- und Frühgeschichtliche Archäologie und Archäologie des Mittelalters / Paläoanthropologie

  • 5 enrolled students

Master Naturwissenschaftliche Archäologie

  • 6 enrolled students

+ 2 (Bachelor Geowissenschaften, Master English Linguistics)

Teaching methods

Lecture

Aufgrund der Covid Pandemie findet die Veranstaltung online recorded statt, Unterrichtsmaterialien finden Sie auf Ilias.

Due to the covid pandemic, the course will take place online recorded, material will be uploaded on Ilias on a weekly basis.

Exercise

Aufgrund der Covid Pandemie findet die Veranstaltung online live via Zoom statt.

Due to the covid pandemic, the course will take place online live via zoom.

Requirements

The course has NO specific prerequisites. NO background knowledge or programming skills are expected.

Technical requirements: Regardless of your platform (Windows, Mac or Linux) you will need a high-speed Internet connection in order to watch the videos, download data and software, submit your assignments. You will also need to be able to install software on your PC. BTW, This course makes exclusively use of free and open source software.

Learning objectives

  • Learn guiding principles of Open Science:
    • Open Data
    • Open Source
    • Open Methods
    • Open Access
    • Open Peer-review
    • Open Educational Resources
  • Understand the ethical, legal, social, economic, and research impact arguments for and against Open Science
  • Understand EU and publishing policies
  • Learn some of the best practices of Open Science:
    • R programming
    • Writing dynamic documents with (R)markdown
    • Version control system with GitHub
    • Pre-print & data publishing

Programme

Week Module
1 (26 Apr.) 1. Open Concepts and Principles
2 (3 May) 2. Open Research Data
3 (10 May) 3. Open Research Software and Open Source
4 (17 May) 4. Reproducible Research and Data Analysis
Pfingsten BREAK ---
5 (31 May) Guest lecture (B. Marwick)
6 (7 Jun.) 5. Open Access to Published Research Results
7 (14 Jun.) 6. Open Licensing and File Formats
8 (21 Jun.) 7. Collaborative Platforms
9 (28 Jun.) 8. Open Peer Review, Metrics, and Evaluation
10 (5 Jul.) 9. Open Science Policies
11 (12 Jul.) 10. Citizen Science
12 (19 Jul.) 11. Open Educational Resources
13 (26 Jul.) Wrap-up & Final exam

Assignments

  • Quizzes (Modules 1 - 11)
  • swirl modules (15)
  • Final exercise
  • Final exam

Course resourses

A growing list of resourses is uploaded on ILIAS.

Pre-course survey

  • Why are you doing the course? Which goals do you want to achieve?
  • What is your level of prior knowledge in this course's subject area?
  • Do you already have programming skills?

Please complete the pre-course survey.

Definitions

What is Open?

Open means anyone can freely access, use, modify, and share for any purpose (subject, at most, to requirements that preserve provenance and openness)

Open Definition

What is Open?

"The Open Definition was initially derived from the Open Source Definition, which in turn was derived from the original Debian Free Software Guidelines, and the Debian Social Contract of which they are a part, which were created by Bruce Perens and the Debian Developers. Bruce later used the same text in creating the Open Source Definition. This definition is substantially derivative of those documents and retains their essential principles. Richard Stallman was the first to push the ideals of software freedom which we continue."

Open Definition

What is Free?

What is Free?

"Free software means software that respects users' freedom and community. Roughly, it means that the users have the freedom to run, copy, distribute, study, change and improve the software. Thus, “free software” is a matter of liberty, not price. To understand the concept, you should think of “free” as in “free speech,” not as in “free beer”. We sometimes call it “libre software,” borrowing the French or Spanish word for “free” as in freedom, to show we do not mean the software is gratis."

Free Software Foundation

The 4 essential freedoms

A program is free software if the program's users have the four essential freedoms:

  • The freedom (0) to run the program as you wish, for any purpose
  • The freedom (1) to study how the program works, and change it so it does your computing as you wish. Access to the source code is a precondition for this.
  • The freedom (2) to redistribute copies so you can help others.
  • The freedom (3) to distribute copies of your modified versions to others. By doing this you can give the whole community a chance to benefit from your changes. Access to the source code is a precondition for this.

Free Software Foundation

Free VS Proprietary software

What is Open Science?

"The movement to make scientific research (including publications, data, physical samples, and software) and its dissemination accessible to all levels of an inquiring society, amateur or professional. Open science is transparent and accessible knowledge that is shared and developed through collaborative networks. It encompasses practices such as publishing open research, campaigning for open access, encouraging scientists to practice open-notebook science, and generally making it easier to publish and communicate scientific knowledge."

Wikipedia Open Science Definition

"Open Science is the practice of science in such a way that others can collaborate and contribute, where research data, lab notes and other research processes are freely available, under terms that enable reuse, redistribution and reproduction of the research and its underlying data and methods."

FOSTER Open Science Definition

The 4 foundamental rules of Open Science

Free Software

The 4 essential freedoms

  • The freedom (0) to run the program as you wish, for any purpose
  • The freedom (1) to study how the program works, and change it so it does your computing as you wish. Access to the source code is a precondition for this.
  • The freedom (2) to redistribute copies so you can help others.
  • The freedom (3) to distribute copies of your modified versions to others. By doing this you can give the whole community a chance to benefit from your changes. Access to the source code is a precondition for this.

Open Science

The 4 foundamental rules

  • Transparent
    • TRANSPARENCY is a precondition for looking inside, for studying the inner mechanism. It is a precondition for REPRODUCIBILITY
  • Available & free
  • Accessible
  • Reusable

Open Science basic principles

Schools of thought

The origin and the future of science

Openness in science is significant in that it both defines the origins of modern science and imagines the future of science.

Fecher & Friesike 2013

Rationale

Research integrity

Research integrity

Bias can be introduced throughout the research process by

  • Research misconduct (making up data or results, selective reporting, cherry picking...),
  • Errors (statistical reporting errors),
  • Inconsistencies (results do not match the data generating process),
  • HARKing (Hypothesizing After the Results are Known),
  • Publication bias (published literature is systematically unrepresentative of the real population of completed studies, e.g., preference to publish positive results or to reject negative results, cooking/writing up mixed results or non-significant results).

Challenge all steps of your research (research question/hypothesis, study design, data sourcing, processing, analysis, report)!

Research integrity

  • Apophenia: the tendency to see patterns in random data
  • Confirmation bias: the tendency to focus on evidence that is in line with our expectations or favoured explanation
  • Hindsight bias: the tendency to see an event as having been predictable only after it has occurred

Munafò et al. 2017

Research integrity

"A research finding is less likely to be true when the studies conducted in a field are smaller; when effect sizes are smaller; when there is a greater number and lesser preselection of tested relationships; where there is greater flexibility in designs, definitions, outcomes, and analytical modes; when there is greater financial and other interest and prejudice; and when more teams are involved in a scientific field in chase of statistical significance."

Ioannidis 2005

Motivations

  • Increased TRANSPARENCY can reduce fraud, data manipulation, and selective reporting of results
    • Increased transparency for greater efficiency, rigour, accountability, sustainability for future generations, and reproducibility.
  • Sharing resources between research disciplines facilitate scientific collaboration
    • Open research outputs are openly licensed in order to maximize re-use while allowing the creator to retain ownership and receive credit for their work.
  • Open Science leads to increased impact associated with wider sharing and re-use (e.g., the so-called open access citation advantage)
  • Pressure from research academies and governments for publicly-funded research to be shared more openly
    • Publicly funded research outputs should be publicly available (EU's open science policy)
    • Need to drive cultural change in research and amongst researchers
  • Open Science could increase trust in science and in the reliability of scientific results.

How is Open Science relevant to archaeology?

  • "Open Science practices encourage archaeologists to conduct research that is transparent, reusable, and easily accessible (open data and open methods) without financial or copyright barriers (open access)."
  • "Open Science practices improve archaeology by increasing transparency and reproducibility in archaeological research. This approach enables archaeologists to more readily and responsibly build on the work of their colleagues, advancing archaeological practice and accelerating discovery. Transparency and reproducibility also enhance the credibility of archaeological research by allowing more complete independent assessment of research findings than is possible with traditional peer review of only research results. Open Science practices promote ethical research by enabling researchers to efficiently demonstrate the chain of reasoning behind their data analysis and expose more of their research workflow to the research community and the public".
  • "Community best practices for open science in archaeology facilitate the sharing of methods, data, and results by encouraging researchers to deposit them in trustworthy online repositories. Standardizing research-sharing practices enhances engagement between archaeologists, our collaborators, and the communities we work with, including policymakers and project managers." Marwick et al. 2017

Major challanges to the realization of Open Science

  • Number of actors: authors, reviewers, publishing houses, funding bodies, educational institute etc...
    • Authors have to invest a considerable amount of time and effort to make their research reproducible
    • Reviewers have to invest a considerable amount of time and effort to reproduce the reproducible research
    • Publishing houses are reluctant to give up a commercial model that has been very successful for them in favour of a model that is relatively untested and probably less lucrative.
    • Funding bodies have to foster Open Science by networking with all actors
    • Education institute have to invest resources to develop Open Science curricula and institutional repositories or alternative strategies

Major challanges to the realization of Open Science

  • Licensing and copyright issues
    • Sensitive data
    • Data ownership - authors hardly understand that, typically, open research outputs are openly licensed in order to maximize re-use while allowing the creator to retain ownership and receive credit for their work.
  • Lack of standards, or shared practices
  • CULTURAL CHANGE
    • To achieve the necessary cultural change in academia, a new, widely-accepted incentive and reward system for researchers is required.

Good news

Good news

Summary

Summary

  • Definition of Open & how this is related to the free software philosophy
  • Definition of Open Science & key values of TRANSPARENCY, ACCESSIBILITY, AVAILABILITY, REPRODUCIBILITY, REUSABILITY
  • Open Science basic principles & the multiple perspectives of approaches
    • Open Data
    • Open Source
    • Open Methods
    • Open Access
    • Open Peer-review
    • Open Educational Resources
  • Research process, Motivation & major challanges

FAQ

What is the difference between Open Science and ‘science’?

Open Science refers to doing traditional science with more transparency involved at various stages, for example by openly sharing code and data. Many researchers do this already, but don’t call it Open Science.

Does ‘Open Science’ exclude the Humanities and Social Sciences?

No, the term Open Science is inclusive. Indeed, the case is that sometimes Open Science is more broadly referred to as ‘Open Research’ or ‘Open Scholarship’ to be more inclusive of other disciplines, principles and practices. However, Open Science is a commonly used term at multiple levels and so it makes sense to adopt it for communication purposes, with the proviso that it includes all research disciplines.

Does Open Science lead to misuse or misunderstanding of research?

No, the application of Open Science principles is in fact a safeguard against misuse or misunderstanding. Transparency breeds trust, confidence and allows others to verify and validate the research process.

Will Open Science lead to too much information overload?

It is better to have too much information and deal with it, than to have too little and live with the risk of missing the important parts. And there are technologies such as RSS feeds, machine learning and artificial intelligence that are making content aggregation easier.

Food for thought

What is the current application of Open Science principles in your institute? Picture the positive outcomes that Open Science might have on your current academic life.

References & further resources

Practical exercises

Practical exercises

Outline

  • Introduction to some of the best tools for doing Reproducible Research
    • R & RStudio
    • (R)Markdown
    • Git & GitHub
    • Zenodo, OSF

Setup

Download and install the R base system and RStudio. Both are needed. Installing RStudio will not automatically install R.