Project completion
The question arises at the conclusion of your research project of what is to be done with your research data. Various options are available for publishing research data to make them available to third parties, depending on their value and subsequent usage potential. You should consider properly archiving your data to ensure that your research results remain comprehensible and reproducible, in accordance with good scientific practice.
Publishing research data
-
Why should I publish my data?
There are a number of reasons why it is recommendable to publish research data, including enhanced reputation through heightened visibility of academic achievements, for example throughout citation and continuing usage of your research data by third parties. Publishing underlying data is good scientific practice because it affords transparency and renders research results reproducible. Academic dialogue and partnerships are strengthened by the sharing of research data, potentially having a greater impact on societal debates and political decision-making as a result. The open science argument is often heard that the results of publicly funded research should be publicly available, as a rule, and third-party funding contributors are increasingly calling for the publication of research data in the absence of compelling reasons otherwise.
There are of course cases where your data should not or may not be published, such as if they are copyright-protected, subject to confidentiality or contain personal data. Such problems can be eliminated upfront however in some cases through careful project planning (see Project planning). Thus as early on as possible you should clarify any relevant legal issues with your project partners and other stakeholders, obtain informed consent declarations and plan strategies for anonymizing your data (see Legal issues). We would be pleased to advise you on data publication throughout every phase of your research project, just contact us at: [Email protection active, please enable JavaScript.]
-
Repositories, data supplements and data journals
In publishing your data you should follow the FAIR principles , so your data can be drawn upon and cited by third parties. According to these principles, data should be findable, accessible, interoperable and reusable.
Most data are published in repositories, which has the advantage that the data can be found outside of text publications via certain search engines and cited as separate data publication. We recommend as a rule that you contact a discipline-specific repository that specializes in the data formats and metadata standards specific to your field and affords maximum visibility within your academic community. And many institutions have their own repositories their researchers can use. At the University of Bonn you have access to the RADAR servcice for the archiving and publication of research data from all academic disciplines. Other general repositories may be used as well in addition to RADAR for publication of an array of data types. Please see our discipline-specific advisories on identifying suitable repositories.
Academic journals are increasingly offering the option of attaching research data as a file attachment or supplementary material to an article. The advantage is that the data are then published along the article all at once without having to worry about meeting the requirements of other platforms. The conformance of data thus published with the FAIR principles depends on the extent to which such conformity is ensured by the respective journal or publisher.
Additionally, data journals are now increasingly being seen as the gold standard for data publication, allowing a peer review process which can document the quality of your research data in a published data paper. Data papers are particularly beneficial for demonstrating outstanding data quality and high reusage potential, but not all academic fields have suitable data journals at this time. A list of the existing data journals is provided at Forschungsdaten.org: https://www.forschungsdaten.org/index.php/Data_Journals
Contact us at [Email protection active, please enable JavaScript.] for questions and assistance deciding on the most suitable route for publication for your data. The following resources are available to find out more about research data publishing routes:
- CESSDA: Data publishing routes https://www.cessda.eu/Training/Training-Resources/Library/Data-Management-Expert-Guide/6.-Archive-Publish/Data-publishing-routes
- Biernacka, Katarzyna et al. (2019). Wie FAIR sind Deine Forschungsdaten? (How FAIR are your research data?) https://doi.org/10.5281/zenodo.2547338
- OpenAire: How to find a trustworthy repository for your data? https://www.openaire.eu/find-trustworthy-data-repository
- Forschungsdaten.info: Datenjournale – Peer-Review-Publikationen für Daten (Data journals – data publishing with peer review) https://www.forschungsdaten.info/themen/aufbereiten-und-veroeffentlichen/datenjournale/
-
Data preparation
It is important to prepare data prior to publication to enable interpretation and further usage of the data by third parties. The preparation process starts with selection and curation, determining what files in what versions are to be included in the publication and which are not. The subsequent usage potential of the data should be the primary selection criterion. In addition to using the data collected to pursue further research questions, you should consider other usage scenarios as well. Could your data be of interest for teaching, for example, or do you have materials that could be beneficial in terms of methodology (survey methods, questionnaires, analysis techniques, visualization scripts, etc.)? Data quantity is another selection criterion, as publishing large datasets can involve considerable cost.
All data you publish should be documented comprehensively to enable third parties to transparently understand it on their own without having to even read your paper, if possible. Please note as well our data documentation advisories.
Has it been determined whether the data are subject to copyright or exploitation rights? Once any necessary consent declarations been obtained and personal data rendered anonymous (please note our advisories on legal issues), check your data based on formal criteria such as date and number formats, value scales, field and variable naming conventions, abbreviations, spelling errors, and so on. Also look for any extra spaces and special characters, because these are notorious for causing errors in automated data processing.
-
For further information on data preparation see:
- J. Trixa & T. Ebel (2015) (Verbund Forschungsdaten Bildung): Hinweise zur Aufbereitung quantitativer Daten (Information on preparing quantitative data) https://www.forschungsdaten-bildung.de/get_files.php?action=get_file&file=fdb-informiert-nr-4.pdf
- Forschungsdaten.info: Datenvalidierung https://www.forschungsdaten.info/themen/veroeffentlichen-und-archivieren/datenvalidierung/
- Digital Curation Center: Five steps to decide what data to keep http://www.dcc.ac.uk/resources/how-guides/five-steps-decide-what-data-keep
-
Metadata tagging
The general practice is to tag your data with metadata prior to publication. Put simply, metadata is structured information which describes your data set. Without metadata, your dataset is essentially worthless to third parties because the content of the data is not apparent. Metadata for research data include bibliographical and administrative information explaining the creation of the dataset as well as technical and content-related information, which may vary greatly depending on the academic field. Many different metadata standards thus exist which prescribe a fixed set of fields for describing a dataset. The spectrum ranges from general standards like the popular DataCite Schema (https://schema.datacite.org/meta/kernel-4.3/) /) to specialized schemas like the Ecological Metadata Language (https://nceas.github.io/eml/). Metadata are generally tagged in text-based markup languages like XML and JSON, which are both human-readable and machine-readable. This renders research data findable via relevant search engines for subsequent use. Most data repositories however provide input wizards and help texts that facilitate metadata tagging and spare you considerable technical effort.
We invite you to contact us for any metadata-related questions you may have, and encourage you to read our discipline-specific advisories. You may also find the following resources useful:
- Riley, Jenn (2017): Understanding metadata. What is metadata, and what is it for? Baltimore, MD: National Information Standards Organization (NISO Primer series) https://www.niso.org/publications/understanding-metadata-2017
- Cornell University: Metadata and describing data: https://data.research.cornell.edu/content/writing-metadata
- DataCite Metadata Generator: https://dhvlab.gwi.uni-muenchen.de/datacite-generator/
- Forschungsdaten.info: Metadaten und Metadatenstandards https://www.forschungsdaten.info/themen/aufbereiten-und-veroeffentlichen/metadaten-und-metadatenstandards/
-
Usage licenses
When publishing your research data you will generally provide a usage license which stipulates the conditions for usage of your data. Standardized Creative Commons (CC) licenses are often utilized for digital works, as these cover most usage scenarios for copyrighted material.
CC licenses are in widespread use for publishing research data, but there are other licenses as well which are specifically designed for data and databases. The following licenses are available through the Open Data Commons (ODC) project (https://opendatacommons.org//): Public Domain Dedication and License (PDDL, similar to CC0), Attribution License (ODC-By, similar to CC-BY) and Open Database License (ODC-ODbL, similar to CC-BY-SA).
By granting a license you permit third parties to utilize your data under certain terms that are valid as long as you actually hold the rights to the data. There is often insufficient clarity on this point regarding research data, and there can be various legal implications under copyright, ancillary copyright for press publishers, data protection law, personal rights and/or labor law (see also the section Legal Issues). Before licensing your data you should conduct a legal review of the risk of claims being asserted over the data. As a basic rule, data not subject to protective rights must be designated “public domain”. Even though you can (and should) publish such data, you do not hold any copyright to it which under a CC or ODC license you could assert.
Please contact us with any data licensing questions you may have, but be advised that the Research Data Service Center does not provide binding legal advice. If you believe you may require binding legal advice, please contact the Legal Department of the University of Bonn.
-
You may find the following license-related resources useful:
- Ball, Alex (2014): How to License Research Data? Digital Curation Centre http://www.dcc.ac.uk/sites/default/files/documents/publications/reports/guides/How_To_License_Research_Data.pdf
- Das Portal Forschungslizenzen http://forschungslizenzen.de
- Forschungsdaten.info: Forschungsdaten veröffentlichen? Die wichtigsten rechtlichen Aspekte https://zenodo.org/record/3368293
- Baumann, Paul; Krahn, Philipp; Lauber-Rönsberg, Anne (2020): Entscheidungsbaum für die Veröffentlichung von Forschungsdaten. Dresden: Technische Universität Dresden. https://d-nb.info/122783327X/34
- Kreutzer, Till; Lahmann, Henning (2021): Rechtsfragen bei Open Science (Legal Issues with Open Science). Hamburg University Press http://nbn-resolving.de/urn/resolver.pl?urn:nbn:de:gbv:18-3-2112
- Kreutzer, Till (Hg.) (2014): Open Content. A practical guide to using Creative Commons Licences https://commons.wikimedia.org/wiki/File:Open_Content_-_A_Practical_Guide_to_Using_Creative_Commons_Licences.pdf
- Public License Selector https://ufal.github.io/public-license-selector/
Archiving research data
Upon conclusion of your project, the question arises of what to do with the research data. According to the DFG`s precepts of good research practice, data must be retained for at least ten years to fulfill the requirements of verifiability and transparency in academic research. Also, a well-thought-out archiving concept will benefit you if you take up that research again in future. Research data should be archived along with descriptive documentation, as well as any preliminary results and processing tools. Data are generally archived irrespective of whether some of the data have been made available in publication form. This does not mean however that you always have to archive all files and file versions. As with publication (see above), archiving too requires data selection and curation. Generally, you yourself are the best judge of what data should be archived in what form to meet documentation requirements.
Do not archive your data on a USB stick in your desk drawer! Different archiving strategies are recommendable based on the scope, format, content and sensitivity of the data concerned. The simplest possibility is to keep the data yourself, but if you do so then you should store it on at least three separate data carriers kept in at least two different locations (see also Storage and Backup Strategies). Please be aware that storage media have a limited useful life. You should encrypt your data carriers to rule out the possibility of misuse of the data. We recommend that you store an additional copy of your data on the University of Bonn file service infrastructure (see section on Storage and Security).
In addition to such solutions for handling storage yourself, a number of data repositories are available (see above) offering professional archiving services, some of which are geared for long-term archiving. Graduate students and faculty of the University of Bonn also have the option of archiving data for up to 15 years via the RADAR-Service . The service allows you to decide what parties can have access to your data. Other repositories afford a higher level of security for sensitive data requiring greater protection against unauthorized accessing.
In addition to simply storing datasets so as to protect them from being compromised (bitstream preservation), archiving research data involves challenges arising from continuous changes in operating systems, software environments and data formats. Files you save today may not open at some point in the future because the necessary software is no longer available or installable, or because the format is no longer supported. There are two measures you can take however to minimize this risk of data loss. (1) Documentation: the more detailed you describe the software environment (tool version, add-ons used, operating system, etc.), the easier it will be to restore the data if necessary. This is especially important if you use specialized software not widely in use. (2) Data formats: use a data format for storage that is as open and standardized as possible (see also the section Organizing Research Data).Such formats are usually interoperable, i.e. they are much easier to process in differing system environments.
We would be pleased to advise you regarding any data archiving questions you may have. You may also find the following resources helpful:
- Briney, Kristin A. (2020) Project Close-Out Checklist for Research Data https://resolver.caltech.edu/CaltechAUTHORS:20200519-142758925
- Forschungsdaten.info: Langzeitarchivierung (Long-term archiving ) https://www.forschungsdaten.info/themen/bewahren-und-nachnutzen/langzeitarchivierung/
- Rekowski, Thomas. (2018). Durability of Storage Media. Zenodo. https://doi.org/10.5281/zenodo.1468358
- Weichselgartner, Erich et al. (2011): Archivierung von Forschungsdaten (archiving research data). In: Stephan Büttner und Hans-Christoph Hobohm (Hg.): Handbuch Forschungsdatenmanagement. Bad Honnef: Bock + Herchen, S. 191–202. https://opus4.kobv.de/opus4-fhpotsdam/files/208/HandbuchForschungsdatenmanagement.pdf
- Redwine, Gabriela; Beagrie, Neil (2015): Personal Digital Archiving http://dx.doi.org/10.7207/twr15-01
Image sources:
Publications and Data: Auke Herrema (CC-BY)
Data Repository: Roche DG, Lanfear R, Binning SA, Haff TM, Schwanz LE, et al. (2014) CC-BY 4.0
Metadata: Jørgen Stamp CC-BY 2.5 DK
CC-Lizenzen: TU Darmstadt CC-BY-SA 3.0 DE
Floppy Box: Jeremy KeithCC-BY 2.0
Icon Publizieren: Adrien Coquet from the Noun ProjectCC-BY 3.0 US
Icon Archivieren: Nicolas Vicent from the Noun ProjectCC-BY 3.0 US