Data management in Computer Science

In the exact sciences large quantities of data of all kinds are often produced: data from various kinds of measuring apparatus, image files, databases, simulations, statistical data, geographical data, spreadsheets, etc. But also data in publications such as (Open) Office documents, CSV, HTML or PDF files.

To keep these data retrievable, accessible and understandable in the long term, the storage, sharing and archiving of the data must be carefully organised and documented. Responsible storage and handling of research data is what is meant by research data management (RDM).

Data Management Plan

A first step towards responsible data management is the drawing up of a Data Management Plan (DMP). In this plan the researcher(s) describe(s) what type of data will be collected, how and where the data are stored, and who will have access to the data. A DMP is often mandatory when submitting a grant or research proposal.

On the Data Management Plan page you can find the kind of questions which must be covered by a DMP. There are also links to models, checklists and online DMP tools which may help when drawing up a DMP.

On the website of Wageningen University there are a template and examples of Data Management Plans for PhD research in eco-hydrology and eco-toxicology.

Drawing up a DMP must be done by the researchers themselves. An information specialist from the Library can offer advice and support during the writing. 

Metadata

The data which are recorded during the production of rough data are called metadata. It refers to information which describes who collected the data, where, when, what type of data, within which discipline, etc. When depositing a dataset in a data repository you will also be asked to enter information which describes the dataset. A widely used standard is Dublin Core. This standard offers a wide choice of disciplines and is suitable for many types of data.

A number of disciplines have their own standards:


 Biology Darwin Core
 Ecology Ecological Metadata Language

(EML)

 Genomics

Genome Metadata

MIxS

 Informatics   

Among others:

Resource Description Framework (RDF)

eXtensible Markup Language (XML)

 General

Dublin Core

DataCite Metadata Schema

An extensive overview for all disciplines can be found on the website of the Digital Curation Center.

The person who submits the data is responsible for adding the metadata. An information specialist from the Library can help you select the most useful metadata. 

Storage of research data

UvA/AUAS figshare

In UvA/AUAS figshare you can store your research data safely and still have access: your files are stored on ISO certified servers in Switzerland, and you can access them from any computer with an internet connection.

You can also safely share, archive and publish your research data. Your file is assigned a Digital Object Identifier (DOI), a unique permanent link you can use in articles and presentations.

4TU.ResearchData

4TU.ResearchData is a collaboration of the three technical universities in the Netherlands and Wageningen University. UvA researchers may use 4TU.ResearchData for long-term archiving of their data.

Dryad

Dryad is a digital repository for storing data which accompany scientific publications. Dryad arose out of an initiative of a group of journals and scientific organisations in the field of evolutionary Biology and Ecology. The charges depend on the journal in which the article is published.

TAIR

The Arabidopsis Information Resource  (TAIR) was created for the storage of genetic and molecular-biological information of the model higher plant Arabidopsis thaliana (thale cress). The UvA pays an annual contribution in support.

Other data repositories

Data repositories such as Ecological Archives (ESA, ecology), GitHub (software), the University of Florida Sparse Matrix Collection (mathematics and physics) and others can be found via registers of data repositories.

Support

If you have any questions on research data management in Biology, Informatics or Logic, please contact drs. G.H. (George) Meerburg. He can also give you advice on drawing up a DMP or on selecting a good metadata standard or repository for your research data. 

  • drs. G.H. (George) Meerburg

    Information specialist Biology, Informatics and Logic

    G.H.Meerburg@uva.nl | T: 0205256643

    Go to detailpage

Published by  RDM support

10 October 2017