Simply put, metadata are data about data: they are the data that characterise or identify a file. As soon as you are going to publish or archive your data, you will be asked to provide these data.
On this page:
Metadata serve two purposes:
- to make data findable: metadata provide the information the search engine of a data archive (repository) needs to decide if the dataset satisfies a query and must be presented as the search result.
- to make data citable: metadata provide the elements for a correct citation of a dataset.
Metadata are usually recorded during the archiving and/or publishing of the data. If you wish to deposit a dataset in a repository, you will be asked to provide information on the dataset. For example:
- the maker(s) of the dataset
- title or name of the dataset
- a brief description of the dataset
- the date on which the dataset was made or completed
- the period during which the data were collected or to which they refer
- the geographical area where the data were collected or to which they refer
- rights regarding the data (ownership, copyright)
- possible restrictions of access (embargo, conditions)
There are various metadata standards. The best-known and most used is Dublin Core.
Some equipment and software also assign metadata to a digital file automatically, so-called embedded data. For example, the data a digital camera puts into a digital picture or the information which is registered under file properties in Microsoft Office. In Microsoft Office and in software for qualitative data analysis you can also assign your own metadata to a file.
Metadata vs documentation
Assigning metadata means describing a dataset in such a way that it can be read by computers, e.g. to facilitate the search function on the website of a data repository. Besides this, the dataset is accompanied by documentation: information meant to be read by humans.
To clarify the distinction by means of an example:
- data: the answers to questions that have been posed
- metadata: the name of the maker of the questionnaire, the date on which the questionnaire was submitted, etc.
- documentation: the questionnaire, description of the method used, etc.