Metadata is information about the context, content, quality, provenance, and/or accessibility of data. It is the critical information for ensuring the longevity and reproducibility of research data.
Metadata can exist in a variety of different formats. Examples are listed below:
Metadata can exist in a variety of different formats. Some of the most common ones are summarized below:
Discipline | Definition | Example |
---|---|---|
Biodiversity | The Darwin Core (DwC) is a standard designed to facilitate the exchange of information the geographic occurrence of species and the existence of specimens in collections. | Example |
Geospatial | Geospatial metadata commonly document geographic digital data such as Geographic Information System (GIS) files, geospatial databases, and earth imagery but can also be used to document geospatial resources including data catalogs, mapping applications, data models and related websites. | Example |
Social Science | Data Documentation Initiative (DDI) is an effort to create an international standard for describing data from the social, behavioral, and economic sciences. Expressed in XML, the DDI metadata specification now supports the entire research data life cycle. | Example |
If you are uncertain of what metadata standards may be in use in your discipline, the Digital Curation Centre maintains a list of commonly-used metadata standards organized by discipline. If you intend to deposit your data in a data repository, this repository may have guidelines on what metadata standard(s) should be used to describe deposited data.
Controlled vocabularies are a collection of preferred terms that are used to retrieve content consistently. Predefined and authorized terms are mandated, in contrast to tags or keywords, which are not controlled, thus ambiguous and inconsistent. Taxonomies, thesauri, and ontologies are types of controlled vocabulary.
In some fields, vocabularies are well-established, in other disciplines, they are are emerging. You may want to check professional societies and journals for ones that have been developed in your disciplinary area. The list below is a starting point.
Disciplinary area | Example |
---|---|
Life Science | Bioportal biomedical vocabularies from the NIH National Centers for Biomedical Computing |
Geospatial | Geographic Names Information System (GNIS) Developed by the USGS in cooperation with the U.S. Board on Geographic Names, contains information about physical and cultural geographic features in the United States and associated areas, both current and historical (not including roads and highways). The database holds the Federally recognized name of each feature and defines the location of the feature by state, county, USGS topographic map, and geographic coordinates. |
Medical | Medical Subject Headings (MeSH) is a controlled vocabulary for the purpose of indexing journal articles and books in the life sciences; created and updated by the US National Library of Medicine. |
Agriculture | The agricultural thesaurus online vocabulary tools of agricultural terms in English and Spanish that cooperatively produced by the National Agricultural Library, USDA, and the Inter-American Institute for Cooperation on Agriculture. |
Biodiversity | Biocomplexity Thesaurus displays terminologies and term relationships in the fields of biology, ecology, environmental sciences, and sustainability. |
Humanities | Music Ontology provides main concepts and properties for describing music (artists, albums, tracks, arrangements). |
Parts of this guide were borrowed and/or adapted from resources from the Universit of Wisconsin-Madison and the University of Nebraska-Lincoln. Thanks for sharing!