An identifier, most simply, is “a name that identifies either a unique object or a unique class of objects.” In computer or technical systems, these are often found in the form of bar codes, QR codes or unique alphanumeric strings such as ISSNs or URLs.
In data curation, we need identifiers to battle reference rot: if data are playing the role of evidence in research and public policy, they need to be accessible for the long-term. Using stable identifiers is key to this: they allow users to reliably reference and locate materials.
Identifiers for data curation must be:
There are many different kinds of identifier schemas for use with data:
All of these have different properties:
From Duerr et al., 2011
Digital Object Identifiers, or DOIs, have recently gained widespread use with both bibliographic materials and datasets, largely thanks to their adoption and subsequent popularization by CrossRef.
DOIs are character strings, and can be divided into three parts: the URL of resolution service (e.g. the metadata store that redirects the user to the current location of the digital object), a prefix which points to the identity of the registrant (e.g. the company or individual assigning the DOI), and a suffix which points to the object itself. The prefix and suffix are separated by a forward slash:
From the Australian National Data Service "Cite My Data" Guide
Once created, a DOI cannot be renamed or deleted.
There are a number of services that can "mint" or assign DOIs for datasets, including:
EZID: Run out of the California Digital Library. Major benefits include links with CrossRef and additional metadata and citation formatting services
DataCite: stores metadata in their in-house datastore, which provides some advantages over CrossRef (stability, and better search?)
Both service require institution- or project-level paid subscriptions.
The prevalence of DOI minting services means that while each identifier is unique and only points to one resource, there's nothing keeping different DOI services from minting multiple identifiers for one resource.
Just as an address won’t prevent a house from being demolished or remodeled, an ID will not prevent “content drift,” or prevent data from being deleted from its hosting server. Furthermore, its important to remember that a DOI doesn't:
For more on this, see "DOIs unambiguously and persistently identify published, trustworthy, citable online scholarly literature. Right?"
Australian National Data Service. (2011). Cite My Data Guide. Available from: http://ands.org.au/guides/cite-my-data.html
Bilder, G. (2013). DOIs unambiguously and persistently identify published, trustworthy, citable online scholarly literature. Right? CrossTech Blog. Available from: http://crosstech.crossref.org/2013/09/dois-unambiguously-and-persistently-identify-published-trustworthy-citable-online-scholarly-literature-right.html
Duerr, R. E., Downs, R. R., Tilmes, C., Barkstrom, B., Lenhardt, W. C., Glassy, J. J., Slaughter, P. (2011). On the Utility of identification schemes for digital earth science data: an assessment and recommendations. Earth Science Informatics, (4), 139–160. doi:10.1007/s12145-011-0083-6
DataCite: https://www.datacite.org/
International DOI Foundation. (2014). Chapter 2: Numbering. Available from http://www.doi.org/doi_handbook/2_Numbering.html
Klein, M., Van de Sompel, H., Sanderson, R., Shankar, H., Balakireva, L., Zhou, K., & Tobin, R. (2014). Scholarly Context Not Found: One in Five Articles Suffers from Reference Rot. PLoS ONE, 1–39. doi:10.1371/journal.pone.0115253
Page, RDM. (2013) Multiple DOIs for one article issued by different publishers. iPhylo. http://iphylo.blogspot.com/2013/05/duplicate-dois-for-same-article-issued.html
Page, RDM. (2014) DOIs are not enough. iPhylo. http://iphylo.blogspot.com/2014/05/dois-are-not-enough.html
Starr, J., & Gastl, A. (2011). isCitedBy: A Metadata Scheme for DataCite. D-Lib Magazine, 17(12).
Van de Sompel, H., Sanderson, R., Shankar, H., & Klein, M. (2014). Persistent Identifiers for Scholarly Assets and the Web: The Need for an Unambiguous Mapping. International Journal of Digital Curation, 9(1), 331-342.