A lifecycle approach to data curation is needed because different curatorial activities (or a lack thereof) at each stage in the life cycle directly influence practitioners' ability to preserve data, and later researchers' ability to reuse data. Thus it's important to identify specific curatorial actions that should or can be undertaken at each stage.
The DCC has created checklists of tasks to accomplish for each of the "sequential" steps in their lifecycle. For instance:
Conceptualize
- Get into the habit of equating data curation with good research.
- Know what your funding body expects you to do with your data and for how long. Assess your ability to be able to meet these expectations (i.e., do you need additional funding or staff?).
- Determine intellectual property rights from the outset and ensure they are documented
- Identify any anticipated publication requirements (embargoes, restrictions on publishing over multiple sites).
The tasks outlined in the checklists are intentionally broad -- this makes them appropriate for a broad range of data types and institutional setups.
The DCC has additionally created numerous checklists for other curatorial tasks, for instance "Five steps to decide what data to keep: a checklist for appraising research data" (link).
The Curation Profiles Project (CPP) was an IMLS-funded research project between Purdue University Libraries and GSLIS; the overall goal was to understand which researchers would be willing to share data when, and why, via detailed case studies of different researchers' data practices. The CPP team conducted cases studies of 12 different research domains, and from these, developed Data Curation Profiles:
A Data Curation Profile is essentially an outline of the “story” of a data set or collection, describing its origin and lifecycle within a research project.
(From http://datacurationprofiles.org/)
Data Curation Profiles can:
- provide a guide for discussing data with researchers
- give insight into areas of attention in data management
- help assess information needs related to data collections
- give insight into differences between data in various disciplines
- help identify possible data services
- create a starting point for curating a data set for archiving and preservation
From these profiles, researchers at Purdue went on to develop the Data Curation Profile Toolkit for use by librarians and data curators, to help guide conversations between them and researchers. The Toolkit itself is essentially an interview guide -- similar to a reference interview, in some ways, but with the goal of helping the librarian understand a dataset's curatorial needs, rather than a patron's research needs. We will work with these further in today's lab.
Increasingly, US funding agencies such as the National Science Foundation (NSF), the National Institutes of Health (NIH), and other private funders are requiring that researchers submit a data management plan along with their funding applications. For instance, the NSF writes,
Investigators are expected to share with other researchers, at no more than incremental cost and within a reasonable time, the primary data, samples, physical collections and other supporting materials created or gathered in the course of work under NSF grants. Grantees are expected to encourage and facilitate such sharing... Proposals submitted or due on or after January 18, 2011, must include a supplementary document of no more than two pages labeled “Data Management Plan”. This supplementary document should describe how the proposal will conform to NSF policy on the dissemination and sharing of research results. read more
Even if researchers aren't required to create a data management plan, creating one can be of helpful for both the researchers themselves and for the data curators who will be eventually tasked with caring for the data. Creating a management plan at the beginning of research can help ensure that important data objects are preserved throughout the research lifecycle -- not accidentally discarded or lost.
The California Digital Library has created the Data Management Planning (DMP) Tool, which we will use in this afternoon to create a DMP. This tool walks users through templates for different institutions and funding agencies, and allows users to create their own templates as well. We will work with this tool in today's lab.
Curation Profiles Project website: http://datacurationprofiles.org/
DCC (no date). Checklist for conceptualisation. Edinburgh: Digital Curation Centre. Available online: http://www.dcc.ac.uk/sites/default/files/Conceptualisation%20Checklist.pdf
DCC (2014). 'Five steps to decide what data to keep: a checklist for appraising research data v.1'. Edinburgh: Digital Curation Centre. Available online: http://www.dcc.ac.uk/resources/how-guides
Curation Profiles Project
Cragin, M. H., Palmer, C. L., Carlson, J. R., & Witt, M. (2010). Data sharing , small science and institutional repositories, 4023–4038. doi:10.1098/rsta.2010.0165
Witt, M., Carlson, J., Brandt, D. S., & Cragin, M. H. (2009). Constructing Data Curation Profiles. International Journal of Digital Curation. doi:10.2218/ijdc.v4i3.117
The Data Curation Profiles Directory - http://docs.lib.purdue.edu/dcp/
DMP Tool
Sallans, A., & Donnelly, M. (2012). DMP Online and DMPTool: Different strategies towards a shared goal. International Journal of Digital Curation, 7(2), 123-129.
Mallery, M. (2014). Dmptool: Guidance and Resources for Your Data Management Plan; https://dmp. cdlib. org. Technical Services Quarterly, 31(2), 197-199.
Shreeves, S. L. (2014). Presenting the New and Improved DMPTool (presentation). Link
The DMP Tool website: https://dmptool.org/