Skilling Up for Data Curation Infographic

My latest infographic, Skilling Up for Data Curation, using Piktochart examines the skills and tools I’ll need for data curation at my campus.  The infographic was used for a poster session on the topic for the Fall 2013 conference of the Western New York/Ontario Chapter of the Association of College Research Libraries (ACRL).

There has been a lot of discussion over the last several year about the role that libraries should play in data curation efforts at their institutions.  Technical advances have made it possible for the creation of larger and larger amounts of information/data/research/scholarship.  How best to manage and preserve this influx is under debate, especially given the challenges; the sheer volume, different media types, intellectual property issues, obsolescence of formats/software and lack of metadata to name a few…


DCC Curation Lifecycle Model

What we do know is that data curation must be a collaborative effort between librarians and data creators  for two important reasons: to have the metadata necessary for curation later in the data lifecycle and to education data creators about the need for standardization of metadata.  Consistent standards used by researchers within a discipline, or better yet across disciplines, will allow for the opportunity to automate some (perhaps all) of the curation process and the possibility for adding smaller datasets to the corpus of curated data outside our own small institution for reuse by others, attaining an even greater return on investment.

Digital curation is far removed from the the institutional repository of the past.  Reappraisal and providing access in ways (and formats) that the data can be readily reused is key so that our digital collection don’t wind up looking like an old attic where we’ve abandoned our institutional data.

Through a series of discussions with faculty on our campus this summer, we found that, as yet, there is no great demand for data curation with respect to faculty research.  However, we have begun developing skill sets in this area so we’ll be prepared with technology and infrastructure options in anticipation of future needs.

During these discussions, we found several areas where the library immediately could serve:

  • Provide assistance locating discipline specific repositories for finding and publishing research data.
  • Provide instruction or workshops for undergraduate students to improve skills in managing both laboratory and their own data.
  • Provide assistance in developing data management plans on funding applications.
  • Identify faculty work or research projects that could/should be digitally curated.

Education and ongoing discussion with faculty about developing standards for metadata and opportunities and benefits for open data sharing will be key.  We also have the opportunity to to be selective about the projects we pursue and to pace our digital initiatives in ways that are practicable in terms of resources, time and funding.

Bird, C., Willoughby, C., Coles, S., and Frey, J. (2013). Data curation: Issues in the chemical sciences. Information Standards Quarterly, 25(3): 7-12.

Digital Curation Centre. (n.d.). Data curation lifecycle model.

DataCite: Helping you find access, and reuse data. (n.d.). Why cite data?

Lee, C. A., Tibbo, H., & Schaefer, J. C. (2007a). DigCCurr: Building an international digital curation curriculum & the Carolina Digital Curation Fellowship Program.

Schirrwagen, Jochen, Paolo Manghi, Natalia Manola, Lukasz Bolikowski, Najla Rettberg, Birgit Schmidt. (2013). Data Curation in the OpenAIRE scholarly communication infrastructure. Information Standards Quarterly, 25(3): 13-19.

Smith, K. (2009). All universities should have an institutional repository. Bulletin of the American Society for Information Science and Technology (Online), 35(4), 11-31.

The roles of data in publishing and scholarship; musing on research attribution

purpleDataImgA few weeks ago, I had the opportunity to listen in on portions of the ORCID/Dryad symposium on research attribution being held in Oxford. Something that really stood out for me was David Deroure‘s description of the future of scholarship as “an ecosystem of interacting scholarly social machines” where the case can be made that social tech like Twitter becomes an “infrastructure and every hash tag is a social machine” for scholarship. What an exciting time to be involved in developing/contributing to this new ecosystem.

When it comes to data citation/altmetric tracking and attribution, there are so many things that still have to be hashed out. Christine Borgman, who serves on the CODATA-ICSTI task group on data citation standards and practices, pointed out that we already have a broken model for citation metrics just for article publication, let alone coming up with standards that include data, software and other less traditional forms of scholarly output. She posed so many questions that quickly need to have answers, including how the many contributors involved in data often want attribution, but which of those roles deserve it and the responsibility/accountability that comes with it (either social or legal)? Publication of these nontraditional outputs will help with both attribution and accountability, but will also create a whole new set of problems, questions, and needs for standardization.  It’ll also be interesting to see where the library/librarian’s role in assisting with the curation of data falls with regard to questions of attribution…

Why we cite will also drive the shape of this evolving model, particularly decisions about whether more emphasis will be placed on creating mechanisms for linking, discovery of and REUSE (including the transformation of data as it is reused) of these outputs than on the idea of simply giving credit.  I’m hoping it’s more the former than the latter, though I completely understand the importance and need for attribution.

This summer, I and some of my colleagues on campus are dedicating a bit of time to explore this ecosystem and the future that SUNY Geneseo might have in it.  I have no doubt that it will be an interesting process.