Manage Your Research Data: Overview
Introduction
What is Data Management?
Data Management is the process by which researchers plan to collect, store, archive, and ultimately share their research data. Many questions related to Data Management have long been issues researchers are trained to address through the course of their work:
- What data are collected and created?
- How is the data created or collected?
- What supplemental documentation is needed to understand the data?
- Are there privacy issues associated with your data collection?
- Are there legal issues associated with your data?
- How will your data be stored and backed-up during the project?
- How will you ensure data security?
- How much of the collected data will be retained and shared when the project is concluded?
- What is the long-term preservation plan?
- How will the final data be shared?
Answers to these questions will vary by discipline and project, though there are general best practices.
The most useful resource for answering these questions as they relate to individual projects are the DMPTool.
Definitions
Data Management guidelines were written to apply across all disciplines. As a result, the definitions and terminology used is often extremely broad. While potentially frustrating, this ambiguity is meant to allow flexibility for researchers as they write a Data Management Plan. Below are some comment terms and definitions:
Data
Data reflects any information created during the course of a research project that is needed to validate or recreate the final results of the study. This can include, but is not limited to: test results, statistics, code, images, computer files, survey responses, transcripts, recordings, laboratory logs, or algorithms.
Live Data
Live data is data currently being created, manipulated, or used for an ongoing research project.
Storage & Stored Data
Short or long-term storage of active data. This may be on OIT Research Computing data storage devices, local hard drives, or on the Cloud.
Archived Data
Archived Data is data that is no longer being altered or manipulated, or has served as final research data for a grant or published study. Archived data is being stored in a secure and permanent system, and is accessible to researchers.
Shared
Data that is made publicly accessible through data repositories like PDXScholar.
Final Research Data
This is data generated during a project that is needed to validate or recreate the results and conclusion of the completed study. The scope of “Final Research Data” may vary between projects, and it is the responsibility of the Principal Investigator to determine and justify the scope of their Final Research Data within their Data Management Plan.
Principal Investigator
The primary researcher responsible for managing the research data. Principal Investigator may also be the researcher tasked with overseeing a laboratory, or the lead team member responsible for overseeing data collection and creation.
Data Management Plan
The document drafted by the Principal Investigator, through which data creation, preservation, and sharing policies are outlined. Fundamental issues to be addressed in each plan must include: data collection methods, documentation and metadata, ethics and legal compliance, storage and backup policies, data sharing, data management responsibilities.
Open Data vs Public Data
Public data is publicly available upon request, while open data is immediately and freely available without an intermediary. Data produced by the National Center for Education Statistics (NCES) that must be requested is "public" while data that can be immediately downloaded from the NCES website is "open." Similarly, if researchers need to directly contact a PI for access to research data, this does not meet open data sharing requirements; conversely, depositing data in an open repository like PDXScholar, which provides 24/7 immediate access to content, does count as open data.
Quick Reference
- DMPToolDMPTool provides a step-by-step guide to creating a Data Management Plan that meets federal requirements.
- SPARC "Data Sharing Requirements by Federal Agency"Outlines data sharing requirements for: AHRQ, ASPA, CDC, D. Defense, D. Education, D. Energy, D. Transportation, EPA, FDA, Homeland Security, NASA,NIH, NIST, NOAA, NSF, USAID, USDA, and USGS.
- Portland State University Research Data GuidebookThis guidebook, compiled by Prof. Kimberly Pendell, assists Portland State faculty and students with the proper care and management of their research data by gathering together the University’s infrastructure, training, and recommended best practices.
Further Readings
- Data management: A gentle introduction (ebook) byISBN: 9401805504Publication Date: 2020The overall objective of this book is to show that data management is an exciting and valuable capability that is worth time and effort.
- Data Management for Researchers byCall Number: PSU Library Shelves -- 5th floor Q180.55.E4 B75 2015ISBN: 9781784270131Publication Date: 2015-09-01A comprehensive guide for scientific researchers providing everything they need to know about data management and how to organize, document, use and reuse their data.--Source other than the Library of Congress.
- Engaging researchers with data management (ebook) byCall Number: E-BookISBN: 9781783747986Publication Date: 2019-10-01A collection of 24 case studies, drawn from institutions across the globe, that demonstrate clearly and practically how to engage the research community with RDM.
- Data Management and Data Description byISBN: 9781857420388Publication Date: 2020"The author sets out the main issues in Data Management, from the first principles of meta modelling and data description through the comprehensive management exploitation, re-use, valuation, extension and enhancement of data as a valuable organizational resource."
- Data stewardship: An actionable guide to effective data management and data governance byCall Number: QA76.9.A25 P594 2021ISBN: 9780128221327Publication Date: 2020
- Teaching Research Data Management byCall Number: PSU Library Shelves ZA4065 .T43 2022ISBN: 9780838937976Publication Date: 2022-01-03"This collection gathers practitioners from a broad range of academic libraries to describe their services and instruction around research data. You will learn about such topics as integrating research data management into information literacy instruction; threshold concepts for novice learners of data management; designing a data management workshop series; and key competencies that are entry points for library-faculty collaboration in data instruction"--Provided by publisher.
- Big data management: Data governance principles for big data analytics byISBN: 9783110662917Publication Date: 2020Data analytics is core to business and decision making. The rapid increase in data volume, velocity and variety offers both opportunities and challenges....Big Data Management discusses numerous policies, strategies and recipes for managing big data. The book is a must-read for data scientists, data engineers and corporate leaders who are implementing big data platforms in their organizations."
Guide Editors
This guide is edited by Julia Stone, Open Scholarship Librarian, and Kimberly Pendell, Research & Instruction Services Manager, Social Work and Social Sciences Librarian.
If you have questions, please contact: lib-data-management@pdx.edu