Manage Your Research Data: Overview
On This Page
Introduction
What is Data Management?
What is a Data Management Plan?
Definitions
Defines important terms like "Final Data," "Live Data, "Archived Data," "Open Data," and "Public Data"
Further Readings
Provides additional suggested readings on Data and Data Management.
Introduction
What is Data Management?
Data Management is the process by which researchers plan to collect, store, archive, and ultimately share their research data. Many questions related to Data Management have long been issues researchers are trained to address through the course of their work:
What data are collected and created?
How is the data created or collected?
What supplemental documentation is needed to understand the data?
Are there privacy issues associated with your data collection?
Are there legal issues associated with your data?
How will your data be stored and backed-up during the project?
How will you ensure data security?
How much of the collected data will be retained and shared when the project is concluded?
What is the long-term preservation plan?
How will the final data be shared?
Answers to these questions will vary by discipline and project, though there are general best practices.
The most useful resource for answering these questions as they relate to individual projects are the DMPTool.
In addition to this guide, the Library also provides DMP workshops.
What is a Data Management Plan?
A Data Management Plan (DMP) is a document, typically a page in length, that explicitly addresses the questions above as they relate to a specific research project. The content of a DMP will vary by discipline, funding agency, and project. Some DMPs will be no more than a paragraph explaining that no data was collected nor shared, while a few may be two pages to explain why specific data is not being shared (usually for legal or privacy reasons). Pragmatically, a successful DMP should at least meet the basic requirements for grants (these guidelines will be spelled out in the grant or by the granting agency).
Data Management requirements are constantly changing, making it extremely difficult to provide a definitive "how to" guide for drafting a DMP. For up-to-date guidance relating to your specific topic, consult:
- The instructions in your grant application.
- The standards suggested by your professional association (ex. Data Management recommendations via the American Psychological Association)
- The standards suggested by the federal agency funding projects in your field (ex. NSF Data Staring Policy)
Examples
Data Management requirements and individual DMPs vary, making it difficult to provide sample DMPs that are both current and representative. The University of Arizona and Stanford University, however, to provide examples (below).
Please note that agency requirements are constantly changing, so a DMP from even a few years ago may not be an ideal template.
The best resources to ensure your DMP is accurate are guidelines within your grant, or those posted on the funding agency's website.
The DMPTool is also constantly update, and is your best resource for drafting a Data Management Plan.
Definitions
Data Management guidelines were written to apply across all disciplines. As a result, the definitions and terminology used is often extremely broad. While potentially frustrating, this ambiguity is meant to allow flexibility for researchers as they write a Data Management Plan. Below are some comment terms and definitions:
Data
Data reflects any information created during the course of a research project that is needed to validate or recreate the final results of the study. This can include, but is not limited to: test results, statistics, code, images, computer files, survey responses, transcripts, recordings, laboratory logs, or algorithms.
Live Data
Live data is data currently being created, manipulated, or used for an ongoing research project.
Storage & Stored Data
Short or long-term storage of active data. This may be on OIT Research Computing data storage devices, local hard drives, or on the Cloud.
Archived Data
Archived Data is data that is no longer being altered or manipulated, or has served as final research data for a grant or published study. Archived data is being stored in a secure and permanent system, and is accessible to researchers.
Shared
Data that is made publicly accessible through data repositories like PDXScholar.
Final Research Data
This is data generated during a project that is needed to validate or recreate the results and conclusion of the completed study. The scope of “Final Research Data” may vary between projects, and it is the responsibility of the Principal Investigator to determine and justify the scope of their Final Research Data within their Data Management Plan.
Principal Investigator
The primary researcher responsible for managing the research data. Principal Investigator may also be the researcher tasked with overseeing a laboratory, or the lead team member responsible for overseeing data collection and creation.
Data Management Plan
The document drafted by the Principal Investigator, through which data creation, preservation, and sharing policies are outlined. Fundamental issues to be addressed in each plan must include: data collection methods, documentation and metadata, ethics and legal compliance, storage and backup policies, data sharing, data management responsibilities.
Open Data vs Public Data
Public data is publicly available upon request, while open data is immediately and freely available without an intermediary. Data produced by the National Center for Education Statistics (NCES) that must be requested is "public" while data that can be immediately downloaded from the NCES website is "open." Similarly, if researchers need to directly contact a PI for access to research data, this does not meet open data sharing requirements; conversely, depositing data in an open repository like PDXScholar, which provides 24/7 immediate access to content, does count as open data.
Further Readings
-
(E-Book) Big Data Management: Data Governance Principles for Big Data Analytics Por
Número de Clasificación: E-BookISBN: 9783110662917Fecha de Publicación: 2020-11-09"
Data analytics is core to business and decision making. The rapid increase in data volume, velocity and variety offers both opportunities and challenges. While open source solutions to store big data, like Hadoop, offer platforms for exploring value and insight from big data, they were not originally developed with data security and governance in mind. Big Data Management discusses numerous policies, strategies and recipes for managing big data. It addresses data security, privacy, controls and life cycle management offering modern principles and open source architectures for successful governance of big data. The author has collected best practices from the world's leading organizations that have successfully implemented big data platforms. The topics discussed cover the entire data management life cycle, data quality, data stewardship, regulatory considerations, data council, architectural and operational models are presented for successful management of big data. The book is a must-read for data scientists, data engineers and corporate leaders who are implementing big data platforms in their organizations." -
(E-Book) Data Management: a Gentle Introduction Por
Número de Clasificación: E-BookISBN: 9401805504Fecha de Publicación: 2020-03-03"The overall objective of this book is to show that data management is an exciting and valuable capability that is worth time and effort. More specifically it aims to achieve the following goals:
1.To give a “gentle” introduction to the field of DM by explaining and illustrating its core concepts, based on a mix of theory, practical frameworks such as TOGAF, ArchiMate, and DMBOK, as well as results from real-world assignments.
2.To offer guidance on how to build an effective DM capability in an organization. This is illustrated by various use cases, linked to the previously mentioned theoretical exploration as well as the stories of practitioners in the field.
The primary target groups are: busy professionals who “are actively involved with managing data”. The book is also aimed at (Bachelor’s/ Master’s) students with an interest in data management. The book is industry-agnostic and should be applicable in different industries such as government, finance, telecommunications etc." -
Data Management and Data Description Por
ISBN: 9781857420388Fecha de Publicación: 2020"The author sets out the main issues in Data Management, from the first principles of meta modelling and data description through the comprehensive management exploitation, re-use, valuation, extension and enhancement of data as a valuable organizational resource." -
Data Management for Researchers Por
Número de Clasificación: PSU Library Shelves -- 5th floor Q180.55.E4 B75 2015ISBN: 9781784270131Fecha de Publicación: 2015-09-01"
A comprehensive guide for scientific researchers providing everything they need to know about data management and how to organize, document, use and reuse their data.--Source other than the Library of Congress." -
(E-Book) Engaging Researchers with Data Management Por
Número de Clasificación: E-BookISBN: 9781783747986Fecha de Publicación: 2019-10-01"Engaging Researchers with Data Management is an invaluable collection of 24 case studies, drawn from institutions across the globe, that demonstrate clearly and practically how to engage the research community with RDM. These case studies together illustrate the variety of innovative strategies research institutions have developed to engage with their researchers about managing research data. Each study is presented concisely and clearly, highlighting the essential ingredients that led to its success and challenges encountered along the way. By interviewing key staff about their experiences and the organisational context, the authors of this book have created an essential resource for organisations looking to increase engagement with their research communities. This handbook is a collaboration by research institutions, for research institutions. It aims not only to inspire and engage, but also to help drive cultural change towards better data management. It has been written for anyone interested in RDM, or simply, good research practice."
Guide Editor
Rick Mikulski, Government Documents & Social Sciences Librarian.
If you have questions, please contact: lib-data-management@pdx.edu