Manage Your Research Data: Data Archives & Data Sharing

This guide provides a primer on the fundamentals of data management.

Archiving Data at Portland State University

"Archived Data" is material you have collected throughout the research process that you want to share when the project is conclude.
Most grants will call this "final research data," which PIs will share with colleagues to facilitate future research projects.  Once data is archives, it cannot be changed and is static.

Live or active data, which is still being collected or manipulated throughout a project's life cycle, should not be archived. For live data storage, contact PSU's Office of Information Technology.

If you chose to archive your data in PDXScholar, the following is an example description of PDXScholar for your data management plan. Copy and edit as needed.

PDXScholar is Portland State University’s institutional repository that supports long-term data storage and access. Data will be archived in perpetuity at Portland State University’s institutional repository, PDXScholar. The data will be available [upon creation/upon conclusion of the grant/after some embargo period] in accordance with the rights policies outlined [elsewhere]. Primary responsibility for long-term preservation of the data rests with the Digital Initiatives Unit at Portland State University.

Data Repositories

Below is a sample of date repositories in which researchers can deposit their data. 
Please note that grants or journals may require PIs to deposit their data into a specific repository. 

FAIR Data

The ultimate goal of metadata is to make data findable, accessible, interoperable, and reusable ("FAIR Data"). The defining principles of FAIR data are:
Findable
F1. (Meta)data are assigned a globally unique and persistent identifier
F2. Data are described with rich metadata (defined by R1 below)
F3. Metadata clearly and explicitly include the identifier of the data they describe
F4. (Meta)data are registered or indexed in a searchable resource
Accessible
A1. (Meta)data are retrievable by their identifier using a standardised communications protocol
A1.1 The protocol is open, free, and universally implementable
A1.2 The protocol allows for an authentication and authorisation procedure, where necessary
A2. Metadata are accessible, even when the data are no longer available
Interoperable
I1. (Meta)data use a formal, accessible, shared, and broadly applicable language for knowledge representation.
I2. (Meta)data use vocabularies that follow FAIR principles
I3. (Meta)data include qualified references to other (meta)data
Reusable
R1. (Meta)data are richly described with a plurality of accurate and relevant attributes
R1.1. (Meta)data are released with a clear and accessible data usage license
R1.2. (Meta)data are associated with detailed provenance
R1.3. (Meta)data meet domain-relevant community standards

Data Repository Directories

Final Steps Before Sharing Data

Before sharing data, consider the following:

  • Are there any legal restrictions? 
  • Does your funder have sharing restrictions?
  • Has sensitive data or personal identifiers of subjects been redacted?
  • Do you have the legal rights to share data?

Privacy Considerations:

  • Anonymized data: Data in which identifying information has been removed, substituted, aggregated, or generalized.
  • Identifiers: Variables that could be used to identify individuals or groups in the study.
  • Direct identifiers are information can directly identifies participants. These may include:
    • Name, initials, address, contact information, social security numbers, vehicle registration data, medical data, device information, IP address, photographs, recordings, family information, date of birth.
  • Indirect identifiers are information that can identify an individual or group when combined with other data. These may include:
    • Gender, place of birth, rare medical condition, place of treatment, zip or area code, socioeconomic data (ex. workplace, income, education institution), ethnicity, age, geographic indicators, transcript records.
  • Restricted data: Data in the study that may be shared, but only under secure conditions.
  • Sensitive data: Data that can be used to identify a group or individual.

Metadata and Documentation:

Before sharing the project, be sure to include appropriate metadata, which is information and documentation about the data itself. Metadata includes codebooks, read me files, and supporting documentation. See guide for full discussion of metadata and documentation.