Manage Your Research Data: Data Archives & Data Sharing
Archiving Data at Portland State University
"Archived Data" is material you have collected throughout the research process that you want to share when the project is conclude.
Most grants will call this "final research data," which PIs will share with colleagues to facilitate future research projects. Once data is archives, it cannot be changed and is static.
Live or active data, which is still being collected or manipulated throughout a project's life cycle, should not be archived. For live data storage, contact PSU's Office of Information Technology.
- Storage of live & active research dataFor storage of live/active data, please contact the Office of Information Technology.
For long-term data storage at PSU, please see Data Storage link below. - Data Archive at PSU (PDXScholar)PDXScholar is Portland State University’s institutional repository. Features:
*No fees
*No file size limit
*No type/format limits
*Amazon S3 backup
*DOI – permanent, unique identifier
If you chose to archive your data in PDXScholar, the following is an example description of PDXScholar for your data management plan. Copy and edit as needed.
PDXScholar is Portland State University’s institutional repository that supports long-term data storage and access. Data will be archived in perpetuity at Portland State University’s institutional repository, PDXScholar. The data will be available [upon creation/upon conclusion of the grant/after some embargo period] in accordance with the rights policies outlined [elsewhere]. Primary responsibility for long-term preservation of the data rests with the Digital Initiatives Unit at Portland State University.
Data Repositories
Below is a sample of date repositories in which researchers can deposit their data.
Please note that grants or journals may require PIs to deposit their data into a specific repository.
- ICPSR (Inter-university Consortium for Political and Social Research)PSU is a member of ICPSR, which allows members of our community to deposit data in the ICPSR database.
Similar to OpenICPSR, which allows non-subscribers to deposit data, with two advantages: 1) there is no data cap 2) ICPSR staff assist depositors throughout the process.
- DryadThe Dryad Digital Repository is a curated resource that makes research data discoverable, freely reusable, and citable. Dryad provides a general-purpose home for a wide diversity of data types. It is an open, easy-to-use, not-for-profit, community-governed data infrastructure
- FigshareFigshare is a multidisciplinary repository where users can make all of their research outputs available.
- OpenICPSR"A service of the Inter-university Consortium for Political and Social Research (ICPSR), openICPSR is a self-publishing repository for social, behavioral, and health sciences research data. openICPSR is particularly well-suited for the deposit of replication data sets for researchers who need to publish their raw data associated with a journal article so that other researchers can replicate their findings."
Note: Limit of 2GB space - PDXScholarPDXScholar is Portland State University's institutional repository, which allows PIs to freely share their data. PDXScholar is a trusted data depository that meets federal open data sharing requirements.
FAIR Data
The ultimate goal of metadata is to make data findable, accessible, interoperable, and reusable ("FAIR Data"). The defining principles of FAIR data are:
Findable
F1. (Meta)data are assigned a globally unique and persistent identifier
F2. Data are described with rich metadata (defined by R1 below)
F3. Metadata clearly and explicitly include the identifier of the data they describe
F4. (Meta)data are registered or indexed in a searchable resource
Accessible
A1. (Meta)data are retrievable by their identifier using a standardised communications protocol
A1.1 The protocol is open, free, and universally implementable
A1.2 The protocol allows for an authentication and authorisation procedure, where necessary
A2. Metadata are accessible, even when the data are no longer available
Interoperable
I1. (Meta)data use a formal, accessible, shared, and broadly applicable language for knowledge representation.
I2. (Meta)data use vocabularies that follow FAIR principles
I3. (Meta)data include qualified references to other (meta)data
Reusable
R1. (Meta)data are richly described with a plurality of accurate and relevant attributes
R1.1. (Meta)data are released with a clear and accessible data usage license
R1.2. (Meta)data are associated with detailed provenance
R1.3. (Meta)data meet domain-relevant community standards
Data Repository Directories
- re3data.orgRe3data is a global registry of research data repositories that covers research data repositories from different academic disciplines.
- Open Access Directory: Disciplinary Data RepositoriesList of disciplinary data repositories
- Repository FinderA pilot project of the Enabling FAIR Data Project led by the American Geophysical Union (AGU) in partnership with DataCite and the Earth, space and environment sciences community. Searches Re3data, but also lists repositories that meet the criteria of the Enabling FAIR Data Project and those that meet the criteria of the FAIRsFAIR Project.
Final Steps Before Sharing Data
Before sharing data, consider the following:
- Are there any legal restrictions?
- Does your funder have sharing restrictions?
- Has sensitive data or personal identifiers of subjects been redacted?
- Is the redaction in compliance with FERPA and HIPPA?
- Do you have the legal rights to share data?
Privacy Considerations:
- Anonymized data: Data in which identifying information has been removed, substituted, aggregated, or generalized.
- Identifiers: Variables that could be used to identify individuals or groups in the study.
- Direct identifiers are information can directly identifies participants. These may include:
- Name, initials, address, contact information, social security numbers, vehicle registration data, medical data, device information, IP address, photographs, recordings, family information, date of birth.
- Indirect identifiers are information that can identify an individual or group when combined with other data. These may include:
- Gender, place of birth, rare medical condition, place of treatment, zip or area code, socioeconomic data (ex. workplace, income, education institution), ethnicity, age, geographic indicators, transcript records.
- Restricted data: Data in the study that may be shared, but only under secure conditions.
- Sensitive data: Data that can be used to identify a group or individual.
Metadata and Documentation:
Before sharing the project, be sure to include appropriate metadata, which is information and documentation about the data itself. Metadata includes codebooks, read me files, and supporting documentation. See guide for full discussion of metadata and documentation.
- ICPSR Guide to Social Science Data Preparation and Archiving (5th ed.)For additional information, see ICPSR's guide, specification "Phase 5: Preparing Data for Sharing," pp. 36-39.