Manage Your Research Data: Backup, Storage & Security
This guide provides a primer on the fundamentals of data management.
Storage, Backup and Security for Your Data
Storage, Backup and Security
- Fundamental and interrelated components of a data management strategy
- Together ensure ongoing integrity of research data
- Project planning and data management must account for all three
Storage and Security Best Practices
- Use hard drive or tape backup system
- Unencrypted is ideal because most easily read by you and others in the future unless encryption is required
- If encrypted because of human subjects then keep passwords and keys on paper (2 copies) and in encrypted digital files
- Uncompressed is ideal for storage, so to conserve space limit compression to your 3rd backup copy
- Electronic data should be saved on a device that has the appropriate security safeguards such as:
- unique identification of authorized users,
- password protection,
- encryption,
- automated operating system patch (bug fix),
- anti-virus controls,
- firewall configuration, and
- scheduled and automatic backups to protect against data loss or theft.
Backup Best Practice = 3 Copies
Researchers are encouraged to keep 3 copies of their active files:
- The original/active file stored on your computer. This is the file you update daily throughout the course of your work.
- A backup copy on a physical external drive (Ex. disk, jump-drive, external hard drive). This should be updated frequently in case your primary work computer is lost.
- A backup file stored on a remote or cloud drive (Ex. shared university drive, cloud storage). This should be updated frequently in case your physical storage devices are lost. This also ensures remote access to files.
Finally, when a project is concluded and the data is no longer being amended or collected, a final archived copy of the files can be permanently stored in Portland State's data depository, PDXScholar, or other established data repository.
Research Computing Services
The Office of Information Technology provides assistance in storing live/active data. This support includes:
- Data Storage
- Research Software
- High-performance computing
- Platforms for long running processes
- Support for Cluster computing
Additional information about the range of OIT's services can be found on the Research Computing Services website:
- OIT, Research Computing Services"Computing resources ranging from laptop to supercomputer, from Excel-scale to Exascale, with the purpose of facilitating scientific discovery. OIT provides servers, data storage, networking and a broad range of software to handle many different computational requirements. These systems:
* provide platforms for very long running processes.
* provide platforms for high-performance computing at scale with MPI.
*supply data storage for active computing.
* give you access to a range of installed research software."
Questions to Ask Yourself about Securing Data
- How often should data be backed up?
- How many copies of data should you have?
- Where can you store your data?
- How much server space can you get?
- Should your data be encrypted?
Storage Options
Storage Options
- Internal, local hard drives
- Networked storage
- External storage devices
- Physical storage
- Remote storage services (The Cloud)
- Private sector services: Amazon S3, Elephant Drive, Jungle Disk, Mozy, Carbonite
Things to Remember
- Data sharing is not data storage
- Read terms of service
- Consult your PI or institutional policies
Version Tracking
- GithubThough GitHub is primarily a repository where software developers can collaborate with one another on common projects while keeping track of versions of code uploaded from multiple contributors, its collaborative development workflow environment can also be used to create and maintain a working dataset repository whenever comma-separated or tab-delimited text file formats are used.