Portland State University Copyright Guide: Text and Data Mining
What is Text and Data Mining?
Text and Data Mining (TDM) is the automated process of selecting and analyzing large amounts of text and/or data for purposes of searching, finding patterns, finding relationships, analysis and learning how content relates to ideas and needs in a way that can provide information for studies, research, etc. This allows researches to work though more content than they would be able to do manually.
Policy
TDM for non-profit educational purposes is often considered a Fair Use under Copyright Law. However, the licenses for some of our databases and electronic resources prohibit TDM. Also, the use of crawlers, bots, scripts, or other methods to search for and extract data usually is not permitted. Violating licensing terms can result in the University losing access to the electronic source.
If a database does permit TDM, there are often restrictions on how data can be accessed, used, and shared. Some vendors require researchers to provide detailed information about the research they plan to conduct, and sometimes, they must sign additional licenses or provide a fee for data files.
As a member of the Orbis Cascade Alliance, our Library recognizes as a licensing best practice that authorized users may use licensed materials to perform and engage in text and/or data mining activities within the context of scholarship, research, and other educational purposes. Authorized users may make the results available for use by others, so long as the purpose is not to create a product for use by third parties that would substitute for the licensed materials. However, publishers may require payment for text and data mining.
Please refer any questions to your Subject Librarian for more details, but note that as a researcher, you may have to contact the content owners for permission or clarification.