MSU Libraries have several datasets and resources available for research. We have text and datasets to use for text and data mining and for analysis. We have datasets across broad categories or genres.
TDM Studio is a text and data mining (TDM) solution that allows users to text mine newspapers, scholarly journals, dissertations, and other content available through MSU’s ProQuest subscriptions.
Text Assembler enables construction and bulk download of full text corpora derived from newspapers, newswires, web-based news publications, broadcast news transcripts, magazines, and legal and trade publications published between 1980 and the present.
The Google Dataset (GDS) is a collection of scanned books, totaling approximately 3 million volumes of text, or 2.9 terabytes (2,970 gigabytes) of data. The books included in the dataset are public domain works digitized by Google and made available by the Hathi Trust Digital Library.
The Inter-university Consortium for Political and Social Research (ICPSR), established in 1962, is an integral part of the infrastructure of social science research. ICPSR maintains and provides access to a vast archive of social science data for research and instruction. Search this site for holdings. Data downloads are free of charge by registering with an MSU email address.
Contains scholarly documents and primary source material from JSTOR and partners. Build your own dataset and take tutorials to learn about text analysis and data literacy.
The DSL features cutting-edge technologies that may be new to many scholars, faculty, researchers, and librarians including a 360-degree room. It also features infrastructure and support for more known digital practices like scanning/digitizing, text mining, and data management.