Learn more about text mining

HathiTrust Data Capsule workshop series

The MIT Libraries is pleased to announce a series of workshops on text mining with the HathiTrust on June 3.

HathiTrust is a collaboration between several hundred academic and research organizations that provides access to millions of digitized texts from libraries from all over the world. The HathiTrust Research Center provides computational access to this collection through a variety of tools. Join us for one of a series of workshops (or attend the whole thing!) that dive into different aspects of these tools, and learn more about the text mining capabilities in the HathiTrust.

Attendees should bring their own laptops if they wish to follow along with the hands-on portions of each workshop. Registration links are included below.

All workshops will be led by Eleanor Koehl, Digital Scholarship Librarian for the HathiTrust Research Center. Eleanor provides training and outreach for the HathiTrust Research Center, including workshops and research support. She focuses on deepening engagement with member librarians and scholars on text mining resources for the HathiTrust corpus.

Introduction to Text Mining with HathiTrust
Time: 9-10:30 a.m.
Room: 2-105
Registration

This workshop provides an overview of the HathiTrust as well as the tools available to researchers. It will cover the following areas:

  • Overview of HathiTrust and the digital library data
  • Overview of HTRC and its tools and services
  • HTRC use cases (focused on the data capsules)

HathiTrust Research Center Data Capsules for Secure Full Text Data Mining
Time: 10:45 a.m.-12 p.m.
Room: 2-105
Registration
Prequesites:
Familiarity with HathiTrust, or Introduction to Text Mining with Hathitrust workshop

This workshop focuses on the HathiTrust Data Capsules, a tool that provides non-consumptive full text data mining service via their Data Capsule API to the full corpus of HathiTrust. Each capsule comes pre-loaded with standard data analysis programs and software that can be configured by the researcher. Topics include:

  • Introduction to the Data Capsule — why, where, and how
  • Demonstration and hands-on for how to work within the Data Capsule

HathiTrust Data Capsule Discussion
Time: 1-2 p.m.
Room: 2-105
Registration
Prequesites:
Familiarity with HathiTrust Data Capsules, or Introduction to Text Mining with Hathitrust and HathiTrust Research Center Data Capsules for Secure Full Text Data Mining workshops.

This workshop will allow participants to discuss the HathiTrust Data Capsules, as well as possible applications for their own research and/or teaching. It will also provide an opportunity for attendees to provide feedback on the Data Capsules themselves, such as what they think works particularly well or other features they would like to see.

HathiTrust Hands-On Data Access Tutorial and Discussion
Time: 2:30-4 p.m.
Room: 2-105
Registration
Prequesites: Familiarity with HathiTrust Data Capsules, or Introduction to Text Mining with HathiTrust, HathiTrust Research Center Data Capsules for Secure Full Text Data Mining, and HathiTrust Data Capsule workshops.

This session on the HathiTrust Data Capsule will provide:

  • Hands on demo and discussion session about the capsule architecture with interested technologists
  • Guiding question: What access mechanisms are in place for HathiTrust, and why?