Distinctive Collections opens web archives to the public

Archived MIT-produced and affiliated webpages ensure valuable information about the Institute remains accessible

The Department of Distinctive Collections is pleased to announce that our web archives collections are now open to the public. These web archives consist of MIT-produced webpages and affiliated webpages selected for preservation to ensure valuable information about the Institute and groups and individuals connected to it is accessible and available to researchers in the long term. You can access these archived webpages through the Archive-It website.

Have you ever been browsing for something on an MIT webpage and run into the “404 page not found” screen? While the little drawing tool that shows up when that happens is fun to play with, mostly you want to get to the information that you were looking for. These newly accessible collections from the MIT Libraries Distinctive Collections can help with that. In addition to getting access to websites that are no longer accessible on the live web, you can use these archived websites to study the history of the organizations and people that created these sites. For instance, if you want to know and study what the Computer Science and Artificial Intelligence Lab (CSAIL) was working on in the late 2010s, their web archives would be a good place to start, in addition to their other archival records which you can access by request in the Distinctive Collections Reading Room.

Scope of collections

Distinctive Collections began collecting web pages through the Archive-It tool (software and platform offered by Internet Archive) in 2016. Since then, the number of websites or “seeds” that we are capturing has risen to 61 with plans to continue to increase in the future. We have started by collecting websites of and about the Institute as our initial effort in order to build on extensive collections of related materials already preserved by Distinctive Collections. This scope will grow as we start to collect websites programmatically and identify unique and distinctive resources to the Institute and beyond. As our capacity grows, we plan to develop options to start archiving community recommended websites.

Screen shot of archived spxce.mit.edu website

Screen shot of archived spxce.mit.edu website

Accessing the web archives

By clicking on the URL listed in Archive-It, you can see the dates the webpages were captured and can click on one of them to see the webpage as it appeared on that date. Some sites that we’ve captured are of groups on campus that have only have started recently such as SPXCE: Social Justice Programming & Cross Cultural Engagement and websites related to ongoing developments with the College of Computing. By capturing websites of groups soon after their creation, a future researcher can see how these areas evolved from conception to an established group at the Institute. You can also see how a student group website like the Student Information Processing Board has changed over the past few years. In addition to these examples, there are many other websites from special projects, departments, alumni groups, and more available at the Archive-It page. If you’re  interested in working with any of this data (in the form of WARC files), please contact Distinctive Collections and we can provide it on an individual basis (we are working on making these files more widely available).