Tag: Big Data

Program on Information Science awarded $10,000 by Amazon, Inc. for open data research

Photo of the Constitution of the United States of America. A feather quill is included in the photo.The Constitution of the United States is the supreme law of the United States of America and is the oldest codified written national constitution still in force. It was completed on September 17, 1787.

The Amazon Corporation awarded $10,000 through The AWS Cloud Computing for Research program to support a joint project between the University of Florida and MIT to develop a prototype for an open election data facility. Led by Micah Altman, director of research and head/scientist, Program for Information Science at MIT Libraries, Michael P. McDonald, associate professor of political science at University of Florida, and Charles Stewart III, Kenan Sahin Distinguished Professor of Political Science at MIT, the project aims to develop a prototype system to automatically collect and archive data related to United States elections.

This collaboration builds upon a history of prior joint research and software development. The work will contribute to the universities’ capacity to collect, archive, and disseminate research data.

Making decisions in a world awash in data: We’re going to need a different boat

Anthony Scriffignano head shot

Join us for a brown bag talk with Anthony Scriffignano, SVP/Chief Data Scientist at Dun and Bradstreet. In this session, Scriffignano will explore some of the ways in which the massive availability of data is changing and the types of questions we must ask in the context of making business decisions.

The session will cover three main themes: The new normal (how the data around us continues to change), how are we reacting (bringing data science into the room), and the path ahead (creating a mindset in the organization that evolves). Ultimately, what we learn is governed as much by the data available as by the questions we ask.  This talk, both relevant and occasionally irreverent, will explore some of the new ways data is being used to expose risk and opportunity and the skills we need to take advantage of a world awash in data. Learn more

Scriffignano in an internationally recognized data scientist with over 35 years experience in multiple industries and enterprise domains. He has extensive background in linguistics and advanced algorithms, leveraging that background as primary inventor on multiple patents worldwide. He provides thought leadership globally, including serving as a forum panelist at the World Internet Conference hosted by Chinese President Xi Jinping in Wuzhen, China and providing subject matter expertise to the US National Security Telecommunications Advisory Committee Report to the President on Big Data Analytics. He was recently published in CIOReview (US), Mint (India) and quoted in various publications including China Daily, Xinhua and Peoples Daily. He was profiled by InformationWeek and by BizCloud, and was a recent CXOTalk guest. He regularly presents at business and academic venues globally regarding emerging trends in data and information stewardship relating to the “Big Data” explosion, multilingual challenges in business identity and malfeasance in commercial settings.

Information Science Brown Bag talks, hosted by the Program on Information Science, consists of regular discussions and brainstorming sessions on all aspects of information science and uses of information science and technology to assess and solve institutional, social and research problems. These are informal talks. Discussions are often inspired by real-world problems being faced by the lead discussant.  

Event Details
Location: E25-401
Lunch will be provided; please bring your own drink and your questions.

November 14, 2016 12 - 1pm

Issues in curating the open web at scale

Photo of Gary Price

Gary Price

Much of the web remains invisible: resources are undescribed, unindexed or simply buried — as many people rarely look past the first page of Google searches or are unavailable from traditional library resources. At the same time, many traditional library databases pay little attention to quality content from credible sources accessible on the open web.

How do we build collections of quality open-web resources (i.e. documents, specialty databases, and multimedia) and make them accessible to individuals and user groups when and where they need it? This talk reflects on the emerging tools for systematic programmatic curation; the legal challenges to open-web curation; long-term access issues, and the historical challenges to building sustainable communities of curation.

Event details
Location: E25-401
Lunch will be provided

About Gary Price
Price received a Bachelor of Arts degree from the University of Kansas, and a master’s in library and information science from Wayne State University. He was for a time a reference librarian at George Washington University. Price co-authored the book The Invisible Web (see Deep Web) with Chris Sherman in July 2001. Price has worked as a librarian at George Washington University and by the search engine Ask.com as Director of Online Information Resources. He also does frequent consulting projects and has written for a number of publications. Currently, he is a contributing editor at Search Engine Land. Before launching INFOdocket.com and FullTextReports.com in February 2011, Gary Price and Shirl Kennedy worked together for 10 years as founders and co-editors of ResourceShelf and DocuTicker. Price won the Special Libraries Association‘s “Innovations in Technology Award” in 2002, and their News Division‘s “Agnes Henebry Roll of Honor Award” in 2004. He was also awarded the Alumni of the Year Award from Wayne State’s Library and Information Program.

Information Science Brown Bag talks, hosted by the Program on Information Science, consist of regular discussions and brainstorming sessions on all aspects of information science and uses of information science and technology to assess and solve institutional, social and research problems. These are informal talks. Discussions are often inspired by real-world problems being faced by the lead discussant.  

September 20, 2016 12 - 1pm

Confidential Information: Storage, Sharing & Publication

micah_altman_largecrop

This class focuses on the tools and good practices for storing confidential data, sharing data for collaboration, and publishing data or derivative results for broad use.  Topics covered in this class include: an overview of information security standards and frameworks; information security core practices (credentials, authentication, authorization, and auditing); information partitioning and secure linking; file, disk, and network encryption tools and practices; cloud storage practices for confidential information; data “de-identification” tools and practices; statistical disclosure limitation approaches and tools; and data use agreements.

Event Details
Location: E25-401
Register
The course will be presented in a half-day format. Individual consultations may be scheduled with Micah Altman by contacting Kelly Hopkins at khopkins@mit.edu.

Discussant Bio: Micah Altman, PhD, is Director of Research and Head/Scientist, Program on Information Science for the MIT Libraries, at the Massachusetts Institute of Technology. Altman is also a Non-Resident Senior Fellow at The Brookings Institution. Prior to arriving at MIT, Altman served at Harvard University for 15 years as the Associate Director of the Harvard-MIT Data Center, Archival Director of the Henry A. Murray Archive, and Senior Research Scientist in the Institute for Quantitative Social Sciences.

Altman conducts work primarily in the fields of social science, information privacy, information science and research methods, and statistical computation—focusing on the intersections of information, technology, privacy, and politics; and on the dissemination, preservation, reliability and governance of scientific knowledge.

July 19, 2016 12 - 3pm

Searching in harsh environments

sherlock.jpg

A Libraries & Big Data Brown Bag brought to you by the Information Science program at the MIT Libraries

Bring your lunch and join us for this fascinating presentation with Ophir Frieder, the Robert L. McDevitt, K.S.G., K.C.H.S. and Catherine H. McDevitt L.C.H.S. Chair in Computer Science and Information Processing at Georgetown University. This brown bag is appropriate for intermediate and advanced researchers and anyone interested in Big Data.

Many consider “searching” a solved problem, and for digital text processing, this belief is factually based. The problem is that many “real-world” search applications involve complex documents — comprising a mixture of images, text, signatures, tables, etc., and often available only in scanned hardcopy formats — and such applications are far from solved. Some of these documents are corrupted, some contain multiple languages, and accurate search systems for such document collections are currently unavailable.

This session will cover:

  • Efforts at building a complex document information-processing prototype and previous complex document benchmark development efforts
  • Spelling correction in adverse environments, including foreign name search and medical term search
  • Analyzing social media, an additional, non-traditional search environment

Learn more about this presentation.

Location:
E25-401
Lunch will be provided, but attendees should bring their own beverage.

March 15, 2016 12 - 1pm