Recognizing the Next Generation of Open Data Leaders

MIT President Sally Kornbluth joins the Libraries and the School of Science in celebrating the 2025 MIT Prize for Open Data

Sally Kornbluth at a podium

Sally Kornbluth. Photo by Bryce Vickmark.

The fourth annual MIT Prize for Open Data, which included a $2,500 cash prize, was awarded in October to seven individual and group research projects. Presented jointly by the School of Science and the MIT Libraries, the prize highlights the value of open data — research data that is openly accessible and reusable — at the Institute.

The 2025 awards were presented at a celebratory event held during International Open Access Week. Winners gave five-minute presentations on their projects and the role that open data plays in their research. MIT President Sally Kornbluth opened the event by offering her congratulations to the winners, noting their creativity and determination and the wide range of projects being celebrated.

“I was excited to see how many of these projects support priorities we’ve identified for MIT, high-impact areas where we have a responsibility to harness our collective efforts and make a real difference – from climate and health care to generative AI and manufacturing,” said Kornbluth.

2025 MIT Prize for Open Data honorees

Photo by Bryce Vickmark.

Winners were chosen from more than 65 nominees, representing 30 different academic departments, labs, centers, and institutes:

  • Lucas Attia, a graduate student in chemical engineering, won for fastsolv, along with graduate student Jackson Burns and faculty members Patrick S. Doyle and William H. Green. Fastsolv, an open-sourced deep learning model for organic solubility prediction, is freely available online.
  • Timur Cinay, a graduate student in earth, atmospheric, and planetary sciences (EAPS), won for the Galapagos Emissions Monitoring Station, a first-of-their-kind continuous dataset monitoring ocean emissions of the greenhouse gas nitrous oxide, made completely free and openly available to all researchers globally.
  • Edgar Costa, a research scientist in mathematics, was recognized for the L-functions and modular forms database (LMFDB), a database of mathematical objects arising in number theory and arithmetic geometry that illustrates some of the mathematical connections predicted by the Langlands program.
  • The team behind the Geospatial Trucking Industry Decarbonization Explorer (Geo-TIDE) was recognized for this open data platform that synthesizes fragmented public datasets into more than 400 curated, cloud-hosted geospatial layers for freight decarbonization planning.
  • Austin Saragih, PhD candidate, MIT Center for Transportation and Logistics, Willem Guter, research engineer, MIT CAVE and MIT Intelligent Logistics Systems, and their collaborators won for SCGraph, an open source Python package that transforms scattered open transportation datasets into clean, ready-to-use geographic networks for research and real world analysis.
  • Nada Tarkhan, graduate student in architecture, and Paolo Giani, postdoctoral associate, EAPS, won for their project, “Extreme-Aware Meteorological Years: Open Weather Data for Climate-Resilient Building Simulations.” The project’s novel weather file formats embed extreme events into building simulation workflows using anomaly detection and climate model emulators.
  • Jonathan Zheng, graduate student, chemical engineering, and his collaborators won for their project, “Widespread misinterpretation of pKa terminology for zwitterionic compounds and its consequences.” This work explained an error in a widely-used biochemical dataset, ChEMBL, examined the downstream repercussions, and made recommendations for data curation to avoid these issues in the future.

Learn more about the winning projects, as well as honorable mentions, and see links to all the projects’ research data, at libraries.mit.edu/opendata.