MIT Libraries logo MIT Libraries

MIT Logo Search Contact

Find a data repository

Repositories can help you:

  • manage your data
  • cite your data by supplying a persistent identifier
  • facilitate discovery of your data
  • preserve your data over time

Selecting your repository

In choosing a repository for your data, first consider:

While there are many options to choose from, we’ve highlighted a small set of trusted repositories – Harvard Dataverse, Zenodo, and Dryad – to help researchers quickly review key features in making their decisions.

Please refer to each repository’s documentation for the most current information. If you see an error in the information below or would like to discuss what might work for you and your data, contact data-management@mit.edu.

Harvard Dataverse Zenodo Dryad
File management
File & dataset
size limits
2.5GB/file; 1TB per researcher 50GB/dataset
(contact to discuss larger datasets)
300GB/dataset
(contact to discuss larger datasets)
Useful integrations Open Science Framework (OSF); 
Dropbox
Github
(archive a Github repo in Zenodo)
Zenodo
(for software publication)
Frictionless Data
Versioning support? Yes Yes Yes
Persistent, unique identifier support DOI DOI for each version with a “Concept” and
DOI to represent “all versions”
DOI
Permissions & access
Allows multiple administrators? Yes Yes for collections, unknown for individual items Yes
User guestbook? Yes No No
Data licensing
options
CC0 default;
custom Terms of Use optional
Creative Commons licenses for Open Access CC0 required
Allows embargo? No, on roadmap for 2021 Yes Yes, if allowed by journal
Provides private
URLs for peer review?
Yes Yes for software through Dryad integration; unknown for other situations Yes
Other access restrictions available? Yes, allow access to specific accounts, and allow users to request access Yes, limit access with custom terms of use No
Computational
access
Search API, Data Access API,  and Dataverse Package for R on rOpenSci Project OAI-PMH REST API and Dryad Package for R on rOpenSci Project
Administrative considerations
Costs Free up to 1TB Free up to 50GB/dataset Costs covered by institutional, publisher, and funder members.  Otherwise a one-time $120 fee for authors for curation and preservation costs.
Usage analytics Downloads

Follows Counter Code Of Practice for Research Data Usage Metrics (Make Data Count)

Views, Downloads, Data volume, Unique views, Unique downloads

Follows Counter Code Of Practice for Research Data Usage Metrics (Make Data Count)

Views, Downloads, Citations

Follows Counter Code Of Practice for Research Data Usage Metrics (Make Data Count)

Trusted repository? * Yes Yes Yes

* Trusted repositories are those that commit to providing “reliable, long-term access to managed digital resources” (RLG-OCLC, 2002). “The TRUST Principles for digital repositories” offers a useful framework – Transparency, Responsibility, User focus, Sustainability and Technology – for evaluating the alignment of a repository to this mission. 

 

Using data from a repository?

Cite the data to give credit to the data producer, enable others to use the data, and meet journal requirements.

 

We are grateful to the creators of the Generalist Repository Comparison Chart for their efforts on developing a concise comparison chart from which we took much inspiration for the content of this page.

Stall, Shelley, Martone, Maryann E., Chandramouliswaran, Ishwar, Crosas, Mercè, Federer, Lisa, Gautier, Julian, Hahnel, Mark, Larkin, Jennie, Lowenberg, Daniella, Pfeiffer, Nicole, Sim, Ida, Smith, Tim, Van Gulick, Ana E., Walker, Erin, Wood, Julie, Zaringhalam, Maryam, & Zigoni, Alberto. (2020). Generalist Repository Comparison Chart. Zenodo. https://doi.org/10.5281/zenodo.3946720