Documentation & metadata
Data documentation, also known as metadata, helps you understand your data in detail, and also helps other researchers find, use, and properly cite your data.
Various metadata standards are available for particular file formats and disciplines. General guidelines are provided below. For more details, see materials from our workshop on file organization. For help in documenting your data, email data-management@mit.edu.
Important things to do while you collect or create your data
- Make a note of all file names and formats associated with the project, how the data is organized, how the data was generated (including any equipment or software used), and information about how the data has been altered or processed.
- Include an explanation of codes, abbreviations, or variables used in the data or in the file naming structure.
- Keep notes about where you got the data so that you and others can find it.
Things to document about your data
Title
Name of the dataset or research project that produced it
Creator
Names and addresses of the organization or people who created the data
Identifier
Number used to identify the data, even if it is just an internal project reference number
Dates
Key dates associated with the data, including project start and end date, data modification data release date, and time period covered by the data
Subject
Keywords or phrases describing the subject or content of the data
Funders
Organizations or agencies who funded the research
Rights
Any known intellectual property rights held for the data
Language
Language(s) of the intellectual content of the resource, when applicable
Location
Where the data relates to a physical location, record information about its spatial coverage
Methodology
How the data was generated, including equipment or software used, experimental protocol, other things you might include in a lab notebook
Need more help?
- Cornell’s Research Data Management Service Group has a Guide to writing “readme” style metadata
- Social Science Data Editor’s released A template README for social science replication packages
- Documenting data that you are reusing (secondary data) or that you are backing up from some other location? Please see our readME template for secondary data