Skip to Main Content
Michigan State University

How to Cite Data: General Info

Introduction

Data requires citations for the same reasons journal articles and other types of publications require citations: to acknowledge the original author/producer and to help other researchers find the resource.

A dataset citation includes all of the same components as any other citation:

  • author
  • title
  • year of publication
  • publisher (for data, this is often the archive where it is housed),
  • edition or version
  • access information (a URL, DOI or other persistent identifier)

Unfortunately, standards for the citation of data are not uniformly agreed upon and have yet to be codified by the National Information Standards Organization (an organization that sets technical standards for other bibliographic materials).  However, many data providers and distributors and some style manuals do provide guidelines.  Some of these instructions are listed on this guide.

Be sure to follow the general citation format for the style manual your professor has asked you to use.  It is always better to provide more information about a resource rather than less!

The tab for Dataset Citations and Statistical Table Citations provide specific examples from style manuals, data archives, and distributors.

Read on below for some general rules...

General Rules

Some style manuals do provide instructions for the citation of data, and selected examples are listed on the Data Citations tab.  If the style manual you are using does not address data citations, you can follow these general rules.

Usually a style manual will lay out basic rules for the order of citation elements, regardless of the type of work.  This is what you will need to pay close attention to in order to format your citation correctly.  If you can’t find a generic list of rules, then look at how the citation for a book is formatted. 

These are the citation elements you need to consider when building a data citation:

Author

Who is the creator of the data set?  This can be an individual, a group of individuals, or an organization.

Title

What name is the data set called, or what is the name of the study? 

Edition or Version

Is there a version or edition number associated with the data set?

Date

What year was the data set published?  When was the data set posted online?

Editor

Is there a person or team responsible for compiling or editing the data set?

Publisher and Publisher Location

What entity is responsible for producing and/or distributing the data set?  Also, is there a physical location associated with the publisher? 

In some cases, the publisher of a data set is different than how we think of the publisher of a book.  A data set can have both a producer and a distributor.

The producer is the organization that sponsored the author’s research and/or the organization that made the creation of the data set possible, such as codifying and digitizing the data.

The distributor is the organization that makes the data set available for downloading and use. 

You may need to distinguish the producer and the distributor in a citation by adding explanatory brackets, e.g., [producer] and [distributor].

Some citation styles (e.g., APA) do not require listing the publisher if an electronic retrieval location is available.  However, you may consider including the most complete citation information possible and retaining publisher information even in the case of electronic resources.

Material Designator

What type of file is the data set?  Is it on CD-ROM or online? 

This may or may not be a required field depending on the style manual.  Often this information is added in explanatory brackets, e.g. [computer file].

Electronic Retrieval Location

What web address is the data set available at?  Is there a persistent identifier available?  If a DOI or other persistent identifier is associated with the data set it should be used in place of the URL.

Examples using the General Rules

APA (6th edition)

 

Minimum requirements based on instructions and example for dataset reference:

Milberger, S. (2002). Evaluation of violence against women with physical disabilities in Michigan, 2000-2001 (ICPSR version) [data file and codebook]. doi:10.3886/ICPSR03414

With optional elements:

Milberger, S. (2002). Evaluation of violence against women with physical disabilities in Michigan, 2000-2001 (ICPSR version) [data file and codebook]. Detroit: Wayne State University [producer]. Ann Arbor, MI: Inter-university Consortium for Political and Social Research [distributor]. doi:10.3886/ICPSR03414

MLA (7th edition)

 

Minimum requirements based on instructions and examples for books and web publications:

Milberger, Sharon. Evaluation of Violence Against Women With Physical Disabilities in Michigan, 2000-2001. ICPSR version. Inter-university Consortium for Political and Social Research, 2002. Web. 19 May 2011.

With optional elements:

Milberger, Sharon. Evaluation of Violence Against Women With Physical Disabilities in Michigan, 2000-2001. ICPSR version. Detroit: Wayne State U [producer]. Ann Arbor, MI: Inter-university Consortium for Political and Social Research [distributor], 2002. Web. 19 May 2011. doi:10.3886/ICPSR03414

Chicago (16th edition)

 

Bibliography style (based on documentation for books):

Milberger, Sharon. Evaluation of Violence Against Women With Physical Disabilities in Michigan, 2000-2001. ICPSR version. Detroit: Wayne State University, 2002. Distributed by Ann Arbor, MI: Inter-University Consortium for Political and Social Research, 2002. doi:10.3886/ICPSR03414.

Author-Date style:

Milberger, Sharon. 2002. Evaluation of Violence Against Women With Physical Disabilities in Michigan, 2000-2001. ICPSR version. Detroit: Wayne State University. Distributed by Ann Arbor, MI: Inter-University Consortium for Political and Social Research. doi:10.3886/ICPSR03414.

For Librarians and others interested in further information on data citations

Key articles proposing data citation standards (in chronological order):

Dodd, S. A. (1979). Bibliographic references for numeric social science data files: Suggested guidelines. Journal of the American Society for Information Science, 30(2), 77-82. doi:10.1002/asi.4630300203

Altman, M., & King, G. (2007). A proposed standard for the scholarly citation of quantitative data. D-Lib Magazine, 13(3/4). doi:10.1045/march2007-altman 

Green, T. (2009). We need publishing standards for datasets and data tables. OECD Publishing White Paper. Paris: OECD Publishing. doi:10.1787/603233448430

Starr, J., & Gastl, A. (2011). isCitedBy: A metadata scheme for DataCite. D-Lib Magazine, 17(1/2). doi:10.1045/january2011-starr

DataCite Metadata Working Group. (March 2011). DataCite metadata scheme for the publication and citation of research data. (Version 2.1). doi:10.5438/0003

IASSIST Quick Guide to Data Citation

The IASSIST organization of data professionals recommends consulting the ICPSR's Quick Guide to Data Citation for suggestions on the best way to cite data in APA, MLA, and Chicago styles.

International Polar Year Data Citation Guidelines

The International Polar Year 2007-2008 maintained a Data and Information Service program to coordinate and manage the participation of groups from around the world. The group developed guidelines for data users to cite IPY data in order to properly acknowledge the valuable contributions that go into the creation of a data set.

This document, How to Cite a Data Set, explains all of the citation elements to consider for inclusion and formats them in author-date style. As the IPY Data and Information Service program is no longer active, this document is available via the Internet Archive.

Digital Curation Centre Guide