Data Use Policy


This work is supported by the Genomic Science Program in the U.S. Department of Energy, Office of Science, Office of Biological and Environmental Research (BER) under contract numbers DE-AC02-05CH11231 (LBNL), 89233218CNA000001 (LANL), and DE-AC05-76RL01830 (PNNL).

How to cite the NMDC:

E. M. Wood-Charlson, Anubhav, D. Auberry, H. Blanco, M. I. Borkum, Y. E. Corilo, K. W. Davenport, S. Deshpande, R. Devarakonda, M. Drake, W. D. Duncan, M. C. Flynn, D. Hays, B. Hu, M. Huntemann, P.-E. Li, M. Lipton, C.-C. Lo, D. Millard, K. Miller, P. D. Piehowski, S. Purvine, T. B. K. Reddy, M. Shakya, J. C. Sundaramurthi, P. Vangay, Y. Wei, B. E. Wilson, S. Canon, P. S. G. Chain, K. Fagnan, S. Martin, L. A. McCue, C. J. Mungall, N. J. Mouncey, M. E. Maxon, E. A. Eloe-Fadrosh, The National Microbiome Data Collaborative: enabling microbiome science. Nature Reviews Microbiology 18, 313-314 (2020). doi: 10.1038/s41579-020-0377-0

The data accessible on the National Microbiome Data Collaborative (NMDC) science gateway are contributed by NMDC Collaborators. The NMDC’s Data Policy is that all data and data contributors should be properly acknowledged in alignment with the FAIR (findable, accessible, interoperable, and reusable) data principles (Wilkinson, 2016) and in accordance with the Creative Commons with Attribution 4.0 International license. 

Part A: NMDC Data Citations

Every data set and project in the NMDC is attributed to an NMDC Collaborator who chose to openly share their data with the global microbiome community. Proper citation of data sets discovered and accessed through the NMDC both supports reproducibility and provides important recognition of the NMDC data contributors through quantitative metrics on data reuse and impact of each contribution. To read more on data citations, please review the ESIP Data Citation Guidelines for Earth Science Data

Citation elements include

  • Author: Person, people, or organization(s) responsible for the intellectual work to develop a data set. 
  • Public Release Date: When the source data was first made available for use.
  • Title: Formal title of the data set. It is recommended that version information be independent of the title. Note this is the title of the data set, not the project or a related publication. It is important for the data set to have an identity and title of its own. 
  • Version ID: Careful versioning and documentation of version changes are central to enabling accurate citation. Data stewards need to track and clearly indicate precise versions as part of the citation for any version greater than 1. It may be appropriate to track major and minor versions. 
  • Repository: Name of the entity that holds, archives, publishes, prints, distributes, releases, issues, or produces the data. If citing processed data, this should also include a reference to the repository that holds the raw, source data. 
  • Resolvable Persistent Identifier(s): Unique identifier that provides the ability to access the data. If citing processed data, this should also include a reference to the identifier for the raw, source data. Not all data have Persistent Identifiers (PIDs) or can be digitally accessed, so an alternative method to access metadata, such as a URL or a physical address, can be provided instead.
  • Access Date: Data can be dynamic and changeable in ways that are not always reflected in release dates and versions, so it is important to indicate when data were accessed.

Citing a single NMDC data set or project

A data citation captures the core concepts necessary to provide attribution and provenance. When citing a single data set or project accessed through the NMDC, please follow the below format(s):  

Data set – source:
Wrighton, K. (2014) Deep subsurface shale carbon reservoir microbial communities from Ohio, USA – Utica-2 Time Series 2014_10_10 [Data set]. DOE Joint Genome Institute (PID). Accessed: 5 November 2020.

Data set – processed:
Wrighton, K. (2014) Deep subsurface shale carbon reservoir microbial communities from Ohio, USA – Utica-2 Time Series 2014_10_10 [Data set]. v2.0. DOE Joint Genome Institute (PID); NMDC (PID). Accessed: 5 November 2020.

Wrighton, K. (2014). Microbial controls on biogeochemical cycling in deep subsurface shale carbon reservoirs [Data set]. DOE Joint Genome Institute ( Accessed: 5 November 2020.

Citing more than one NMDC data set or project

The NMDC science gateway enables the discovery and recombination of diverse data sets across projects. Each unique combination of data can be captured for citation and publication, enabling downstream attribution and reproducibility. The NMDC is able to issue a Collection DOI, through the U.S. Department of Energy Office of Scientific and Technical Information (OSTI), that appropriately cites each data set and/or project represented in the new analysis/study. The collection DOI refers to an NMDC landing page that can be provided as part of a data availability statement for a journal publication or report. 

Example citation:
Jones, J. (2020) Title [Data set]. NMDC (PID).

Part B: NMDC Data Contributions

The NMDC supports open science and FAIR data principles. By contributing to the NMDC, your data become discoverable and accessible to the broader microbiome community. Data are processed through the standardized NMDC workflows with predefined parameters so that samples collected at different times and across diverse projects are analysed exactly the same enabling cross-studies comparable to each other.

During the NMDC pilot, direct contributions of data have been restricted to enable rapid, iterative feature-driven design, focused on metagenome, metatranscriptome, metaproteome, and metabolomic data. All data in the NMDC are associated with curated sample metadata. The NMDC has identified required metadata, based on the Genomic Standards Consortium MIxS packages. Finally, data in the NMDC adheres to the Creative Commons Attribution 4.0 License for data usage rights. 

If you are interested in working with the NMDC and believe you have data that meet these requirements, please contact the NMDC team (

Part C: UC Berkeley Lab Data Contributor License Agreement

In order to clarify the intellectual property rights granted with respect to any contributions from any person or entity (“Contribution“), The Regents of the University of California, Department of Energy contract-operators of the Ernest Orlando Lawrence Berkeley National Laboratory (“Berkeley Lab”) must have a Contributor License Agreement (the “Agreement“) agreed to by each contributor. The license granted hereunder is for your protection as a contributor as well as for the protection of Berkeley Lab; it does not change your rights to use your own Contributions for any other purpose.

Either individuals or businesses, governmental or non-profit entities, including without limitation, all employees or agents acting on behalf of any such entity (an Entity), may submit Contributions to the National Microbiome Data Collaborative (NMDC) data project (the Data Project) under this Agreement. By clicking “I Agree”, you indicate that you are entering this Agreement on behalf of an Entity, you represent that you have the authority to bind such Entity to this Agreement, in which case, the terms “You” and “Your” shall refer to such Entity, as further defined below.

Please read this document carefully before agreeing to it, and print a copy for your records if required.

You accept and agree to the following terms and conditions for (i) all contributions that You may have previously submitted to the Data Project, unless otherwise governed by a written license agreement, and (ii) Your present and future Contributions submitted to the Data Project. Except for the licenses granted herein You reserve all right, title, and interest in and to Your Contributions.

1. Definitions.

“You” (or “Your”) shall mean the holder of intellectual property rights in the Contributions or the Entity authorized by such holder of intellectual property rights to enter into this Agreement with Berkeley Lab. For Entities, the Entity making a Contribution and all other Entities that control, are controlled by, or are under common control with that Entity are considered to be a single Contributor. For the purposes of this definition, “control” means (i) the power, direct or indirect, to cause the direction or management of such entity, whether by contract or otherwise, (ii) the power to appoint a majority of the board of directors or other similar governing body of such entity, (iii) ownership of fifty percent (50%) or more of the outstanding voting shares of, or other voting equity interests in, such entity, or (iv) any other beneficial majority ownership of such entity.

“Contribution” shall mean any data or other content including any modifications or additions to any existing data or content (including metadata), that is or has been intentionally submitted by You to Berkeley Lab for inclusion in, or documentation of, Berkeley Lab. For the purposes of this definition, “submitted” means any form of electronic, verbal, or written communication sent to Berkeley Lab or its representatives, for the purpose of adding to, modifying, and/or improving the Work other than a communication that is conspicuously designated in writing by You as “Not a Contribution.” Data Project and Contributions are collectively referred to herein as the “Work“.

2. Grant of License.

You hereby grant, to Berkeley Lab and its agents, under the terms of the Creative Commons Attribution 4.0 International License, all rights necessary to copy, store, redistribute, and share Your data, metadata, and any other content of Your Work, with the public.

You assert that Your Work is not subject to current export control laws and does not contain personally identifiable information, and Berkeley Lab hereby agrees to share Your Work with the public subject to the terms of the Creative Commons Attribution 4.0 International License.

Your Work is subject to the terms of the Creative Commons with Attribution 4.0 International License indefinitely.

3. Representations and Warranties.

If You are entering into this Agreement as an individual, You represent and warrant that You are legally entitled to grant the above license. If Your employer(s) has intellectual property rights in the Contributions, You represent and warrant that You have received permission to make Contributions on behalf of that employer, that Your employer has waived such rights for Your Contributions to Berkeley Lab, or that Your employer has executed a separate contribution agreement with Berkeley Lab.

You represent and warrant that (i) You will observe all applicable United States and foreign laws and regulations (if any) with respect to the export, re-export, diversion or transfer of any software, any data, or both, including, without limitation, the International Traffic in Arms Regulations (ITAR) and the Export Administration Regulations, and (ii) You will not transfer to Berkeley Lab any personally identifiable information or any information that is export controlled other than information classified as EAR99 under the Export Administration Regulations without prior written authorization from Berkeley Lab.

If You are entering into this Agreement on behalf of an Entity, You represent and warrant that You are legally entitled to grant the above license. You further represent and warrant that each employee of the Entity that submits Contributions is authorized to submit such Contributions on behalf of the Entity.

Except as warranted above, You provide Your Contributions on an “AS IS” BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied, including, without limitation, any warranties or conditions of TITLE, NON- INFRINGEMENT, MERCHANTABILITY, or FITNESS FOR A PARTICULAR PURPOSE.

4. Covenants

You agree to promptly notify Berkeley Lab of any facts or circumstances of which you become aware that would make any representations herein inaccurate in any respect.

5. Miscellaneous

This Agreement shall be governed by and construed in accordance with the laws of the State of California, without regard to the conflict of laws provisions thereof.

Any provision of this Agreement that is determined to be unenforceable or unlawful shall not affect the remainder of the Agreement and shall be severable therefrom, and the unenforceable or unlawful provision shall be limited or eliminated to the minimum extent necessary to that this Agreement shall otherwise remain in full force and effect and enforceable.

This Agreement constitutes the entire agreement between the parties and supersedes any and all prior agreements between them, whether written or oral, with respect to the subject matter hereof.

This Agreement may be terminated by either party at any time for any reason, upon thirty (30) days prior written notice. Excepting Section 2 of this Agreement all other terms and conditions shall survive any such termination.

This Agreement may not be amended, modified or provision hereof waived, except in a writing signed by the parties hereto.

No waiver by either party, whether express or implied, of any provision of this Agreement, or of any breach thereof, shall constitute a continuing waiver of such provision or a breach or waiver of any other provision of this Agreement.

Thank you for your interest
Please be sure to check your inbox for the latest news, updates, and information.