Connecting over data and ideas
Building a FAIR microbiome data sharing network, through infrastructure, data standards, and community building, to address pressing challenges in environmental science.
In 2016, the White House Office of Science and Technology Policy (OSTP), in collaboration with federal agencies and private-sector stakeholders, launched the National Microbiome Initiative focused on three main priorities: supporting interdisciplinary research, developing platform technologies, and expanding the microbiome workforce. This spurred a call to action for microbiome data science, led by Nikos Kyrpides, Natalia Ivanova, and Emiley Eloe-Fadrosh. A small workshop convened in early 2017 at the Department of Energy’s Joint Genome Institute focused on developing a vision for microbiome data science to address gaps in existing infrastructure.
Working from a collaborative vision for a microbiome data infrastructure that was outlined at the workshop, an open-invitation town hall was organized at the 2017 ASM Microbe conference to gauge support and solicit input from the microbiome research community. Stakeholders from all facets of microbiome science filled the standing-room-only meeting and signaled a collective eagerness for concerted data infrastructure solutions. In November 2017, a stakeholder workshop hosted by the American Society for Microbiology brought together representatives from academia, industry, government, and philanthropic funding agencies to conceptualize the National Microbiome Data Collaborative (NMDC). These efforts serve as the foundation of a community-driven national effort aimed to develop standards, processes, and infrastructure for an integrated microbiome data ecosystem.
Following these program development activities, the FY19 Energy and Water Appropriations Bill included $10 million to “begin establishment of a national microbiome database.” In January 2019, Berkeley Lab was formally tasked by the DOE Office of Biological and Environmental Research to develop a pilot NMDC as a non-competed program. The NMDC was initiated in July 2019 as a pilot project led by Berkeley Lab, in partnership with Los Alamos National Laboratory (LANL), Pacific Northwest National Laboratory (PNNL) and Oak Ridge National Laboratory (ORNL).
The NMDC pilot focused on developing core capabilities and resources centered on standards and metadata; bioinformatic workflows; data integration and access; and community engagement. During the pilot phase, we also developed strategic partnerships with DOE’s Environmental Systems Science Data Infrastructure for a Virtual Ecosystem (ESS-DIVE), DOE’s Systems Biology Knowledgebase (KBase), and NSF’s National Ecological Observatory Network (NEON). To advance the NMDC beyond the pilot phase, we were invited to prepare a 3-year plan in 2022 that leveraged lessons learned and an ambitious framework for collaborative, interdisciplinary data infrastructure to support integrative science.
We use proven approaches and new innovations in distributed data infrastructure and linked data technologies to support new ways for researchers to ask questions about how genes, individual microbes, and communities are associated with environmental processes. Engagement and user research throughout the development lifecycle ensures we are providing resources which are essential to advancing microbiome research. The NMDC aims to tackle existing gaps in environmental microbiome research to broadly support data, information, and knowledge access through the NMDC products driven by community needs.
Enabling inclusive and interdisciplinary environmental microbiome science by connecting data, people, and ideas.
The work conducted by the National Microbiome Data Collaborative (https://ror.org/05cwx3318) is supported by the Genomic Science Program in the U.S. Department of Energy, Office of Science, Office of Biological and Environmental Research (BER) under contract numbers DE-AC02-05CH11231 (LBNL), 89233218CNA000001 (LANL), and DE-AC05-76RL01830 (PNNL).
How to cite the NMDC:
Eloe-Fadrosh EA, Ahmed F, Anubhav, Babinski M, Baumes J, Borkum M, Bramer L, Canon S, Christianson DS, Corilo YE, Davenport KW, Davis B, Drake M, Duncan WD, Flynn MC, Hays D, Hu B, Huntemann M, Kelliher J, Lebedeva S, Li PE, Lipton M, Lo CC, Martin S, Millard D, Miller K, Miller MA, Piehowski P, Jackson EP, Purvine S, Reddy TBK, Richardson R, Rudolph M, Sarrafan S, Shakya M, Smith M, Stratton K, Sundaramurthi JC, Vangay P, Winston D, Wood-Charlson EM, Xu Y, Chain PSG, McCue LA, Mans D, Mungall CJ, Mouncey NJ, Fagnan K. The National Microbiome Data Collaborative Data Portal: an integrated multi-omics microbiome data resource. Nucleic Acids Res. 2022 January 7;60(D1):D828–D836. doi: 10.1093/nar/gkab990.
Kelliher, JM, Rudolph M, Vangay P, Abbas A, Borton MA, Davenport ER, Davenport KW, Erazo NG, Herman C, Karstens L, Kocurek D, Lutz HL, Myers KS, Ockert I, Rodriguez FE, Santistevan C, Saunders JK, Smith ML, Vogtmann E, Windsor A, Wood-Charlson EM, Woodley L, Eloe-Fadrosh EA. Cohort-based learning for microbiome research community standards. Nature Microbiology. 2023 April 17. doi:10.1038/s41564-023-
E. M. Wood-Charlson, Anubhav, D. Auberry, H. Blanco, M. I. Borkum, Y. E. Corilo, K. W. Davenport, S. Deshpande, R. Devarakonda, M. Drake, W. D. Duncan, M. C. Flynn, D. Hays, B. Hu, M. Huntemann, P.-E. Li, M. Lipton, C.-C. Lo, D. Millard, K. Miller, P. D. Piehowski, S. Purvine, T. B. K. Reddy, M. Shakya, J. C. Sundaramurthi, P. Vangay, Y. Wei, B. E. Wilson, S. Canon, P. S. G. Chain, K. Fagnan, S. Martin, L. A. McCue, C. J. Mungall, N. J. Mouncey, M. E. Maxon, E. A. Eloe-Fadrosh, The National Microbiome Data Collaborative: enabling microbiome science. Nature Reviews Microbiology 18, 313-314 (2020). doi: 10.1038/s41579-020-0377-0