Data Standards

Supporting standards throughout the data lifecycle

What are metadata?

Metadata are contextual information about your experimental data, and are critical for the sharing and reuse of scientific data.

Get an introduction to metadata→

What is FAIR for microbiome research?

Making microbiome data Findable, Accessible, Interoperable, and Reusable (FAIR) promotes data stewardship.

Learn how we are making data FAIR→

What is the data lifecycle for microbiome research?

Data management best practices span from project start to end. Standards play a critical role is supporting the data lifecycle, and well-established standards like the Genomic Standards Consortium’s (GSC) Minimum Information about any (x) Sequence (MIxS) reporting standard and the Environment Ontology (EnvO) are key to microbiome data use and reuse. The NMDC aims to support the data lifecycle by providing tools and resources for how to follow best practices and use standards. The NMDC Metadata Documentation provides further details on how we are working across standards organizations and with the research community.

NMDC schema

The NMDC schema supports modeling of data using the Linked Data Modeling Language (LinkML). The schema enables the weaving together of several different community standards, such as the Minimum Information about any (x) Sequence (MIxS) standard from the Genomic Standards Consortium (GSC).

Persistent identifiers (PIDs)

The NMDC persistent identifier service (NMDC API) supports links across studies, samples, and workflow runs. For NMDC data, we provide stable, unique, and long-term identifiers to support interoperability, attribution, and linking across resources.

Technical solutions for standards and distributed data infrastructure

The NMDC leverages distributed data infrastructure and linked data technologies to support community standards and data sharing across existing platforms. Through using a robust, yet flexible data schema, we are able to translate reporting standards into machine-actionable solutions to help researchers access, share, and reuse data.

Advancing community standards

We work closely with the GSC’s Compliance and Interoperability Working Group and Technical Working Group. The latest release of the GSC standard (MIxS) and associated documentation (docs) is managed using LinkML, making the metadata machine operable and conversion between various formats for different tools straightforward.

Interoperability

The NMDC links across complementary data platforms including JGI’s Integrated Microbial Genomes and Microbiomes (IMG/M) platform and Genomes Online Database (GOLD), NCBI, MassIVE, ESS-DIVE, KBase, and the NEON Data Portal. This effort supports interoperability across the microbiome data ecosystem.

Programmatic access

Programmatic access to all NMDC study and biosample data is available to users (NMDC API). The Joint Genome Institute’s (JGI) Integrated Microbial Genomes and Microbiomes (IMG/M) platform uses the API to share data. Documentation for how to access the API is available to the research community.

FAIR evaluation

Across our software tools and data, we adhere to the FAIR data principles by utilizing the FAIRness evaluation framework and creating FAIRness Maturity Indicators (MIs) and Compliance Tests for microbiome data.

Thank you for your interest
Please be sure to check your inbox for the latest news, updates, and information.