Standardized Bioinformatics Workflows
The NMDC supports standardized workflows for processing microbiome omics data
Microbiome datasets are often processed with different tools and pipelines, which presents challenges for reuse and cross-study comparisons. To address these challenges, the NMDC integrates production quality, open-source bioinformatics tools into accessible standardized workflows for processing omics data (e.g., metagenome, metatranscriptome, metaproteome, and metabolome data) to produce interoperable and reusable annotated data products. These workflows further the NMDC’s commitment to the FAIR data principles.
The NMDC workflows include bioinformatics tools developed by the Joint Genome Institute (JGI) and Environmental Molecular Sciences Laboratory (EMSL), among others. The workflows developed and used in production at JGI and EMSL are used to process thousands of datasets annually and have been extensively benchmarked to ensure the generation of high-quality data.
The NMDC workflows are publicly available through GitHub and DockerHub as standalone, containerized workflows, offering a unique opportunity for any institute or individual to obtain, install, and run the workflows in their own environments.
Documentation
The NMDC documentation provides additional information on each workflow, their standardized parameters, any associated databases, versions, and the tools associated with each workflow.
Feedback
Our team works to incorporate as much user feedback as possible and we highly value user input and suggestions. Beta testing opportunities are also announced through the NMDC newsletter, “The Microbiome Standard”, on the NMDC User Research webpage, and communications channels such as the NMDC Community Slack. To participate, sign up here and the team will reach out when a beta testing round opens.