Back
Mar. 4, 2025 | Blog Post

NMDC EDGE and the NMDC workflows help microbiome researchers around the world process and reuse microbiome data

With the growing amount of publicly available multi-omics data, researchers can reuse this data to synthesize new findings from past work at scales not previously possible, from geographic to experimental breadth. While many researchers want to reuse data, the microbiome community has discussed the lack of metadata as a major hindrance to understanding the study context for their comparisons. When it comes to reusing processed data, NMDC Microbiome QA/QC Manager Alicia Clum said, “Some of the major hurdles with data reuse are not having enough information, or metadata, about how the data was processed (i.e., what database versions were used or listing the software used but not the version or parameters), and consistent data processing for samples collected over a long period of time.” This can make it challenging for researchers to know how to compare different datasets, and they often have to reprocess the raw data, which can be time-intensive or require extensive bioinformatics experience. 

The National Microbiome Data Collaborative (NMDC) has worked with the Joint Genome Institute and Environmental Molecular Sciences Laboratory to develop publicly available bioinformatics workflows for microbiome multi-omics data. These workflows enable researchers to process metagenomics, metatranscriptomics, metaproteomics, and metabolomics data in a standardized manner. They are available through GitHub and Docker Hub and can be run on the NMDC EDGE (Empowering the Development of Genomics Expertise) graphical user interface which is free to use and accessible to all. Recently, the NMDC team published an article about NMDC EDGE to provide more information about the workflows and interface. When asked about the benefits of standardizing workflows, NMDC Co-PI Patrick Chain replied, “Standards more readily allow comparisons across studies. Researchers can generate the same types of standardized outputs, and be more confident in differences or similarities they observe across studies, without having to reprocess all the data from these studies on their own.” Clum added, “NMDC EDGE provides a friendly user interface to run these workflows backed by public repositories.” Users can run the workflows on their own uploaded data or can directly pull in datasets available in NCBI SRA (National Center for Biotechnology Information Sequence Read Archive). Combined with accompanying robust and standardized sample metadata, these standardized processed datasets become much more Findable, Accessible, Interoperable, and Reusable (FAIR). This benefits the whole microbiome research community by enabling connections across studies toward new insights. 

The NMDC Ambassador program trains early career researchers on data management and FAIR data practices and how the NMDC data ecosystem supports these practices. These researchers then host events within their institutions and respective research fields. This year, 2024 NMDC Ambassadors Viviana Alban, Lílian Caesar, and Lennel Camuy-Vélez led workshops on NMDC EDGE and the NMDC workflows to support microbiome research initiatives in their home communities. Read more below to learn about our Ambassadors and their experiences (answers have been lightly edited for spelling and grammar): 

Viviana Alban is a Ph.D. candidate at the University of Washington and has had the privilege of working with research communities in Ecuador to advance microbiome research. She is interested in understanding how environmental exposures, such as contact with animals and their feces, impact the gut microbiome of children living in Low and Middle-Income countries. Recently, she collaborated with Universidad San Francisco de Quito – Ecuador (USFQ) and its Institute of Microbiology to organize two workshops that introduced graduate students to the powerful tools offered by the National Microbiome Data Collaborative (NMDC), with a special focus on NMDC EDGE workflows. 

Lílian Caesar is a postdoctoral researcher in the Department of Biology at Indiana University, working with Dr. Irene Newton on the ecology and evolution of bee microbiomes. She is also part of the National Science Foundation (NSF) Biology Integration Institute on Genomics and Eco-evolution of Multi-Scale Symbioses (GEMS). Before coming to the US, she completed her academic training in Brazil, where she maintains active collaborations on diverse microbiome research projects. She led a virtual workshop for Portuguese speakers with a focus on data management and NMDC EDGE and co-hosted a virtual workshop for the GEMS institute.

Lennel Camuy-Vélez is a Ph.D. candidate at North Dakota State University in Dr. Samiran Banerjee’s Laboratory and co-advised by Dr. Kevin Sedivec, where he studies invasive species’ plant-microbe interactions and the effects of grassland management practices on soil microbial communities. Born and raised in Puerto Rico, he is passionate about supporting research communities on the island and empowering students as they transition from undergraduate studies to graduate school and microbiome research careers. He led a virtual workshop with undergraduates at Inter American University of Puerto Rico – Metropolitan Campus. In addition to the Puerto Rico community, he also had the opportunity to talk at the ASA-CSSA-SSSA conference, reaching another audience excited to get involved in microbiome science.

What inspired you to teach your community about NMDC EDGE and the NMDC workflows?

Viviana: The inspiration for these workshops came from a gap in bioinformatics training within Ecuadorian graduate programs, particularly in the area of metatranscriptomics. While topics like metagenomics are often covered, metatranscriptomics—the study of all RNA transcripts in a microbial community—rarely makes it into the curriculum. To address this, we developed two complementary sessions as part of a graduate-level genomics course at USFQ, which included students from master’s and doctoral programs. The first session provided an overview of NMDC tools, highlighting their potential for microbiome data analysis, while the second was a hands-on event focused on the NMDC EDGE metatranscriptomics pipeline. For me, sharing NMDC EDGE and its workflows is about more than teaching software—it’s about building capacity in regions where access to advanced tools is often limited. These workshops demonstrated how free, open-source platforms like NMDC EDGE can level the playing field, allowing scientists across many backgrounds to engage in cutting-edge microbiome research. By equipping students with these skills, we’re not just enhancing individual careers; we’re contributing to a stronger global research community.

Lílian: My inspiration to teach Portuguese-speaking researchers about NMDC EDGE and the NMDC workflows comes from my experiences during my Ph.D. and conversations with my collaborators. As a Ph.D. student, I spent several hours installing software and benchmarking pipelines since microbiome research was just emerging in my lab and no one else in my program was working on similar projects. On top of that, my limited English fluency made it difficult to use available tutorials or reach out to program developers, who were often based abroad. Many of my Brazilian collaborators—some of whom have reached out to me specifically for help with microbiome analyses—face similar challenges. These barriers often delay research and limit the advancement of science globally. Furthermore, many labs entering this field still lack the computational infrastructure required for intensive analyses. By proposing this workshop in Portuguese, my goal was to address these challenges by providing accessible computational tools and standardized workflows, empowering non-English-speaking researchers to overcome these hurdles and advance their work.

Lennel: What inspires me to teach my community about NMDC EDGE and NMDC workflows is their ease of use and accessibility for researchers at all levels. The NMDC platform opens doors to opportunities and resources that might only sometimes be readily available to everyone, making bioinformatics more approachable.

Workshop participants at Alban’s USFQ in-person event. Photo credit: Viviana Alban

How have the NMDC EDGE and NMDC workflows helped your research or helped your research community?

Viviana: The impact of these workshops was immediate and inspiring. Graduate students, many of whom had little to no prior experience with metatranscriptomics, were thrilled to discover that NMDC EDGE offers free, accessible, and standardized workflows. These tools provide an incredible opportunity for students to integrate multi-omics data into their research without the typical financial and technical barriers. At the end of the workshops, participants expressed their enthusiasm, not only for the practical skills they gained but also for the broader possibilities these tools open up for their research and careers. The enthusiasm and engagement of the students were a testament to the importance of initiatives like this. As these young scientists move forward, I am excited to see how they use these tools to tackle complex questions in microbiome research, drive innovation, and ultimately make meaningful contributions to science and society.

Lílian: The NMDC EDGE and NMDC workflows have been very helpful for my research and the broader research community. Many researchers, especially those new to microbiome analysis, often feel overwhelmed by the number of programs available for tasks like assembly or taxonomic assignment. Through the workshop, I introduced researchers to key technical terms and demonstrated NMDC EDGE as a user-friendly interface with standardized pipelines. The platform allows users to progress from read quality assessment to functional analysis with just a few clicks, significantly accelerating their research. Participants were particularly excited about the platform’s accessibility, ease of use, and extensive support, including written and video tutorials. This accessibility has helped reduce barriers and streamline microbiome analyses for researchers, enabling them to move their projects forward more efficiently.

Lennel: My involvement with NMDC has enhanced my research and improved my ability to communicate science effectively. It has equipped me with tools to support colleagues and students in conducting standardized, reproducible analyses. In the future, I plan to fully integrate NMDC EDGE into my projects, mainly to implement standardized pipelines for reproducible data analysis. This will be especially valuable for researchers and students who are just beginning to explore the world of bioinformatics.

 

LA-UR-24-33296

Join our vision

Want more info? Or to be an NMDC Champion? Subscribe to be the first to know about the latest news and developments.

Thank you for your interest
Please be sure to check your inbox for the latest news, updates, and information.