Quantitative PCR provides a simple and accessible method for quantitative microbiota profiling

Introduction

The use of relative abundance data from next generation sequencing (NGS) can lead to misinterpretations of microbial community structures, as the increase of one taxon leads to the concurrent decrease of the other(s) in compositional data. . Since the changes of components are mutually dependent, high false discovery rates occur when compositional data are analyzed using traditional statistical methods.  Although different DNA- and cell-based methods as well as statistical approaches have been developed to overcome the compositionality problem, and the biological relevance of absolute bacterial abundances has been demonstrated, the human microbiome research has not yet adopted these methods, likely due to feasibility issues. Here, we describe how quantitative PCR (qPCR) done in parallel to NGS library preparation provides an accurate estimation of absolute taxon abundances from NGS data and hence provides an attainable solution to compositionality in high-throughput microbiome analyses. The advantages and potential challenges of the method are also discussed.

Method

۱-Bacterial DNA extraction

Bacterial DNA will be extracted from fecal samples using a modified version of repeated bead beating that efficiently extracts bacterial DNA from both Gram-positive and -negative bacteria.

۲-۱۶S rRNA gene sequencing

۳-Sequencing data processing and analysis

The preprocessing will be done in the R package mare, utilizing USERACH for quality filtering, chimera removal, and taxonomic annotation. Only the high-quality forward reads should be used.

۴-Quantitative PCR

Quantification of total bacteria, specific taxa and butyrate production capacity should be carried out by qPCR.

۵-Calculation of absolute abundance and copy-number correction

The sequencing reads assigned to different taxa in each sample will be divided by the total number of reads for the sample to obtain relative abundances of the taxa in each sample. The relative abundances obtained based on the sequencing reads will be translated into total abundances by multiplying the relative abundance of each taxon by the total bacterial abundance in the sample. These figures will be further corrected for 16S rRNA gene copy-number variation by dividing the abundance of a taxon by the number of 16S copies in its genome. For the copy-number correction, the 16S copy number database rrnDB can be used.

PCR provides

Conclusion

Importantly, qPCR-based quantitative microbiome profiling enjoys the following conceptual and practical benefits over other approaches:

۱-Cost-effectiveness and feasibility: qPCR is cost-effective and accessible as the laboratory settings, machinery and reagents are similar to those needed for preparing the NGS libraries. The same DNA extract serves as the starting material both for qPCR and NGS, making qPCR done in 96- or 384-format easy to implement in the workflow for high-throughput analysis of up to thousands of microbiome samples.

۲-Simplicity: qPCR is relatively simple to perform compared to flow cytometry that requires considerable expertise for reproducible results. In fact, flow cytometric enumeration of microbial cells was initially restricted to pure cultures and still remains challenging when performed in complex matrices [32]. Also, no spikes, other exogenous controls, or complicated transformation/computation are needed in qPCR-based quantitative microbiome profiling.

۳-Comparability to NGS: Unlike flow cytometry that counts cells, qPCR and NGS both target bacterial DNA, including extracellular DNA derived from lysed bacteria. Extracellular DNA can be intrinsic or result from the differential lysis of Gram-positive and negative bacteria during the common freeze-thawing prior to fecal DNA extraction. As the 16S profiles from the gut appear very different for intracellular and extracellular DNA [33], qPCR is expected to reflect the NGS targeted community structure both quantitatively and qualitatively more closely than flow cytometry

۴-Applicability: qPCR-based quantitative microbiome profiling is applicable also for samples containing a substantial amount of host or non-bacterial DNA, in which bacterial density cannot be reliably estimated by total DNA yield [5]. Moreover, the qPCR-based method can be employed to study also non-bacterial communities where a universal marker gene is available, such as in fungi

References

Knight R, Vrbanac A, Taylor BC, Aksenov A, Callewaert C, Debelius J, et al. Best practices for analysing microbiomes. Nature reviews Microbiology. 2018; 16(7):410–۲۲. Epub 2018/05/26. https://doi.org/10. 1038/s41579-018-0029-9 PMID: 29795328.

Morton JT, Marotz C, Washburne A, Silverman J, Zaramela LS, Edlund A, et al. Establishing microbial composition measurement standards with reference frames. Nat Commun. 2019; 10(1):2719. Epub 2019/06/22. https://doi.org/10.1038/s41467-019-10656-5 PMID: 31222023; PubMed Central PMCID: PMC6586903.

Props R, Kerckhof FM, Rubbens P, De Vrieze J, Hernandez Sanabria E, Waegeman W, et al. Absolute quantification of microbial taxon abundances. The ISME journal. 2017; 11(2):584–۷. Epub 2016/09/10. https://doi.org/10.1038/ismej.2016.117 PMID: 27612291; PubMed Central PMCID: PMC5270559.

Jian C, Luukkonen P, Yki-Järvinen H, Salonen A, Korpela K. Quantitative PCR provides a simple and accessible method for quantitative microbiota profiling. PLoS One. 2020 Jan 15; 15(1):e0227285.

?What is Epigenomics

Epigenomics

Epigenomics is the study of the complete set of epigenetic modifications on the genetic material of a cell, known as the epigenome. The field is analogous to genomics and proteomics, which are the study of the genome and proteome of a cell. Plant flavones are said to be inhibiting epigenomic marks that cause cancers. Indeed, Epigenomics refers to identifying modifications of DNA or DNA-associated proteins. These include DNA and methylation. Cell fate and functions can be modified by modifications in DNA and histones, apart from genetic changes. These changes can be based on the environment and are passed onto progeny. Epigenetic changes in genome can also act as markers for metabolic syndromes, cardiovascular diseases, and physiological disorders. These changes can be cell-and tissue-specific. Thus, it is critical to identify the epigenetic changes during native and diseased states. Next generation sequencing is also used to assess DNA modifications.

What is the difference between epigenetics and epigenomics?

Epigenetics focuses on processes that regulate how and when certain genes are turned on and turned off, while epigenomics pertains to analysis of epigenetic changes across many genes in a cell or entire organism.

epigen

What is Epigenomic profiling?

Epigenomics involves the profiling and analysis of epigenetic marks across the genome. These processes modify local genome activity without changing the underlying DNA sequences and thus determine cellular phenotypes by regulating gene expression dynamics.

What technology is essential for Epigenomics?

Epigenomics has only become possible in recent years because of the advent of various sequencing tools and technologies, such as DNA microarrays, cheap whole-genome resequencing, and databases for studying entire genomes.

?What is Whole genome sequencing (WGS)

Whole genome sequencing (WGS) is the most global approach to identifying genetic variations. Whole genome sequencing or full genome sequencing, is the process of determining the entirety of the DNA sequence of an organism’s genome at a single time. Genomic information has been instrumental in identifying inherited disorders, characterizing the mutations that drive cancer progression, and tracking disease outbreaks. This entails sequencing all of an organism’s chromosomal DNA as well as DNA contained in the mitochondria and, for plants, in the chloroplast. Whole genome sequencing has largely been used as a research. In the future of personalized medicine, whole genome sequence data may be an important tool to guide therapeutic intervention.

Advantages of Whole-Genome Sequencing

  • Provides a high-resolution, base-by-base view of the genome
  • Captures both large and small variants that might be missed with targeted approaches
  • Identifies potential causative variants for further follow-up studies of gene expression and regulation mechanisms
  • Delivers large volumes of data in a short amount of time to support assembly of novel genomes

Next generation sequencing

The feasibility of WGS analysis is under the support of next generation sequencing (NGS) technologies, which require substantial computational and biomedical resources to acquire and analyze large and complex sequence data. Meanwhile, the rapid progress and innovation of NGS technology has successfully enabled the generation of large volumes of sequence data and reduced the expense for WGS. While WGS method is commonly associated with sequencing human genomes, the scalable, flexible nature of next-generation sequencing (NGS) technology makes it equally useful for sequencing any species, such as agriculturally important livestock, plants, or disease-related microbes.

Current challenges and best-practice protocols for microbiome analysis

Analyzing the microbiome of diverse species and environments using next-generation sequencing techniques has significantly enhanced our understanding on metabolic, physiological and ecological roles of environmental microorganisms. However, the analysis of the microbiome is affected by experimental conditions (e.g. sequencing errors and genomic repeats) and computationally intensive and cumbersome downstream analysis (e.g. quality control, assembly, binning and statistical analyses). Moreover, the introduction of new sequencing technologies and protocols led to a flood of
new methodologies, which also have an immediate effect on the results of the analyses. The aim of this work is to review the most important workflows for 16S rRNA sequencing and shotgun and long-read metagenomics, as well as to provide best-practice protocols on experimental design, sample processing, sequencing, assembly, binning, annotation and visualization. To simplify and standardize the computational analysis, we provide a set of best-practice workflows for 16S rRNA and metagenomic sequencing data (available at https://github.com/grimmlab/MicrobiomeBestPracticeReview).

The methods for gut microbiota analysis

Both target gene and metagenomic sequencing approaches are key to decipher a plethora of roles which are played by environmental microorganisms. However, both sequencing and computational methods still suffer from many biases that are due to errors in sample handling, experimental errors
and downstream bioinformatics analysis. Thus, improvements in sequencing technologies and the development of new computational tools and algorithms should always be based on prior knowledge, e.g. known caveats at each sample processing step. Factors that potentially influence preprocessing, as well as downstream analysis of both short-read and long-read data including sample preparation, sequencing, binning, assembly and functional annotations, should be catalogued precisely.
Herein, we have attempted to list challenges and best-practice protocols utilized during microbiome acquisition using 16SrRNA and metagenomic sequencing. This is important due to the large and expanding paradigms of computational tools that have been developed in recent years for analyzing long and short-read sequencing data. Here, we provide a workflow of optimally tested tools available for processing sequencing samples, estimating microbial abundances, and classification, assembly and functional annotations. In addition, we also discussed the experimental challenges with a systematic review of steps involved in 16S rRNA and shotgun metagenomics.
The experimental challenges mainly account for factors responsible for contamination in isolated microbial genomes and resulting variations in microbial profiles. Although gradual improvisation of these factors has been implemented, extensive and multilayered, sequencing data remain prone to errors at various levels. Hence, we believe that utilization and awareness of integrated methods described here will not just help to improve the reliability of sequencing outcomes but would also reduce variability in the data generation and processing steps.

Key words

microbiome; amplicon sequencing; 16S rRNA sequencing; metagenomics

Reference

Bharti R, Grimm DG. Current challenges and best-practice protocols for microbiome analysis. Briefings in bioinformatics. 2021 Jan;22(1):178-93. doi: 10.1093/bib/bbz155

Guidelines for quality control of NGS techniques

Next-generation sequencing (NGS) refers to large-scale, fast and efficient DNA (and RNA) sequencing technology. NGS is changing the paradigm in precision medicine and continue to fuel innovation.

Data-Driven NGS Quality Control Guidelines

The data-driven guidelines detailed below for the quality control of next-generation sequencing (NGS) files were generated by comparing 47 quality features on 2000+ FastQ files manually labelled for quality. Guidelines have been generated for specific organisms and assays, and also for experimental conditions related to particular cell types or ChIP protein and antibody targets. Focus is given to the understanding of individual quality features and their effective combination.

Guideline Documents

  • Scientific publication (link to article): detailed motivations, methods, results and discussion.
  • Decision trees (link to PDF): methods summary and decision trees to classify NGS files by quality in a selection of data subsets
  • Interactive tables (below on this page): compare classification performance of the quality features in a selection of data subsets
  • Online interactive dashboard (external web site): compare values of the quality features in user-defined data subsets, including statistical test results on selected subsets

Reference

Sprang M, Krüger M, Andrade-Navarro MA, Fontaine JF. Statistical guidelines for quality control of next-generation sequencing techniques. Life science alliance. 2021 Nov 1; 4(11).