New reference catalogue for the microbiome and its function

To properly determine the properties of a bacterial species on human health, it is important to study the behavior and characteristics of the bacteria. An important tool in determining the characteristics of a bacteria, is to determine the composition of its genome. The gene-content of the genome tells us something about the different activities and processes the bacteria is involved in. The first major research project that tried to determine the genome of bacterial species found in people was the Human Microbiome Project. In this study, hundreds of participants were registered, which resulted in the identification of many types of bacteria specific for the human gut. From the genomes that were found for the different bacteria, the genes in each of the genomes was extracted. This resulted in a global overview of all genes, and as such activities, present in the gut microbiota. This overview of the microbial gene content was termed the ‘Integrated Gene Catalogue’. Based on earlier research and imputation, functions could be ascribed to the different genes, which provided the project with a perspective on the influence of the bacterial species on important processes for human health.

One important caveat of this overview, however, was that the individual genes could not be traced back to the bacteria they were originally derived from. A recent study has tried to overcome this problem, by combining the data of many studies on the human microbiome. Using this dataset, the researchers were able to directly correlate the genes present in the DNA database to the relevant bacterial species. This greatly helps in elucidating the specific gene expression and activities of each different bacteria. In total, the researchers collected approximately 200.000 different bacterial genomes, containing over 150 million genes! In comparison, the human genome consists of only 25.000 genes.

It should be mentioned that due to the nature of the studies that were combined, not all relevant functional information about these (newly identified) bacteria is known. Traditionally, bacteria are isolated and cultured to study their properties. With the methods used for this study, namely next generation DNA sequencing, only the DNA content of each bacteria is determined, without any further information about the bacteria’s morphology, metabolism and other characteristics. Of the total number of identified bacteria in the combined study set, only 29% has been described according to their culturing characteristics.

Source: Nature Biotechnology (2020)