Thursday, 29 December 2016


Written by Emilio J. Laserna Mendieta | Marcus J. Claesson, Posted in Volumen 4

Many metagenomic studies are aimed at determining the bacterial composition in a specific type of sample. This is usually performed by next-generation sequencing analysis of the data obtained from libraries created by PCR amplification of the prokaryotic 16S ribosomal RNA (rRNA) gene. This gene has a length of 1542 base pairs and it is composed by nine variable regions and interspaced by more conserved regions. The amplification of one or two of these variable regions of the 16S rRNA gene is a common approach employed in the phylogenetic identification of hundreds of species that are present in the human microbiota1.
figura1Figure 1. Summary of the approach followed for the generation of 16S rRNA gene libraries. The amplified region comprised variable regions V3 and V4 and the conserved region between them. A first PCR was performed with specific primers for V3- V4 that contain adaptors for Nextera XT indexes. In a second PCR, a different combination of indexes for each sample and P5/P7 sequences (which hybridise with flow cell adaptor in the MiSeq system) were introduced. After normalization of the quantity of each sample that was included in the final pool, libraries were ready to be sequenced in the MiSeq system
One of the main sources of variation in these sorts of studies, apart from the DNA extraction method and the sequencing platform, is the choice of primers used in 16S rRNA gene amplification2. Klindworth and co-authors3 carried out an in silico evaluation about this issue and described a couple of degenerate primers (that is, with two or more nucleotide bases in several positions) for regions V3-V4 as those showing the most promising results. As a result, these primers have been widely employed in later studies (almost 300 cites at the end of September 2016).

In our laboratory, we have been analysing microbiota composition across multiple studies from hundreds of stool and biopsy human samples. Here, DNA was extracted from those samples and employed for two consecutive PCR amplifications (Figure 2). In this way and following the protocol summarized in Figure 1, we have created libraries that contain amplicons of the 16S rRNA gene from several samples. This approach was based in the method recommended by Illumina4, although some modifications were introduced.

figura2Figure 2. On the left, it is shown the PCR result for 16S rRNA gene amplification in six DNA samples from human faeces. Clear and specific bands are observed in the size range of 500-600 base pairs. On the right, it is shown the PCR result employing Nextera XT indexes, thus now amplicons have a size higher than 600 base pairs. In both cases, samples were run in a 2% agarose gel stained with Midori Green (Nippon Genetics).

The first step consisted in a PCR with V3-V4 primers where an adaptor sequence for Nextera XT indexes (Illumina) was added to. Thus, the complete sequence for these primers was (IUPAC nomenclature):


This PCR reaction was performed using a high-fidelity DNA polymerase (in our case, Phusion High-Fidelity DNA polymerase, purchased from Thermo Scientific). PCR conditions were: 98°C for 30 s, followed by 25 cycles of 98°C for 10 s, 55°C for 15 s and 72°C for 20 s, with a final extension step of 72°C for 5 min. PCR products were purified using magnetic beads (Agencourt AMPure XP beads, purchased from Beckman-Coulter). Then, 5 μL of the purified DNA were used in the second PCR that employed primers from Nextera XT indexes. The program for this second PCR was the same set up for 16S rRNA PCR, only changing the amplification cycles from 25 to 8. After this PCR, we achieved 16S amplicons that had a different and unique combination of indexes for every single sample. After a new purification of the second PCR products, DNA quantification is performed employing high sensitivity methods for double stranded DNA, like Quant-iT Picogreen or Qubit (both from Thermo Scientific). Finally, we pooled all samples, which should have the same final concentration, by mixing them in one single tube. This pool was afterwards diluted to achieve an adequate concentration to be sequenced in the MiSeq system (Illumina).

Sequencing results have to be processed using properly selected bioinformatic pipelines to get the bacterial composition of each sample. This methodology was successfully employed in a study carried out in our laboratory about stool microbiota comparison between people in long-term treatment with proton pump inhibitors and control individuals. The analysis of 16S rRNA gene amplicons recently revealed changes in the colonic microbiota, as the Firmicutes/Bacteroides ratio, associated to the long-term prescription of these types of drugs5.


Leave a comment

Please login to leave a comment.