Next-generation sequencing (NGS)
Next-generation sequencing (NGS), also known as massively parallel sequencing (MPS), is a high-throughput method that enables rapid sequencing of the base pairs in DNA or RNA molecules. Unlike traditional Sanger sequencing, NGS utilizes simultaneous sequencing of millions of DNA molecules, enabling a comprehensive analysis of the genome in a much shorter time. Currently, NGS technology is widely used by both scientific and medical communities to uncover genetic information with unprecedented speed and accuracy, leading to advancements in research, diagnostics, and personalized medicine.
Next-Generation Sequencing began to penetrate clinical practice in the late 2000s and early 2010s. The first clinical applications began around 2008, with increasing adoption and validation in subsequent years. By 2013, NGS gained significant traction for clinical diagnostics, particularly in hereditary disease and oncology. Advances in technology, reduction in sequencing costs, and the development of robust bioinformatics tools facilitated its integration into routine clinical practice. Due to this, NGS is currently an accessible and reliable tool for genetic testing.
In the context of hereditary diseases, NGS enables the identification of germinal mutations across a wide range of genes, allowing the screening for life-threatening diseases, such as cystic fibrosis, muscular dystrophy, and various inherited cancers shortly after childbirth. By sequencing genes known to be associated with specific disorders, clinicians can precisely pinpoint the exact genetic landscape responsible for a patient’s condition, leading to more accurate diagnoses and personalized treatment plans, which may significantly improve prognosis or even completely prevent development of disease.
In oncology, on the other hand, NGS is used to detect somatic mutations, genetic changes in cell DNA acquired during life, which lead to manifestation of tumors. This technology can provide a detailed molecular profile of cancer, enabling the identification of targetable mutations and the development of tailored therapies. For example, NGS can identify mutations in the EGFR gene in lung cancer, guiding the use of specific inhibitors that target these alterations. Moreover, NGS is instrumental in monitoring minimal residual disease and tracking the evolution of cancer, thus improving prognosis and treatment strategies.
The NGS also plays a crucial role in the detection and monitoring of infectious diseases. By providing complete and high-resolution data on the genetic material of pathogens, NGS enables the rapid identification of bacteria, viruses, fungi, and other microorganisms present in clinical and environmental samples. This technology allows for the detection of novel and emerging pathogens, antimicrobial resistance genes, and virulence factors, enhancing our understanding of pathogen biology and epidemiology. NGS can particularly valuable in outbreak situations, where it facilitates real-time tracking of pathogen transmission and evolution. By sequencing the genomes of pathogens from different patients or environmental sources, NGS can map the spread of an infection and identify sources of contamination, which helps in implementing effective infection control measures and mitigating the impact of outbreaks.
Short-read sequencing (SR-Seq)
Short-read sequencing is a NGS method for analysis of DNA molecules of various origin, which involves simultaneous reading of millions to billions of short DNA fragments, typically ranging from 50 to 300 base pairs. Before the sequencing process itself, a DNA library must be prepared first. The library preparation procedure typically starts by digesting the original DNA molecule into smaller fragments. Next, molecular adapters are attached to ends of these fragments in a process that usually employs Polymerase-chain-reaction (PCR) to amplify the library at the same time. Such DNA library is then loaded onto a sequencing chip, where it physically interacts with the solid surface and each fragment may undergo sequencing. Most wide-spread technology for short-read DNA sequencing is sequencing by synthesis (SBS), which is employed on-board of Illumina’s instruments. The SBS technology enables addition of fluorescently labeled nucleotides to the individual DNA fragments on the chip in a progressive fashion, while their incorporation is detected by a camera. By capturing the optical signal in each sequencing cycle, the nucleotide sequence of individual fragments can be resolved and reassembled into the original DNA molecule using a reference. Subsequently, sophisticated bioinformatics tools are used to analyze and pinpoint genetic variants.
When compared to classical sequencing methods like Sanger sequencing, SR-Seq offers several advantages. One of the most significant benefits of this method (and virtually any NGS method) is its high throughput. While Sanger sequencing can only sequence one short DNA fragment at a time, SR-Seq can sequence billions of fragments simultaneously, generating massive amounts of data in a much shorter time. This makes NGS particularly suitable for large-scale genomic projects and population studies. Another advantage of NGS is its cost-effectiveness. The per-base cost of sequencing using NGS is significantly lower than that of Sanger sequencing, making it more accessible for a wide range of research and clinical applications. This cost efficiency, combined with the high throughput, allows researchers to undertake more extensive and comprehensive studies. NGS also offers higher accuracy and low error rates, which ensures that the data generated is reliable and can be used for detailed genetic analysis and use in diagnostics. Additionally, SR-Seq is highly versatile in terms of its application potential as it enables whole-genome sequencing, targeted sequencing, RNA sequencing, microbiome sequencing and others. This versatility allows researchers to apply NGS to a wide range of scientific questions. In clinical diagnostics, targeted sequencing allows interrogation of specific genes, which are known to be associated with development of diseases, in multiple patients at the same time.
Long-read sequencing (LR-Seq)
Long-read sequencing technology represents a new generation of approaches and technologies designed for detailed, comprehensive, and robust DNA analysis based on NGS. LR-Seq technology introduces several innovations to the DNA sequencing process, including real-time sequencing of individual DNA molecules, acceleration and optimization of sample preparation processes, and the utilization of new bioinformatic solutions for data interpretation. Standard NGS methods, regardless of platform type (Illumina, Ion Torrent, MGI), rely on sequencing short DNA fragments (typically up to 300 base pairs), which traditionally allow for detection of limited type genetic variants, such as single nucleotide polymorphisms (SNPs), small insertions or deletions (indels), and occasionally copy number variants (CNVs). Although the MPS techniques utilizing short-reads are currently considered the gold-standard in genomic analysis, recent scientific discoveries and challenges have highlighted certain limitations, particularly in the detection of structural genetic aberrations and repetitive sequences within the human genome.
Genomic analysis using LR-Seq technology is based on the isolation of ultra-high molecular weight DNA ( uHMW DNA ), which is required as the source material. This allows for the sequencing of long DNA fragments (typically tens of kilobases), providing an unprecedented amount of genomic information within a single experiment. In addition to information on the presence of SNPs, indels, and CNVs, LR-Seq offers the ability to detect structural changes in DNA, phase haplotypes (the origin of variants from mother versus father), and information on native DNA methylation (epigenetics).which influences gene expression. Furthermore, it enables the analysis of previously challenging structural regions of the genome, such as centromeres, telomeres, acrocentric chromosome ends, and areas of trinucleotide repeats.
Beyond its added value in genomics, LR-Seq also brings a revolution to the field of transcriptomics. Due to its capability to analyze RNA transcripts in their full length, LR-Seq allows for the differentiation of various products from the same gene resulting from alternative splicing, as well as more detailed identification of fusion genes and precise characterization of exon-intron boundaries. This leads to a better understanding of gene expression and its regulation.