Next-Generation Sequencing Advances

This year marks a new chapter in the second-generation (next-generation or NGS) DNA sequencing market, as a new set of NGS systems hit the market. The first generation of DNA sequencers consisted of capillary electrophoresis (CE) systems employing Sanger sequencing. Third-generation, or single-molecule, systems directly sequence DNA molecules. NGS sequencing technology, which is characterized by massively parallel sequencing and clonal amplification, has progressed at a rapid pace. The new NGS systems seek to expand the market, capitalize on technology differentiators, and offer new options for both established and newer applications.

To gain an understanding of the current state of NGS technology and applications, IBO spoke with Elaine Mardis, PhD, the codirector and director of Technology Development at the Genome Institute at Washington University. Her current NGS work includes the study of tumor genome evolution and participation in the Human Microbiome Project. Dr. Mardis surmised that the three most popular applications of NGS are targeted capture (targeted resequencing), whole-genome sequencing and RNA-Seq.

One area that has advanced the use of NGS is informatics. Asked about current NGS informatics challenges for primary analysis, Dr. Mardis told IBO that there is a need for faster alignment algorithms. “Another [challenge] is the need to eliminate various sources of ‘noise’ from alignment algorithms.” Alignment using short reads has been improved through the use of paired reads: paired ends for short inserts and mate pairs for long inserts. By producing DNA fragments a known distance apart, these approaches can be used to close gaps in a genome using a reference genome or to assemble small de novo sequences. “A last pressing need is for improved assembly of short read pairs, often in combination with long reads or read pairs (mate pairs), so that better de novo assemblies can be obtained,” she said.

Another area that has influenced the development of NGS is sample preparation. Asked about challenges for NGS sample preparation, Dr. Mardis said, “I think that we constantly struggle with size-based separation for different library-construction approaches—namely, how best to isolate the tightest-size fraction possible, ranging from 100 bp to 100,000 bp, in a reasonable time frame and without requiring huge amounts of DNA.” Current systems for size selection include Caliper Life Sciences’ LabChip XT fractionation system and Sage Science’s Pippin Prep gel electrophoresis system.

New sample preparation and informatics strategies, as well as new instrument designs, have been brought together in the introduction of systems designed to make NGS faster, less expensive and easier in order to bring it to new users. This quarter, Illumina will begin shipping MiSeq. The instrument is priced at $125,000 and does not require ancillary sample preparation equipment. Prices per run range from $400 to $750. A single 8-hour run (1.5 hours for library preparation, 4.5 hours for sequencing, and 2 hours for alignment and variant calling) yields 120 Mb of data at read lengths of 1 x 35 bp. More than 1 GB of data at 2 x 150 bp can be produced in 27 hours. Ease of use is enabled by preloaded reagents, on-board clustering and on-board software (software for instrument control, base calling and secondary analysis).

According to Illumina, targeted (amplicon) sequencing is the application for which there has been the most initial customer interest. “This is a perfect fit with MiSeq as the amount of sequence and price per target are in its wheelhouse,” Illumina told IBO. In order to increase run times for this application, this month Illumina introduced the TruSeq Custom Amplicon Assay for multiplexing samples. Illumina said the Assay “vastly simplifies interrogating up to 384 targets per sample and up to 96 samples per run, all of which can be done in one MiSeq run with a DNA-to-data turnaround time of two days.” Other applications for MiSeq include clone checking, small-genome sequencing, ChIP-Seq and RNA-Seq.

Amplicon sequencing is one of the applications for which Illumina hopes the MiSeq will replace CE-based sequencers. “[Amplicon sequencing] is also the main application currently performed on CE systems, albeit on a smaller scale per system.” The time and cost savings make MiSeq an attractive alternative to CE-based systems, and it can perform many CE-based sequencing applications. “The largest application performed on CE systems today is that of sequencing PCR fragments or a short stretch of DNA from a sample of interest. The other applications that also drive sizable share on CE systems are clone checking and small-genome sequencing, all of which can be done on MiSeq.”

As for longer-read applications, Illumina stated that “the technology still has headroom for increasing read lengths.” Also, as Illumina President and CEO Jay Flatley explained in a conference call, other approaches alleviate the need for longer read lengths for many applications. Such approaches rely on paired-end technology, informatics and target enrichment.

MiSeq also employs the same TruSeq library preparation chemistries as Illumina’s other NGS platforms for easier adoption by labs familiar with Illumina technology. “The process for MiSeq is the same as that used for the HiSeq system, except for the fact that it is completed on MiSeq in an hour in a fully automated fashion. This is in contrast to the highly manual and error-prone emulsion PCR process common to Roche and Ion Torrent [Life Technologies],” said Illumina.

Illumina foresees applications for MiSeq in many markets. “It has tremendous potential in several applied markets, such as agriculture, forensics, food and environmental testing, or any market with a need for fast, high-quality results using targeted sequencing,” said the company. Asked about possible clinical applications for MiSeq, Illumina told IBO, ”In our review, MiSeq will have the most impact in biomarker discovery and validation, stratification of patients for enrollment in therapeutic cancer drug clinical trials, and research to determine which biomarkers are more predictive for survival.”

Competing with MiSeq to bring faster, cheaper and easier NGS to labs is Life Technologies/Ion Torrent’s Personal Genome Machine (PGM). Released late last year, the PGM is the first system to utilize semiconductor sequencing chips. When a nucleotide is added to each strand of DNA, a hydrogen ion is released, resulting in a change in pH that is measured by sensors, each of which is contained in a well on the chip. The process involves no scanning, cameras or light. The PGM is priced at $49,500.

First released was the 314 chip, which contains 1.2 million wells and produces more than 10 Mb of data. This month, Life Technologies released the 316 chip with 6.2 million wells. It can produce 100 Mb in less than two hours. The 314 chip now sells for $99. Later this year, the 318 chip will be released, containing 11.1 million wells for production of more than 1000 Mb of data.

In addition to the reduced hardware costs, reagent costs are also low as the PGM uses natural chemistries. According to Mike Lelivelt, director of Bioinformatics & Software Products at Life Technologies, “Our sample preparation is on the order of about $250 per sample,” as the system uses off-the-shelf reagents. In addition, a simplified instrument design lowers service costs. “The machines are very robust,” he told IBO.

Low cost is just one feature that will drive wider adoption of NGS; speed is another. With the 316 chip, Life Technologies has also updated associated products for a faster workflow. Total library preparation time is now 3.5 hours. The Ion Xpress Fragment Library Kit offers enzymatic-based digestion, reducing library enrichment time to 30 minutes. The automated Ion OneTouch System, which will ship later this quarter, completes template preparation in 3 hours. The new Torrent Suite 1.4 software analyzes Ion 314-chip results in 30 minutes and 316-chip results in 60 minutes. “We’re currently at about 9 hours of complete workflow from genomic DNA to a report off of the Torrent Suite software,” explained Mr. Lelivelt. PGM run time is 40 minutes.

Mr. Lelivelt highlighted the use of the PGM to sequence the E. coli virus in the spring outbreak in Germany, an example of a small microbial–sequencing application. Another application for the PGM that is attracting interest is amplicon sequencing. In addition, the PGM can also be used for RNA-Seq. In May, Life Technologies introduced the Ion Total RNA-Seq Kit for transcript-expression analysis. As for CE-based sequencing, which the company also offers, he stated, “What semiconductor sequencing does is give greater read depth to try to augment complex samples, whereas CE is the gold standard on read length as well as accuracy. So they are really complementary systems.”

The PGM’s scalability is also a key advantage, according to Mr. Lelivelt. Scalability applies to both the exponential increase in the amount of data available from each new chip, as well as read length. “We’ve been able to grow our read length from 36 bases. Now we’re at 120 bases on average. Internally, our maximum read length to date is 250 bases.” He also explained it is a combination of capabilities that makes PGM technology unique. “It’s not just the fact that they are long reads; it’s the fact that we’re able to achieve those long read lengths with such speed.” Life Technologies has announced that read lengths of 400 bp will be available in 2012.

Like Life Technologies and Illumina, Roche 454 Life Sciences has also introduced a new sequencing product that takes advantage of technology differentiation. The GS FLX+ system is a system upgrade for the GS FLX system. It increases read lengths from 600 bp to 1000 bp. Roche’s GS systems have the longest read lengths among NGS systems. According to Roche, raising it to 1000 bp makes the system competitive with CE systems and increases throughput.

Katie Montgomery, marketing communications manager at Roche 454 Life Sciences, told IBO that initial interest for the system has been for standard applications requiring long reads. “We have seen tremendous interest in the GS FLX+ System for whole-genome sequencing and de novo assembly, particularly of large, complex and highly polyploid genomes such as plants and animals.”

Transcriptome analysis is another application for the GS FLX+, according to Ms. Montgomery. “Transcriptome-sequencing projects will also benefit as the long, accurate reads cover more exons and splice junctions and further extend into untranslated regions. This results in improved coverage of transcripts and more accurate reconstruction of gene models.” Another application for Roche NGS technology is metagenomics. The system is also expected to be used as part of hybrid approaches that combine short-read and long-read data.

In addition to the GS FLX+ system, Roche also offers the GS Junior system, a lower-cost, easier-to-use system with a 10-hour run time, which was launched in 2010. Recent publications and product introductions have emphasized the system’s use for clinical applications. Asked about this, Ms. Montgomery stated, “We are focusing on applications where there is a clear advantage over existing approaches and a natural fit for the long-read sequencing technology.” Describing some of these advantages, she said, “The long reads enable many clinical researchers, who are most familiar with Sanger sequencing, to easily make the transition to the 454 high-throughput approach, while improving their ability to discover novel genetic mutations and rearrangements underlying disease.” As both a drug company and molecular diagnostics provider, Roche has experienced with such applications.

In March, Roche released the GS GType HLA Primer Sets for genotyping of class I and class II loci of the human leukocyte antigen genes, an application for which PCR and CE-based sequencing are used. The Sets can be used with the GS Junior or the GS FLX. According to Roche, NGS approaches enable higher resolution and unambiguous allele assignment in a single run. Roche also plans to launch assays for oncology. “A product is currently in development to target the TET2, CBL, KRAS and RUNX1 genes, which are important markers for leukemia (acute myeloid leukemia, myelodisplasia and acute lymphocytic leukemia),” said Ms. Montgomery.

Other focuses are infectious disease and virology. “A wide range of studies have shown the power of 454 sequencing to detect rare drug-resistant variants in HIV and hepatitis C virus/hepatitis B virus,” she said. Longer reads can also provide more molecular information for disease management. “This deep-sequencing method enables the characterization of both common viral variants and those undetected by standard population-sequencing approaches.” The company is also working with external partners to develop assays.

< | >