Rapid molecular genetic diagnosis of hypertrophic cardiomyopathy by semiconductor sequencing

Background Rapidly determining the complex genetic basis of Hypertrophic cardiomyopathy (HCM) is vital to better understanding and optimally managing this common polygenetic cardiovascular disease. Methods A rapid custom Ion-amplicon-resequencing assay, covering 30 commonly affected genes of HCM, was developed and validated in 120 unrelated patients with HCM to facilitate genetic diagnosis of this disease. With this HCM-specific panel and only 20 ng of input genomic DNA, physicians can, for the first time, go from blood samples to variants within a single day. Results On average, this approach gained 595628 mapped reads per sample, 95.51% reads on target (64.06 kb), 490-fold base coverage depth and 93.24% uniformity of base coverage in CDS regions of the 30 HCM genes. After validation, we detected underlying pathogenic variants in 87% (104 of 120) samples. Tested seven randomly selected HCM genes in eight samples by Sanger sequencing, the sensitivity and false-positive-rate of this HCM panel was 100% and 5%, respectively. Conclusions This Ion amplicon HCM resequencing assay provides a currently most rapid, comprehensive, cost-effective and reliable measure for genetic diagnosis of HCM in routinely obtained samples.


Background
Hypertrophic cardiomyopathy (HCM) is regarded as a most common inherited cardiac disorder (1/500) and the leading cause of sudden cardiac death in adolescents 0020 [1][2][3][4]. So far, over 1000 mutations in at least 30 genes have been reported to responsible for HCM, which implied an highly genetic heterogeneity and hence resulting various clinical phenotypes, ranging from asymptomatic forms to sudden cardiac death in the young [3,[5][6][7][8][9][10][11][12]. Although disease-causing mutations in MYH7, MYBPC3 and TNNT2 had been considered to explain about half of HCM patients [8,9], the frequency of each causal variant is relatively low and most rare mutations are unique in specific families [13]. Moreover, about 10% HCM patients harbored more than one mutation and thus suffering from an earlier onset or worse prognosis [2,7]. Therefore, systemic genetic diagnosis for HCM patients was necessary and recommended by current clinical guidelines [14][15][16][17]. For instance, the identification of sudden-death-high-risk patients could benefit from an implantable cardioverterdefibrillator in primary prevention [18].
However, conventional Sanger sequencing was too laborious and expensive to content regular clinical practice [19]. Advancing high throughput next-generation sequencing (NGS) technologies have the potential to solve the problem by rapidly dissecting large regions at low cost [1,[20][21][22]. Nevertheless, the current NGS platforms have several weaknesses, including sample scalability, sequencing time and cost of entry, which need to be addressed if these technologies are going to service clinical routine genetic diagnosis [23]. With lowest-price, shortest running time, minimum start DNA amount and flexible sequencing-chip reagents, the recent flourishing semiconductor sequencing technique is notable [21,24].
Our study provides, to our knowledge, a currently most rapid, comprehensive, cost-efficient and reliable assay for genetic diagnosis of hypertrophic cardiomyopathy in everyday clinical practice. Implementation of this method will change diagnosis and understanding of the molecular etiologies of HCM.

HCM resequencing panel design
For the HCM resequencing panel targeted genes selection, recent ten years' literatures, including prior genetic detection technique articles, reviews and case-reports of HCM, were carefully accessed. To recruit a maximum coverage of the mutation spectrum of this polygenetic disorder, we designed a currently most comprehensive HCM-specific resequencing panel including 30 causal genes that most frequently affected in patients with HCM (Table 1). Then, primers of overlapping amplicons covering the CDS-region and flanking sequences of each targeted gene were automated designed by Ion AmpliSeq™ Ready-to-Use custom designer platform following guide of the website (https://www.ampliseq.com/protected/ dashboard.action) (Primers for Semiconductor sequencing are presented in Additional file 1: Table S1). With the ability to perform ultrahigh-multiplex PCR reaction in one tube parallelly, the primers were mixed and provided (Life Technologies, Carlsbad, California, USA) in two primer-pools. Eventually, 97.96% of the targeted

Patients and DNA sample preparation
With approval from the local ethics committee, 120 unrelated Chinese Han HCM patients confirmed by echocardiography during 2008 to 2013 with written informed consent were included in this study. Two of the 120 patients were from independent HCM pedigrees, carrying known pathogenic mutations rs121913641 and rs121913637 in the same loci (p.R719Q, p.R719W) in gene MYH7, respectively. Genomic DNA (gDNA) of each patient was extracted and RNase managed from peripheral leukocytes, using a DB-S kit (FUJIFILM Corporation, Tokyo, Japan) according to the manufacturer's instructions. The purified gDNA was then checked with electrophoresis to avoid fragmental degradation and RNA pollution.

Library preparation and sequencing
Ion Torrent adapter-ligated libraries were builded using Ion Ampliseq™ Library Kit 2.0 (Life Technologies) following the manufacturer's protocol within about 5 hours. Briefly, 20 ng gDNA for every sample was quantitated by Qubit 2.0 fluorometer (Invitrogen, Carlsbad, CA, USA) for multiplex PCR amplification with each of the two primer-pools, respectively. The resulting amplicons of the two primer-pools were mixed together, and then ligated to barcodes and Ion Torrent adapters (Life Technologies). Subsequently libraries were purified with AMPure XP beads (Beckman Coulter, Brea, CA, USA) using 5-cycles of PCR amplification and further purification, followed by quantification by Qubit 2.0 fluorometer. In order to increase efficiency and reduce costs, sixteen uniquely barcoded libraries were combined together with equal molar ratios for one 318 chip. Subsequent emulsion PCR and enrichment of the sequencing beads of the pooled libraries was performed using the OneTouch system (Life Technologies) according to the manufacturer's protocol within about 5 hours. Finally, 500 Flows (125 cycles) sequencing was done on the 318-chip using Ion PGM 200 Sequencing Kit (Life Technologies) on the Ion Torrent Personal Genome Machine (PGM) (Life Technologies) ( Figure 1).

Bioinformatic analysis
Raw data from 4.5 hours' PGM runs were initially processed using the Ion Torrent platform-specific software Torrent Suite v3.6.2 to generate sequence reads, trim adapter sequences, align to the hg19 human reference genome, analyze coverage and call variants (Variants Caller parameter settings see Additional file 1: Table S2). Then, all variants were annotated with an online-software Variant Effect Predictor (http://asia.ensembl.org/info/docs/ variation/vep/index.html). To predict possible impact of detected non-synonymous variants in exons, all missense substitutions were scored and in-silico-function-predicted

Study population
One-hundred-and-twenty unrelated patients with HCM were studied.  Table S3).

Sequencing output and coverage
The sequencing of selected regions of 30 HCM-associated genes on the Ion torrent PGM achieved an average output of 595628 mapped reads and 95.51% on target per sample in the 120 HCM specimens. In summary, 99.55% of all target amplicons was covered at least once, 96.98% amplicons was covered at least 20 times, 91.95% amplicons was covered at least 100 times. The mean uniformity of base coverage is 93.24% in this panel. The average read depth in the 64.06 K target region across the 120 samples was~490 folds ( Figure 2). Moreover, chip-loading-rate was improved shortly and polyclonal-rate was reduced significantly after few trails in the beginning of experiments, which result in an increase in mean coverage.

Mutation detection and sanger sequencing validation
The Ion Torrent platform-specific software Torrent Suite v3.6.2 and online software Variant Effect Predictor were employed to align the reads sequences to the human reference genome build hg19, call variants and bioinformatical annotate. Criteria for variant identification were a read coverage of higher than 30-fold. All together, in the 120 patients, 458 known or novel variants were detected by Semiconductor sequencing and on average 80 variants per sample. After Sanger sequencing validation, except 25 variants, 433 variants were determined truly exist. Most of the 25 false-positive miscalls are insertions or deletions and detected in more than one sample. Of these 433 variants, 345 (80%) are predicted to be noncoding or synonymous, whereas 88 (20%) are non-synonymous, including missense mutation and small insertion/deletion, resulting in the change of amino acids (Table 2). Notably, we identified at least one functional variant in 104/120 (87%) HCM patients and found more than one functional variants in 12/120 (10%) HCM patients.Furthermore, the two known positive pathogenic mutations (rs121913641 and rs121913637) in the two probands were successfully identified.

Sensitivity and false-positive-rate evaluation
To further assess the sensitivity and specificity of this HCM panel, direct Sanger sequencing of seven randomly selected genes was performed in eight selected subjects. Finally, 38 variants were detected by semiconductor sequencing, including 35 known variants and 3 novel variants (Table 3). Compared with Sanger sequencing results, 2 variants were failed to be validated. Therefore, the sensitivity of this HCM panel was evaluated as 100% and the false-positive-rate was evaluated as 5%.

Discussion
This study provides the first comprehensive HCM-specific semiconductor sequencing assay, attempting to facilitate the clinical diagnosis and optimally manage HCM patients. Compared with other NGS platform, semiconductor sequencing has the highest throughput and shortest run time [20,21]. From DNA extraction to data analysis, within only one day, 64.06 kb targeted CDS and flanking regulating regions of 30 genes in up to sixteen samples can be parallelly scanned using one Ion torrent 318-chip. As described in this article, our workflow leads to a mean coverage of 490X, allowing the reliable detection of sequence variants with high accuracy. On the whole, we identified 140 novel sequence variants, which are not listed in the NCBI dbSNP or 1000-Gemome project databases. By bioinformatical prediction of SIFT and Polyphen-2, we revealed potential functional mutations in known disease genes in 104 (87%) of the 120 patients with HCM. This detection rate is in the expected range and provides much better performance compared with previous studies [7,19,25].
To evaluate the capability of our Ion amplicon HCMspecific panel, we carried out assessments of experiments in several aspects. By Sanger sequencing, we dissected the panel technically uncovered 1304 bp regions of the 13 targeted genes in all patients and identified no more potential functional variants. Besides, the panel presented satisfactory results with high sensitivity (100%) and low false-positive-rate (5%) in the following validating tests. Thus, it is reasonable to believe that our panel has enough power to detect potential functional variants in HCM patients. Ion torrent PGM is considered to have weakness in producing long-homopolymer-associated insertion/deletion errors [21]. Hence, by carefully dissecting the sequences after validation, we found that this kind of primary error type was the most miscall reason in this HCM-panel. Besides, a heterozygous-substitution-miscall (c.T136C, p.F46L) in gene MYH7 was detected in 58 of 120 subjects after Sanger sequencing filtration. Since this miscall exists in high proportion of participants and with high coverage, we suspected that it is due to mistake during multiplex PCR. Although this HCM-panel could generate above false positive mistakes, the following Sanger sequencing verification can easily eliminate them.
Although there were some other HCM-relevant genes reported sporadically, such as TTN, MYPN, CRYAB, MTTL1, RAF1 and FHL1, the connection between them and HCM pathogenesis were not ascertained [26]. Our panel was designed for clinical genetic diagnosis, hence, selected only causal genes. But we will pay attention to these and other candidate genes constantly, and update our panels once they are ascertained to be pathogenesis in the future.

Conclusions
This study established a currently most comprehensive and reliable semiconductor HCM-specific resequencing assay and provided a useful, rapid and cost-effective measure for clinical routine genetic diagnosis of HCM. Implementation of this method will significantly improve routine diagnosis of HCM and change understanding of the molecular etiologies of this disease.