- Open Access
Unmasking Retinitis Pigmentosa complex cases by a whole genome sequencing algorithm based on open-access tools: hidden recessive inheritance and potential oligogenic variants
Journal of Translational Medicine volume 18, Article number: 73 (2020)
Retinitis Pigmentosa (RP) is a clinically and genetically heterogeneous disorder that results in inherited blindness. Despite the large number of genes identified, only ~ 60% of cases receive a genetic diagnosis using targeted-sequencing. The aim of this study was to design a whole genome sequencing (WGS) based approach to increase the diagnostic yield of complex Retinitis Pigmentosa cases.
WGS was conducted in three family members, belonging to one large apparent autosomal dominant RP family that remained unsolved by previous studies, using Illumina TruSeq library preparation kit and Illumina HiSeq X platform. Variant annotation, filtering and prioritization were performed using a number of open-access tools and public databases. Sanger sequencing of candidate variants was conducted in the extended family members.
We have developed and optimized an algorithm, based on the combination of different open-access tools, for variant prioritization of WGS data which allowed us to reduce significantly the number of likely causative variants pending to be manually assessed and segregated. Following this algorithm, four heterozygous variants in one autosomal recessive gene (USH2A) were identified, segregating in pairs in the affected members. Additionally, two pathogenic alleles in ADGRV1 and PDZD7 could be contributing to the phenotype in one patient.
The optimization of a diagnostic algorithm for WGS data analysis, accompanied by a hypothesis-free approach, have allowed us to unmask the genetic cause of the disease in one large RP family, as well as to reassign its inheritance pattern which implies differences in the clinical management of these cases. These results contribute to increasing the number of cases with apparently dominant inheritance that carry causal mutations in recessive genes, as well as the possible involvement of various genes in the pathogenesis of RP in one patient. Moreover, our WGS-analysis approach, based on open-access tools, can easily be implemented by other researchers and clinicians to improve the diagnostic yield of additional patients with inherited retinal dystrophies.
Retinitis Pigmentosa (RP, ORPHA:791) is the most common form of inherited retinal dystrophies (IRD), affecting 1 in 4000 individuals worldwide . RP is characterized by the primary death of rods, which typically manifests with progressive night blindness followed by visual field constriction. As the disease progresses, cones dysfunction also occurs, leading to decreased visual acuity and central vision loss . RP is defined by a huge phenotypic variability, in which age of onset and disease progression can vary from patient to patient, even within the same family (inter- and intra-familial variability) [3, 4]. Moreover, RP is one of the most genetically heterogeneous disorders, as mutations in 88 genes have been associated so far . RP can be inherited as an autosomal dominant (adRP), autosomal recessive (arRP) or X-linked (XLRP) trait. However, in a large percentage of cases, the mode of inheritance is unknown due to the absence of additional affected members (simplex RP, sRP) [6, 7]. In other cases, the mode of inheritance can be inaccurately assumed due to pseudo-dominance of certain XLRP variants [8, 9], or the presence of more than one genetic causes in the same family [10,11,12].
In this scenario, receiving a genetic diagnosis becomes increasingly important to confirm the clinical diagnosis , to provide genetic and reproductive counseling for a proper clinical management of patients  and due to the development of gene therapy for some retinal dystrophies . The methods and tools available for genetic diagnosis have evolved dramatically during the last two decades. Nowadays, different next-generation sequencing (NGS) approaches are commonly conducted for the genetic diagnosis of IRD, most of them based on targeted sequencing of a variable number of disease-associated genes [16,17,18,19]. The overall diagnostic yield of targeted sequencing is 55–65% [7, 20, 21], suggesting the implication of both novel genes or mutations not detectable or filtered by standard diagnostic algorithms such as structural, deep-intronic, non-coding or synonymous variants . Whole genome sequencing (WGS) has been shown to overcome some of the disadvantages of whole exome sequencing (WES) and targeted gene panels, due to WGS coverage uniformity , allowing also the identification of deep-intronic variants and structural variations. Indeed, a comparative study between WGS and targeted gene panels concluded that WGS may improve the pathogenic variant detection rate by facilitating detection of structural variations and variants in regulatory regions . The remaining challenge will be in handling the large amount of data generated and variant interpretation, which is further aggravated by the absence of consensus workflows. Thus, WGS could be useful to characterize those cases in which the screening for previously identified disease-causing variants had been inconclusive.
The aim of this work was to uncover the genetic cause of RP in one Spanish family using WGS. The causal mutations underlying the phenotype of this family were not previously identified using gene-panel sequencing. Therefore, we have implemented and optimized a diagnostic algorithm for the systematic analysis of WGS data, which included variant annotation, filtering, prioritization, bioinformatics pathogenicity predictions, Sanger sequencing validation and segregation studies.
Subjects, clinical evaluation and previous studies
Nineteen participants from the same family, including five affected individuals, were recruited and received comprehensive ophthalmic evaluations. Three individuals underwent WGS (III:3, III:23 and IV:1). The DNA of these three individuals together with the DNA of sixteen additional family members were used to make the segregation analysis of candidate variants by Sanger sequencing (II:1, III:3, III:4, III:5, III:7,III:8, III:10, III:11, III:15, III:17, III:18, III:20, III:21, III:23, IV:1, IV:2, IV:3, IV:4 and IV:35) (Fig. 1).
Clinical diagnosis of retinal dystrophy was based on fundus examination, visual acuity, computerized testing of central and peripheral visual fields and electroretinography (ERG) findings. RP was defined as bilateral visual loss, initial night blindness, restrictions of visual field, gradual increased bone spicule pigmentation, decrease of visual acuity, attenuation of retinal vessels and reduced or undetectable ERG .
Peripheral blood was collected from all subjects to extract genomic DNA using standard protocols. Prior to WGS, individual III:3 was analyzed by targeted sequencing using a panel of 64 IRD genes  without achieving a genetic diagnosis.
Moreover, in order to facilitate the filtering and prioritization steps during the bioinformatic analysis, an in-house database containing WGS data was used as pseudo-controls. This pseudo-control cohort was composed of six unaffected individuals belonging to unrelated IRD families processed under similar conditions.
Whole genome sequencing and data analysis
WGS has been performed by Edinburgh Genomics using Illumina SeqLab, which integrates Illumina TruSeq library preparation, Illumina cBot2 cluster generation, Illumina HiseqX sequencing, Hamilton Microlab STAR integrative automation, and Genologics Clarity LIMS X Edition.
Briefly, genomic DNA (gDNA) samples with a concentration of 20–100 ng/µl were sheared to a 450 bp mean insert size using a Covaris LE220 focused-ultrasonicator. The inserts were blunt ended, A-tailed, size selected and the TruSeq adapters were ligated onto the ends. The insert size for each library was evaluated using the Caliper GX Touch to ensure that the mean fragment sizes fell between 300 and 800 bp. The concentration of each library was calculated using a Roche LightCycler 480 and a Kapa Illumina Library Quantification kit to ensure that the concentration of each library was between 1.1 and 8 nM.
The libraries were normalized to 1.5 nM and were denatured for clustering and sequencing at 300 pM using Hamilton MicroLab STAR with Genologics Clarity LIMS X (4.2) Edition. Libraries were clustered onto a HiSeq X Flow cell v 2.5 on cBot2s and the clustered flow cell was transferred to a HiSeqX platform for sequencing using a HiSeqX Ten Reagent kit v2.5.
The developed algorithm for WGS data analysis is shown in Fig. 2. The bioinformatics analysis was executed using several bioinformatics tools: bcl2fastq v.22.214.171.124 (Illumina)for demultiplexing, allowing one mismatch when assigning reads to barcodes; BCBio-Nextgenv.0.9.7 (https://github.com/bcbio/bcbio-nextgen) to perform alignment , BAM file preparation and variant detection, BCBio uses BWA memv.0.7.13  to align the raw reads to the human reference genome (hg19); samblaster v.0.1.22  to mark the duplicated fragments and the Genome Analysis Toolkit (GATK v.3.4-0-g7e26428)  for indels realignment and base recalibration. Finally, the genotype likelihoods are calculated using GATK HaplotypeCaller (3.4-0-g7e26428) creating a final VCF file for each of the sequenced samples.
Additionally, in order to facilitate the subsequent data analysis, the tool VCF Combine, available in the Galaxy web-based platform , was used to generate a combined VCF file containing all variants from sequenced samples selected in each of the optimization phases of the algorithm.
CNVs analysis was conducted employing the tool Estimation by Read Depth with Single-nucleotide variants (ERDS) . CNVs annotation was done using an in-house solution based on UCSC Table Browser . Large deletions and duplications were visually inspected with Integrative Genomics Viewer (IGV). All likely pathogenic CNVs were checked in Database of Genomic Variants (DGV)  and DECIPHER .
Variants filtering, prioritization and pathogenicity assessment
The tertiary WGS data analysis was done following a step-by-step in-house algorithm, using the online tool Bystro  and a VCF file as starting point (Fig. 3). Subsequently, several filtering steps were applied: (i) the “recurrence filtering” applicable if a combined VCF file containing variants of all sequenced samples (including pseudo-controls) was available. This filter allows discarding sequencing artefacts and polymorphisms leaving only variants exclusive of the family under study and absent in homozygosis in the pseudo-control cohort. Prior to the application of this filter, we checked if there was any variant consistent with the patient’s phenotype, that is, variants previously associated with any type of IRD, described as pathogenic or as likely pathogenic in the ClinVar database. For this purpose, Bystro’s filters, such as, ‘ClinVar clinical significance’ and ‘ClinVar phenotype list’, were employed. Moreover, ‘conflicting interpretations of pathogenicity’ variants were also checked just in case they were conflicting between pathogenic/VUS. (ii) The “frequency filtering”, was used to discard variants with a MAF > 0.01 in the Genome Aggregation Database (gnomAD). (iii) The “IRD genes filtering”, to prioritize variants located in any of the 274 genes previously associated with IRD according to the RetNet  (Additional file 1: Table S1). This filter allows the prioritization of all those exonic and intronic variants in these genes, regardless of the distance from splice sites, before to evaluate candidate variants in novel genes. (iv) The “pedigree filtering”, help us to prioritize variants according to their zygosity and phenotype, as long as the starting point was not a single VCF file. This filter should be applied taking into account the specific pedigree of each family. In this case, heterozygous or homozygous variants in one patient were prioritized, whether or not they were in the other affected individual. Moreover, the homozygous variants in the unaffected individual were discarded. Furthermore, since WGS was conducted in the affected individual III:3 and in her unaffected son IV:1, all those heterozygous variants located in cis in the same gene could be filtered out.
Following this, a manual prioritization was conducted by which variants in the coding exons or in the ± 10 bp of their flanking intronic regions were prioritized. If no causal mutation were detected, deep-intronic variants were considered. Although only variants with MAF < 0.01 have been selected, the absence of homozygotes in gnomAD should be checked manually for each candidate variant. For intronic variants, three online tools were used to assess the impact on splicing mechanisms: two algorithms included in Human Splicing Finder (HSF  and MaxEntScan ) and NNSPLICE . Specific thresholds were defined based on a known deep-intronic variants validation for two tools: a minimum score of 2 and score variation > 15% for MaxEnt and a minimum score of70 and score variation > 10% for HSF, were necessary to pass the quality threshold. For NNSPLICE, we use default settings (cut-off > 0.4) and a score difference between wild-type and mutated sequence > 10% to be considered for further analysis . The pathogenicity prediction of variants was performed by SIFT , PolyPhen-2  and/or Mutation Taster  depending on the type of mutation. Candidate variants were classified using the ACMG guidelines .
Sanger sequencing was conducted in order to validate and segregate all the candidate variants in available family members. Specifically gDNA from 19 individuals (Fig. 1) was used to verify segregation of the sequence alteration with the phenotype by conventional Sanger sequencing according to the manufacturer’s protocols (BigDye® Terminator v3.1 Cycle Sequencing Kit, 3730 DNA Analyzer, Applied Biosystems, USA) (Primer sequences and reaction conditions are available upon request).
The nomenclature of all variants was adjusted to the Human Genome Variation Society guidelines using Mutalyzer .
The analyzed family was of Spanish origin. Affected individuals received a well-defined clinical diagnosis of RP and had a suspected autosomal dominant pedigree due to the existence of multiple affected individuals of both genders in three consecutive generations (Fig. 1). Clinical findings of the sequenced patients are reported in Table 1.
NGS data quality
Quality assurance and quality control are essential to ensure the reliability of the generated data. Genome sequencing in the three sequenced individuals produced an average mapping yield of 134.2 Gb ± 4.12 (mean ± SD) and an average coverage of 34.2x ± 0.97 (mean ± SD). The 99.4% of reads were mapped, and the duplicated reads percentage was 15.5%. The total bases Q ≥ 30 was 85.9%. The Q score of 30 to a base is equivalent to the probability of an incorrect base call 1 in 1000 times. This means that the base call accuracy is 99.9%, thus, all the reads will have zero errors and ambiguities in the 85.9% of the bases. All these parameters indicated that WGS data had a good quality for continuing the analysis.
Diagnostic algorithm optimization for the WGS data analysis
Tertiary data analysis comprising filtering, annotation, prioritization and biological interpretation of candidate variants was performed using an in-house algorithm (Fig. 2). Tertiary data analysis is the most complex, experiment-specific, time-consuming and manual phase of the NGS data analysis pipeline, and therefore, part of this work implied the optimization of the filtering steps (Fig. 3).
Initially, an unprocessed VCF file containing all the variants present in one single individual (~ 4.8 millions) was used (Fig. 3a). After selecting all variants with a MAF < 0.01 located in IRD genes, around 400 variants remained for manual prioritization and in silico tools predictions. Of these, around 30 possibly causative variants were selected for family segregation studies by Sanger sequencing. As a first approach, VCFs for each individual were annotated using Bystro but this process hampered the integration of the data for the rest of individuals belonging to the same family, requiring considerable time and effort. Therefore, a second approach consisting in the generation of a single VCF file that combined the data of the three individuals belonging to the same family was followed (Fig. 3b). This allowed the integration of all the data automatically using Bystro filtering options, and taking into account the phenotype data of each individual (“pedigree filtering”). Although the initial number of variants was greater in this file (~ 7.2 millions), after applying the corresponding filters, the number of variants remaining for manual prioritization was significantly reduced (from 403 to 184) (Fig. 3b) but it was still excessive. Finally, to further optimize the data analysis, a single VCF file containing all the variants (~ 16.5 millions of variants) present in the 3 individuals of our family and the 6 individuals of the pseudo-control cohort was generated (Fig. 3c). The use of this single VCF file allowed us to filter out sequencing artefacts and polymorphisms which facilitated data interpretation. Interestingly, after applying the “recurrence filtering”, the number of variants exclusive of one family dropped from to ~ 7.2 millions to ~ 3.9 millions. This efficient strategy allowed us to further reduce the number of likely causative variants pending to be manually assessed and segregated (from 184 to 104) (Fig. 3c). Hence, the proposed algorithm for the variant prioritization of WGS data from IRD patients comprised the application of five different filters (Fig. 3d).
Likewise, if no candidate variants in any of the IRD genes segregated with the disease in the family, mutations in novel genes would be considered following the discovery pipeline (Fig. 2). The prioritization of variants in novel genes would be done considering multiples factors such as the pathogenicity predictors provided by Bystro (CADD , PhastCons  and PhyloP ), the absence of homozygotes in gnomAD, bibliography searching, the expression of the gene in retina available at public expression databases, the presence of retinal changes in knockout mouse models, the association of the novel protein in known retinal interaction networks (STRING ), etc.
Identification of mutations by whole genome sequencing
The application of the diagnostic algorithm led to the initial identification of four candidate variants (M1–M4) in the analyzed family (Table 2). Among these, three had been previously reported as pathogenic or likely pathogenic variants in clinical databases: two missense (M2: p.(Arg334Trp) and M3: p.(Cys759Phe)) and one frameshift mutation (M1: p.(His308Glnfs*16) in USH2A. The last variant consisted in one missense variant in USH2A (M4: p.(Arg4187His)). A further analysis, led to the identification of two additional candidate variants (M5 and M6), comprising one nonsense variant in PDZD7 (M5: p.(Gln515*)) and one novel frameshift variant in ADGRV1 (M6: p.(Gly4360Glufs*10)). Interestingly, M4 and M5 variants have been reported in the general population (gnomAD) with a very low MAF, but no entry in clinical databases has been made for any of them (Table 2).
Although the family in study was clinically diagnosed of RP with presumed autosomal dominant inheritance, our diagnostic algorithm led to the identification of six candidate heterozygous variants in three autosomal recessive IRD-associated genes: USH2A (M1–M4), PDZD7 (M5) and ADGRV1 (M6). Therefore, more than one combination of pathogenic variants was identified in this large family. Biallelic combination between the USH2A variants M1, M2 and M3 segregated with the disease in the third branch of the family (individuals III:17–III:23) (Fig. 1). The first branch of the family (individuals III:3–III:8) harboured also biallelic variants in USH2A (M2and M4), but these variants do not entirely segregate with the disease, as an unaffected sister (III:4) shared this genotype with her affected sister (III:3). Moreover, individual III:3 was also a carrier of two additional likely pathogenic variants, one in the PDZD7 gene (M5) and another in the ADGRV1 gene (M6), the latter being present only in the affected individual III:3. Both variants were absent in clinical databases and were classified as pathogenic and likely pathogenic according to the ACMG guidelines (Table 2). No additional candidate variant (SNVs, small indels, deep-intronic or CNVs) was identified neither in these genes nor in any of the 274 IRD genes.
In this study, a WGS approach was conducted to identify the genetic cause of RP in one Spanish family that remained undiagnosed despite previous studies. Currently, the use of NGS approaches in the clinical setting is primarily based on gene panels or exome analysis, both of which involve selective capturing of target regions. However, capture-based strategies have some limitations such as the lack of uniformity in terms of sequencing depth and coverage. Thus, WGS can be a better approach compared to WES as it allows not only, the identification of mutations in non-coding regions, but also, a greater sensitivity in detecting structural variants. As sequencing costs decline and bioinformatics analysis improve, WGS will have the potential to entirely replace WES . Currently, filtering and prioritization of variants derived from WGS data remains challenging due to the enormous amount of information generated and the lack of systematized protocols for variant prioritization.
During this work, the optimization and implementation of a personalized diagnostic algorithm for WGS data analysis led to a reliable approach with a great versatility and high performance. The optimization process allowed minimizing the number of candidate variants pending to be validated and segregated in the available family members. This resulted in increased cost-effectiveness by reducing the amount of tedious work such as in silico predictions, manual review of the number of homozygotes in gnomAD, and Sanger sequencing. Specifically, the efficacy of the “recurrence filtering” was particularly evident, as the number of variants exclusive of one family decreased from ~ 7.2 millions to 3.9 millions. Our approach is based on an open access software and online tools which algorithms are more frequently updated compared to commercial solutions , facilitating data interpretation. Moreover, these filtering steps can be easily used by other researchers without investing large amount of resources in commercial licenses. In fact, despite the substantial reductions in sequencing costs, the cost of bioinformatics analyses is, in some cases, similar to sequencing; therefore, an algorithm based on free software and tools would allow the implementation of WGS in research as well as clinical practice .
Here, the application of the diagnostic algorithm led to the genetic diagnosis in one family which received a clinical diagnosis of RP with suspected autosomal dominant inheritance (Fig. 1). Previous analyses of the index patient (III:3) using gene-panel sequencing  were initially focused on the identification of variants segregating under a dominant trait but no causal mutations were detected in adRP genes. Although both USH2A variants (M2 and M4) were detected by the panel, the segregation was not conclusive and, therefore, WGS was conducted not only in this patient but also in two additional relatives. As a result of our hypothesis-free WGS data analysis and sequencing of more than one relative, heterozygous variants in recessive genes, USH2A, ADGRV1 and PDZD7, were detected in this family. Mutations in the USH2A gene cause non-syndromic RP and Usher Syndrome type II, both autosomal recessive conditions . In this case, three of the identified USH2A variants (M1, M2 and M3) were previously reported as pathogenic in ClinVar for both phenotypes, while one of them was not detected in IRD patients (M4). Remarkably, an accurate selection of the samples in which WGS is going to be conducted is highly recommended for a successful application of this pipeline.
Therefore, different combinations of USH2A pathogenic variants were found in this family. While individuals III:17 and III:23 harboured M3 (p.(Cys759Phe)) in trans with M1 (p.(His308Glnfs*16)), individual III:20 carried M3 (p.(Cys759Phe)) in trans with M2 (p.(Arg334Trp)). Interestingly, affected siblings (III:17, III:23 and III:20) harboured the M3 (p.(Cys759Phe)) variant in compound heterozygous status with a second variant (Fig. 1). The variant p.(Cys759Phe) is one of most prevalent USH2A variants, mainly associated with non-syndromic RP [51, 52]. None of the patients manifested sensorineural hearing loss, indicating that the combination of these mutations caused arRP. The expression of the phenotype varies depending on the nature of the mutations [53, 54] and these combinations results in a less severe condition, in this case, non-syndromic RP. Clinical findings also revealed the existence of intra-familial phenotypic variability among relatives of the same and different branches, reinforcing the hypothesis of more than one genetic cause underlying the phenotype in this family. In fact, affected individuals (III:3, III:17 and III:23) who are carriers of a frameshift variant manifested an earlier onset age than III:20 individual who carry two missense variants.
The index patient (III:3) harboured the USH2A pathogenic allele M2 (p.(Arg334Trp)) like her cousin (III:20), inherited from her affected father (II:1). However, the second mutation in USH2A identified in this case was the variant M4 (p.(Arg4187His)). Although this variant has been identified in the general population (6/281154 alleles in gnomAD), its frequency is consistent with disease prevalence. Computational prediction tools and conservation analyses do not provide strong support for or against an impact to the protein. Therefore, the clinical significance of the M4 is uncertain. Moreover, the segregation analysis for this variant was inconclusive, as individual III:4 was an asymptomatic carrier (aged 58 years).
Further analysis in individual III:3 led to the identification of two additional likely pathogenic variants in two genes associated with Usher Syndrome type II (PDZD7 and ADGRV1). Previous studies have identified PDZD7 variants as phenotype’s modifiers of a biallelic mutation in an USH gene, including a homozygous truncating USH2A mutation associated with a more severe RP when accompanied by a PDZD7 mutation . Therefore, mutations in this gene could contribute to aggravate the ocular phenotype in these cases. In addition, both PDZD7 and ADGRV1, have also been associated to digenic inheritance . Moreover, two pathogenic variants in two different USH2 genes (USH2A and ADGRV1) were detected in one patient suggesting that both together could be contributing to its phenotype but segregation analysis would be needed to conclude . According to our proposed algorithm, the screening of mutations in deep-intronic regions in known IRD-genes must be conducted as an essential step prior to evaluate mutations in novel genes or oligogenic trait reinforcing the importance of adopting a WGS-based strategy.
Therefore, the index patient of our family (individual III:3) harboured four clinically relevant alleles: one in PDZD7, one in ADGRV1 and two in USH2A, of which one has been reported as pathogenic while the pathogenicity of the other one remains unclear. This combination of variants is present only in this patient as it is not shared by the rest of affected individuals. One possibility would be that causative variants in the index patient remain undetected, but neither CNVs nor deep-intronic variants consistent with the disease were detected in any of the known IRD genes. Another possibility could be that the RP of individual III:3 could be caused by mutations segregating under an oligogenic inheritance among USH2A, ADGRV1 and/or PDZD7. In this scenario, the variants in PDZD7 and/or ADGRV1 could act as genetic modifiers capable of modulating the penetrance of the milder USH2A allele, although further studies are needed. Oligogenic inheritance and the involvement of genetic modifiers have been demonstrated experimentally to contribute to heterogeneous disorders such as human heart diseases  or Bardet–Biedl syndrome . Moreover, incomplete penetrance have been reported in some specific RP genes generally associated with a dominant trait [59, 60]. In addition, this mechanism has also been proposed to explain the absence of RP symptoms in other family with the homozygous USH2A allele p.Cys759Phe [61, 62]. In this regard, it cannot be ruled out that these two USH2A mutations are pathogenic but the individual III:4 has no signs of the disease due to the lack of an updated clinical evaluation, incomplete penetrance or the involvement of genetic modifiers.
The huge phenotypic overlap and genetic heterogeneity of IRDs makes that patients who received a clinical diagnosis of a particular condition may harbour causal variants in genes not specifically associated with that diagnosis. For instance, a significant number of non-syndromic RP patients can carry mutations in genes also associated with syndromic ciliopathies [63, 64]. Moreover, patients who received an initial clinical diagnosis of adRP may carry causal mutations in XLRP genes . Our results are in agreement with previous studies suggesting that the contribution of mutations in recessive genes to the RP of suspected autosomal dominant pedigrees should be taken into consideration . Thus, diagnostic approaches focused on a limited number of genes for a specific phenotype and mode of inheritance may not detect variants in genes not typically associated with that clinical diagnosis in a number of patients.
This family is a good example of the enormous genetic and clinical heterogeneity of IRD, since within a pseudo-dominant pedigree, six different variants segregating under a recessive inheritance pattern were identified in three genes causing IRD. These results contribute to expand the mutational spectrum of IRD genes, as well as, the number of cases explained following an oligogenic inheritance. The role of genetic modifiers and oligogenic inheritance should not be underestimated in those families that remain without a conclusive genetic diagnosis, even after being thoroughly analyzed using updated approaches.
Availability of data and materials
The datasets used and/or analyzed during the current study are available from the corresponding author on reasonable request.
Autosomal dominant Retinitis Pigmentosa
Autosomal recessive Retinitis Pigmentosa
Copy number variations
Inherited retinal dystrophies
Integrative Genomics Viewer
Minor allele frequency
Retinal pigment epithelium
Single nucleotide variants
Variant call format
Whole exome sequencing
Whole genome sequencing
X-linked Retinitis Pigmentosa
Verbakel SK, van Huet RAC, Boon CJF, den Hollander AI, Collin RWJ, Klaver CCW, Hoyng CB, Roepman R, Klevering BJ. Non-syndromic retinitis pigmentosa. Prog Retin Eye Res. 2018;66:157–86.
Hartong DT, Berson EL, Dryja TP. Retinitis Pigmentosa. Lancet. 2006;368:1795–809.
Yuan Z, Li B, Xu M, Chang EY, Li H, Yang L, Wu S, Soens ZT, Li Y, Wong LC, et al. The phenotypic variability of HK1-associated retinal dystrophy. Sci Rep. 2017;7:017–07629.
Hull S, Arno G, Plagnol V, Chamney S, Russell-Eggitt I, Thompson D, Ramsden SC, Black GC, Robson A, Holder GE, et al. The phenotypic variability of retinal dystrophies associated with mutations in CRX, with report of a novel macular dystrophy phenotype. Invest Ophthalmol Vis Sci. 2014;55:6934–44.
RetNet: Retinal Information Network. https://sph.uth.edu/retnet/home.htm. Accessed 22 Sept 2019.
Martin-Merida I, Avila-Fernandez A, Del Pozo-Valero M, Blanco-Kelly F, Zurita O, Perez-Carro R, Aguilera-Garcia D, Riveiro-Alvarez R, Arteche A, Trujillo-Tiebas MJ, et al. Genomic landscape of sporadic Retinitis Pigmentosa: findings from 877 Spanish Cases. Ophthalmology. 2019;126:1181–8.
Bravo-Gil N, Gonzalez-Del Pozo M, Martin-Sanchez M, Mendez-Vidal C, la Rodriguez-de la Rua E, Borrego S, Antinolo G. Unravelling the genetic basis of simplex Retinitis Pigmentosa cases. Sci Rep. 2017;7:41937.
Birtel J, Gliem M, Mangold E, Muller PL, Holz FG, Neuhaus C, Lenzner S, Zahnleiter D, Betz C, Eisenberger T, et al. Next-generation sequencing identifies unexpected genotype-phenotype correlations in patients with retinitis pigmentosa. PLoS ONE. 2018;13:e0207958.
Churchill JD, Bowne SJ, Sullivan LS, Lewis RA, Wheaton DK, Birch DG, Branham KE, Heckenlively JR, Daiger SP. Mutations in the X-linked retinitis pigmentosa genes RPGR and RP2 found in 8.5% of families with a provisional diagnosis of autosomal dominant retinitis pigmentosa. Invest Ophthalmol Vis Sci. 2013;54:1411–6.
Jones KD, Wheaton DK, Bowne SJ, Sullivan LS, Birch DG, Chen R, Daiger SP. Next-generation sequencing to solve complex inherited retinal dystrophy: a case series of multiple genes contributing to disease in extended families. Mol Vis. 2017;23:470–81.
Chen X, Sheng X, Liu Y, Li Z, Sun X, Jiang C, Qi R, Yuan S, Wang X, Zhou G, et al. Distinct mutations with different inheritance mode caused similar retinal dystrophies in one family: a demonstration of the importance of genetic annotations in complicated pedigrees. J Transl Med. 2018;16:018–1522.
Mendez-Vidal C, Bravo-Gil N, Gonzalez-Del Pozo M, Vela-Boza A, Dopazo J, Borrego S, Antinolo G. Novel RP1 mutations and a recurrent BBS1 variant explain the co-existence of two distinct retinal phenotypes in the same pedigree. BMC Genet. 2014;15:143.
Duncan JL, Pierce EA, Laster AM, Daiger SP, Birch DG, Ash JD, Iannaccone A, Flannery JG, Sahel JA, Zack DJ, Zarbin MA. Inherited retinal degenerations: current landscape and knowledge gaps. Transl Vis Sci Technol. 2018;7:6.
Nanda A, McClements ME, Clouston P, Shanks ME, MacLaren RE. The location of Exon 4 mutations in RP1 raises challenges for genetic counseling and gene therapy. Am J Ophthalmol. 2019;202:23–9.
Dias MF, Joo K, Kemp JA, Fialho SL, da Silva Cunha A Jr., Woo SJ, Kwon YJ. Molecular genetics and emerging therapies for retinitis pigmentosa: basic research and clinical perspectives. Prog Retin Eye Res. 2018;63:107–31.
Bravo-Gil N, Mendez-Vidal C, Romero-Perez L, Gonzalez-del Pozo M, Rodriguez-de la Rua E, Dopazo J, Borrego S, Antinolo G. Improving the management of inherited retinal dystrophies by targeted sequencing of a population-specific gene panel. Sci Rep. 2016;6:1–10.
Wang X, Wang H, Sun V, Tuan HF, Keser V, Wang K, Ren H, Lopez I, Zaneveld JE, Siddiqui S, et al. Comprehensive molecular diagnosis of 179 Leber congenital amaurosis and juvenile retinitis pigmentosa patients by targeted next generation sequencing. J Med Genet. 2013;50:674–88.
Shanks ME, Downes SM, Copley RR, Lise S, Broxholme J, Hudspith KA, Kwasniewska A, Davies WI, Hankins MW, Packham ER, et al. Next-generation sequencing (NGS) as a diagnostic tool for retinal degeneration reveals a much higher detection rate in early-onset disease. Eur J Hum Genet. 2013;21:274–80.
Consugar MB, Navarro-Gomez D, Place EM, Bujakowska KM, Sousa ME, Fonseca-Kelly ZD, Taub DG, Janessian M, Wang DY, Au ED, et al. Panel-based genetic diagnostic testing for inherited eye diseases is highly accurate and reproducible, and more sensitive for variant detection, than exome sequencing. Genet Med. 2015;17:253–61.
Farrar GJ, Carrigan M, Dockery A, Millington-Ward S, Palfi A, Chadderton N, Humphries M, Kiang AS, Kenna PF, Humphries P. Toward an elucidation of the molecular genetics of inherited retinal degenerations. Hum Mol Genet. 2017;26:R2–11.
Carss KJ, Arno G, Erwood M, Stephens J, Sanchis-Juan A, Hull S, Megy K, Grozeva D, Dewhurst E, Malka S, et al. Comprehensive rare variant analysis via whole-genome sequencing to determine the molecular pathology of inherited retinal disease. Am J Hum Genet. 2017;100:75–90.
Zeitz C, Michiels C, Neuille M, Friedburg C, Condroyer C, Boyard F, Antonio A, Bouzidi N, Milicevic D, Veaux R, et al. Where are the missing gene defects in inherited retinal disorders? Intronic and synonymous variants contribute at least to 4% of CACNA1F-mediated inherited retinal disorders. Hum Mutat. 2019;40:765–87.
Meienberg J, Bruggmann R, Oexle K, Matyas G. Clinical sequencing: is WGS the better WES? Hum Genet. 2016;135:359–62.
Ellingford JM, Barton S, Bhaskar S, Williams SG, Sergouniotis PI, O’Sullivan J, Lamb JA, Perveen R, Hall G, Newman WG, et al. Whole genome sequencing increases molecular diagnostic yield compared with current diagnostic testing for inherited retinal disease. Ophthalmology. 2016;123:1143–50.
Langmead B, Trapnell C, Pop M, Salzberg SL. Ultrafast and memory-efficient alignment of short DNA sequences to the human genome. Genome Biol. 2009;10:2009–10.
Li H, Durbin R. Fast and accurate short read alignment with Burrows–Wheeler transform. Bioinformatics. 2009;25:1754–60.
Faust GG, Hall IM. SAMBLASTER: fast duplicate marking and structural variant read extraction. Bioinformatics. 2014;30:2503–5.
McKenna A, Hanna M, Banks E, Sivachenko A, Cibulskis K, Kernytsky A, Garimella K, Altshuler D, Gabriel S, Daly M, DePristo MA. The Genome Analysis Toolkit: a MapReduce framework for analyzing next-generation DNA sequencing data. Genome Res. 2010;20:1297–303.
Garrison E. Vcflib, a simple C++ library for parsing and manipulating VCF files; 2015. https://github.com/vcflib/vcflib: GitHub.
Zhu M, Need AC, Han Y, Ge D, Maia JM, Zhu Q, Heinzen EL, Cirulli ET, Pelak K, He M, et al. Using ERDS to infer copy-number variants in high-coverage genomes. Am J Hum Genet. 2012;91:408–21.
Karolchik D, Hinrichs AS, Furey TS, Roskin KM, Sugnet CW, Haussler D, Kent WJ. The UCSC Table Browser data retrieval tool. Nucleic Acids Res. 2004;32:D493–6.
MacDonald JR, Ziman R, Yuen RK, Feuk L, Scherer SW. The Database of Genomic Variants: a curated collection of structural variation in the human genome. Nucleic Acids Res. 2014;42:29.
Firth HV, Richards SM, Bevan AP, Clayton S, Corpas M, Rajan D, Van Vooren S, Moreau Y, Pettett RM, Carter NP. DECIPHER: database of chromosomal imbalance and phenotype in humans using ensembl resources. Am J Hum Genet. 2009;84:524–33.
Kotlar AV, Trevino CE, Zwick ME, Cutler DJ, Wingo TS. Bystro: rapid online variant annotation and natural-language filtering at whole-genome scale. Genome Biol. 2018;19:018–1387.
Desmet FO, Hamroun D, Lalande M, Collod-Beroud G, Claustres M, Beroud C. Human Splicing Finder: an online bioinformatics tool to predict splicing signals. Nucleic Acids Res. 2009;37:1.
Yeo G, Burge CB. Maximum entropy modeling of short sequence motifs with applications to RNA splicing signals. J Comput Biol. 2004;11:377–94.
Reese MG, Eeckman FH, Kulp D, Haussler D. Improved splice site detection in Genie. J Comput Biol. 1997;4:311–23.
Liquori A, Vache C, Baux D, Blanchet C, Hamel C, Malcolm S, Koenig M, Claustres M, Roux AF. Whole USH2A gene sequencing identifies several new deep intronic mutations. Hum Mutat. 2016;37:184–93.
Sim NL, Kumar P, Hu J, Henikoff S, Schneider G, Ng PC. SIFT web server: predicting effects of amino acid substitutions on proteins. Nucleic Acids Res. 2012;40:11.
Adzhubei IA, Schmidt S, Peshkin L, Ramensky VE, Gerasimova A, Bork P, Kondrashov AS, Sunyaev SR. A method and server for predicting damaging missense mutations. Nat Methods. 2010;7(4):248–9. https://doi.org/10.1038/nmeth0410-248.
Schwarz JM, Cooper DN, Schuelke M, Seelow D. MutationTaster2: mutation prediction for the deep-sequencing age. Nat Methods. 2014;11(4):361–2. https://doi.org/10.1038/nmeth.2890.
Richards S, Aziz N, Bale S, Bick D, Das S, Gastier-Foster J, Grody WW, Hegde M, Lyon E, Spector E, et al. Standards and guidelines for the interpretation of sequence variants: a joint consensus recommendation of the American College of Medical Genetics and Genomics and the Association for Molecular Pathology. Genet Med. 2015;17:405–24.
Wildeman M, van Ophuizen E, den Dunnen JT, Taschner PE. Improving sequence variant descriptions in mutation databases and literature using the Mutalyzer sequence variation nomenclature checker. Hum Mutat. 2008;29:6–13.
Rentzsch P, Witten D, Cooper GM, Shendure J, Kircher M. CADD: predicting the deleteriousness of variants throughout the human genome. Nucleic Acids Res. 2019;47:D886–94.
Siepel A, Bejerano G, Pedersen JS, Hinrichs AS, Hou M, Rosenbloom K, Clawson H, Spieth J, Hillier LW, Richards S, et al. Evolutionarily conserved elements in vertebrate, insect, worm, and yeast genomes. Genome Res. 2005;15:1034–50.
Pollard KS, Hubisz MJ, Rosenbloom KR, Siepel A. Detection of nonneutral substitution rates on mammalian phylogenies. Genome Res. 2010;20:110–21.
Szklarczyk D, Gable AL, Lyon D, Junge A, Wyder S, Huerta-Cepas J, Simonovic M, Doncheva NT, Morris JH, Bork P, et al. STRING v11: protein-protein association networks with increased coverage, supporting functional discovery in genome-wide experimental datasets. Nucleic Acids Res. 2019;47:D607–13.
Gullapalli RR, Desai KV, Santana-Santos L, Kant JA, Becich MJ. Next generation sequencing in clinical medicine: challenges and lessons for pathology and biomedical informatics. J Pathol Inform. 2012;3:2153–3539.
Schwarze K, Buchanan J, Taylor JC, Wordsworth S. Are whole-exome and whole-genome sequencing approaches cost-effective? A systematic review of the literature. Genet Med. 2018;20:1122–30.
McGee TL, Seyedahmadi BJ, Sweeney MO, Dryja TP, Berson EL. Novel mutations in the long isoform of the USH2A gene in patients with Usher syndrome type II or non-syndromic retinitis pigmentosa. J Med Genet. 2010;47:499–506.
Baux D, Blanchet C, Hamel C, Meunier I, Larrieu L, Faugere V, Vache C, Castorina P, Puech B, Bonneau D, et al. Enrichment of LOVD-USHbases with 152 USH2A genotypes defines an extensive mutational spectrum and highlights missense hotspots. Hum Mutat. 2014;35:1179–86.
Aller E, Najera C, Millan JM, Oltra JS, Perez-Garrigues H, Vilela C, Navea A, Beneyto M. Genetic analysis of 2299delG and C759F mutations (USH2A) in patients with visual and/or auditory impairments. Eur J Hum Genet. 2004;12:407–10.
Lenassi E, Vincent A, Li Z, Saihan Z, Coffey AJ, Steele-Stallard HB, Moore AT, Steel KP, Luxon LM, Heon E, et al. A detailed clinical and molecular survey of subjects with nonsyndromic USH2A retinopathy reveals an allelic hierarchy of disease-causing variants. Eur J Hum Genet. 2015;23:1318–27.
Gonzalez-Del Pozo M, Martin-Sanchez M, Bravo-Gil N, Mendez-Vidal C, Chimenea A, Rodriguez-de la Rua E, Borrego S, Antinolo G. Searching the second hit in patients with inherited retinal dystrophies and monoallelic variants in ABCA4, USH2A and CEP290 by whole-gene targeted sequencing. Sci Rep. 2018;8:018–31511.
Ebermann I, Phillips JB, Liebau MC, Koenekoop RK, Schermer B, Lopez I, Schafer E, Roux AF, Dafinger C, Bernd A, et al. PDZD7 is a modifier of retinal disease and a contributor to digenic Usher syndrome. J Clin Invest. 2010;120:1812–23.
Aparisi MJ, Aller E, Fuster-Garcia C, Garcia-Garcia G, Rodrigo R, Vazquez-Manrique RP, Blanco-Kelly F, Ayuso C, Roux AF, Jaijo T, Millan JM. Targeted next generation sequencing for molecular diagnosis of Usher syndrome. Orphanet J Rare Dis. 2014;9:014–0168.
Gifford CA, Ranade SS, Samarakoon R, Salunga HT, de Soysa TY, Huang Y, Zhou P, Elfenbein A, Wyman SK, Bui YK, et al. Oligogenic inheritance of a human heart disease involving a genetic modifier. Science. 2019;364:865–70.
Zaghloul NA, Liu Y, Gerdes JM, Gascue C, Oh EC, Leitch CC, Bromberg Y, Binkley J, Leibel RL, Sidow A, et al. Functional analyses of variants reveal a significant role for dominant negative and common alleles in oligogenic Bardet-Biedl syndrome. Proc Natl Acad Sci U S A. 2010;107:10602–7.
Daiger SP, Bowne SJ, Sullivan LS. Genes and mutations causing autosomal dominant Retinitis Pigmentosa. Cold Spring Harb Perspect Med. 2014;5:a017129.
Rose AM, Bhattacharya SS. Variant haploinsufficiency and phenotypic non-penetrance in PRPF31-associated retinitis pigmentosa. Clin Genet. 2016;90:118–26.
DuPont M, Jones EM, Xu M, Chen R. Investigating the disease association of USH2A p.C759F variant by leveraging large retinitis pigmentosa cohort data. Ophthalmic Genet. 2018;39:291–2.
Gonzalez-Del Pozo M, Bravo-Gil N, Mendez-Vidal C, Montero-de-Espinosa I, Millan JM, Dopazo J, Borrego S, Antinolo G. Re-evaluation casts doubt on the pathogenicity of homozygous USH2A p.C759F. Am J Med Genet A. 2015;167:1597–600.
Estrada-Cuzcano A, Koenekoop RK, Senechal A, De Baere EB, de Ravel T, Banfi S, Kohl S, Ayuso C, Sharon D, Hoyng CB, et al. BBS1 mutations in a wide spectrum of phenotypes ranging from nonsyndromic retinitis pigmentosa to Bardet–Biedl syndrome. Arch Ophthalmol. 2012;130:1425–32.
Webb TR, Parfitt DA, Gardner JC, Martinez A, Bevilacqua D, Davidson AE, Zito I, Thiselton DL, Ressa JH, Apergi M, et al. Deep intronic mutation in OFD1, identified by targeted genomic next-generation sequencing, causes a severe form of X-linked retinitis pigmentosa (RP23). Hum Mol Genet. 2012;21:3647–54.
Zheng SL, Zhang HL, Lin ZL, Kang QY. Whole-exome sequencing identifies USH2A mutations in a pseudo-dominant Usher syndrome family. Int J Mol Med. 2015;36:1035–41.
The authors are grateful to the family described in this study.
This work was supported by the Instituto de Salud Carlos III (ISCIII), Spanish Ministry of Economy and Competitiveness, Spain and co-funded by European Union (ERDF, “A way to make Europe”) [PI15-01648] and [PI18-00612], CIBERER ACCI [ER16P1AC702/2017], Regional Ministry of Economy, Innovation, Science and Employment of the Autonomous Government of Andalusia [CTS-1664] and Regional Ministry of Health and Families of the Autonomous Government of Andalusia [PEER-0501-2019]. The Foundation Isabel Gemio/Fundación Cajasol. The CIBERER is an initiative of the ISCIII, Spanish Ministry of Economy and Competitiveness. EFS is supported by fellowship FI19/00091 from ISCIII and co-funded by ESF, “Investing in your future”. The funders had no role in study design, data collection and analysis, decision to publish, or preparation of the manuscript.
Ethics approval and consent to participate
This study was conducted according to the ethical principles for medical research involving human subjects according to the Declaration of Helsinki (Edinburgh, 2000). Prior to the study, written informed consents were obtained from all participants or their legal guardians for clinical and molecular genetic studies, which was approved by the Ethical Committees of the University Hospital Virgen del Rocio (Seville) and the University Hospital Virgen Macarena (Seville).
Consent for publication
Consent for publication was obtained from the affected subjects of family in study.
The authors declare that they have no competing interests.
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Additional file 1: Table S1. List of genes associated with IRD according to Retinal Information Network . This list of 274 genes was used in the prioritization of variants during the application of the “IRD genes filtering”.
About this article
Cite this article
González-del Pozo, M., Fernández-Suárez, E., Martín-Sánchez, M. et al. Unmasking Retinitis Pigmentosa complex cases by a whole genome sequencing algorithm based on open-access tools: hidden recessive inheritance and potential oligogenic variants. J Transl Med 18, 73 (2020). https://doi.org/10.1186/s12967-020-02258-3
- Retinitis Pigmentosa
- Inherited retinal dystrophies