Skip to main content

Comprehensive analysis of oncogenic fusions in mismatch repair deficient colorectal carcinomas by sequential DNA and RNA next generation sequencing



Colorectal carcinoma (CRC) harboring oncogenic fusions has been reported to be highly enriched in mismatch repair deficient (dMMR) tumors with MLH1 hypermethylation (MLH1me+) and wild-type BRAF and RAS. In this study, dMMR CRCs were screened for oncogene fusions using sequential DNA and RNA next generation sequencing (NGS).


Comprehensive analysis of fusion variants, genetic profiles and clinicopathological features in fusion-positive dMMR CRCs was performed. Among 193 consecutive dMMR CRCs, 39 cases were identified as MLH1me+ BRAF/RAS wild-type. Eighteen fusion-positive cases were detected by DNA NGS, all of which were MLH1me+ and BRAF/RAS wild-type. RNA NGS was sequentially conducted in the remaining 21 MLH1me+ BRAF/RAS wild-type cases lacking oncogenic fusions by DNA NGS, and revealed four additional fusions, increasing the proportion of fusion-positive tumors from 46% (18/39) to 56% (22/39) in MLH1me+ BRAF/RAS wild-type dMMR cases. All 22 fusions were found to involve RTK-RAS pathway. Most fusions affected targetable receptor tyrosine kinases, including NTRK1(9/22, 41%), NTRK3(5/22, 23%), ALK(3/22, 14%), RET(2/22, 9%) and MET(1/22, 5%), whilst only two fusions affected mitogen-activated protein kinase cascade components BRAF and MAPK1, respectively. RNF43 was identified as the most frequently mutated genes, followed by APC, TGFBR2, ATM, BRCA2 and FBXW7. The vast majority (19/22, 86%) displayed alterations in key WNT pathway components, whereas none harbored additional mutations in RTK-RAS pathway. In addition, fusion-positive tumors were typically diagnosed in elder patients and predominantly right-sided, and showed a significantly higher preponderance of hepatic flexure localization (P < 0.001) and poor differentiation (P = 0.019), compared to fusion-negative MLH1me+ CRCs.


We proved that sequential DNA and RNA NGS was highly effective for fusion detection in dMMR CRCs, and proposed an optimized practical fusion screening strategy. We further revealed that dMMR CRCs harboring oncogenic fusion was a genetically and clinicopathologically distinctive subgroup, and justified more precise molecular subtyping for personalized therapy.


Colorectal carcinoma (CRC) represents one of the most common malignancies worldwide, ranking third and fifth for cancer-related deaths in United States and China, respectively [1]. Nowadays, there is an increasing recognition that AJCC-TNM staging is insufficient for personalized therapy. The molecular heterogeneity of CRCs has been widely emphasized, and proved to be of critical prognostic and therapeutic significance.

Oncogenic fusions have long been well-recognized as not only diagnostic or prognostic markers, but also potential therapeutic targets in different cancer types, including CRCs [2]. With the emerging introduction of fusion targeted therapy, efficient and accurate detection of druggable gene fusions is becoming increasingly important for clinical decision making. Fusion gene diagnosis was traditionally performed by fluorescence in situ hybridization (FISH) or quantitative real-time polymerase chain reaction (RT-PCR) assay. Despite the high sensitivity, these methods typically test for only one specific fusion gene, and provide very limited information of the fusion partners and breakpoints [3]. Targeted DNA-based next generation sequencing (NGS) has been proved to effectively detect common oncogenic fusions with high confidence. However, some gene fusions of high clinical relevance may be missed due to the insufficient coverage of large introns and blind-spot within the targeted areas [4]. By comparison, RNA NGS can overcome many of these limitations by conducting genome-wide inspection of gene fusions with nucleotide-level resolution of genomic breakpoints, identifying both known and novel fusion genes, and delineating the fusion transcripts directly at the mRNA level [3, 5]. Currently, RNA NGS has been proved to be an indispensable testing in routine diagnostics for sarcoma[6], and an important complement to DNA NGS for high yield detection of targetable gene fusions in non-small cell lung cancers [7, 8]. Nevertheless, reports regarding RNA NGS in fusion gene diagnosis of other cancers, including CRCs, are still limited.

Previously, oncogenic fusions were considered to be rare molecular events in CRCs, presenting in less than 1% of unselected patients [9]. Due to the extremely low prevalence, universal assessment for gene fusions utilizing high-throughput methods in routine clinical practice could be expensive and time-consuming. A practical and efficient strategy to screen for such rare but clinically critical molecular alteration was highly warranted. Notably, we and others have recently uncovered that gene fusions were nearly exclusively detected, and significantly enriched in a specific molecular subtype of mismatch repair deficient (dMMR) CRCs, characterized by hypermethylated MLH1 (MLH1me+) and wild-type BRAF/RAS [9,10,11]. A preliminary screening protocol using routine molecular pathological assays has also been proposed by us [10]. In the present study, we enlarged the sample size and incorporated RNA NGS in complement to DNA NGS for fusion detection, aiming to improve our prior fusion screening strategy, and achieve more comprehensive understanding of this rare CRC subtype.

In this study, DNA NGS was performed in a retrospective consecutive cohort of dMMR CRCs, whilst RNA NGS was sequentially conducted in MLH1me+ BRAF/RAS wild-type dMMR CRCs lacking oncogenic fusions by DNA NGS. We revealed that additional RNA NGS could efficiently enhance fusion detection, and accordingly proposed an optimizing strategy to screen for potential targetable gene fusions in CRCs using combined DNA NGS and RNA NGS. A complete review of fusion genes and variants was presented. Molecular genetic features and clinicopathological features in dMMR CRC with oncogenic fusions were also analyzed.

Materials and methods

Patient selection

This retrospective study involved consecutive CRC cases (n = 2230) from July 2015 until June 2020 in Peking Union Medical College Hospital (PUMCH). All patients with materials included in the study underwent a partial colectomy for primary CRC. None of the patients were known to have received neoadjuvant therapy or tyrosine kinase inhibitor therapy prior to surgery. This study was approved upon ceding review by the PUMCH Institutional Review Board for review.

DNA and RNA extraction

DNA and RNA were isolated from formalin-fixed paraffin-embedded (FFPE) CRC specimens using Direct FFPE DNA Kit (Qiagen #A31133) and RNeasy FFPE Kit (Qiagen #73504), respectively, according to the manufacturer’s protocols.

DNA NGS and determination of mutational significance

DNA targeted sequencing was performed using hybrid capture-based targeted next-generation sequencing (NGS) as previously described. Barcoded libraries were hybridized to our customized panel of 1,021 genes containing whole exons, selected introns of 288 genes and selected regions of 733 genes (Additional file 1: Table S1). The libraries were prepared and sequenced to a uniform median depth (> 500×). Genomic alterations, including single nucleotide variants, small insertions and deletions, copy number alterations, and gene fusions/rearrangements, were compared against each patient’s corresponding normal sample. After removing raw reads containing adaptor sequences, those with more than 50% low-quality base reads, or those with more than 50% N bases, together with their mate pair, reads were mapped to the reference human genome (hg19) using the Burrows-Wheel Aligner ( with default parameters. Duplicate reads were identified and marked with Picard’s Mark Duplicates tool ( for tumor and germline DNA data and were clustered according to UID and position of the template fragments for cfDNA data. Errors introduced by PCR or sequencing were corrected according to clustered reads. Local realignment and base quality recalibration were performed using The Gene Analysis Toolkit ( Somatic single-nucleotide variations (SNVs) were called using the MuTect2 algorithm ( Candidate mutations were filtered if: (1) more than 10 reads with insertions/deletions in an 11-bp window were centered; (2) the matched germline DNA control sample carried ≥ 3% or ≥ 2% alternate allele reads, and the sum of quality scores was above 80; (3) the candidate was found in dbsnp (version 138, but not listed in the COSMIC database; (4) the candidate was supported by fewer than five high-quality reads (base quality ≥ 30, mapping quality ≥ 30); or (5) the allele frequency was less than 1%. Insertions or deletions of small fragments (indels) were called using MuTect2 with default parameters. Variants detected in matched control samples with three or more reads indicating indels at the same location or in the 40-bp flanking regions of experimental samples or residing near regions with low complexity or short tandem repeats were removed. Remaining mutations were considered validated somatic variants. CNVs in tumor DNA was called using The Contra algorithm ( Genomic DNA sequencing libraries were prepared using the protocols recommended The KAPA Library Preparation Kit (Kapa Biosystems, Wilmington, MA, USA). Genomic DNA sequencing libraries were prepared using the protocols recommended The KAPA Library Preparation Kit (Kapa Biosystems, Wilmington, MA, USA). The libraries were hybridized to custom-designed probes covering 1021 genes (Integrated DNA Technology, Coralville, IA, USA), including selected for the detection of genomic rearrangements. Genomic rearrangements were identified by the software developed in-house analyzing chimeric read pairs. MSI status was determined using MSIsensor (v0.2), which reported the percentage of unstable somatic microsatellites through a Chi-square test on predefined microsatellite regions covered by our panel. The average sequencing depth for the target regions of the tumor samples was 2447×, and 99.0% of the average coverage of the targeted regions was more than 200×, which were qualified for variant calling and the MSI analysis.

Mutations of oncogenes were filtered according to the corresponding documentation in the Catalog of Somatic Mutations in Cancer [12] and OncoKB [13] annotation. Mutational significance of tumor suppressor genes was determined according to protocols described in our previous study [14], and only “predicted deleterious” mutations were included in the analysis.


NEBNext rRNA Depletion Kit (Human/Mouse/Rat) (NEB #Z1955E) was chosen to remove the targeted ribosomal RNA (rRNA). All RNA with a percentage of RNA fragments > 200 nucleotides (DV200) ≤ 50% skipped fragmentation and proceeded to library preparation. After rRNA depletion and fragmentation, cDNA synthesis and NGS library preparation were performed using NEBNext® Ultra™ II Directional RNA Library Prep Kit (NEB#E7760L). The library was quantitated using Qubit 3.0 (life Invitrogen, USA) and quality was assessed with LabChip GX Touch (PerkinElmer, USA). After removal of terminal adaptor sequences and low-quality data by using fastp (version: 0.19.5) [15] and removal rRNA reads through aligning clean reads to rRNA database (download from NCBI) by using bowtie2 (version:2.2.8) [16], clean reads without known rRNA were aligned to the reference human genome (hg19) through STAR (version 020201) [17]. Fusions were detected by a customized version of Arriba 1.1.0. and annotated by in house software annoFilterArriba (version:1.0.0) with NCBI release 104 database. All final candidate fusions were manually verified with the integrative genomics viewer browser. A series of quality control metrics was computed by using RNA-SeQC assessment [18]. A threshold of ≥ 80 million mapped reads and ≥ 10 million junction reads per sample was set.

MLH1 promoter hypermethylation analysis

MLH1 promoter hypermethylation analysis was performed using methylation-specific PCR, with the protocol as previously described [10, 14].

Statistical methods

Continuous variables were presented as mean ± standard deviation, and categorical variables were expressed as percentages. Chi-square test, Fisher’s exact test, or Mann–Whitney test was used when appropriate for comparison between dMMR CRCs with fusion and dMMR CRCs without fusion. Statistical processing was performed using SPSS version 24 (SPSS Inc., Chicago, IL, USA) and P < 0.05 (two-sided) was considered statistically significant.


Screening for MLH1-hypermethylated dMMR CRC cases

Of the 2230 cases in the consecutive CRC cohort, 193 (9%) cases showed absent immunohistochemical (IHC) staining in any of four MMR proteins (MLH1, MSH2, MSH6 and PMS2), and were identified as dMMR tumors. One hundred and forty-three cases showing lost MLH1/PMS2 expression were subjected to methylation-specific PCR. Of these, ninety-one cases (91/143, 64%) presented MLH1 promoter hypermethylation.

Complete review of gene fusions detected by sequential DNA and RNA NGS

DNA NGS was conducted in all 193 dMMR tumors, and identified eighteen genetic fusions (detailed in Additional file 2: Figure S1 and summarized in Fig. 1A). All gene fusions were exclusively presented in tumors harboring MLH1 promoter hypermethylation and lacking concurrent BRAF or RAS driver mutations. These fusion-positive cases by DNA NGS represented 9% (18/193) of all dMMR tumors, 19% (18/91) of MLH1me+ tumors, and 46% (18/39) of MLH1me+ tumors with wild type BRAF or RAS. NTRK1 fusions were the most frequent fusion events detected by DNA NGS, presenting in nine cases. All NTRK1 fusions were intrachromosomal rearrangements involving known NTRK1 partners. Six of these cases (6/9, 67%) harbored TPM3-NTRK1 fusions with three different fusion breakpoints: exon(e)7 to e10 (3/9, 33%), e7 to e9 (2/9, 22%) and e5 to e11(1/9, 11%). LMNA-NTRK1 fusions were found in two cases, with e9 to e12 and e10 to e10 fusion breakpoints, respectively. PLEKHA6-NTRK1 fusion with e22 to e10 fusion breakpoint was found in one case. NTRK3 gene fusions were identified in three cases, which were interchromosomal translocations with identical fusion breakpoints involving ETV6 e1–5 on chromosome 12 and NTRK3 e15–20 on chromosome 15. In-frame ALK gene rearrangements were found in three cases. Two of them were well-reported fusions connecting STRN e3 to ALK e20. Another one showed a fusion between EML4 e1–2 and atypical breakpoint at ALK e19. NCOA4-RET fusion gene involving NCOA4 e1–11 and RET e12–19 were observed in two cases. CUL1-BRAF fusion gene were found in one case, with the BRAF breakpoint located in intron 8, preserving the portion encoding the BRAF kinase domain.

Fig. 1
figure 1

A Schematic representation of the predicted products of the 18 gene fusions detected by DNA NGS. B Schematic representation of the predicted products of four gene fusions detected by RNA NGS, but not by DNA NGS

Additional RNA NGS was performed in 21 MLH1me+ CRCs where neither oncogenic gene fusions nor BRAF/RAS driver mutations were detected by DNA NGS. Gene fusions were identified by RNA NGS in four (4/21, 19%) cases (detailed in Additional file 3: Figure S2 and summarized in Fig. 1B). Among them, two cases presented EML4-NTRK3 fusions, which were formed through reciprocal translocation that joined the e1–2 of EML4 with e14–19 of NTRK3. One case showed MET gene rearrangement involving a novel partner gene SNRNP70, with fusion breakpoints of SNRNP70 e8 to MET e15. In another case, a novel in-frame fusion involving YPEL1 and the extracellular signal-regulated kinase gene MAPK1 was detected. This YPEL1-MAPK1 chimeric transcript contained only part of the MAPK1 C-terminal kinase domain by connecting e1 of YPEL1 to e5 of MAPK1. EML4-NTRK3 fusion was validated by RT-PCR and Sanger sequencing on FFPE samples of two cases. (Additional file 4: Figure S3).

All 22 fusion events were identified as driver alterations within RTK-RAS signaling pathway (Fig. 2). The majority of fusions affected upstream receptor tyrosine kinases (RTKs), including NTRK1(9/22, 41%), NTRK3(5/22, 23%), ALK(3/22, 14%), RET(2/22, 9%) and MET(1/22, 5%). Two other fusions involved components of mitogen-activated protein kinase (MAPK) cascade BRAF(1/22, 5%) and MAPK1 (1/22, 5%), functioning in intracellular signal transduction of RTK-RAS pathway.

Fig. 2
figure 2

Schematic representation showing the activation of RTK-RAS signaling pathway by 22 gene fusions in our colorectal carcinoma cohort. All of the detected gene rearrangements within NTRK1, NTRK3, ALK, RET, MET, BRAF and MAPK1 are targetable with currently available small molecule kinase inhibitors

Development of screening strategy for gene fusions in CRC using integrative DNA NGS and RNA NGS

Comparing to DNA NGS alone, additional RNA NGS increased the proportion of detected fusion-positive tumors from 9% (18/193) to 11% (22/193) in dMMR cases, 19% (18/91) to 24% (22/91) in MLH1me+ dMMR cases, and from 46% (18/39) to 56% (22/39) in MLH1me+ BRAF/RAS wild-type dMMR cases, respectively. Based on these and our previously published findings, we developed an improved strategy with combined use of DNA NGS and RNA NGS to screen for potentially targetable gene fusions in CRCs (Fig. 3). In the molecular workup for MLH1me+ dMMR CRCs, when BRAF/KRAS/NRAS driver mutation testing was performed by DNA NGS, sequential RNA NGS was indicated when no gene fusions were found. Additionally, direct RNA NGS was suggested in BRAF/RAS wild-type cases when PCR assay was performed instead of DNA NGS for BRAF/RAS genotyping.

Fig. 3
figure 3

An optimized strategy incorporating RNA next generation sequencing to screen for gene fusions in colorectal carcinomas. CRC, colorecta carcinoma; NGS, next generation sequencing; MMR, mismatch repair

Molecular genetic features of dMMR CRCs with gene fusions

RNF43(17/22, 77%), FAT2(10/22, 45%), APC(9/22, 41%), FAT1(9/22, 41%), TGFBR2(9/22, 41%), ATM(8/22, 36%), TP53(8/22, 36%), ARID2(8/22, 36%), BRCA2(7/22, 32%), FBXW7(7/22, 32%) and ARID1A(7/22, 32%) were identified as most recurrently mutated genes in dMMR CRCs harboring gene fusions (Fig. 4).

Fig. 4
figure 4

Mutation profile of top 20 most frequently mutated genes in 22 fusion-positive colorectal carcinomas. The significantly mutated genes are displayed as bar chart, ordered according to gene mutation frequencies (right plot). Different types of gene alterations in each tumor sample are displayed as heatmap (left plot)

Alterations in key WNT pathway components were found in nineteen (19/22, 86%) cases. Apart from one CTNNB1 activating mutation, these were primarily truncating mutations affecting various tumor suppressor genes RNF43 (n = 17), APC(n = 9), ARID1A(n = 7), FBXW7(n = 7), AXIN2(n = 5), TCF7L2(n = 4), FAM123B(n = 3), and SOX9 (n = 2). Nine (9/22, 41%) cases harbored frameshift mutations in TGFBR2, which encoded a key kinase receptor mediating TGF-β signaling transduction. However, few mutations affecting other key TGF-β pathway components ACVR1B, SMAD2, SMAD3 and SMAD4 were identified. In five (5/22, 23%) tumors, mutations in key genes of PI3K pathway were detected, including PTEN (n = 3), PIK3CA (n = 2), and PIK3R1(n = 1). Notably, both of the tumors with fusions affecting MAPK cascade components BRAF and MAPK1 presented PI3K pathway aberrations (PIK3CA and PTEN mutation, respectively). None of the 22 tumors harbored mutations in other key RTK-RAS driver genes BRAF, KRAS, NRAS, ERBB2 and ERBB3.

Clinicopathological features of dMMR CRCs with gene fusions

The clinicopathological features of 22 tumors harboring gene fusions detected by either DNA NGS or RNA NGS were listed in Table 1. The majority of these tumors were diagnosed in female (13/22, 59%). All patients were elderly over 50 years old, with the median age of 72 years. Tumors were predominantly right-sided (20/22, 91%), and over half were located at hepatic flexure (13/22, 59%). All tumors were either stage II (15/22, 68%) or stage III (7/22, 32%) according to TNM classification. Histologically, poorly differentiated areas were detected in more than half of these tumors (13/22, 59%). Nine cases (9/22, 41%) presented focal to extensive mucinous components, including one case displaying a diffuse signet-ring mucinous component. Lymphovascular invasion was observed in ten cases (10/22, 45%), and perineural invasion was observed in two cases (2/22, 9%) (Fig. 5). Within 91 MLH1me+ CRCs cases, patients with fusion-positive tumors were significantly older (median 72 vs. 62 years, P = 0.013) comparing with those harboring fusion-negative tumors. They also showed a significantly higher preponderance of hepatic flexure localization (59% vs. 12%, P < 0.001) and poor differentiation (55% vs. 23%, P = 0.019) (Table 2). No statistically significant differences in other clinicopathological features, including gender, stage, mucinous differentiation, lymphovascular and perineural invasion were observed between two groups.

Table 1 Clinicopathological features of 22 tumors harboring gene fusions detected by either DNA or RNA next generation sequencing
Fig. 5
figure 5

Histologic features of colorectal carcinomas harboring gene fusions. A Poorly differentiated area in tumor harboring TPM3(e7)–NTRK1(e10) fusion, showing ribbon-like growth pattern. B Poorly differentiated area in tumor harboring TPM3(e7)–NTRK1(e9) fusion, displaying vague nested growth pattern. C Mucinous differentiated area in a tumor harboring ETV6(e5)–NTRK3(e15) fusion. D Diffuse signet-ring mucinous component in a LMNA(e9)–NTRK1(e12) fusion tumor. E Lymphovascular invasion in a TPM3(e5)–NTRK1(e11) fusion tumor; F Perineural invasion in a ETV6(e5)–NTRK3(e15) fusion tumor

Table 2 Comparison of clinicopathological features between fusion-positive MLH1 hypermethylated colorectal cancers, and fusion-negative MLH1 hypermethylated colorectal cancers


It has been documented in our previous study that oncogenic fusions were significantly enriched in dMMR CRCs harboring hypermethylated MLH1 and wild-type BRAF/RAS [10]. Herein, we conducted further study using integrative DNA and RNA sequencing, aimed for more accurate and comprehensive characterization of gene fusions in CRCs. We proved that RNA NGS was a valuable addition to DNA NGS for enhancing fusion detection (46–56% in MLH1me+ BRAF/RAS wild-type dMMR CRCs), as well as identifying novel or atypical fusion types. An optimizing strategy incorporating RNA NGS to screen for oncogenic fusions in CRCs was thus proposed. Next, we presented a detailed analysis of molecular genetic profile and clinicopathological features of fusion-positive dMMR CRCs. All fusions involved RTK-RAS signaling pathway, predominantly RTKs, and were mutually exclusive to other RTK-RAS driver mutations. WNT pathway alterations were also frequently detected. Fusion-positive tumors were typically diagnosed in elder patients, predominantly right-sided, preferentially occurred at hepatic-flexure and showed histologically poor-differentiated components.

Considering the distinct advantages over other techniques in gene fusion detection, the latest National Comprehensive Cancer Network guideline for non-small cell lung cancer recommended RNA-based NGS in patients with no identifiable driver oncogenes detected by broad panel DNA NGS [19]. In the present study, we revealed that nearly 20% (n = 4) MLH1me+ dMMR tumors with neither oncogenic fusions nor BRAF/RAS driver mutations detected by DNA NGS were positive for gene fusions by RNA NGS. In all of these four cases, the genomic breakpoints were located at large introns or intronic repetitive elements, which were typically not sufficiently covered by large hybrid-capture based DNA NGS panel. In our cohort, fusion-positive tumors by integrative DNA and RNA NGS represented 11% of dMMR cases, 24% of MLH1me+ dMMR cases, and 56% of MLH1me+ dMMR cases with wild-type BRAF/RAS. These proportions were much higher in comparison to that reported in prior DNA-based large-scale clinical research using MSK-IMPACT assay [9], suggesting that optimizing fusion detection process by incorporating additional RNA NGS was able to achieve a considerably higher yield of gene fusions in CRCs. In addition, RNA NGS successfully identified two potentially actionable kinase fusions (SNRNP70-MET and YPEL1-MAPK1) which have not been reported in CRCs before. Therefore, we suggested the sequentially combined use of DNA NGS and RNA NGS as a highly effective strategy to uncover oncogenic gene fusions in MLH1me+ CRCs, which were suggested as markers for unfavorable prognosis and targets for personalized therapy [20]. In clinical settings where BRAF/RAS PCR was applied as an alternative to DNA NGS, direct RNA NGS was recommended in BRAF/RAS wild-type cases for maximized cost-efficiency.

RNA extracted from fresh-frozen (FF) tissue was preferentially used for gene expression study. However, the availability of FF tissue was very limited in clinical practice. FFPE specimens represent more accessible and exploitable sources for molecular studies. Despite that RNA isolated from FFPE samples often suffer degradation and chemical modification due to fixation and archiving method, recent comparative studies have reported high correlation of RNA NGS detected gene expression profile between paired FFPE and FF samples [21, 22]. Notably, artifacts introduced during library preparation and sequence alignment might hamper the reliable prediction of gene fusions by RNA NGS, leading to unaligned or out-of-frame transcripts. In clinical practice, sequential cross-validation using PCR or Sanger sequencing might be considered for RNA-NGS detected novel fusions, especially those with low abundance transcripts and with multiple breakpoints within the same exon of the fusion partner [22].

Aberrant activation of RTK-RAS signaling pathway has been well-recognized as key molecular event in CRC tumorigenesis. Previously, among MLH1me+ dMMR CRCs, RTK-RAS activation was generally considered to be mediated by BRAF oncogenic mutation, occurring at the early stage of serrated neoplasia pathway [23]. In this and our prior studies [14], we revealed that almost all gene fusions were detected in dMMR CRCs harboring hypermethylated MLH1, which presented as the only RTK-RAS driver alteration in these tumors. It is rational to suggest gene fusions as one major mechanism of RTK-RAS oncogenic activation in MLH1me+ dMMR CRCs, second only to BRAF mutation. Most of the fusion-positive cases harbored RTK fusions susceptible to tyrosine kinase inhibition therapy. In spite of the rarity, it is worth noting that a minority of fusions involved MAP3K(BRAF) and MAP1K, genes encoding key components of downstream mitogen-activated protein kinase (MAPK) cascade which were essential for intracellular RTK-RAS signal transduction. Due to the potential feedback activation of EGFR [24, 25], combination therapy consisting of both EGFR and RAS/RAF inhibitors might be required in these cases [26,27,28].

Despite that dMMR was typically considered as a favorable prognostic marker in CRC patients, oncogenic fusions have been shown to be associated with poorer clinical outcome [29, 30]. The detected genetic fusions primarily affected RTKs, and rendered those tumors amenable to FDA approved targeted therapy that might reverse the otherwise poor prognosis. Therefore, efficient identification and detailed characterization of fusion variants is of key clinical significance. In our dMMR CRC cohort, TRK fusions, particularly NTRK1 fusions, were the most frequently detected fusion events. We observed that TPM3 was the most common fusion partner of NTRK1 in CRCs (66%), which was in consistent with previous reports [31, 32]. NTRK1-LMNA and NTRK1-PLEKHA6, two other NTRK1 fusion types documented in CRCs before [31], were found to take a lesser proportion in our cases. We did not detect NTRK1 fusions with SCYL3 and TPR, which have been reported rarely before [32]. In previously published reports, NTRK3 fusions were found in only a few CRCs, accounting for two out of 21 fusion events in cases assessed by MSK-IMPACT testing [9], and one out of 16 NTRK fusion events in cases screened by pan-TRK IHC testing [32]. However, it has been implicated that substantial numbers of NTRK3 gene rearrangements occurred at large introns (NTRK3 intron 13 and 14), and might be omitted by DNA NGS alone [7]. Also, large scale clinical researches have documented a lower sensitivity of pan-TRK IHC assay for NTRK3 fusions comparing to NTRK1/2 fusions [33, 34]. In the present study, using sequentially combined DNA NGS and RNA NGS, we observed a much higher proportion of NTRK3 fusions in all detected fusion events (5/22). This finding further justified incorporating RNA NGS in clinical practice to more efficiently identify fusion-positive tumors, especially those harboring NTRK3 fusions. Although several rare NTRK3 fusion types were previously identified in CRCs, including KANK1-NTRK3, COX5A-NTRK3 and VPS18-NTRK3 [11, 32], here we observed that NTRK3 exclusively formed fusion with its main partner gene ETV6 or EML4. As far as we can see, two of the gene fusions affecting RTKs presented in our cohort were not well-documented in CRCs previously. An EML4-ALK fusion was found to involve atypical ALK breakpoint within exon 19 that encoded transmembrane domain. ALK rearrangements at exon 19, instead of usual site within intron 19 or exon 20, has only been rarely described in malignant stromal sarcoma [35] and lung adenocarcinoma [36, 37] before. Except for a case demonstrating a partial response to targeted therapy [36], reports on clinical implication of this breakpoint were very limited. A MET fusion with novel partner gene SNRNP70 encoding a key component of spliceosome was identified in one case. Although MET gene copy number gain and protein overexpression were proved to drive CRC tumor malignant progression [38], MET gene fusions have not been noted in CRCs before.

Apart from RTKs, gene fusions involving the downstream MAPK cascade were also potentially actionable. Both of the two fusions affecting MAPK cascade detected in our cohort have been rarely reported before. The CUL1(e7)–BRAF(e9) fusion was previously observed in a few cases of melanoma [39] and low-grade serous carcinoma (LGSC) [40], and only once in CRC [9]. Tumor cells harboring CUL1-BRAF fusion have been found to show activation of MAPK signaling pathway and sensitivity to MEK/RAF inhibition. Moreover, complete response to MEK inhibitor-based combination therapy was noted in one LGSC patient bearing CUL1-BRAF fusion [40]. The YPEL1(e1)–MAPK1 (e5) was a novel fusion to our limited knowledge. Typically, abnormal overactivation of MAPK1 (ERK) was induced by hyperactivated upstream RTK/RAS signaling. Gain-of-function mutations in the gene itself were only seldomly documented in laboratory models or in clinical cases [41]. Since only part of MAPK1 C-terminal kinase domain was involved in the detected YPEL1-MAPK1 chimeric transcript, whether this fusion gene possessed oncogenic properties awaited further investigation. Given that constitutively activated RTK fusions could concurrently induce downstream RAS and PI3K pathways, it is not surprising to find the general low frequency of PI3K pathway aberration among tumors harboring RTK fusions. However, PIK3CA and PTEN mutations were observed in these two cases with fusions involving MAPK cascade. This finding indicated that despite the well-established intimate intersection of RTK downstream pathways RAS-MAPK and PI3K-mTOR, constitutive activation of MAPK cascade by gene rearrangements might not be sufficient to cross-activate PI3K-mTOR signaling and give rise to malignant transformation events.

We observed that RNF43 was the most frequently mutated one among all genes analyzed in this study. This result strengthened our previous finding that RNF43 inactivation was directly correlated with MLH1 hypermethylation, instead of BRAF mutation status [14]. Nearly 90% of the fusion-positive cases were presented with WNT pathway alterations. Additionally, four out of 12 top recurrently mutated genes (RNF43, APC, FBXW7 and ARID1A) were found to be involved in WNT signaling. It is rational to assume that synergistic cooperation of WNT pathway components might play an important role in tumorigenesis of fusion-positive CRCs. A very recent in vitro study revealed susceptibility to poly (ADP-ribose) polymerase (PARP) inhibitors in a subset of poor prognostic CRCs with DNA homologous recombination repair (HRR) pathway deficiency [42]. Our data showed that one third of fusion-positive tumors harbored mutations in crucial HRR genes ATM and BRCA2, and lay a rationale for further clinical studies investigating PARP inhibitors as a potential therapeutic option for these tumors.

Based on large sample size and detailed molecular subclassification, we further conducted comparison between fusion-positive and fusion-negative tumors within MLH1me+ CRCs. Fusion-positive tumors were found to exhibit characteristic clinicopathological features, including old age, preferential hepatic flexure localization and poor differentiation. Typically, dMMR tumors were considered as a relatively homogeneous molecular entity characterized by vulnerability to immunotherapy, which have recently been approved by FDA as first-line treatment for metastatic dMMR CRCs. Our findings highlighted the delicate yet noticeable heterogeneity within dMMR CRCs, and justified more precise molecular subtyping for personalized diagnosis and therapy in CRCs. In addition, a recent study has uncovered the continuum variation of tumor molecular profile along the large intestine, and necessitated more precise classification of CRCs by tumor location [43]. In this study, we not only confirmed that fusion-positive CRCs were primarily right-sided lesions, but also specified that more than half of them were localized at hepatic flexure. In clinical practice, these results implicated that CRC patients with above-mentioned clinicopathological features might be prioritized for molecular assay for gene fusions, including RNA NGS.

In the present study, we found that fusion-positive tumors showed a significantly higher preponderance of hepatic flexure localization. Variations of microbiome, clinicopathological features and molecular profiles have been reported to be associated with primary tumor localization along the large intestine. Several studies have documented the emerging role of gut microbiota in CRC formation and progression [43, 44]. However, as far as we know, the microbiome characterization of hepatic flexure has not been well described. The mechanism underlying the preferential localization of fusion-positive in hepatic flexure remained to be further explored.

In summary, our study presented a practical and highly effective screening procedure for genetic fusions through integrated DNA NGS and RNA NGS in a selected subset of dMMR CRCs harboring hypermethylated MLH1. With a detailed description of fusion variants, molecular profile and clinicopathologic features, we further characterized fusion-positive CRCs as a distinctive subtype with key clinical significance.

Availability of data and materials

The datasets used, generated, and analyzed during the present study are available from the corresponding authors on reasonable request.



Colorectal carcinoma


Fluorescence in situ hybridization


Real-time polymerase chain reaction


Next generation sequencing


Mismatch repair deficient

MLH1 me + :

Hypermethylated MLH1


Formalin-fixed paraffin-embedded


Ribosomal RNA




Receptor tyrosine kinases


Mitogen-activated protein kinase


Low-grade serous carcinoma


Poly ADP-ribose polymerase


  1. Siegel RL, Miller KD, Jemal A. Cancer statistics, 2020. CA Cancer J Clin. 2020;70(1):7–30.

    Article  PubMed  Google Scholar 

  2. Mertens F, Johansson B, Fioretos T, Mitelman F. The emerging complexity of gene fusions in cancer. Nat Rev Cancer. 2015;15(6):371–81.

    Article  CAS  Google Scholar 

  3. Heyer EE, Deveson IW, Wooi D, Selinger CI, Lyons RJ, Hayes VM, et al. Diagnosis of fusion genes using targeted RNA sequencing. Nat Commun. 2019;10(1):1388.

    Article  CAS  Google Scholar 

  4. Lang UE, Yeh I, McCalmont TH. Molecular melanoma diagnosis update: gene fusion, genomic hybridization, and massively parallel short-read sequencing. Clin Lab Med. 2017;37(3):473–84.

    Article  Google Scholar 

  5. Reeser JW, Martin D, Miya J, Kautto EA, Lyon E, Zhu E, et al. Validation of a targeted RNA sequencing assay for kinase fusion detection in solid tumors. J Mol Diagn. 2017;19(5):682–96.

    Article  CAS  Google Scholar 

  6. Lam SW, Cleton-Jansen AM, Cleven AHG, Ruano D, van Wezel T, Szuhai K, et al. Molecular analysis of gene fusions in bone and soft tissue tumors by anchored multiplex PCR-based targeted next-generation sequencing. J Mol Diagn. 2018;20(5):653–63.

    Article  CAS  Google Scholar 

  7. Benayed R, Offin M, Mullaney K, Sukhadia P, Rios K, Desmeules P, et al. High yield of RNA sequencing for targetable kinase fusions in lung adenocarcinomas with no mitogenic driver alteration detected by DNA sequencing and low tumor mutation burden. Clin Cancer Res. 2019;25(15):4712–22.

    Article  CAS  Google Scholar 

  8. Cohen D, Hondelink LM, Solleveld-Westerink N, Uljee SM, Ruano D, Cleton-Jansen AM, et al. Optimizing mutation and fusion detection in NSCLC by sequential DNA and RNA sequencing. J Thorac Oncol. 2020;15(6):1000–14.

    Article  CAS  Google Scholar 

  9. Cocco E, Benhamida J, Middha S, Zehir A, Mullaney K, Shia J, et al. Colorectal carcinomas containing hypermethylated MLH1 promoter and wild-type BRAF/KRAS are enriched for targetable kinase fusions. Cancer Res. 2019;79(6):1047–53.

    Article  CAS  Google Scholar 

  10. Wang J, Yi Y, Xiao Y, Dong L, Liang L, Teng L, et al. Prevalence of recurrent oncogenic fusion in mismatch repair-deficient colorectal carcinoma with hypermethylated MLH1 and wild-type BRAF and KRAS. Mod Pathol. 2019;32(7):1053–64.

    Article  CAS  Google Scholar 

  11. Sato K, Kawazu M, Yamamoto Y, Ueno T, Kojima S, Nagae G, et al. Fusion kinases identified by genomic analyses of sporadic microsatellite instability-high colorectal cancers. Clin Cancer Res. 2019;25(1):378–89.

    Article  CAS  Google Scholar 

  12. Forbes SA, Beare D, Boutselakis H, Bamford S, Bindal N, Tate J, et al. COSMIC: somatic cancer genetics at high-resolution. Nucleic Acids Res. 2017;45(D1):D777–83.

    Article  CAS  Google Scholar 

  13. Chakravarty D, Gao J, Phillips SM, Kundra R, Zhang H, Wang J, et al. OncoKB: a precision oncology knowledge base. JCO Precis Oncol. 2017.

    Article  PubMed  PubMed Central  Google Scholar 

  14. Wang J, Li R, He Y, Yi Y, Wu H, Liang Z. Next-generation sequencing reveals heterogeneous genetic alterations in key signaling pathways of mismatch repair deficient colorectal carcinomas. Mod Pathol. 2020;33(12):2591–601.

    Article  CAS  Google Scholar 

  15. Chen S, Zhou Y, Chen Y, Gu J. fastp: an ultra-fast all-in-one FASTQ preprocessor. Bioinformatics. 2018;34(17):i884–90.

    Article  Google Scholar 

  16. Langmead B, Salzberg SL. Fast gapped-read alignment with Bowtie 2. Nat Methods. 2012;9(4):357–9.

    Article  CAS  Google Scholar 

  17. Dobin A, Davis CA, Schlesinger F, Drenkow J, Zaleski C, Jha S, et al. STAR: ultrafast universal RNA-seq aligner. Bioinformatics. 2013;29(1):15–21.

    Article  CAS  Google Scholar 

  18. DeLuca DS, Levin JZ, Sivachenko A, Fennell T, Nazaire MD, Williams C, et al. RNA-SeQC: RNA-seq metrics for quality control and process optimization. Bioinformatics. 2012;28(11):1530–2.

    Article  CAS  Google Scholar 

  19. National Comprehensive Cancer Network I. NCCN clinical practice guidelines in oncology (NCCN guidelines). Non-small cell lung cancer (version 2.2021). 2021.

  20. Pagani F, Randon G, Guarini V, Raimondi A, Prisciandaro M, Lobefaro R, et al. The landscape of actionable gene fusions in colorectal cancer. Int J Mol Sci. 2019.

    Article  PubMed  PubMed Central  Google Scholar 

  21. Esteve-Codina A, Arpi O, Martinez-Garcia M, Pineda E, Mallo M, Gut M, et al. A comparison of RNA-seq results from paired formalin-fixed paraffin-embedded and fresh-frozen glioblastoma tissue samples. PLoS ONE. 2017;12(1): e0170632.

    Article  Google Scholar 

  22. Barua S, Wang G, Mansukhani M, Hsiao S, Fernandes H. Key considerations for comprehensive validation of an RNA fusion NGS panel. Pract Lab Med. 2020;21: e00173.

    Article  Google Scholar 

  23. Parsons MT, Buchanan DD, Thompson B, Young JP, Spurdle AB. Correlation of tumour BRAF mutations and MLH1 methylation with germline mismatch repair (MMR) gene mutation status: a literature review assessing utility of tumour features for MMR variant classification. J Med Genet. 2012;49(3):151–7.

    Article  CAS  Google Scholar 

  24. Prahallad A, Sun C, Huang S, Di Nicolantonio F, Salazar R, Zecchin D, et al. Unresponsiveness of colon cancer to BRAF(V600E) inhibition through feedback activation of EGFR. Nature. 2012;483(7387):100–3.

    Article  CAS  Google Scholar 

  25. Corcoran RB, Ebi H, Turke AB, Coffee EM, Nishino M, Cogdill AP, et al. EGFR-mediated re-activation of MAPK signaling contributes to insensitivity of BRAF mutant colorectal cancers to RAF inhibition with vemurafenib. Cancer Discov. 2012;2(3):227–35.

    Article  CAS  Google Scholar 

  26. Huijberts SC, van Geel RM, Bernards R, Beijnen JH, Steeghs N. Encorafenib, binimetinib and cetuximab combined therapy for patients with BRAFV600E mutant metastatic colorectal cancer. Future Oncol. 2020;16(6):161–73.

    Article  CAS  Google Scholar 

  27. Kopetz S, Grothey A, Yaeger R, Van Cutsem E, Desai J, Yoshino T, et al. Encorafenib, binimetinib, and cetuximab in BRAF V600E-mutated colorectal cancer. N Engl J Med. 2019;381(17):1632–43.

    Article  CAS  Google Scholar 

  28. Van Cutsem E, Huijberts S, Grothey A, Yaeger R, Cuyle PJ, Elez E, et al. Binimetinib, encorafenib, and cetuximab triplet therapy for patients with BRAF V600E-mutant metastatic colorectal cancer: safety lead-in results from the phase III BEACON colorectal cancer study. J Clin Oncol. 2019;37(17):1460–9.

    Article  Google Scholar 

  29. Pietrantonio F, Di Nicolantonio F, Schrock AB, Lee J, Tejpar S, Sartore-Bianchi A, et al. ALK, ROS1, and NTRK rearrangements in metastatic colorectal cancer. J Natl Cancer Inst. 2017.

    Article  PubMed  Google Scholar 

  30. Pietrantonio F, Di Nicolantonio F, Schrock AB, Lee J, Morano F, Fucà G, et al. RET fusions in a small subset of advanced colorectal cancers at risk of being neglected. Ann Oncol. 2018;29(6):1394–401.

    Article  CAS  Google Scholar 

  31. Hsiao SJ, Zehir A, Sireci AN, Aisner DL. Detection of tumor NTRK gene fusions to identify patients who may benefit from tyrosine kinase (TRK) inhibitor therapy. J Mol Diagn. 2019;21(4):553–71.

    Article  CAS  Google Scholar 

  32. Lasota J, Chłopek M, Lamoureux J, Christiansen J, Kowalik A, Wasąg B, et al. Colonic Adenocarcinomas harboring NTRK fusion genes: a clinicopathologic and molecular genetic study of 16 cases and review of the literature. Am J Surg Pathol. 2020;44(2):162–73.

    Article  Google Scholar 

  33. Gatalica Z, Xiu J, Swensen J, Vranic S. Molecular characterization of cancers with NTRK gene fusions. Mod Pathol. 2019;32(1):147–53.

    Article  CAS  Google Scholar 

  34. Hechtman JF, Benayed R, Hyman DM, Drilon A, Zehir A, Frosina D, et al. Pan-Trk immunohistochemistry Is an efficient and reliable screen for the detection of NTRK fusions. Am J Surg Pathol. 2017;41(11):1547–51.

    Article  Google Scholar 

  35. Ren H, Tan ZP, Zhu X, Crosby K, Haack H, Ren JM, et al. Identification of anaplastic lymphoma kinase as a potential therapeutic target in ovarian cancer. Cancer Res. 2012;72(13):3312–23.

    Article  CAS  Google Scholar 

  36. Doebele RC, Pilling AB, Aisner DL, Kutateladze TG, Le AT, Weickhardt AJ, et al. Mechanisms of resistance to crizotinib in patients with ALK gene rearranged non-small cell lung cancer. Clin Cancer Res. 2012;18(5):1472–82.

    Article  CAS  Google Scholar 

  37. Penzel R, Schirmacher P, Warth A. A novel EML4-ALK variant: exon 6 of EML4 fused to exon 19 of ALK. J Thorac Oncol. 2012;7(7):1198–9.

    Article  CAS  Google Scholar 

  38. Lee SJ, Lee J, Park SH, Park JO, Lim HY, Kang WK, et al. c-MET overexpression in colorectal cancer: a poor prognostic factor for survival. Clin Colorectal Cancer. 2018;17(3):165–9.

    Article  Google Scholar 

  39. Botton T, Talevich E, Mishra VK, Zhang T, Shain AH, Berquet C, et al. Genetic heterogeneity of BRAF fusion kinases in melanoma affects drug responses. Cell Rep. 2019;29(3):573-88.e7.

    Article  CAS  Google Scholar 

  40. Grisham RN, Sylvester BE, Won H, McDermott G, DeLair D, Ramirez R, et al. Extreme outlier analysis identifies occult mitogen-activated protein kinase pathway mutations in patients with low-grade serous ovarian cancer. J Clin Oncol. 2015;33(34):4099–105.

    Article  CAS  Google Scholar 

  41. Smorodinsky-Atias K, Soudah N, Engelberg D. Mutations that confer drug-resistance, oncogenicity and intrinsic activity on the ERK MAP kinases—current state of the art. Cells. 2020.

    Article  PubMed  PubMed Central  Google Scholar 

  42. Arena S, Corti G, Durinikova E, Montone M, Reilly NM, Russo M, et al. A subset of colorectal cancers with cross-sensitivity to olaparib and oxaliplatin. Clin Cancer Res. 2020;26(6):1372–84.

    Article  CAS  Google Scholar 

  43. Loree JM, Pereira AAL, Lam M, Willauer AN, Raghav K, Dasari A, et al. Classifying colorectal cancer by tumor location rather than sidedness highlights a continuum in mutation profiles and consensus molecular subtypes. Clin Cancer Res. 2018;24(5):1062–72.

    Article  CAS  Google Scholar 

  44. Stintzing S, Tejpar S, Gibbs P, Thiebach L, Lenz HJ. Understanding the role of primary tumour localisation in colorectal cancer treatment and outcomes. Eur J Cancer. 2017;84:69–80.

    Article  Google Scholar 

Download references




This work was supported by Non-profit Central Research Institute Fund of Chinese Academy of Medical Sciences (2019XK320045) and Chinese Academy of Medical Sciences (CAMS) Innovation Fund for Medical Sciences (No.2019-I2M-2-002).

Author information

Authors and Affiliations



JW, HW and ZL performed study concept and design; JW and HW performed development of methodology and writing, review and revision of the paper; JW, RL, JL, YY, XL, JC, WH and HZ provided acquisition, analysis and interpretation of data, and statistical analysis; JL and CL performed validation experiments. All authors read and approved the final manuscript.

Corresponding authors

Correspondence to Huanwen Wu or Zhiyong Liang.

Ethics declarations

Ethics approval and consent to participate

This study was approved upon ceding review by the Peking Union Medical College Hospital Review Board (Beijing, China).

Consent for publication

Not applicable.

Competing interests

The authors declare no conflict of interest.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Supplementary Information

Additional file 1: Table S1.

List of genes included in the 1021 genes panel.

Additional file 2: Figure S1.

Schematic representation of the predicted products of the 18 gene fusions detected by DNA NGS.

Additional file 3: Figure S2.

Schematic representation of the predicted products of the four gene fusions detected by RNA NGS.

Additional file 4: Figure S3.

Validation of EML4-NTRK3 fusion in sample 0394 and sample 0447 using RT-PCR (top panel) and Sanger sequencing (bottom panel). The sequence spanning the break point is 5′- CAGTCTCAAGTAAAG- GTCCCGTGGCTGTCA-3′, which confirms the fusion identified by our assay.

Rights and permissions

Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article's Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article's Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit The Creative Commons Public Domain Dedication waiver ( applies to the data made available in this article, unless otherwise stated in a credit line to the data.

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Wang, J., Li, R., Li, J. et al. Comprehensive analysis of oncogenic fusions in mismatch repair deficient colorectal carcinomas by sequential DNA and RNA next generation sequencing. J Transl Med 19, 433 (2021).

Download citation

  • Received:

  • Accepted:

  • Published:

  • DOI: