Subtype-specific associations between breast cancer risk polymorphisms and the survival of early-stage breast cancer

Background Limited evidence suggests that inherited predisposing risk variants might affect the disease outcome. In this study, we analyzed the effect of genome-wide association studies—identified breast cancer-risk single nucleotide polymorphisms on survival of early-stage breast cancer patients in a Chinese population. Methods This retrospective study investigated the relationship between 21 GWAS-identified breast cancer-risk single nucleotide polymorphisms and the outcome of 1177 early stage breast cancer patients with a long median follow-up time of 174 months. Cox proportional hazards regression models were used to estimate the hazard ratios and their 95% confidence intervals. Primary endpoints were breast cancer special survival and overall survival while secondary endpoints were invasive disease free survival and distant disease free survival. Results Multivariate survival analysis showed only the rs2046210 GA genotype significantly decreased the risk of recurrence and death for early stage breast cancer. After grouping breast cancer subtypes, significantly reduced survival was associated with the variant alleles of rs9485372 for luminal A and rs4415084 for triple negative breast cancer. Importantly, all three single-nucleotide polymorphisms, rs889312, rs4951011 and rs9485372 had remarkable effects on survival of luminal B EBC, either individually or synergistically. Furthermore, statistically significant multiplicative interactions were found between rs4415084 and age at diagnosis and between rs3803662 and tumor grade. Conclusions Our results demonstrate that breast cancer risk susceptibility loci identified by GWAS may influence the outcome of early stage breast cancer patients’ depending on intrinsic tumor subtypes in Chinese women. Electronic supplementary material The online version of this article (10.1186/s12967-018-1634-0) contains supplementary material, which is available to authorized users.


Background
Breast cancer (BC) is the most common diagnosed cancer and the fifth leading cause of cancer death among women in China [1]. The 5-year survival of early stage breast cancer (EBC) patients in China is about 58-78%, which is low compared to that in American and varies in different geographic areas of China [2]. Traditionally, there are some prognostic factors for EBC survival including tumor size, lymph node involvement, tumor grade, hormone receptor (HR) status. However it has been proven that inherited host characteristics, such as single nucleotide polymorphisms (SNPs), play an important role [3].
survival of BC patients in those two studies. Differences might be due to the different sample sizes and the different enrolled BC cases. Still, those studies already demonstrated the possible associations between BC risk loci and BC survival.
Similarly, there had been some BC-risk GWAS focusing on East Asian women and that found several BC risk variants, most of which were different from those identified in other ethnic populations [8,9]. However, the relation between these polymorphisms and survival of EBC Asian patients has never been established. In the present study, we analyzed the association between 21 GWASidentified SNPs and the survival of patients in Southeastern China with EBC.

Study populations
This is a hospital-based study including 1177 early breast cancer cases from Fujian Medical University Union Hospital from July 2000 and October 2014. All the participants were histopathologically confirmed with invasive breast cancer and subsequently treated with curative surgical resection and systemic therapy. Clinicopathological and demographic data were collected from the hospital records and survival data were obtained from the followed-up database which was renewed annually. The patients were staged according to the 7th version of American Joint Commission on Cancer (AJCC) tumornode-metastasis (TNM) staging system [10]. Estrogen receptor (ER)/progesterone receptor (PR) positivity was determined by IHC analysis of the number of positively stained nuclei (≥ 10%) and hormone receptor (HR) positivity was defined as being either ER+ and/or PR+. Tumors were considered human epidermal growth factor-2 (HER2) positive when cells exhibited strong membrane staining (3+). Expressions of 2+ would require further in situ hybridization testing for HER2 gene amplification while expressions of 0 or 1+ were regarded as negative. The subtypes were categorized as follows [11]: luminal A (ER+, PR+ > 20%, HER2−, Ki67 < 14% or grade I when Ki67 was unavailable), luminal B (HR+, HER2−, Ki67 > 14% or grade II/III when Ki67 was unavailable or HR+, HER2+); HER2 enriched (HR−, HER2+) and triple negative (HR− and HER2−). The study was approved by the Institutional Ethics Committee and all participants consented to genetic testing at the time of their participation and contributed data.

SNPs selection
We selected the polymorphisms associated with breast cancer susceptibility from the US National Human Genome Research Institute (NHGRI) Catalog of Published Genome-Wide Association Studies. We used the following inclusion criteria: (i) the significance level for genome-wide association was considered to be P ≤ 1 × 10 −9 ; (ii) the minor allele frequency (MAF) was at least 10% in the HapMap CHB data of the public SNP database (http://www.ncbi.nlm.nih.gov/SNP); (iii) pair wise linkage disequilibrium (LD) between the eligible SNPs calculated by Haploview 4.1 software must be less than 0.8 (r 2 < 0.8). At last, 21 polymorphisms were applied in this study which can be found in Additional file 1: Table S1.

DNA extraction and SNPs genotyping
Blood samples were collected in EDTA anticoagulant tubes and stored at − 80 °C until DNA extraction. Genomic DNA was extracted using the Whole-Blood DNA Extraction Kit (Bioteke, Beijing, China), according to the manufacturer's protocol. The genotype analysis was performed by SNPscan, which is a high-throughput SNPs genotyping technology (Genesky Biotechnologies Inc., Shanghai, China). Finally, the raw data were analyzed by the GeneMapper 4.0 Software (Applied Biosystems, Foster City, CA). 5% of samples were randomly selected as blinded duplicates for quality assessment purposes and 100% concordance was obtained.

Statistical analyses
Overall survival (OS) and breast cancer specific survival (BCSS) were our primary endpoints and defined as the time from the date of cancer diagnosis to the date of mortality for all cause and breast cancer, respectively. Disease free survival (DFS) and distant disease free survival (DDFS) were our secondary endpoints and calculated separately as the time from the date of diagnosis to the date of any recurrence and distant recurrence to the last patient contact [12]. Survival data were analyzed using the Kaplan-Meier method with the log-rank test and multivariate Cox stepwise regression analysis to the end of follow-up (2016.12.31). Adjustment for age at diagnosis, tumor size, lymph node involvement, histological grade, ER status, and HER-2/ neu expression were applied. The hazard ratios (HRs) and 95% confidence interval (CI) for each factor in multivariate analyses were calculated from the Cox-regression model. The Chi square-based Q test was used to examine the heterogeneity between subgroups. The possible gene-environment interactions were also evaluated by the Cox proportional hazard regression models. All tests were 2-sided, and P values of < 0.05 were considered statistically significant. SAS 9.4 (SAS Institute Inc., Cary, NC) was used for all statistical analyses.

Patient characteristics and clinical features
Patients' clinical characteristics and survival are summarized in Table 1. All the 1177 early breast cancer cohort, were female and their mean age was 47.0 ± 10.3 years old at breast cancer diagnosis. During a median follow-up time of 174 months, 446 cases experienced recurrence (142 locoregional and 410 distant) and 343 died (333 died of BC and 10 died of other disease).
No significant difference in BC-DDFS, BCSS, and OS was shown in the subgroup of age at diagnosis (P = 0.087, 0.420, and 0.402). But patients with a tumor size > 2 cm, lymph node positive, grade III, clinical stage II + III, or HER2 positive had significantly shorter survival times, whereas being ER or HR positivity remarkably improved the survival of EBC patients (log-rank P < 0.05, Table 1). Furthermore, our intrinsic molecular subtypes (luminal A, luminal B, HER2-enriched, and triple negative) were also associated with significantly different survival (logrank P < 0.05, Table 1).

Prognostic implication of risk variants in molecular subtypes
For a large number of patients enrolled in this study, we analyzed the association between enrolled SNPs and survival associated with different molecular subtypes of EBC. As showed in Table 3 Table 3). However, no significant effect was observed in the HER2enriched subtype in any model of the 21 polymorphisms.

Combined analysis of three risk SNPs on survival of luminal B EBC
To assess the combined effects on risk of recurrence and death from luminal B EBC, we combined the risk genotypes of rs4951011, rs889312 and 9485372. According to the number of combined risk genotypes, the univariate survival analysis show that all of iDFS, DDFS, BCSS and OS were significantly different among different groups with different combined risk genotypes (P Logrank < 0.01) (Fig. 1). As shown in Table 4, compared to subjects with one or no unfavorable genotype, subjects carrying more unfavorable loci had shorter survival time and had a 1.534-1.645 fold increased risk of recurrence and/of death even after adjustment (iDFS: aHR = 1.534, 95% CI 1.288-1.827, DDFS: aHR = 1.632, 95% CI 1.356-1.964, BCSS: aHR = 1.570, 95% CI 1.267-1.944 and OS: aHR = 1.645, 95% CI 1.334-2.029, respectively for trend).

Stratification and interaction analysis
The associations between breast cancer risk loci genotypes and EBC survival were then evaluated by stratified analysis of age at diagnosis, tumor size, lymph node involvement, grade, hormone-receptor status and HER2 status. As shown in Table 5 An interaction analysis was performed (Table 6) and statistically significant multiplicative interactions on EBC survival were found both between rs4415084 genotypes and age at diagnosis (adjusted Pint: iDFS 0.045, DDFS 0.013, BCSS 0.025 and OS 0.018) and between rs3803662 genotypes and tumor grade (adjusted Pint: iDFS 0.011, DDFS 0.001, BCSS 4.7 × 10 −4 and OS 9.9 × 10 −4 ).

Discussion
In this study, we evaluated the possible relation between 21 GWAS-identified BC susceptibility germline variations and EBC clinical outcome in a large Chinese cohort of 1177 EBC cases. To the best of our knowledge, this  is the first study that reports the association between GWAS-identified BC susceptibility loci and clinical outcomes in a Chinese population and it produced different results from two other American studies findings [6,7]. The most significant and novel result of this study is that the influence of BC risk polymorphisms on the outcome of EBC depends on different intrinsic molecular subtypes, especially for luminal B breast cancer. More recently, Zhang and his colleagues demonstrated some GWAS-identified SNPs are associated with molecular subtypes of EBC in Chinese women [13]. It has been accepted worldwide that breast cancer is a complex disease and consists of several intrinsic subtypes, which have different etiologies and prognosis [14]. By altering the related genes' expression and/or function in key signaling pathways, we gradually realize putative SNPs may take effect on the basis of molecular subtypes, whether in risk or in clinical outcome of EBC [15][16][17].
Loci rs889312, rs4951011, and rs9485372 play significant and independent roles in survival of luminal B breast cancer patients both individually or jointly by all of the four outcome indicators (iDFS, DDFS, BCSS and OS). Recently, MAP3K1 rs889312 has been identified as a low-penetrant risk factor for breast cancer, both for ER+ or ER− breast cancer [18]. It was also demonstrated to be an independent risk factor for poor survival in diffusetype gastric cancer in an overdominant model [19]. However, two similar investigations failed to prove this variant was associated with BC clinical outcome [6,7], although neither of them carried out survival analysis on the basis of BC intrinsic subtypes. From most recent available data, rs889312 (C/C) was found to be significantly associated with poor DFS, DDFS and OS among HR positive breast cancer patients [20], which was similar to our results. The MAP3K1 gene is the most important member in the MAPK signal pathway which activates the transcription of essential cancer genes [21]. But the exact mechanism as to how rs889312 can change MAP3K1 protein structure and/or function is still beyond our knowledge.
The rs4951011 located in intron 2 of the zinc finger CCCH domain-containing protein 11A (ZC3H11A) and 5′-UTR of ZBED6 gene, has been first identified as a BC susceptibility loci in East Asian [8]. In another study, it was only associated with triple negative breast cancer but not other BC subtypes [22]. For rs4951011 in the dominant model, we found that the GA + GG genotype  was significantly associated with a better DFS, DDFS, BCSS and OS (aHR = 0.690-0.734). However, there was no evidence indicating a relation between this variant and clinical outcome of other malignant tumors. The data of ENCODE from human mammary epithelial cells (HMEC) suggests that rs4951011 may be located in a strong enhancer region marked by peaks of several active histone acetylation modifications (H3K4me1, H3K4me3, H3K9ac, and H3K27ac) [23]. Furthermore, it was found in colorectal cancer cell lines that repressing transcription of ZBED6 modulates expression of 10 genes, including PTBN1, WWC1, WWTR1, etc., linked to important signal pathway and tumor development depended on the genetic background of tumor cells and the transcription state of its target genes [24]. So rs4951011 may regulate expression of some important metastasis-related genes and then influence the course of breast cancer.
The SNP rs9485372 was also found to play a significant role in the clinical outcome of luminal A and luminal B breast cancer patients. For luminal A BC, rs9485372 in the recessive model had a worse iDFS, DDFS, BCSS, and OS (aHR 2.465-3.522). For luminal B BC, the GA + AA genotypes had a worse iDFS, DDFS, BCSS and OS (aHR = 1.482-1.557), compared to the GG genotype. This variant is located in Table 2 (TGF-β activated kinase 1/MAP3K7 binding protein 2) which plays a pivotal role in the TGF-β pathway and contributes to development of cancer [25]. Table 2 is near the ESR1 gene and it was found to be co-expressed with ESR1 in hepatocellular carcinoma [26]. Table 2 was found to be a mediator of resistance to endocrine therapy which is a poor prognostic indicator for HR+ breast cancer patients and is a potential new target to reverse pharmacological resistance and potentiate antiestrogen action [27]. Therefore it is possible that the association both rs9485372 and survival of luminal A and B BC patients may be mediated by regulating estrogen signaling and the TGF-β pathway.
Two GWAS-identified BC risk loci, rs1219648 and rs13387042, were found to take effect on overall survival of EBC in Tunisians [28]. On the contrary, we failed to confirm this result in our Chinese population. We attribute this difference to the following reasons. Firstly, these two studies focused on different ethnic groups with different genetics background. Secondly, we used a much bigger sample size and longer follow-up than the other study which made our result more reliable. Finally, both of these two studies are retrospective. We used the multivariate Cox proportional hazard model to evaluate the independent effect of every SNP on survival of EBC