Research design
The institutional review board approved this retrospective analysis (approval number: 2019PHB171-01) and waived the requirement for informed consent. This study included 13 patients with locally advanced GC (pathologically proven tumor [T], node [N], and metastatic [M] status of T4aN + M0 as defined by the American Joint Committee on Cancer [AJCC]). All patients underwent contrast-enhanced multidetector CT (ceMDCT) scans, followed by curative surgery, chemotherapy, and follow-up. The frozen cancerous samples were all stored in the institute biobank.
The flowchart of the study is shown in Fig. 1. First, the study patients underwent preoperative abdominal ceMDCT. Tissue samples were assessed with transcriptome sequencing, and EMVI-related genes that demonstrated P values < 0.001 on Spearman analysis were selected for further analysis. The immunity scores for all GC patients in The Cancer Genome Atlas (TCGA) were then calculated using a single-sample gene set enrichment analysis (ssGSEA) algorithm, and patients were divided into high- and low- immunity groups based on these scores. The EMVI-immune-related genes were then generated by overlapping the immunity score differential genes (DEGs) and EMVI-related genes. Cluster analysis using the "ConsensusClusterPlus" package in R was then used to classify all GC samples in TCGA according to EMVI-immune-related genes. Next, the immune cell infiltration status of GC clusters was determined using the ssGSEA algorithm and the gene ontology (GO) and Kyoto Encyclopedia of Gene and Genomes (KEGG) analyses. The EMVI-immune-related gene scores for all GC samples in TCGA were then calculated using principal component analysis (PCA). The parameter lambda was adjusted using cross-validation, and the final EMVI-immune-related genes were selected to construct an EMVI score model. Next, all GC samples were classified into high and low EMVI score groups. Sensitivity to ICIs and immune evasion were also analyzed. A protein–protein interaction (PPI) network and modules of EMVI gene signature were established, and related pan-cancer analysis was performed using the "reshape2" and "ggpubr" packages. Finally, we performed western blotting and immunofluorescence of the core in PPI on GC cells and immunohistochemistry staining on formalin-fixed paraffin-embedded (FFPE) tissues from the patients initially included in the study.
CT-detected EMVI scoring and mRNA sequencing
CT-detected EMVI of GC could be scored from 0 to 4 [15], with scores of 0 to 2 defined as EMVI negative and scores of 3 or 4 defined as EMVI positive. Of the 13 included patients, 2 had EMVI scores of 1, 5 had scores of 2, 5 had scores of 3, and 3 had scores of 4. No patients had a score of 0, as all patients had T4a disease, meaning that the masses had a nodular appearance.
After quality control testing, frozen tumor tissue samples from these 13 patients underwent transcriptome sequencing using the Illumina HiSeq 4000 system. We used the R language to calculate the Spearman correlation between gene expression and EMVI score, with the threshold of correlation significance set at P < 0.001. A heat map was used to illustrate EMVI-related genes and the clinical data. The sequencing data had been uploaded to the Gene Expression Omnibus (GEO) database (GSE182831).
Gene data collection and processing
Through TCGA (https://portal.gdc.cancer.gov, accessed December 2021), we obtained raw data for the mRNA matrix in fragments per kilobase million (FPKM) format for GC, as well as the copy number data. The mRNA matrix raw data were processed to remove duplicate samples. We also obtained clinical data related to GC patients from TCGA. Through the GEO database (accessed December 2021), we downloaded the GSE84437 dataset and obtained the mRNA matrix and clinical data for GC patients. The FPKM matrix was converted into transcript per million (TPM) format and merged with the GEO matrix, and some missing genes were removed by batch correction. GO and KEGG files were downloaded from the GSEA website (https://www.gsea-msigdb.org, accessed December 2021). We then downloaded the Tumor Immune Dysfunction and Exclusion (TIDE) score, exclusion score, and dysfunction data from the TIDE database (http://tide.dfci.harvard.edu, accessed December 2021), as well as the GC's immune checkpoint therapy score data from The Cancer Immunome Atlas (TCIA) (https://tcia.at, accessed December, 2021) and the pan-cancer mRNA matrix raw and gene mutation data from the UCSC database (xena.ucsc.edu/December, 2021). A PPI network was constructed using the STRING website (https://string-db.org/cgi/input.pl, accessed December 2021).
ssGSEA
Through the ssGSEA algorithm, we calculated the immune cell infiltration status and immunity scores for all GC samples in TCGA. We divided the samples into high and low immune groups based on these immunity scores. The Estimate of Stromal and Immune cells in Malignant Tumors (ESTIMATE) was used to calculate the stromal score, immune score, and estimate score for the high and low immune groups. An immune microenvironment heat map was then drawn for the high and low immune groups. Using differential analysis (false discovery rate [FDR] < 0.05), we identified DEGs in the high and low immune groups.
Cluster analysis of EMVI-immune-related genes
We classified GC data using the "ConsensusClusterPlus" R package to verify the association between EMVI-immune-related genes and GC. A range of 2 to 9 was used; a κ value equaling 2 was considered the ideal number of clusters, as the correlation between groups was weak whereas the correlation within groups was strong. Using the ssGSEA algorithm, we determined the expression of immune cells in different GC clusters and illustrated them with boxplots. GO and KEGG analyses were performed for EMVI-immune-related genes. The "GSEABase" package and the R language "GSVA" package were then used to create a heat map of enriched functional pathways in GC clusters.
EMVI-immune-related gene signature construction
PCA was performed using the stats R package to calculate EMVI scores and to identify EMVI-immune-related genes for all GC patients. The penalty parameter (λ) was determined using tenfold cross-validation, and patients were then stratified into high and low EMVI score groups based on the median value of the EMVI score. The association between EMVI score and TMB was calculated using the "survival" and "survminer" R packages. The correlation between EMVI score and immune cell infiltration was analyzed using the ssGSEA algorithm. Correlations between EMVI score and clinical-pathological characteristics were illustrated with histograms and box plots using the "plyr" and "ggpubr" P packages. A NOMO model was constructed to classify risk factors as predictors of survival.
Response to ICIs treatment
We calculated the TIDE score, exclusion score, and dysfunction score to identify immune escape and immune dysfunction status in the high and low EMVI score groups. Using the EMVI score, we also analyzed the cytotoxic T-lymphocyte-associated antigen 4 (CTLA-4) and PD-1 immune checkpoints therapy in GC. MSI status was analyzed in the high and low score groups. A PPI network for EMVI-immune-related genes was constructed, and the core proteins and hub-genes were identified.
Cell culture
GC cells (GES-1, AGS, HGC-27, KATO III, MKN-1, MKN-45) were purchased from Procell Life Science & Technology (Wuhan, China), and the cells were cultured according to the manufacturer’s instructions. The cell lines were cultured in RPMI-1640 medium (Gibco, USA) and were supplemented with 10% fetal bovine serum (FBS) (Gibco) and 1% penicillin/streptomycin (Gibco).
Western blotting for GC cells
Western blotting was performed as described previously [16]. In brief, we used radioimmunoprecipitation assay (RIPA) buffer to extract the proteins needed for western blotting. Sodium dodecyl sulfate–polyacrylamide gel electrophoresis was then performed to separate the proteins, which were transferred to poly (vinylidene fluoride) membranes. The following antibodies were used: anti-SOX17 (Abcam, USA), anti-SDF1/CXCL12 (Abcam), and GAPDH (ImmunoWay, USA). Corresponding Alexa Fluor dyes were used for fluorescent detection. DAPI was used for nuclear counter staining. Images were captured on a Zeiss LSM780 laser scanning confocal microscope.
Immunohistochemistry on FFPE GC tissues
FFPE cancerous tissues from the 13 study patients were cut into 5-μm sections. Immunohistochemical (IHC) staining was performed using anti-SOX17 and anti-CXCL12. Tissue sections were deparaffinized in xylene and rehydrated in graded ethanol. Antigen retrieval was performed by heating sections in boiling sodium citrate buffer (Sigma-Aldrich, C-9999) for 20 min. After blocking was performed using 3% hydrogen peroxide and bovine serum albumin (BSA), the tissues were incubated with the primary antibody at 4 °C overnight. After being washed, the tissues were incubated with corresponding horseradish peroxidase (HRP)-conjugated secondary antibodies. Color was developed using diaminobenzidine (DAB) substrate (Sigma-Aldrich, D-7304), and slides were counterstained with hematoxylin. Images of three random areas from each section were captured at 20× magnification.
Pan-cancer analysis
The “reshape2” and “ggpubr” packages were used to draw a sorted boxplot of single gene expression in pan-cancer. Single-gene TMBs in pan-cancer samples were calculated, and radar charts of TMB and MSI were drawn using the "fmsb" package. With the "CIBERSORT" algorithm, the expression of immune cells in pan-cancer samples was calculated, and a co-expression heat map of pan-cancer immune cells was drawn.
Statistical analysis
The copy number variation frequencies of EMVI-related genes were obtained by calculating the gene copy number gains and deletions in GC samples from TCGA. The number of mutations in genes was calculated and waterfall charts were drawn. Using the "RCircos" package, a gene copy number circle diagram was drawn. Cox regression and co-expression analyses were used to construct a prognostic network associated with EMVI-related genes. Kaplan–Meier curves and log-rank analyses were used to compare survival among GC patients, and plots were constructed using the "survminer" package. A nomogram and receiver operating characteristic (ROC) curve were then constructed for risk stratification based on EMVI score and other clinical-pathological characteristics in GC patients.