Fukushima Medical University
This document is downloaded at: 2021-11-07T23:51:20Z
Title Glycosyltransferase gene expression profiling identifies a molecularly distinct subtype of colorectal cancer associated with poor prognosis( 本文 )
Author(s) 野田, 勝
Citation
Issue Date 2017-03-24
URL http://ir.fmu.ac.jp/dspace/handle/123456789/946
Rights Fulltext: This is the preprint of "Clin Cancer Res. 2018 Sep 15;24(18):4468-4481. doi: 10.1158/1078-0432.CCR-17-3533.
©2018 American Association for Cancer Research".
DOI
Text Version ETD
学 位 論 文
Glycosyltransferase gene expression profiling identifies a molecularly distinct subtype of colorectal cancer associated
with poor prognosis
(糖転移酵素遺伝子発現プロファイリングによる 予後不良大腸癌サブタイプの同定)
福島県立医科大学大学院医学研究科 腫瘍外科学分野
野田 勝
学位論文題名 糖転移酵素遺伝子発現プロファイリングによる予後不良大腸癌サブタイプの 同定
大腸癌は分子遺伝学的に不均一であり、一般に2つの分子亜型として、DNAミスマッチ修復機能の欠損
(dMMR)を示す群と、KRAS、TP53などの変異と染色体不安定性を示す群があるとされる。しかしこの ような分子機構の理解は臨床応用に直結しておらず、治療方針はいまだTNM分類に依存している。同じ ステージの患者でもそれぞれの癌の形質は多彩であり臨床転帰も異なるため、個々の癌の予後や治療応答 性に関わる大腸癌分子亜型を同定するとともにそれらを判別するバイオマーカーが求められている。糖鎖 修飾は細胞機能の重要な役割を担うが、大腸癌の発癌・進展過程においても、癌細胞表面の糖鎖構造が変 化することが知られ、癌特異的な糖鎖は癌の形質や生物学的機能に直結するとともに、癌のバイオマーカ ーとしても有用性が強く示唆されている。糖鎖修飾機構は極めて多数の糖転移酵素が関連する複雑なプロ セスであり、癌においても転写レベルでの糖鎖生合成異常が知られている。我々は、糖転移酵素遺伝子の 発現が個々の大腸癌の臨床病理学的特徴や臨床転帰と関連し、新規分子亜型やバイオマーカー確立につな がると考えている。
まず網羅的な糖転移酵素遺伝子発現により177例の大腸癌をクラスタリングし低分化・予後不良に関連 する群を見出した。この群に特異的に発現変動する15の糖転移酵素遺伝子を抽出し、4つの大腸癌コホー ト、計764例に適用することで、右側結腸、低分化・粘液癌、dMMR、BRAF変異、P53野生型および予 後不良を示す際立った群(15-glycogene cluster A)の存在を検証した。この群を特徴づける遺伝子である GALNT6は、独立した11コホート、計2261例の大腸癌、および151の大腸癌細胞株を用いた解析によ り、dMMR症例で発現低下していることが示された。さらに複数のコホートを用いた大腸腺腫・大腸癌の 遺伝子発現解析および免疫染色を行い、GALNT6 は、mRNA・タンパクレベルにおいて、大腸腺腫で著 明に発現亢進し、一部の大腸癌において発現低下することを見出した。GALNT6発現低下がメチル化レベ ルと相関していること、大腸癌細胞株の脱メチル化処理により発現が回復することから、GALNT6は一部 の大腸癌においてエピジェネティックに発現抑制されることが示唆された。GALNT6タンパクは、335例 の大腸癌のうち約15%で発現消失を認め、この群は低分化・粘液癌、dMMR、予後不良と関連しており、
上述の15-glycogene cluster Aを再現する結果であった。多変量解析により、GALNT6発現消失は、他の 臨床病理学的因子と独立して不良な癌特異的生存、全生存と関連していることが示され、特にステージⅢ 症例でその意義が顕著であった。また、5-FUベースの術後補助治療施行症例において、GALNT6発現消 失は不良な無再発生存と関連していた。大腸癌細胞株を用いて、siRNAによりGALNT6をノックダウン すると、細胞増殖能やアポトーシスには変化をもたらさないものの、浸潤能、遊走能、および5-FU 抵抗 性の上昇を認めた。さらにレクチンマイクロアレイを用いて細胞表面の糖鎖プロファイルを解析すると、
GALNT6ノックダウン細胞ではレクチンHPAが認識するTn抗原の発現上昇を認め、モノクローナル抗
体MLS128を用いてフローサイトメトリーでこれを検証した。Tn抗原は、低分化・粘液癌に高発現し、
高悪性度形質、予後不良と関連する短縮型の癌関連糖鎖抗原として知られている。したがって大腸癌の発 癌早期段階において、エピジェネティックな機構により一部の群ではGALNT6発現が抑制され、Tn抗原 の発現上昇や癌の進展に寄与しているものと考えられ、癌における糖鎖不全現象の概念に合致するもので ある。以上から本研究は、大腸癌の予後や治療抵抗性と関連するバイオマーカーとしての GALNT6 発現 の意義を示すとともに、大腸癌の発癌・進展におけるGALNT6発現異常の重要性を示唆するものである。
(公表誌名、公表年月日、巻番号、ページ )
cause of cancer-death worldwide (1, 2). CRC is a highly heterogeneous disease that can develop through multiple molecular pathways involving the sequential accumulation of genetic and epigenetic alterations (3, 4). It has been recognized that CRC is biologically grouped into two categories: tumors with microsatellite instability (MSI), caused by defective function of the DNA mismatch repair (MMR) system, and tumors that are microsatellite stable but exhibiting chromosomal instability (CIN) (4-6). The majority of CRC (~85%) follows the CIN pathway, often accompanied by activating mutations in KRAS oncogene and inactivation of TP53 tumor suppressor gene.
Approximately 15% of tumors with deficient MMR (dMMR) characterized by hyper-mutated phenotype that frequently carry oncogenic BRAF mutations (4, 5). Recent clinical trials and meta-analysis studies have implicated MMR status as an important classifier with potential for therapeutic stratification after surgery in the adjuvant setting (7-9). In the metastatic setting, KRAS mutation testing is currently used for predicting unresponsiveness to EGFR-targeted therapies, while the predictive value of BRAF mutation testing is under investigation (10-12). Despite those increasing knowledge of molecular mechanisms of CRC, clinicopathological staging system remains the only prognostic classification currently used in clinical practice to guide appropriate treatment for each patient. However, clinicopathologically similar tumors can strikingly differ in clinical behaviors, including drug response and patient survival that likely reflects molecular heterogeneity. Although it is recommended that all patients with stage III CRC receive adjuvant chemotherapy, irrespective of MMR status, approximately 30-40% of patients develop recurrence even after curative surgery with postoperative chemotherapy (13-15). Likewise, the current clinical factors and molecular biomarkers are still insufficient to identify 10-20% of stage II patients at high risk of recurrence who may benefit from adjuvant chemotherapy (15, 16).
Glycosylation is one of the most common and important post-translational modifications on proteins and lipids in mammalian cells that play a pivotal role in the regulation of diverse physiological processes. The mechanism of glycosylation involves sequential addition of single sugar residues to target structures, resulting in glycan elongation.
Further chemical modifications and branching can finally form a vast array of glycan structures. Those procedures are regulated by the multienzymatic reaction of glycosyltransferases, whose encoding genes, namely glycogenes, are equivalent to 1% of human genome. In human cancers, including CRC, cell surface glycans are known to undergo drastic changes during malignant transformation and tumor progression. Aberrant glycosylation can contribute to distinct biological functions and unique tumor phenotypes, thereby making glycans as potential biomarkers for CRC.
For instance, a cancer-associated glycan epitope, CA19-9, called sialyl Lewis A (sLea), is widely utilized as serum tumor markers for the management of CRC to aid monitoring clinical response to therapy and disease surveillance after surgery (17). CA19-9 on cancer cell surface is considered to facilitate cancer metastasis through cell-cell adhesion (18).
Correspondingly, it has been known that elevated CA19-9 levels are associated with hepatic metastasis and poor
prognosis in CRC. Several other glycan epitopes, including sialyl Lewis X (sLex), sialyl Tn, Tn (STn) and T antigens have also been reported as cancer-associated glycans that are increased during colorectal carcinogenesis and are associated with poor prognosis. Altered glycan structures that preferentially appear in cancer cells can be attributed to dysregulation of multiple glycosyltransferases at the transcriptional level that has been postulated as two principal mechanisms, “incomplete synthesis” and “neo-synthesis” (18-20). Transcription of some glycogenes are repressed by epigenetic silencing during early stages of tumorigenesis, which lead to the biosynthesis of truncated structures, such as Tn and STn expression, called incomplete synthesis. Conversely, in the neo-synthesis process, some glycogenes can be transcriptionally induced by tumor hypoxia accompanying cancer progression at advanced stages, leading to the de novo expression of cancer-antigens, such as sLea and sLex.
In the present study, with the aim to discover distinct classes of CRC on the basis of the expression of glycosyltransferases, we compiled an extensive number of transcriptomic profiles. We initially describe a novel CRC subtype based upon unsupervised clustering of genome-wide “glycogene” expression patterns, and characterize the glycogene-derived subgroup in terms of common clinicopathological variables, patient outcomes and known molecular markers. With the use of multiple cohorts consisted of more than 3000 samples by integrating other available data sources, including mutations, MMR status, methylation, protein expression and cancer-precursor samples, we further attempted to establish more practical, single gene-based biomarkers that can be applied to formalin-fixed paraffin-embedded (FFPE) specimens with high reproducibility. This led us to identify a glycogene, GALNT6 as a candidate biomarker for disease prognosis and therapeutic outcome in CRC. Moreover, we investigated the functional characteristics of GALNT6 involved in the incomplete synthesis of cell surface glycans and the expression of cancer-associated Tn antigen, suggesting the contribution of altered GALNT6 expression to colorectal carcinogenesis.
Materials and Methods Microarray data analysis
All microarray data are publicly available from the Gene Expression Omnibus (GEO) database (http://www.ncbi.nlm.nih.gov/geo). We utilized the normalized expression values obtained from each dataset. If a gene is represented by multiple probe sets, the expression values of multiple probes were averaged. To generate a list of glycogenes, official gene symbols and Entrez Gene IDs for 190 glycogenes were obtained from GGDB (GlycoGene DataBase; http://acgg.asia/ggdb2/). Among 190 glycogenes, 185 unique genes were converted to Affymetrix_3PRIME_IVT_ID (2401 probes / 185 genes) using DAVID Bioinfomatics Resources 6.7 (The Database for Annotation, Visualization and Integrated Discovery; http://david.abcc.ncifcrf.gov/home.jsp) as shown in Supplementary Table 1.
contains 483 probes corresponding to the 185 glycogenes. This dataset consisted of 177 CRC patients with disease-specific survival (DSS), disease-free survival (DFS) and overall survival (OS) information (21). Expression levels of 185 glycogenes were median-centered, and then genes and samples were subjected to an unsupervised hierarchical clustering by the centroid linkage method using the Cluster 3.0 program. The results were visualized using the Java Treeview program (22, 23). Among 39 differentially expressed genes between two major clusters (Cluster A vs B, p< 0.001 by t-test), 15 genes exhibited significant differential expression between the subcluster (Cluster A1) and the remaining subclusters (Clusters A2, B1 and B2) with stringent p-values at <0.0001. We then obtained two Affymetrix datasets with DFS information, including GSE41258 (HG-U133A), and GSE33113 (HG-U133+2.0), each of which originally contained 186 stage I-IV and 90 stage II primary CRCs, respectively (24, 25). We used 121 stage I to III patients in GSE41258 and 89 stage II patients in GSE33113 with available survival information for hierarchical clustering. Based on 3 independent clustering analyses, 3 glycogenes that were consistently upregulated or downregulated between clusters with log2 fold-change >0.4 were identified.
Assembly of the TCGA datasets
Level 3 Illumina RNA-Seq data for both colon and rectal adenocarcinoma (COADREAD) were downloaded through cBioPortal (http://www.cbioportal.org/) (26). For the TCGA samples, clinicopathological and molecular features, including patient age, gender, pathological stage, tumor histology, tumor location, microsatellite instability, and somatic mutations in KRAS, HRAS, NRAS, BRAF, and TP53, were obtained from the TCGA data portal (http://tcga-data.nci.nih.gov/) in June 2015 (5). We utilized two different versions of RNA-Seq data normalized either by RPKM or RSEM methods. These two TCGA datasets, namely, RNA-Seq RPKM and RNA-Seq V2 RSEM, contained 193 and 361 CRC samples, respectively, with available clinicopathological and molecular information, after removing 3 redundant samples from the latter dataset. Hierarchical clustering based on the mRNA expression z-Scores for the 15 glycogenes was applied to each TCGA datasets as described above.
For the analysis of GALNT6, both mRNA expression z-Scores by RNA-Seq V2 RSEM and DNA methylation β-values by Illumina Infinium HumanMethylation450 for 357 samples with available MMR status were also downloaded from cBioPortal.
Association with molecular features
In addition to the TCGA and GSE41258 datasets that had molecular profiles, we further obtained 8 independent microarray datasets, including GSE39582 (n=566), GSE39084 (n=70), GSE42284 (n=188), GSE26682 (n=331), GSE13294 (n=155), GSE4554 (n=84), GSE13067 (n=74) and GSE18088 (n=53) (27-33). They were
discovered by carefully searching the GEO database according to the availability of more than 10 dMMR samples in each dataset. These additional datasets were based on Affymetrix HG-U133+2.0 platform, except for GSE42284, which was on Agilent Homo sapiens 37K DiscoverPrint_19742. This led to the analysis of 11 cohorts consisted of a total of 2261 patients. We also obtained a microarray dataset GSE59857 based on Illumina HumanHT-12 V4.0 expression beadchip, in which mutational and transcriptional profiles of 151 CRC cell lines were available (34).
Precursor lesions
We used formalin-fixed, paraffin-embedded (FFPE) specimens of endoscopically-resected colorectal adenomas from 40 patients treated at Fukushima Medical University Hospital as described previously (35). In GSE41258 dataset, expression data for 53 normal colon and 49 colon adenoma samples were available (24). We also utilized three microarray datasets of colon biopsy specimens from normal colon, adenoma and carcinoma, including GSE4183 (8 normal, 15 adenoma and 15 cancer), GSE77953 (13 normal, 17 adenoma and 17 cancer) and GSE37364 (38 normal, 29 adenoma and 27 cancer) based on Affymetrix HG-U133+2.0 platform (36-38). Two additional microarray datasets, including GSE45270 and GSE79460, were further obtained to compare the expression of genes between conventional tubular adenomas and serrated adenomas (39, 40).
CRC samples
We enrolled 368 consecutive patients with primary CRC, who underwent surgery between 1990 and 2010 in Fukushima Medical University Hospital. Tumors were classified according to the TNM classification of malignant tumors (UICC 7th edition) (41). After exclusion of patients who received preoperative chemotherapy or radiotherapy, 335 stage 0 to IV patients with available FFPE tumor sections were used in the present study. Among them, adjacent normal mucosae from 304 sections were also available for evaluation. Clinical information was retrospectively obtained by reviewing medical records, with the last follow-up in February 2016. For survival analysis, 17 patients with stage 0 tumors (carcinoma in situ) were omitted, and 267 stage I to IV patients who underwent curative resection (R0), with survival information, were utilized. The primary endpoint of interest was DFS, DSS, and OS, which were defined as time from the date of surgery to the date of disease recurrence, cancer death, and death from any cause, respectively.
The study was conducted in accordance with the Declaration of Helsinki and was approved by the Institutional Review Board of Fukushima Medical University.
Immunohistochemistry
For immunohistochemistry (IHC), primary rabbit polyclonal anti-GALNT6 antibody (HPA011762, Prestige Antibodies Powered by Atlas Antibodies, Sigma-Aldrich, Co. LLC. St. Louis, MO, USA) was identified using the Human Protein Atlas database (www.proteinatlas.org) (42, 43). Four-μm thick sections were deparaffinized and
Anti-GALNT6 antibody was incubated in a 1:500 dilution of 10 mM phosphate-buffered saline containing Tween 20 (Sigma-Aldrich) at 4°C overnight, and subsequently detected by a horseradish peroxidase (HRP)-coupled anti-rabbit polymer followed by incubation with diaminobenzidine (Dako EnVision+ System, Dako, Heverlee, Belgium) (44).
Sections were counterstained with hematoxylin.
Each sections were considered positive for GALNT6 staining when more than 10% of tumor cells were stained in the cytoplasm according to the procedure as previously described.
Determination of MMR status
In our FFPE cohort, IHC for MMR protein was performed using Dako EnVision+ System as described above, with primary mouse or rabbit monoclonal antibodies against MLH1 (ES05, 1:50, Dako), MSH2 (FE11, 1:50, Dako), MSH6 (EP49, 1:200, Dako) and PMS2 (EP51, 1:50, Dako). All 335 CRCs were initially stained with PMS2 and MSH6.
Subsequently, any tumors exhibiting abnormal or equivocal staining MMR protein expression were further evaluated with MLH1 and MSH2 (45). Loss of a MMR protein was defined as the absence of nuclear staining of tumor cells in the presence of positive nuclear staining in normal colonic epithelium and lymphocytes. In the expression datasets, MSI testing data (MSI-H, MSI-L and MSS) were obtained through the GEO or the TCGA data portal. Tumors demonstrating MSI-H or loss of at least one MMR protein were collectively designated as dMMR, and tumors with MSS/MSI-L or intact MMR protein expression as proficient MMR (pMMR).
Cell culture and reagents
Human CRC cell lines, LoVo, HCT15, HCT116, SW48, RKO, LS174T, LS180, SW480, SW620, SW837, Colo201 and Colo205 were obtained as described previously (35). RKO and LS180 cells were maintained with DMEM (ThermoFisher Scientific, Waltham, MA, USA); others with RPMI-1640 (ThermoFisher Scientific) containing 10%
fetal bovine serum (FBS) and penicillin/streptomycin (100 IU/ml) (ThermoFisher Scientific) at 37℃ in a humidified atmosphere of 5% CO2. A demethylation reagent, 5-aza-2’-deoxycytidine (5-aza-dC) (Sigma-Aldrich, St. Louis, MO, USA) was dissolved in DMSO at 10 mM and stored in aliquots at −80°C until use. Cells were treated with 5-aza-dC at different concentrations (5 and 10 μM) for 72 hours. The culture medium containing freshly prepared 5-aza-dC was replaced every 24 hours.
Quantitative real-time PCR
Total RNA was extracted from cells using TRIzol Reagent (ThermoFisher Scientific), and the amount of RNA was quantified by NanoDrop (46). One-μg of total RNA was reverse transcribed to cDNA using the SuperScript
III First-Strand Synthesis System (ThermoFisher Scientific) according to the manufacturer’s instructions. qRT-PCR was carried out using TaqMan Gene Expression Master Mix on the 7500 real time PCR system in triplicate with TaqMan assays, including GALNT6 (Assay ID Hs00926629_m1), MLH1 (Hs00179866_m1), and ACTB (Hs99999903_m1) (ThermoFisher Scientific). Relative expression levels were determined with SDS software by the 2-∆∆Ct method as described by the manufacturer, with ACTB used as the calibrator gene.
Western blotting
Total protein was extracted using RIPA lysis buffer supplemented with Halt Protease Inhibitor Cocktail (ThermoFisher Scientific), and were boiled in Tris-Glycine SDS Sample Buffer (ThermoFisher Scientific). Equal amount of protein was loaded and separated by 10% SDS-PAGE gel, and then transferred onto PVDF membranes (ThermoFisher Scientific). The membrane was blocked with 5% non-fat dried skimmed milk powder (Cell signaling Technology), and incubated with primary rabbit anti-GALNT6 (#HPA011762, 1:250, Atlas Antibodies) or mouse anti-β-actin (#SC-69879, 1:2000, Santa Cruz Biotechnology,). The membrane was incubated with goat anti-rabbit or anti-mouse HRP secondary antibody (Santa Cruz Biotechnology), and developed with the SuperSignal West Pico chemiluminescent Substrate (ThermoFisher Scientific) using LAS4000 imager (GE Healthcare).
siRNA transfection
During the exponential growth phase, cells were plated in 6-well plates and transfected with siRNA oligonucleotides of GALNT6 or scramble control (Ambion® Silencer Select; s22154, s22155 and negative control #1) using Lipofectamine RNAiMAX Reagent (ThermoFisher Scientific), according to manufacturer’s instructions.
Following 48 hours of incubation, cells were harvested and used for each experiment. Experiments were repeated at least three times. The knockdown efficiency was evaluated using qRT-PCR and western blot analysis.
Cell proliferation assay
Cell proliferation was measured using the Cell Counting Kit-8 (CCK-8, DOJINDO, Kumamoto, Japan) according to the manufacturer’s instructions. Briefly, 24 hours after transfection, SW480 cells of three groups were harvested, re-suspended and seeded at a density of 4 × 103 cells/well in 96-well culture plates. After 24 hours, 48 hours and 72 hours of incubation in complete medium, cells were then incubated with 10μl of the CCK-8 reagent at 37℃ in a humidified atmosphere of 5% CO2 for 3 hours. The absorbance at 450nm was measured by using a microplate reader.
The experiment was repeated three times, each time in triplicate.
Flow cytometry and detection of apoptosis
Apoptotic cells were detected using the Annexin V-PE/7-AAD Apoptosis Detection Kit (BD Biosciences,
labeling with Annexin V and 7-AAD, followed by flow cytometry. Annexin V-positive cells were regarded as apoptotic cells.
For the analysis of cell surface Tn-antigen expression, cell suspensions were incubated with mouse monoclonal anti-Tn antibody (MLS128, 1:00, Wako, Osaka, Japan), followed by staining with goat anti-mouse IgG H&L (Alexa Fluor 488) (ab150113, 1:00; Abcam, Cambridge, UK). The data were acquired on a FACSCanto II (Becton Dickinson, Franklin Lakes, NJ, USA) and analyzed with Flowjo software.
Wound-healing assay
Cells were seeded on a 6-well plate and allowed to reach confluency. After scratching the bottom of the well with a pipette tip, the monolayer of cells was washed two times with PBS to remove the detached cells. The wound closure photographs were captured at 0, 6, 12, 18 and 24 hours after scratching using a phase contrast microscope. The percent of wound closure was calculated as the cell migration distance to the initial wound distance.
Transwell invasion assay
The invasive capacities of cells were determined using Corning BioCoat 24-Multiwell Tumor Cell Invasion Systems (Corning, NY, USA). Briefly, cells were resuspended with serum-free medium, 5×104 cells were seeded into the upper chamber, and the lower chamber was added with 750 μL of RPMI containing 5% FBS. After 22 hours of incubation at 37°C, 5% CO2 atmosphere, cells that migrated to the bottom of the filter were labeled using Calcein AM (Corning). Following incubation for 1 hour, fluorescence of invaded cells is read at wavelengths of 485/530 nm (Ex/Em) on a Skanlt RE for Varioskan Flash 2.4 (Thermo Fisher Scientific). Percentage of invasion was calculated as described by the manufacturer.
5-FU cytotoxicity assay
GALNT6 siRNA- and control siRNA-transfected SW480 cells were plated into 96 well plates at a density of 5×103 cells per well. After 24 hours of incubation, cells were treated with various 5-FU (Sigma-Aldrich) concentrations and the plates were incubated for 72 hours at 37˚C under 5% CO2. Cytotoxicity was assessed by CCK-8 assay as described above. We initially applied a series of 5-FU concentrations ranging from 0.1 to 1,000 μg/ml or vehicle alone for generating dose-response curves. We then used 1, 5, 10, 50, and 100μg/ml of 5-FU for experiments. At least three independent experiments were performed and each sample was analyzed in triplicate.
Lectin Microarray
The lectin microarray was performed essentially as described elsewhere. Briefly, the membrane fractions of GALNT6 siRNA- and control siRNA-transfected cells were obtained using the ProteoExtract Subcellular Proteome Extraction kit (Merck Millipore, Darmstadt, Germany). One-μg of each sample was added to 100 μg of Cy3 monoreactive dye pack (GE Healthcare Life Science, Pittsburgh, PA, USA). Cy3-labeled proteins were desalted using a Zeba Spin Desalting Column (ThermoFisher Scientific) and diluted, and then 100 μl of samples were analyzed on a lectin microarray glass slide (LecChip ver 1.0; GlycoTechnica, Yokohama, Japan). Fluorescent images were acquired using an evanescent-field fluorescence scanner (GlycoStation Reader 1200; GlycoTechnica). The net intensity for each spot was calculated by subtracting the background value from the raw signal intensity, with averaging of the results from three spots. Acquired data were normalized by making the total fluorescence in each well (i.e. for 45 lectins) equivalent. All data were analyzed with GlycoStation Tools Pro Suite 1.5 (GlycoTechnica).
Statistical analysis
Fisher’s exact test, Chi-square test, unpaired t-test and Mann-Whitney U test were used to determine differences between two variables. Spearman’s correlation was used to evaluate the correlations between levels of expression and methylation. Cumulative survival was estimated by the Kaplan-Meier method, and differences between the two groups were analyzed by log-rank test. Univariate and multivariate models were computed using Cox proportional hazards regression. All statistical analyses were two-sided and were conducted using Graphpad Prism v6.0 (Graphpad Software Inc., La Jolla, CA, USA) and SPSS Statistics version 24 (IBM Corporation, NY, USA). All P-values were two-sided, and P-values less than 0.05 were considered statistically significant.
Results
Transcriptional glycogene profiling demonstrated subgroups of CRC with distinct survival outcomes
The overall study design is demonstrated in Supplementary Figure 1. We generated a list of 2401 Affymetrix microarray probes that correspond to 185 unique glycogenes (Supplementary Table 1). Using the 185 glycogenes, we initially conducted an unsupervised hierarchical clustering analysis in 177 patients with stage I-IV colon cancer from GSE17536 (21). As depicted in Figure 1A, we found two major clusters (Cluster A and B) and four subclusters (Cluster A1, A2, B1 and B2). Although there was no significant difference in clinical parameters between two clusters (Supplementary Table 2), Cluster A showed a clear tendency to be associated with worse clinical outcomes, including DSS (P=0.0696, stage I-IV) and DFS (P=0.0436, stage I-III) (Supplementary Figure 2A-B). Furthermore, when focusing on four subclusters, patients segregating to Cluster A1 had the worst prognosis for DSS and DFS (Supplementary Figure 2C-D). Cluster A1 was significantly associated with poor DSS and DFS as compared to the remaining subclusters comprised of Cluster A2, B1 and B2 (P=0.0243 and P=0.0003, respectively, Figure 1B-C). This
stage and other clinical factors by multivariate analysis (Supplementary Table 3). Similar results were obtained in the analysis for OS. This poor prognosis subtype (Cluster A1) was also significantly associated with poorly-differentiated histology (P=0.007, Supplementary Table 2).
Since Cluster A and its subcluster Cluster A1 exhibited worse survival outcome, we next sought to identify a minimum set of genes whose expression was closely related to these poor prognosis subgroups. Thirty-nine differentially expressed genes between Cluster A and B were further narrowed down to 15 genes (designated the 15-glycogene signature) that were significantly altered between Cluster A1 and the remaining subclusters by applying stringent p-values (Supplementary Table 4).
Prognostic validation of the 15-glycogene signature in two independent datasets
To test the hypothesis that the 15-glycogene signature can discriminate prognostic subgroups, independent microarray datasets with survival information were utilized for hierarchical clustering analysis based on the 15 glycogenes. We initially used resected primary CRC samples with stage I-III diseases from GSE41258 (24). As demonstrated in Figure 1F-G, 121 patients were clearly separated into two clusters, designated as 15-Glycogene Cluster A (n=56) and 15-Glycogene Cluster B (n=65), with statistically significant DFS difference (P=0.0021). Similar results were obtained when only stage II-III patients were analyzed (Figure 1H, n=94, P=0.0020). We next studied a homogeneous group of stage II colon cancer from GSE33113, in which 89 patient samples with DFS data were available (25). Clustering of GSE33113 dataset by the 15 glycogenes closely recapitulated the prognostic subgroups, demonstrating that 15-Glycogene Cluster A (n=43) patients had significant shorter DFS than that of 15-Glycogene Cluster B (n=46) (Figure 1I-J, P=0.0080). Also, multivariate Cox analysis revealed that 15-Glycogene Cluster A was significantly associated with poor survival, independent of stage and other clinical features (Supplementary Table 5). In all 3 independent clustering analyses, the expression of the 15 genes were each consistently altered between clusters (Supplementary Figure 2E). Specifically, we identified upregulation of GCNT3 and FUT8, and downregulation of GALNT6 as common features of Cluster A1 and 15-Glycogene Cluster A.
The 15-gene signature identified a subgroup exhibiting unique clinicopathological and genomic profiles
To further characterize our glycogene-derived subtypes, we analyzed the association between the 15-Glycogene Clusters and known molecular markers. Using the GSE41258 dataset, in which MMR and TP53 mutation status were known, we found that tumors in 15-Glycogene Cluster A were significantly associated with dMMR (P=0.003) and wild-type TP53 (P=0.050) (Figure 1K and Supplementary Table 6). To validate this finding, the same clustering procedure was applied to two independent RNA-Seq datasets obtained from TCGA, consisting of
RNA-Seq RPKM (n=193) and RNA-Seq V2 RSEM (n=361). Tumor histology, location, stage and MMR data were known for both datasets, whereas mutation data for RAS (KRAS, HRAS, and NRAS), BRAF and TP53 were also available for the former dataset. This demonstrated a clear association between dMMR and the 15-Glycogene Cluster A in both TCGA datasets, in agreement with the finding of GSE41258 analysis (Figure 1F, K-L and Supplementary Table 6, P=5.0E-10 and P=6.0E-18, respectively). The association of this cluster with wild-type TP53 was also validated in the TCGA RNA-Seq RPKM (P=1.1E-04). Intriguingly, we found that proximal location, mucinous histology, mutant RAS and mutant BRAF were statistically significantly enriched in the 15-Glycogene Cluster A.
Decreased expression of GALNT6 in dMMR tumors in 11 independent cohorts of CRC patients and a dataset of CRC cell lines
Since we found that the 15-Glycogene Cluster A was associated with specific gene alterations, including dMMR, BRAF mutation, RAS mutation, and wild-type TP53, we further attempted to focus on single genes, altered expression of which can be characteristics of those genomic status. To determine the potential relationship between known molecular markers and the expression of the 3 glycogenes, including GCNT3, FUT8 and GALNT6, 8 additional datasets with MMR status and/or mutation status were assembled. This allowed us to analyze association between the expression of 3 genes and known molecular subgroups in 11 independent cohorts containing a total of 2261 CRC patients (Figure 2). Particularly, this analysis revealed the association between GALNT6 expression and MMR status with high reproducibility, where GALNT6 was statistically significantly downregulated in dMMR tumors in all 11 cohorts, comprised of 393 dMMR and 1618 pMMR patients. GALNT6 expression was also significantly decreased in BRAF-mutant tumors in 3 of 4 datasets we analyzed, whereas it had no clear tendency for RAS or TP53 mutations. It is worth noting that the association of decreased GALNT6 with dMMR and BRAF mutation were clearly recapitulated by the analysis of CRC cell lines using an expression dataset GSE59857, in which 151 CRC cell lines were profiled on Illumina microarray platform (Figure 2). Accordingly, we focused specifically on the significance of GALNT6, which encodes one of the polypeptide GalNAc transferase family enzymes.
GALNT6 is upregulated in precursor lesions and downregulated in a subset of carcinoma upon malignant transformation
It is well accepted that genetic and epigenetic alterations accumulate in tumor cells that contribute to the transition from premalignant to malignant lesions. Also, downregulation of some glycogenes is known to be an important step in colorectal cancer development and progression (47, 48). Thus, we hypothesized that decreased GALNT6 expression may be crucially involved at least in certain pathways of colorectal carcinogenesis. To this end, we analyzed 4 microarray datasets that consisted of normal colon (n=113, in total), colon adenoma (n=110, in total) and carcinoma (n=227, in total) samples. In all 4 microarray datasets, GALNT6 mRNA expression was significantly higher
microarray analyses, IHC for GALNT6 protein expression were conducted using our large series of colorectal adenoma (n=40) and carcinoma specimens (n=335), in which 304 adjacent normal tissues were also available for evaluation.
This FFPE cohort contained 17 carcinoma in situ (Tis) samples (5.1%), and 25 of 335 carcinomas (7.4%) were determined to be dMMR (Table 1). IHC demonstrated that GALNT6 protein expression was not detected in the vast majority of normal colon mucosal cells (92.8% of 304 normal tissues were considered negative GALNT6 expression).
Whereas, virtually all samples of adenoma and carcinoma in situ (Tis) showed strong granular cytoplasmic staining of GALNT6 in tumor cells essentially throughout the tumor area (97.5% and 100.0% were positive for GALNT6, respectively) (Figure 4A-C). Likewise, in carcinoma specimens, intense GALNT6 staining was diffusely found in carcinoma cells (Figure 4D-E). However, approximately 15% of carcinomas lacked GALNT6 protein expression (Figure 4F-G and Table 1). We found that GALNT6 staining was frequently lost in dMMR tumors (52.0% were negative), although the majority of pMMR tumors showed positive GALNT6 staining (11.6% were negative) (Figure 3E and Table 1). This was highly consistent with the analysis of mRNA levels, where decreased GALNT6 was found in dMMR tumors (Figure 2 and 3D). Collectively, in both mRNA and protein levels, GALNT6 expression was highest in precursor and preinvasive tumors, and was subsequently downregulated or lost in a subset of carcinomas which was associated with in dMMR tumors.
It has recently become apparent that CRC develops through histologically different pathways. Since we found that decreased GALNT6 mRNA expression was associated not only with MMR status but also with BRAF mutations (Figure 2), it was speculated that GALNT6 downregulation could be associated with the serrated pathway (49).
However, we observed no difference in GALNT6 expression between serrated adenomas and conventional adenomas in 2 independent datasets of histologically-confirmed adenoma samples (Supplementary Figure 3A-B).
Epigenetic silencing may contribute to GALNT6 downregulation
It is known that, upon malignant transformation, downregulation of some glycogenes along with incomplete glycan synthesis in tumor cells can result from epigenetic silencing of glycogenes mainly by DNA hypermethylation (47). Thus, we addressed the possibility that DNA methylation could contribute to decreased GALNT6 expression in CRC. As shown in Figure 3F, we observed significant inverse correlation between mRNA expression and methylation of GALNT6 in TCGA data (n=357, Spearman’s r=-0.49, P<0.0001). This suggests that reduced GALNT6 in CRC may be at least in part regulated by DNA methylation. Correspondingly, a DNA methyltransferase inhibitor, 5-aza-dC treatment of CRC cell lines showed induction of GALNT6 expression in LoVo and SW48 cells, irrespective of MLH1 methylation status (Figure 3G-H). On the other hand, in HCT116 and RKO cells, GALNT6 expression was not restored by 5-aza-dC. Those findings suggest that decreased GALNT6 expression and the loss of GALNT6 enzyme activity may
contribute to the early stages of colorectal carcinogenesis through epigenetic silencing at least in a subset of CRC.
Lack of GALNT6 protein expression was associated with poor prognosis
Since we identified and validated a transcriptional subtype, namely the 15-Glycogene Cluster A, characterized by poor prognosis, dMMR and decreased GALNT6 expression, we sought to examine the association between GALNT6 protein expression and clinicopathological characteristics as well as prognosis in 335 patients with CRC. Consistent with previous studies, patients with dMMR CRC in our cohort (n=25) were significantly associated with younger age at diagnosis (P=0.011), proximal location (P=0.036), and poorly-differentiated or mucinous histology (P<0.0001) (Supplementary Table 7), but MMR status had no significant impact on DSS, DFS or OS. We found that tumors lacking GALNT6 expression were predominantly observed in poorly-differentiated or mucinous adenocarcinomas (P<0.0001). In contrast, GALNT6 staining exhibited no association with other clinical features, including age, gender, tumor location or TNM classifications.
We next evaluated the prognostic significance of GALNT6 using 267 patients with stage I-IV CRC who underwent potential curative resection. As shown in Figure 5A and Supplementary Figure 4A, patients with negative-GALNT6 tumor had significantly poorer DSS and OS, compared to those with positive-GALNT6 (P=0.0038 and P=0.022, respectively). This remained statistically significant for DSS and OS when the analysis was conducted in 195 patients with stage II and III CRC (Figure 5B and Supplementary Figure 4B, P=0.0008 and P=0.014, respectively).
Furthermore, univariate and multivariate Cox regression analyses revealed that the lack of GALNT6 protein expression was significantly associated with poor DSS (HR 3.01; 95%CI 1.16-7.80; P=0.024) and OS (HR 2.54; 95%CI 1.24-5.17;
P=0.010), independent of stage, TNM classifications, tumor histology, and other conventional clinical factors (Table 2 and Supplementary Table 8). Stratified analyses further demonstrated that negative-GALNT6 had significant prognostic impact on DSS and OS in stage III patients, but not evident in stage II patients (Figure 5C-D and Supplementary Figure 4C-D). However, concerning DFS, the prognostic values of GALNT6 expression showed only a trend, which did not reach statistical significance (Supplementary Figure 4F-I).
As an exploratory analysis, we further addressed the prognostic effect of GALNT6 by stratifying patients according to MMR status. When dMMR tumors were analyzed for DSS, OS and DFS, we found a striking contrast of survival outcomes between positive and negative GALNT6 expression. On the other hand, the prognostic significance of GALNT6 seemed equivocal in pMMR tumors (Figure 5E and Supplementary Figure 4E, J). This may suggest that tumors exhibiting both dMMR and negative-GALNT6 could be a distinct prognostic subgroup of CRC. However, this finding must be cautiously interpreted, because of the exploratory nature of this retrospective analysis with small number of patients and low number of events in each subgroup.
Lack of GALNT6 expression was associated with poor therapeutic response to 5-FU-based adjuvant chemotherapy
Figure 3D-E) but also with poorer prognosis in stage II and III patients after curative resection (Figure 5B, Supplementary Figure 4B and 4G), we sought to determine if the expression of GALNT6 was associated with response to postoperative adjuvant chemotherapy. Among 190 patients with stage II and III CRC for which information on the administration of adjuvant chemotherapy was available, 114 patients received adjuvant chemotherapy after surgery, while 76 patients were treated by surgery alone. Their regimens were intravenous or oral 5-FU-based drugs. To examine the postoperative therapeutic response to 5-FU-based adjuvant chemotherapy, we conducted DFS analyses for GALNT6 expression by stratifying stage II and III patients on the basis of adjuvant treatment history (Supplementary Figure 5A-F). Among patients who received adjuvant chemotherapy, negative-GALNT6 showed a nonsignificant trend towards worse DFS (Supplementary Figure 5A). Notably, in stage III patients receiving adjuvant chemotherapy, negative-GALNT6 was significantly associated with poor therapeutic outcome (P=0.0079, Supplementary Figure 5C).
This prognostic effect was not observed in stage III patients without adjuvant chemotherapy, although the number of patients in each group was limited (Supplementary Figure 5D). When stage II patients were analyzed, there was no association between GALNT6 expression and DFS in those with or without adjuvant chemotherapy (Supplementary Figure 5E-F).
Depletion of GALNT6 increased the chemoresistance to 5-FU
In the attempt to understand the biologic function of GALNT6 in CRC, GALNT6 protein levels in a panel of CRC cell lines were initially examined by western blotting, and a pMMR cell line, SW480, with relatively higher GALNT6 expression were selected for further analyses (Supplementary Figure 6). We used two different siRNAs targeting GALNT6, demonstrating that GALNT6 was effectively silenced in SW480 cells in both mRNA and protein levels, confirmed by qRT-PCR and western blotting (Figure 6A-B). We observed that silencing of GALNT6 had no significant impact on cell proliferation or apoptosis, determined by CCK-8 assay and flow cytometry, respectively (Figure 6C-D). On the other hand, GALNT6 depletion increased both cell migration and invasion determined by wound healing assay and transwell invasion assay, respectively (Figure 6E-F and Supplementary Figure 7). As we found the association between negative-GALNT6 and poor response to 5-FU-based adjuvant chemotherapy in patients with stage III CRC (Supplementary Figure 5), we tested the in vitro contribution of GALNT6 expression to the sensitivity to 5-FU treatment. We examined the effect of GALNT6 knockdown on cell viability by CCK-8 assay in response to 5-FU treatment using SW480 cells. We found a moderate, but significant increase of 5-FU resistance in GALNT6-knockdown cells as compared to cells treated with control siRNA (Figure 6G). Collectively, decreased GALNT6 may be involved in progression of CRC by enhancing invasion, migration and 5-FU resistance.
Decreased GALNT6 resulted in the increase of cancer-associated truncated glycan, Tn antigen
Malignant transformation is commonly associated with dysregulated glycogenes, resulting in alteration of cell surface glycosylation. Thus, we attempted to determine if the depletion of GALNT6 could affect the cell surface glycan profiles. Lectin microarray analysis was conducted to examine the lectin profiles of surface membranous fractions in SW480 cells. Compared to siRNA control, GALNT6-silenced cells demonstrated decreased lectin Jacalin and ACA, each of which can recognize core 1 and core 3 extension of O-glycan, respectively (Supplementary Figure 8).
Conversely, we found increased levels of lectin HPA that recognizes GalNAc-Ser/Thr, also known as Tn-antigen, by GALNT6-knockdown (Figure 6H and Supplementary Figure 8). We further confirmed that GALNT6 knockdown could increase the cell surface expression of Tn-antigen by flow cytometry using a monoclonal antibody MLS128 (Figure 6I-J).
Discussion
The present study provides several lines of evidence that the expression of GALNT6 is a robust biomarker for identifying a prognostic subgroup of CRC and is implicated in colorectal carcinogenesis. First, a glycogene-derived transcriptional subtype, namely 15-Glycogene Cluster A, was identified and validated using a total of 941 samples from multiple microarray and RNA-Seq datasets. This novel subgroup, in which GALNT6 was downregulated, was characterized by poor prognosis, mucinous-poorly differentiated histology, proximal tumor location, and dMMR.
Moreover, strong association between decreased GALNT6 mRNA expression and dMMR was reproducibly confirmed in all 11 independent patient cohorts and a dataset of CRC cell lines that we analyzed, followed by a large series of FFPE cohort at GALNT6 protein levels, consisting of a total of more than 3000 samples. Second, in both mRNA and protein levels, downregulation of GALNT6 seemed to occur in early stage of colorectal carcinogenesis possibly through epigenetic silencing, where GALNT6 was expressed in almost precursor/preinvasive lesions but was subsequently decreased in a subset of carcinomas. In addition, using lectin microarray analysis followed by confirmation with flow cytometry, silenced GALNT6 upregulated the cell surface expression of a cancer-associated truncated glycan, Tn-antigen, which is previously implicated in colorectal carcinogenesis and poor prognosis. Third, the lack of GALNT6 protein expression discriminated a poor prognostic subgroup of CRC that was largely consistent with that of the 15-Glycogene Cluster A, reinforcing the notion that the glycogene-derived transcriptional subtype is in part recapitulated by tumors lacking GALNT6 protein. It is also likely that patients with negative-GALNT6 tumor receiving 5-FU-based adjuvant chemotherapy are associated with poor therapeutic outcome. This was further supported by the finding that GALNT6 depleted CRC cells demonstrated an increase of chomoresistance to 5-FU treatment.
The present study addressed the hypothesis that the expression of glycogenes can define clinically relevant molecular subtypes in CRC. We initially conducted an unsupervised clustering of a large microarray dataset based on
validation, the 15-glycogene signature successfully identified a glycogene-derived prognostic subgroup that was associated with poor outcomes, independent of other clinical factors by multivariate analyses. Moreover, this subgroup was also characterized by mucinous or poorly differentiated histology, dMMR, harboring mutations in KRAS or BRAF, and wild-type P53. Subsequently, we identified a novel association between GALNT6 expression and MMR status that was robustly validated using a total of 11 independent patient cohorts as well as a dataset of CRC cell lines. This finding was further followed by a large, well-characterized cohort of FFPE tissues. Importantly, our strategy integrated various gene expression platforms, including Affymetrix, Agilent and Illumina microarrays and RNA-Seq data obtained from different laboratories, and even technologically independent approach by IHC. Likewise, GALNT6 downregulation in carcinoma tissues, compared to adenoma tissues, was clearly reproduced in 5 independent series of cancer precursor and cancer lesions in both mRNA and protein levels. Those integrated, multistep analyses can minimize false-positive results. It is therefore unlikely that the presence of this subgroup of CRC is related to false discoveries or batch effects from high-throughput data analyses.
Using IHC analysis of 335 CRC specimens, we finally identified a distinct subgroup lacking GALNT6 protein expression in approximately 15% of tumors. Although there was no relationship between GALNT6 expression and TNM factors, GALNT6-negative CRC was significantly associated with dMMR and mucinous or poorly differentiated adenocarcinoma. Furthermore, patients with GALNT6-negative CRC had significant worse OS and DSS, independent of other clinical factors, including stage, tumor location, histological differentiation, MMR status and receipt of adjuvant chemotherapy, by multivariate analysis. It seems that the negative-GALNT6 tumors by IHC closely recapitulated the 15-glycogene-derived poor-prognostic cluster, in which GALNT6 was robustly downregulated.
Therefore, both the 15-glycogene mRNA expression signature and GALNT6 protein expression can be robust prognostic biomarkers for CRC patients who underwent curative surgery. The prognostic performance of GALNT6 staining was particularly remarkable in stage III patients in OS and DSS analyses when stratified by stage. Intriguingly, although loss of GALNT6 showed only a nonsignificant trend toward poorer DFS, even in patients with stage III CRC, its negative prognostic effect on DFS was particularly evident in stage III patients who received 5-FU-based adjuvant therapy, demonstrating a striking contrast to those who treated by surgery alone. This seems to implicate that negative-GALNT6 may also have a predictive value for poor response to 5-FU-based adjuvant chemotherapy in stage III CRC. Therefore, alternative therapeutic strategies, including combination regimens or targeted drugs, may be more effective for stage III patients with negative-GALNT6 tumor. Given the exploratory and retrospective nature of this study, those results presented here would need to be validated in the future investigations.
In normal tissues, GalNAc type O-glycans are modified by core enzymes to generate core 1 and core 3 extension, and core O-glycans are further extended and capped by the addition of different terminal structures that are
usually sialylated and fucosylated (20, 51). In those process, glycosyltransferases act in a step-wise and competitive manner to synthesize various glycans with diverse biological functions. In the initiation step of GalNAc-type O-glycosylation is controlled by a large number of isoenzymes, polypeptide GalNAc-transferases (ppGalNAc-Ts).
GALNT6 is one of the members of ppGalNAc-Ts that catalyze the transfer of GalNAc to Ser/Thr residues on substrate proteins, initiating O-glycosylation. The ppGalNAc-Ts form a family of 20 distinct enzymes expressed in a cell-type specific manner, with different but overlapping substrate specificity, thus O-glycans are synthesized through concerted and occasionally competitive action of ppGalNAc-Ts (51, 52). In many types of cancer, altered expression of ppGalNAc-Ts, including GALNT6, has been investigated, suggesting ppGalNAc-Ts as potential cancer biomarkers (20). Previous studies have, in general, demonstrated that GALNT6 protein was expressed in most specimens of adenocarcinoma lesions in breast, lung, pancreas and renal cancer, whereas it was undetectable or very weakly found in their normal counterpart (53-56). Consistent with those studies, we demonstrated that GALNT6 was not detectable in almost all normal mucosae, but was expressed in vertically all adenomas as well as the majority of CRC samples, indicating a role of altered GALNT6 expression during carcinogenesis. In pancreatic cancer, Li et al reported that loss of GALNT6 expression was found to be associated with poor overall survival (54). This seems highly concordant with our finding that tumors lacking GALNT6 expression were independently associated with poor survival in patients with CRC. In both Li’s study and ours, GALNT6-nagative subsets of CRC and pancreatic cancer were each significantly correlated with poor differentiation (54). Conversely, the same group from Li et al. has recently reported that GALNT6 expression was independently predict poor overall survival after curative resection in patients with lung adenocarcinoma (53). In breast cancer, overexpression of GALNT6 may contribute to mammary carcinogenesis through aberrant glycosylation (56, 57), and GALNT6 mRNA expression positively correlated with bone marrow disseminated breast cancer cells (58). In gastric carcinoma, GALNT6 expression was associated with venous invasion (59). Such conflict between different cancers has also been observed in several studies investigating other ppGalNAc-Ts. For instance, in advanced ovarian cancer, overexpression of GALNT3 correlated with poor survival and likely to contribute to ovarian cancer progression through aberrant mucin O-glycosylation (60). GALNT3 expression was also reported to predict poor prognosis in renal cell carcinoma (55), whereas it was associated with better survival in lung adenocarcinoma (61), gastric cancer (62) and colorectal cancer (63). Although there is no direct explanation for the contradictory influence of ppGalNAc-Ts in different cancers, these conflicting data among different cancer types may indicate the complexity of O-glycosylation along with the diversity and distinct substrate specificities of ppGalNAc-Ts that can confer specific roles in specific cellular contexts.
Upon malignant transformation and cancer progression, epigenetic alterations are recognized as key characteristics of cancer that cause dysregulation of glycogenes, resulting in aberrant expression of cell surface glycans (18). In the present study, we found an inverse correlation between the expression and methylation of GALNT6.
Correspondingly, treatment with DNA demethylating agent reactivated the expression of GALNT6 in CRC cell lines.
with the concept of incomplete synthesis that glycan elongation in nonmalignant cells are impaired upon malignant transformation by silencing of glycogenes, resulting in the expression of cancer-associated truncated glycans (18, 20, 48). We also observed that GALNT6 silencing affected the cell surface glycan profiles by lectin microarray analysis. It is worth noting that the levels of lectin HPA-recognized GalNAc-Ser/Thr were specifically increased in GALNT6-depleted cells, confirmed by flow cytometry using a monoclonal antibody MLS128. This truncated O-glycan structure is also known as cancer-associated Tn antigen that was shown to be involved in tumor progression in many types of cancer, including CRC (64-67). Indeed, Tn antigen can be a marker of poorly differentiated and mucinous adenocarcinoma and increased expression of Tn antigen was associated with aggressive tumor phenotype and poor patient prognosis in CRC (20, 52). In addition to the upregulation of Tn antigen, knockdown of GALNT6 led to the increase of invasion and migration, which was in agreement with the known phenotype of Tn antigen overexpression.
Furthermore, GALNT6-depleted cells showed an increased resistance to 5-FU treatment. Collectively, our findings suggest that in early stages of colorectal carcinogenesis, GALNT6 expression is epigenetically regulated in a subset of CRC. This may cause the incomplete synthesis of Tn antigen and contribute to cancer progression by enhancing tumor cell invasion, migration and chemoresistance.
In conclusion, we developed and validated the 15-glycogene signature that can identify a genomically distinct subgroup with poor patient outcomes. The association between dMMR and decreased GALNT6 expression, which represents the hallmark of the glycogene-derived subtype, was reproducibly verified in 11 independent gene expression datasets and a large series of IHC specimens. Our results indicate that not only the 15-glycogene mRNA expression signature but also the lack of GALNT6 expression by IHC can be novel prognostic biomarkers for CRC. Furthermore, GALNT6 downregulation, in part due to epigenetic silencing, may contribute to the incomplete O-glycan synthesis and increased expression of the cancer-associated Tn antigen, highlighting the possible role of GALNT6 in colorectal carcinogenesis. Future studies are required to validate our findings in larger patient cohorts and to elucidate the precise mechanism of GALNT6 in CRC.
Acknowledgements
This work was supported by JSPS KAKENHI Grant Numbers 15K10143 and 25870582.
Figure Legends
Figure 1: Identification of glycogene-derived subtypes of CRC. (A) Unsupervised hierarchical clustering analysis
based on 185 glycogenes in 177 patients with stage I to IV CRC obtained from GSE17536, demonstrating two major clusters, Cluster A and B, and four subclusters, Cluster A1, A2, B1 and B2. Poorly differentiated tumors and patients with disease relapse are shown below the tree. (B,C,D,E) Kaplan-Meier curves depicting disease-specific survival (B,C) and disease-free survival (D,E) in the GSE17536 dataset. Patients classified as Cluster A1 had significantly worth survival in comparison to those in the remaining subclusters. (F) Clustering analysis for prognostic validation using 15 glycogenes (designated as 15-glycogene signature) in 121 patients with stage I to III CRC with survival information obtained from GSE41258, demonstrating the 15-Glycogene Cluster A and B. Clinical and genetic features, including deficient mismatch-repair (dMMR), TP53 mutation, tumor location and disease recurrence, are indicated. (G,H) Kaplan-Meier curves depicting disease-free survival in the GSE41258 dataset. Patients segregating 15-Glycogene Cluster A were significantly associated with poor survival. (I) Clustering analysis for prognostic validation based on the 15-glycogene signature in 89 patients with stage II CRC from GSE33113. (J) Patients segregating 15-Glycogene Cluster A had significant poor disease-free survival in the GSE33113 dataset. (K,L) Clustering analysis using the 15-glycogene signature in two independent RNA sequence (RNA-Seq) datasets obtained from TCGA (COADREAD), consisting of RNA-Seq RPKM (n=193) and RNA-Seq V2 RSEM (n=361). Clinical and genetic features, including dMMR, mutations in RAS, BRAF and TP53, tumor location and mucinous histology, are indicated.
Figure 2: The association between the expression of 3 glycogenes, including GCNT3, FUT8 and GALNT6, and known molecular markers in CRC, including dMMR, mutations in BRAF, RAS and TP53. Eleven independent cohorts, comprised of 2261 patients with CRC and a dataset of 151 CRC cell lines were used. The red or blue colors in the heatmap represent glycogenes with statistically significant upregulation or downregulation in tumors harboring dMMR, mutated BRAF, mutated RAS, and mutated TP53. The expression of GALNT6 was statistically significantly downregulated in dMMR tumors in all 12 datasets of CRC patients and cell lines.
Figure 3: Alteration of GALNT6 expression in colorectal carcinogenesis. (A,B,C,D) GALNT6 mRNA expression levels in normal colon, colon adenoma and carcinoma samples in GSE4183 (A), GSE77953 (B), GSE37364 (C) and GSE41258 (D). Compared to adenoma samples, GALNT6 mRNA was significantly downregulated in carcinoma samples, particularly in those with dMMR. (E) GALNT6 protein expression by immunohistochemistry in normal colon mucosa, colon adenoma, carcinoma in situ, and invasive carcinoma with MMR status. Loss of GALNT6 protein was found in carcinoma specimens, particularly in those with dMMR. *P < 0.05, ***P < 0.001. (F) Inverse correlation between GALNT6 mRNA expression and GALNT6 DNA methylation in 357 patients with CRC obtained from TCGA.
(G,H) qRT-PCR analysis for MLH1 (G) or GALNT6 (H) expression in CRC cell lines, treated with a DNA demethylating agent, 5-aza-dc.