Abstract
The majority of gallbladder cancer cases are discovered at later stages, which frequently leads to poor prognoses. Small-cell gallbladder neuroendocrine carcinoma (GB-SCNEC) is a relatively rare histological type of gallbladder cancer, and its survival rate is exceptionally low because of its greater malignant potential. In addition, the genomic landscape of GB-SCNEC is rarely considered in treatment decisions. We performed whole-genome sequencing on an advanced case of GB-SCNEC. By analyzing the whole-genome sequencing data of the primary cancer tissue (76.29X coverage), lymphatic metastatic cancer tissue (73.92X coverage) and matched non-cancerous tissue (35.73X coverage), we identified approximately 900 high-quality somatic single nucleotide variants (SNVs), 109 of which were shared by both the primary and metastatic tumor tissues. Somatic non-synonymous coding variations with damaging impact in HMCN1 and CDH10 were observed in both the primary and metastatic tissue specimens. A pathway analysis of the genes mapped to the SNVs revealed gene enrichment associated with axon guidance, ERBB signaling, sulfur metabolism and calcium signaling. Furthermore, we identified 20 chromosomal rearrangements that included 11 deletions, 4 tandem duplications and 5 inversions that mapped to known genes. Two gene fusions, NCAM2-SGCZ and BTG3-CCDC40 were also discovered and validated by Sanger sequencing. Additionally, we identified genome-wide copy number variations and microsatellite instability. In this study, we identified novel biological markers of GB-SCNEC that may serve as valuable prognostic factors or indicators of treatment response in patients with GB-SCNEC with lymphatic metastasis.
Introduction
Gallbladder cancer is the most common aggressive tumor of the biliary tract and accounts for 80-95% of these types of cancers worldwide(Lai and Lau 2008). Because symptoms are frequently indiscernible in the early stages, patients are often diagnosed at more advanced stages of the disease. As a result, the median 5-year survival rate associated with gallbladder cancer remains low(Zhu et al. 2010). Adenocarcinomas account for most gallbladder malignancies (90%), which include the less common squamous, adenosquamous, and neuroendocrine carcinoma (NEC) histologic subtypes. NEC commonly arises in the gastrointestinal tract and the bronchi but is rarely observed in the gallbladder(Yao et al. 2008). Small-cell neuroendocrine carcinoma (GB-SCNEC) of the gallbladder is even less common and has an incidence of approximately 0.5% of all gallbladder cancers(Henson et al. 1992) and 0.2% of all gastrointestinal carcinomas according to Surveillance, Epidemiology and End Results (SEER) data(Jun et al. 2006). The symptomatology of GB-SCNEC is nonspecific, and the diagnosis of GB-SCNEC is often based on the detection of cholecystolithiasis or polyps by cholecystectomy. Women and patients with cholelithiasis are considered high-risk populations for GB-SCNEC, and approximately 75% of patients from these populations present with concomitant hepatic, pulmonary, peritoneal and lymph node metastases. Because of the rarity of this disease, standards of care for treatment have not been clearly defined, and effective treatment options are currently limited to complete surgical resection with negative margins(El Fattach et al. 2015). Although GB-SCNEC cases represent a relatively small proportion of the total gallbladder cancer cases, the survival rate for neuroendocrine carcinomas is much less that the overall rate for gallbladder cancer because of its malignant potential and late-stage diagnosis. Thus, novel insights into the molecular and genetic aberrations specific to GB-SCNEC might facilitate the development of more efficient methods for the prevention, early diagnosis and curative treatment of this disease.
Gene mutations associated with gallbladder cancers are typically somatic in nature rather than inherited (2015). Mutations and overexpression of the p53 gene are the most common genetic changes observed in human cancers(Itoi et al. 1997), and p53 mutations are observed in up to 70% of the gallbladder cancer cases evaluated in various studies(Saetta 2006). In a previous study, we observed a significant frequency of non-silent p53 mutations in gallbladder cancer (47.1%)(Li et al. 2014). K-ras is another well-studied gene implicated in gallbladder cancer. Recent investigations have shown that K-ras mutation rates in various cancers range from 10 to 67%(Saetta 2006). We previously reported that a substantial proportion of K-ras mutations in gallbladder cancer are non-silent (7.8%), which is similar to p53(Li et al. 2014). The ERBB signaling pathway has the most well-established association with gallbladder carcinogenesis. Mutations in the components of the ERBB signaling pathway, including EGFR, ERBB2, ERBB3 and ERBB4, are the most frequently observed mutations in gallbladder cancer and affect approximately 37% of all gallbladder tumors(Li et al. 2014). In a recent study, Miyahara N demonstrated that Muc4 activation of ERBB2 signaling contributed to gallbladder carcinogenesis(Miyahara et al. 2014). Microsatellite instability (MSI) is another notable genetic factor associated with gallbladder cancer, and the frequency of MSI ranges from 3% to 35%(Saetta 2006).
Whole-genome sequencing (WGS) is considered a more powerful method for identifying putative disease-causing mutations in exonic regions compared with whole-exome sequencing (WES)(Belkadi et al. 2015); however, WGS studies in gallbladder cancer and metastatic gallbladder cancer have not been reported. The present study is the first to report the application of WGS technology to study genome-wide single nucleotide variants (SNVs), structural variations, copy number variations (CNVs) and microsatellite stability in a patient with GB-SCNEC. We also evaluated genomic differences between primary and metastatic GB-SCNEC tissues. Together, our findings might provide insights into genome-wide aberrations that mediate gallbladder carcinogenesis and metastasis.
Materials and Methods
1. Clinicopathological characteristics of the patient
A 56-year-old man was hospitalized with a 1-month history of pain in the right hypochodrium. The physical examination revealed no abnormalities. On admission, tumor marker levels were within the normal range. An abdomen CT with and without contrast revealed markedly abnormal gallbladder morphology with an enlargement at the neck region measuring 4.7 × 4.5 cm. The porta hepatis lymph nodes were abnormally enlarged, and stones were not visible in the gallbladder mass (Figure 1A, 1B). No neoadjuvant or adjuvant chemotherapy was administered prior to surgery. The surgical procedure (which included cholecystectomy as well as extensive surgical resections, including segment 4 of the liver and regional lymph node clearances) was performed in July 2014. After the surgical resections, histopathological analyses revealed that the tumor was composed of monomorphic cells containing small round nuclei and eosinophilic cytoplasm. The cells were organized in small nodular, trabecular or acinar structures surrounded by highly vascularized stroma with a high mitotic index (Figure 1C, 1D). Immunohistochemistry assays revealed that the cells in both the tumor and the resected lymph nodes were negative for chromogranin A but positive for synaptophysin, NSE and Ki 67 (80% of cells) (Figure 1E-1L). These findings were consistent with a diagnosis of GB-SCNEC with lymph node involvement. All of the surgical margins were negative, and a pathological R0 resection was achieved. The patient later received systemic chemotherapy with VP-16 and cisplatin and died 17 months after the operation.
2. Whole-genome sequencing
Genomic DNA was extracted from the tissue specimens using the QIAamp DNA kit (Qiagen). The libraries were prepared using the protocols provided by Illumina. Briefly, 1 μg of DNA was sheared into small fragments (approximately 200-300 bp) using a Covaris S220 system. DNA fragments were end repaired and 3’ polyadenylated. Adaptors with barcode sequences were ligated to both ends of the DNA fragments. E-Gel was used to select the target-size DNA fragments. Then, the tissues underwent 10 cycles of PCR amplification, and the resulting PCR products were purified. Validated DNA libraries were sequenced using the Illumina Sequencing System (HiSeq Series).
3. Whole-genome sequencing data processing and mutation identification
Read pairs (FASTQ data) generated from the sequencing system were trimmed and filtered using Trimmomatic(Bolger et al. 2014). The resulting reads were mapped to the human reference genome hg19 downloaded from the UCSC Genome Browser using BWA-MEM (http://arxiv.org/abs/1303.3997)) in the SpeedSeq framework(Chiang et al. 2015). Duplicate marking, position sorting and BAM file indexing were also conducted during the SpeedSeq align procedure using SAMBLASTER(Faust and Hall 2014) and Sambamba (https://github.com/lomereiter/sambamba). The sorted and indexed BAM files were directly loaded into the SpeedSeq somatic module. SpeedSeq analysis using FreeBayes (http://arxiv.org/abs/1207.3907) was performed on the tumor/normal pair of BAM files to achieve somatic SNVs and small INDELs. For the indexed BAM files, we also conducted INDEL realignment and base quality score recalibration using the Genome Analysis Toolkit (GATK)(McKenna et al. 2010), and then the MuTect algorithm(Cibulskis et al. 2013) was applied to identify somatic SNVs. BEDtools was used to filter the somatic SNVs identified by both SpeedSeq and Mutect. Finally, we used the Ensembl Variant Effect Predictor (VEP)(McLaren et al. 2010) to annotate the merged vcf file.
4. Pathway analysis
We performed a pathway analysis of the genes that mapped to the somatic SNVs in the tumor samples using DAVID (https://david.ncifcrf.gov/). Genes listed on the DAVID web site and all known human genes were used as the background. A KEGG database analysis was used to evaluate the pathway enrichment.
5. Structural variation analysis
LUMPY was run in the SpeedSeq sv module to identify somatic structural variations(Layer et al. 2014), and the vcf files generated by SpeedSeq were annotated using the VEP algorithm. We predicted the somatic structural variations in the primary and lymphatic metastatic GB-SCNECs using DELLY. We identified large somatic deletions, duplications and inversions detected by both LUMPY and DELLY. With respect to the genomic translocational structural variants, we focused on putative gene fusions. Somatic genomic translocational events were generated and filtered by DELLY. OncoFuse(Shugay et al. 2013) was used to identify the putative activating gene fusions.
6. Copy number variation analysis
Saas-CNV(Zhang and Hao 2015) was used to study somatic copy number variations in the primary and metastatic GB-SCNEC tissues. The GATK variant analysis pipeline was used to process the data generated from the sequencing system as described above. High-quality SNVs and small INDELs called by GATK were saved in variant call format (VCF) files, and the files were loaded into the Saas-CNV algorithm. The read depth and alternative allele frequency at each site were determined and used to identify somatic copy number alterations (SCNAs).
7. Microsatellite stability analysis
We used the vcf2MSAT (http://sourceforge.net/projects/vcf2msat/) script to retrieve the short tandem repeat (STR) data from the vcf files generated from the SpeedSeq somatic module. Information regarding the microsatellite repeat regions identified as INDELs was saved in the vcf files. The data obtained from identifying the genomic location and somatic changes in repeat number were used for the follow-up analyses.
Results
1. Whole-genome sequencing (WGS) and identification of SNVs
We performed WGS of the primary cancer, lymphatic metastatic cancer and matched non-cancerous tissue specimens. The sequencing data were mapped to the human reference genome hg19 with a mean sequencing depth of 61.98X (range 35.73-76.29X) and a mean coverage of 98.7%. Additional information regarding the WGS data is summarized in Supplementary Table 1.
Using the SpeedSeq framework and GATK-MUTECT with the non-cancerous gallbladder tissue as the normal reference, we identified 295 and 712 high-quality somatic SNVs in the primary and lymphatic metastatic tumor tissues, respectively (Figure 2, 3A, 3B). One hundred nine somatic SNVs were shared by the primary and metastatic tumors. Additional information regarding the 109 somatic SNVs is listed in Supplementary Table 2. The majority of the identified SNVs were located in intronic and intergenic regions (Figure 2). A pathway analysis of the genes mapped to the somatic SNVs revealed a significant enrichment in genes associated with axon guidance, ERBB signaling, sulfur metabolism and calcium signaling (Figure 4A). Notably, we identified somatic SNVs in 7 genes (PAK7, HRAS, ERBB4, SOS1, MTOR, NRG1 and ABL1) in the ERBB signaling pathway, which is consistent with the results of our previous study(Li et al. 2014).
Next, we investigated the impact of the genetic variations on amino acid changes. Somatic non-synonymous coding variations in HMCN1 and CDH10 were observed in the both the primary and metastatic tissue specimens (Table 1). In addition, somatic non-synonymous coding variations in MAST2, C6orf118, UBR5, OR51F1, SLC22A8, CSPG4 and BAIAP2L2 were observed in the metastatic GB-SCNEC tissue. The SNVs associated with amino acid variations were further validated using conventional Sanger sequencing.
We further evaluated the impact of the SNVs in HMCN1 and CDH10 on the structure and function of proteins using Polyphen2. We found that the amino acid changes in the proteins encoded by the HMCN1 (E171K) and CDH10 (R4623W) variants were predicted to disrupt the structure and function of proteins (score=0.996 and 0.880, respectively) (Figure 4B). The two variants were also clearly validated by Sanger sequencing (Figure 4C).
2. Analysis of structural variations
Using LUMPY and Delly, we identified more than 50 somatic genomic rearrangements in the primary and lymphatic metastatic GB-SCNECs tissue samples, which included 11 deletions, 4 tandem duplications and 5 inversions mapped to known genes. Additional information regarding the genes that mapped to structural variations is summarized in Table 2, and the genomic locations of the structural variations are indicated in Figure 3A and Figure 3B. Notably, 2 tandem duplications that encompassed the CLCNKA, CLCNKB, FAM131C, C1DP2 and C1DP3 genes were observed in both the primary and lymphatic metastatic GB-SCNEC samples.
We also considered the tanslocational events generated by DELLY. The high-quality somatic translocational structure variants were filtered. In primary and lymphatic metastatic tumor tissues, we identified 35 and 41 translocations respectively, 10 of which were shared by the primary and metastatic tumors. The detailed information regarding the translocations is summarized in Supplementary Table 3. A total of 6 gene fusions were identified, 2 of which, including NCAM2-SGCZ and BTG3-CCDC40 were validated by conventional Sanger sequencing. The genomic locations of all the translocations were labeled in Figure 5.
3. Analysis of copy number variants
We identified putative genome-wide copy number variants according to the total read depth and alternative allele proportion at each SNP/INDEL locus. First, we plotted the genome-wide distribution of the alternative allele frequency (also referred to as the B allele frequency or BAF) of the primary and metastatic tumor tissues (Figure 6). We found that the general distribution of the BAF was conserved between the 2 primary and metastatic tumor tissues. The SCNAs were estimated by combining the total read depth and the BAF at each locus. A total of 668 SCNAs in the primary tumor and 553 SCNAs in the metastatic tumor were identified. Next, we focused on the SCNAs that mapped to known genes and were observed in both of the primary and metastatic tumors tissues (Table 3).
4. Microsatellite instability analysis
We extracted the genomic locations of the genomic repeats and the changes in the somatic repeat count identified using the SpeedSeq framework. Additional information regarding the repeat regions identified in the primary and lymphatic metastatic GB-SCNEC tissues is provided in Supplementary Table 4 and Table 5 respectively. Repeat motif counts ≥ 4 and somatic count changes > 5 were filtered (Table 4). The genomic locations of the repeat motifs with changes in repeat number are also indicated in Figure 3A and Figure 3B. Genome-wide microsatellite instability was observed in both the primary and metastatic GB-SCNECs tissue, which is consistent with observations from many previous studies.
Discussion
GB-SCNEC is a relatively rare and aggressive tumor that is associated with a poor prognosis, and early diagnosis with prompt surgical intervention is currently the only treatment that provides favorable long-term outcomes. Therefore, novel insights into the molecular mechanisms underlying GB-SCNEC are required for the development of effective treatment strategies. In the present study, we performed whole-genome sequencing of primary gallbladder cancer, lymphatic metastatic cancer and matched non-cancerous tissues and identified 295 and 712 somatic SNVs in the primary and metastatic tumor tissues, respectively. The genetic aberrations were mapped to genes associated with axon guidance, ERBB signaling, sulfur metabolism and calcium signaling. Among these genes, HMCN1 and CDH10 were determined to be novel suppressors of gallbladder cancer metastasis. In addition, we identified 11 deletions, 4 tandem duplications and 5 inversions that mapped to known genes. Copy number variants were also analyzed, and genome-wide microsatellite instability was observed in both the primary and metastatic tumor specimens.
We previously reported that the ERBB signaling pathway is associated with the development of gallbladder cancer. Using whole-exome and targeted sequencing technology, we identified 12 genes that are involved in the ERBB signaling pathway and associated with somatic mutations in gallbladder cancer, and they included ERBB4, HRAS and NRG1(Li et al. 2014). In the present study, we identified somatic mutations in ERBB4, HRAS and NRG1 in gallbladder cancer tissue. Together, these findings further validate the critical role of the ERBB signaling pathway in gallbladder carcinogenesis.
CDH10 encodes a type II classical cadherin that mediates calcium-dependent cell-cell adhesion. In this study, we identified a somatic mutation in CDH10 with a deleterious effect on the structure and function of certain proteins. The mutation was observed in both the primary and metastatic tumor tissue, indicating that it is associated with tumor metastasis. A relationship between CDH10 and tumor metastasis has also been observed in colorectal cancer. Jun Yu et al. found that mutations in CDH10 as well as 4 other genes (COL6A3, SMAD4, TMEM132D, VCAN) were significantly associated with improved overall survival in colorectal cancer independent of tumor node metastasis (TNM) stage(Yu et al. 2015). Cadherin-10, the protein encoded by the CDH10 gene, can directly interact with β-catenin, a protein that mediates cell-cell interactions(Shimoyama et al. 2000). Notably, cadherins are calcium-dependent cell adhesion proteins, and we identified mutations in 9 genes involved in calcium signaling.
We also identified mutations in 10 genes that play a role in axon guidance signaling; therefore, this is the first study to report a link between axon guidance signaling and gallbladder neuroendocrine carcinoma. A relationship between axon guidance signaling and neuroendocrine carcinoma has also been observed in small intestine neuroendocrine tumors (SI-NETs) as reported by Banck MS et al.(Banck et al. 2013). Using exome sequencing, these authors found that mutations in axon guidance were frequently disrupted in SI-NET. In summary, disruptions in axon guidance signaling might play crucial roles in neuroendocrine carcinogenesis.
Structural variations that mapped to known genes were also observed in gallbladder neuroendocrine carcinoma tissues. This study is also the first to report the identification of 11 deletions, 4 tandem duplications and 5 inversions in gallbladder cancer. Notably, a 23,182 bp tandem duplication in chromosome 1 and a 15,645 bp tandem duplication in chromosome 10 were observed in both the primary and metastatic tumor tissues. These duplications might disrupt the expression of genes mapped to these regions (CLCNKA, CLCNKB, FAM131C, C1DP2 and C1DP3), and these potential disruptions in gene expression might play a role in tumor malignancy. NCAM2-SGCZ and BTG3-CCDC40 fusions were among the translocational structure variations. Notably, the functional domains in the 4 genes were partially retained in the fusions, which indicated that novel fusion transcripts were involved in the development and metastasis of gallbladder cancer.
This study is also the first to report the identification of copy number variations in primary and metastatic GB-SCNEC tissues. In addition, microsatellite instability has long been considered a contributor to gallbladder carcinogenesis. The genome-wide microsatellite instability observed in the primary and metastatic gallbladder tumor tissues in this study is consistent with the results from many previous reports(Yoshida et al. 2000; Sessa et al. 2003). In summary, this is first study to report the whole-genome sequencing of matched primary and metastatic gallbladder tumor tissues. Our findings have the potential for use in the identification of novel approaches for predicting and treating lymphatic metastasis in gallbladder cancer.
Acknowledgments
This study was supported by the National Natural Science Foundation of China (No. 81172026, 81272402, 81301816, 81172029, 91440203, 81402403, 81502433 and 31501127), the National High Technology Research and Development Program (863 Program) (No. 2012AA022606), the Foundation for Interdisciplinary Research of Shanghai Jiao Tong University (No. YG2011ZD07), the Shanghai Science and Technology Commission Intergovernmental International Cooperation Project (No. 12410705900), the Shanghai Science and Technology Commission Medical Guiding Project (No. 12401905800), the Program for Changjiang Scholars, the Natural Science Research Foundation of Shanghai Jiao Tong University School of Medicine (No. 13XJ10037), the Leading Talent program of Shanghai and Specialized Research Foundation for the PhD Program of Higher Education-Priority Development Field (No. 20130073130014), the Interdisciplinary Program of Shanghai Jiao Tong University (No.14JCRY05), the Shanghai Rising-Star Program (No. 15QA1403100), Top 6 elitist summit of jiangsu Province (No. 2013-WSN-025), and the People’s Livelihood Science and Technology Demonstration Project of Yixing City (No. 2014-80).