Gene expression profiling utilizing extremely sensitive CDNA arrays and enrichment-based network study of major bone cancer genes
Qiang Lin1, Anum Munir2, Sana Masood2, Shahid Hussain2, Mashal Naeem3, Sahar Fazal2
1 The First Department of Orthopedic Injury, Baoji Hospital of Traditional Chinese Medicine, Jintai District, Baoji City, Shanxi Province, China
2 Department of Bioinformatics and Biosciences, Faculty of Health and Life Sciences, Capital University of Science and Technology, Islamabad, Pakistan
3 Department of Bioscience, Comsats Institute of Information Technology, Islamabad, Pakistan
|Date of Submission||31-May-2020|
|Date of Decision||30-Nov-2020|
|Date of Acceptance||15-Feb-2021|
|Date of Web Publication||31-Jul-2021|
Dr. Anum Munir
Department of Bioinformatics and Biosciences, Faculty of Health and Life Sciences, Capital University of Science and Technology, Islamabad
Source of Support: None, Conflict of Interest: None
Background: The gene interaction network is a set of genes interconnected by functional interactions among the genes. The gene interaction networks are studied to determine pathways and regulatory mechanisms in model organisms. In this research, the enrichment study of bone cancer-causing genes is undertaken to identify several hub genes associated to the development of bone cancer. Materials and Methods: Data on bone cancer is obtained from mutated gene samples; highly mutated genes are selected for the enrichment analysis. Due to certain interactions with each other the interaction network model for the hub genes is developed and simulations are produced to determine the levels of expression . For the array analyses, a total of 100 tumor specimens are collected. Cell cultures are prepared, RNA is extracted, cDNA arrays probes are generated, and the expressions analysis of Hub genes is determined. Results: Out of cDNA array findings, only 7 genes: CDKN2A, AKT1, NRAS, PIK3CA, RB1, BRAF, and TP53 are differentially expressed and shown as significant in the development of bone tumors, approximately 15 pathways have been identified, including pathways for non-small cell lung cancer, prostate cancer, pancreatic cancer, chronic myeloid leukemia, and glioma, consisting of all the identified 7 genes. After clinical validations of tumor samples, the IDH1 and TP53 gene revealed significant number of mutations similar to other genes. Specimens analysis showed that RB1, P53, and NRAS are amplified in brain tumor, while BRAF, CDKN2A, and AKT1 are amplified in sarcoma. Maximum deletion mutations of the PIK3CA gene are observed in leukemia. CDKN2A gene amplifications have been observed in virtually all tumor specimens. Conclusion: This study points to a recognizable evidence of novel superimposed pathways mechanisms strongly linked to cancer.
Keywords: Bone cancer, enrichment, gene ontology, network, phylogenetic analysis
|How to cite this article:|
Lin Q, Munir A, Masood S, Hussain S, Naeem M, Fazal S. Gene expression profiling utilizing extremely sensitive CDNA arrays and enrichment-based network study of major bone cancer genes. J Res Med Sci 2021;26:49
|How to cite this URL:|
Lin Q, Munir A, Masood S, Hussain S, Naeem M, Fazal S. Gene expression profiling utilizing extremely sensitive CDNA arrays and enrichment-based network study of major bone cancer genes. J Res Med Sci [serial online] 2021 [cited 2021 Dec 3];26:49. Available from: https://www.jmsjournal.net/text.asp?2021/26/1/49/322870
| Introduction|| |
Bone tumor is a neoplastic growth of bone tissues. Abnormal or irregular developments of the bone can be either benign or dangerous. In US, normal 5-year survival after diagnosis of bone and joint malignancy is 67%., Bone cancer is the most significant cancer relativeto several other forms of cancers. Primary bone tumors can be classified into two: Cancer and tumors. Benign bone cancer involves osteoma, enchondroma, osteochondroma, osteoid osteoma, osteoblastoma, giant tumor, aneurysmal bone blister, and nerve dysplasia of bone. The basic malignant bone tumors are (a) an osteosarcomas (36%) in the leg bones of adults and children; more incessant among young girls under 15 and young boys over 15 years of age; more common among nonwhites than whites; (b) chondrosarcoma (30%) that usually affects individuals over 40 years of age; a small tumor that grows frequently in the pelvic bones; and (c) Ewing's sarcoma (16%), a disease that primarily affects kids and adolescents; develops in large bones, such as those in the upper arm, thigh, pelvis or shin; two-fold males are affected as females; a rapidly developing tumor is nerly9-fold greater among whites than non-whites.,
Bone tumors can be called as benign tumor that forms in the bone or bone-determined cells and tissues, and metastatic tumors, that begin at various locations and spread to several areas of the skeleton. The most frequent symptom of bone tumors is tormenting, which slowly and gradually increases after some time. The pain usually arises with the growth of the tumor. Additional symptoms may include weakness, fever, weight reduction, iron deficiency, and unexplained cracks of bones. Numerous patients apart from the effortless mass won't experience any side effects. Any bone tumors can weaken the structure of the bone, causing pathologic cracks. Exploring the association between multiple genes and related phenotypes is a significant method in both molecular and cancer biology. In osteosarcoma, the major mutations are observed in GRM4, CDKN2A/B, P53, RB, RECQL2/3/4, E2F, MDM2, WWOX, FGFR2, MAPK, and VEGF genes. Similarly, in chondrosarcoma, the commonly mutated genes are Cyclooxygenase-2, PTHLH, bcl2, P53, MDM2, CDKN2, and INK4A genes. The common genes involved in Ewing's sarcoma include C-MYC, FL1, ERG, ETV1, FEV, STAG2, P53, CDKN2A, and TERT genes.
A gene interaction network is a set of genes (nodes) interconnected by the edges representing functional interactions between these genes. The edges are known as interactions, thus, the two given genes in question are said to have either a physical interaction by their gene products; proteins, or one of the genes transform or interrupt the activity of another gene. In addition to these physical interactions, there are genetic interactions in which two gene alternates have a common effect that is not manifested either of them alone. These types of interactions are essential for understanding pathways and regulation in the model organisms, as well for the insight into complex diseases., Gene set enrichment analysis (GSEA) also known as functional enrichment analysis is a method for recognizing classes of genes or proteins that are over-represented in a wide variety of genes or proteins, and may be correlated with disease phenotypes. The approach uses statistical and computational techniques to identify significantly enriched or useless groups of genes. Microarray and proteomics findings usually detect thousands of genes for analysis.
In Bioinformatics research, pathway analysis software is used to identify linked proteins within a pathway or to construct de novo pathway from the proteins of concern. Pathway research allows to explain omics data in the context of pathways figures. It enables the detection of various cell processes, diseases, or signaling pathways that are statistically related to differentially expressed genes between two samples., Pathways analysis isalso used instead of networks analysis, functional, and GSEA.
This research study focuses on systematic examination of the functional enrichment of genes to establish their involvement in important pathways contributing to the development of several cancers. For this reason, the key genes involved in bone have been examined for their enrichment in multiple pathways and highly sensitive cDNA experiments have been performed to determine the levels of expression in other cancer as well.
| Materials and Methods|| |
Data on mutated bone cancer gene samples were collected from the cancer browser of the Catalog of Somatic Mutations in Cancer (COSMIC) database. It is an online database of somatic mutations found in human cancers incorporates knowledge from the scientific literature and clinical trial from the Cancer Genome Project at the Sanger Institute. COSMIC includes 4800 somatic mutations in a variety of cancers.
Highly mutated genes and network-based enrichment analysis
Genes with maximum mutations in samples were selected and their network-based enrichment analysis was performed, by the EnrichNet server; a web-based tool for investigation of gene and protein records, which uses evidence from molecular systems and offers a chart based representation of the findings. EnrichNet exploits data from the submolecular system structure that deals with two gene/protein sets, with a more intuitive understanding of system substructures. Thus, allows a direct submolecular translation of how a user characterized set of genes/proteins is identified with a gene/protein set of known functions. The probability of genes forming a network was estimated using the Fisher's exact test formula shown in equation 1.
p = ((a+b)! (c+d)! (a+c)! (b+d)!) / (a! b! c! d! n!)....... eq. (1)
In this formula, a, b, c, and d are the frequencies of the gene occurrence in each pathway, and N is the cumulative frequency of gene appearance in all the pathways. The genes having significant score values for enrichment were identified and the ontologies were determined for the genes having significant values.
Identification of significant pathways and network development
The genes having significant score values for enrichment were identified and their associated pathways mechanisms were determined. The gene interaction network was developed for those who have shown maximum overlap (involvement) with the pathway genes and submitted genes data sets through the Gene Mania Tool. Gene MANIA (http://www.genemania.org) is an adaptable, easy to understand web interface for developing models of genes ability, breaking down gene records, and organizing them for functional examinations. Given a query list, GeneMANIA generates an interaction network using accessible genomics and proteomics information. The hub nodes of the network were identified through modularity analysis calculated by equation 2.
Here 'e' is the number of edges in the network, 's' represents the number of strongly connected edges, and d is the degree of a node. The genes showing a large number of interactions were considered as hub genes.
Functional based enrichment analysis and drugs identification
Genes enrichment analysis and candidate gene prioritization were also calculated for the hub genes, based on functional annotations and protein interaction networks by the TopGene and TopFun tools. TopFun shows functional enrichment of the gene list based on Transcriptome, Proteome, Regulome, Ontologies, Phenotype, Pharmacome, literature co-citation, and other features, whereas, TopGene ranks candidate genes based on functional similarity to training gene list. The overlapping genes were identified by the formula shown in equation 4.
In this formula the On represents the number of overlapped genes, tn is the total number of genes. The overlapped genes are those which show some degree of similarity. The Drugs used to combat the diseases caused by the selected mutated genes were also identified through TopFun tool.,
The expression analysis of significant genes and their survival rate
The expression analysis and survival rate of patients were conducted for those significant genes that overlapped a maximum number of pathways through Gene Expression Profiling Interactive Analysis (GEPIA) server; a newly built interactive webserver to analyze gene expression data of 9,736 tumors. GEPIA performs several operations such as analysis of differential expression of tumor genes, profiling, according to the type of cancers, and the survival analysis of patients according to gene expressions, similar gene detection, correlation analysis, and dimensionality reduction analysis.
cDNA expression assay profiling and Verification of abnormal expression of hub genes
Blood samples of cancer patients were collected through informed consent. Ethical Committee approval was obtained from The Baoji Hospital of Traditional Chinese Medicine and Capital University of Science and Technology, Islamabad. A total of 100 fresh-frozen cancer specimens were collected from surplus material used for biopsy purposes. Cell cultures were prepared, the samples were put in 165iu collagen type-2 GIBCOt Invitrogen Corporation to disaggregate the cells. The growth of cultures was regulated in DMEM with 10% FCS. The cDNA array studies were performed on Human Genefilterst-GF211. Nylon-microarrays with 7196 known human cDNA probes were chosen from Unigene. Tumor tissues were dissected into roughly 0.5 cm3 thick pieces. Each piece was put into a sterile cryotube consisting of liquid nitrogen, until ready for the extraction of RNA. The RNAs were extracted from frozen tumor tissues using a buffer solution containing guanidine–thiocyanate, and b-mercaptoethanol as instructed by the manufacturers. The specimens were homogenized and RNA extraction was performed using an RNeasy Minikit (Qiagens Ltd).
The consistency of overall extracted RNA molecules was confirmed by the inclusion of 28S and 18S ribosomal bands on 1% agarose gels. The single-stranded cDNAs were synthesized according to the Research Genetics protocol. Each probe was made using 6 mg RNA. Duplicate probes were produced from one pool of RNA for each individual patient. The probes were then radio-labeled with 33PdCTP (ICN Radiochemicals, Amersham, UK) and purified. All the radio-labeled probes were hybridized on a nylon filter overnight. After hybridization with the radio-labeled probe, the filters were washed using 0.5%–1% concentrations of sodium-dodecyl sulfate and 2X–0.5X SSC. The filters were exposed to a phosphor screen for 48 h and then scanned using a Cyclonet Packard Instrument Company, Meridan, USA. Genetic mutations consisting of gene amplifications, down-regulations, deletions, missense mutations, and substitution related to the identified hub genes were measured, transcriptional changes, and mutual expression abilities of genes were calculated through Kaplan–Meier analysis. Moreover, datasets were extracted to verify the highly differentiated genes among the hub genes.
| Results and Discussions|| |
The dataset consists of approximately 13473 genes with major mutations in several forms of bone cancer, 45 genes having larger number of reported mutations were chosen for enrichment analysis to figure out the wide variety of possible biomedical applications. The enrichment analysis is the positioning of interactions between the group of identified disease-related genes and pre-characterized gene sets that demonstrate cell pathways. We found 82 different Gene ontologies with significance of network-distance distribution values (XD Score) ranging from 9.4 × 102 to 1.0 × 101, out of 82, only 9 ontologies were selected which displayed significant values greater than or equal to 9.0 × 101. More specifically, gene sets with similar or somewhat related over-representation scores can be distinguished using their Xd-distances. The identified Biological process ontologies with a significant score value greater than or equal to 9.0 × 101 for the genes set are shown in [Table 1].
|Table 1: The identified biological process ontologies as a result of genes enrichment|
Click here to view
[Table 1] shows the pathway identifiers for the genes set and their network-comparability scores (Xd-scores). Keeping in mind, the Fisher's exact test calculated the value of genes set to cover the pathways with the user-specified genes set. Furthermore, the number of genes in the transferred and mapped user characterized gene set, the mapped pathways, and their intersection sets were shown.
Total 196 pathways related to the genes set were identified, but only 15 pathways were chosen indicating a significant correlation between the XD score and Fisher's exact test [Table 2].
|Table 2: The Pathways identified on the basis of enrichment analysis of bone cancer mutated genes set|
Click here to view
The absolute correlation between Pearson's coefficient and XD score is 0.8 with 95% confidence value, which indictes that the uploaded gene set and mapped genes were strongly correlated for Gene Ontology. As shown in [Table 1] and [Table 2], it is found that the maximum numbers of genes overlapped with the pathway genes is 7 in total: CDKN2A, AKT1, NRAS, PIK3CA, RB1, BRAF, and TP53. The involvement of these genes in pathways was also verified through a fun coup tool for cross verification. The gene interaction network is shown in [Figure 1].
|Figure 1: The gene interaction network and their roles in several pathways, the hub genes are represented by the solid lines within the circles, the red color represents up-regulation of genes whereas pink color shows down-regulation. Blue arrows show state changes, changes in expressions are shown by green arrows; the brown ones represent gene complexes|
Click here to view
The network shown in [Figure 1] has a predictive confidence value of 0.85, with 85% accuracy. The overlapped genes outlined in circles with strong lines in [Figure 1] are the most important components of significant pathways. About 30.99% of physical interactions were identified and 28.29% of the genes represent co-expression among each other. 18.6% state changes, 5.7% change in expression levels, and 7.2% genes-complexes were observed. We also found that all the 7 highlighted genes showed co-expression with each other. A gene interaction network is an arrangement of genes associated with edges that show functional relations between them. The edges are known as interactions since the two given genes are thought to have either a physical interaction through their gene items, for example, proteins, or one of the gene modifications or influence the action of another gene of intrigue. Other than these physical interactions, genetic interactions also occurred, visualization and investigation of these networks are basic for specialists. They analyze and understand these networks and help to reconcile outer information sources, for example, the gene ontology. As the understanding of physical and functional interactions between molecules in the living body is therefore of most significance in biology thus, understanding interactions between proteins are increasingly helpful in evaluating the roles in healthy and unhealthy conditions of living beings and may help in the determination, prevention, and treatment of sicknesses.
The expression levels of hub genes in cancers were evaluated and the survival rates of patients were calculated by uploading genes set [Table 3] into the GIPEA, choosing their respective cancers [Table 2], and the survival rate of patients having particular gene mutations was analyzed [Figure 2].
|Table 3: Different mutations of hub genes analyzed in the collected samples of the cancer cells and their clinical analysis and verifications|
Click here to view
|Figure 2: The analysis of survival rate of patients suffering from genes mutations in several cancers, (a) RB1 mutations, (b) NRAS mutations, (c) mutations of TP53, (d) BRAF mutations, cancers, (e) PIK3CA mutations, (f) AKT1 mutations, (g) mutations of CDKN2A|
Click here to view
[Figure 2] indicates that none of the patients have demonstrated fruitful results of survival and their survival rate is decreasing continually. Therefore, understanding gene enrichment of pathways may identify novel biomarkers for cancer. Specifically, this type of gene mutation data of bone cancer can be used to manage medicinal research for innovative applications and novel treatments of existing chemotherapies. The significant genes involved in multiple pathways profiles reflect the enhanced biological activity of genes. Our investigation offered important insights into the behavior and mechanism of these genes, their involvement in a variety of pathways, and their closeness.
Approximately 100 samples of various cancer patients were collected and cDNA array studies were performed on the samples to measure the expression of the hub genes in specimens. The good quality RNA molecules were extracted for the manufacture of radio-labeled cDNA probes and array data were derived. The highly expressed genes in all the tumor samples contained several significant markers, but only the hub genes obtained through GSEA were used to validate their expression in all the samples. Analysis of the dataset found that RB1, P53, and NRAS were amplified in the Brain cancer, while BRAF, CDKN2A, and AKT1 were amplified in the Sarcoma samples. Maximum deletion mutations of the PIK3CA gene were observed in leukemia. CDKN2A gene amplifications were found in nearly all cancers. In some cancers, the truncating mutations, missense mutations, and deletions were also observed. [Table 3] summarizes the detailed analysis of expression levels of hub genes.
| Conclusion|| |
Various tumor genes have been analyzed for the signaling and metabolic pathways in the formation of bone tumor and were developed into a useful functional map. This subsequent, detailed investigation prompts a more methodological perspective on normal and human tumor systems. This study leads to the recognizable proof of novel superimposed pathways that are strongly related to cancer and tumor science. 82 different Gene ontologies were found with XD Score ranging from 9.4 × 102 to 1.0 × 101, out of 82, only 9 ontologies were chosen which display significant values greater than or equal to 9.0 × 101. The larger Fisher's exact test values were obtained for the thyroid and non-small cell lung cancers. The CDKN2A, AKT1, NRAS, PIK3CA, RB1, BRAF, and TP53 genes were found to be the most important among the 41 selected genes. In addition, notable gene expression profiles were measured concerning their natural behavior for the productive combination of focused tumor therapeutics and for increasing the application area of entrenched tumor treatments. In the future, this approach will be used to classify various mechanisms related to the development of many diseases and the roles of different genes in a variety of diseases.
Financial support and sponsorship
Conflicts of interest
There are no conflicts of interest.
| References|| |
Gisbert-Garzarán M, Manzano M, Vallet-Regí M. Mesoporous silica nanoparticles for the treatment of complex bone diseases: Bone cancer, bone infection and osteoporosis. Pharmaceutics 2020;12:83.
Lim W, Kim B, Jo G, Yang DH, Park MH, Hyun H. Bioluminescence and near-infrared fluorescence imaging for detection of metastatic bone tumors. Lasers Med Sci 2020;35:115-20.
Johnston WT, Erdmann F, Newton R, Steliarova-Foucher E, Schüz J, Roman E. Childhood cancer: Estimating regional and global incidence. Cancer Epidemiol 2020;101662 [In Press].
Das S, Clézardin P, Kamel S, Brazier M, Mentaverri R. The CaSR in pathogenesis of breast cancer: A new target for early stage bone metastases. Front Oncol 2020;10:69.
Siemiatycki J. Historical overview of occupational cancer research. In: Occupational Cancers. Cham: Springer; 2020. p. 1-20.
Lindquester WS, Crowley J, Hawkins CM. Percutaneous thermal ablation for treatment of osteoid osteoma: A systematic review and analysis. Skeletal Radiol 2020;49:1-9.
Marc-André W, Simon David S, Georg WO, Burkhard L, Bernd W, Hans-Ulrich K, et al
.Clinical long-term outcome, technical success, and cost analysis of radiofrequency ablation for the treatment of osteoblastomas and spinal osteoid osteomas in comparison to open surgical resection. Skeletal Radiol 2015;44:981-93.
Matthew RC, Damian ED, Stephen BS, Robert AB, Peter JL, Kirkland WD, et al
. Percutaneous image-guided cryoablation of painful metastases involving bone. Cancer 2013;119:1033-41.
Morrow JJ, Khanna C. Osteosarcoma genetics and epigenetics: Emerging biology and candidate therapies. Crit Rev Oncog 2015;20:173-97.
Kim MJ, Cho KJ, Ayala AG, Ro JY. Chondrosarcoma: With updates on molecular genetics. Sarcoma 2011;1-15.
Montoya C, Rey L, Rodríguez J, Fernández MJ, Troncoso D, Cañas A, et al
. Epigenetic control of the EWSFLI1 promoter in Ewing's sarcoma. Oncol Rep 2020;43:1199-207.
Kobaisi F, Fayyad N, Sulpice E, Badran B, Fayyad-Kazan H, Rachidi W, et al
. High-throughput synthetic rescue for exhaustive characterization of suppressor mutations in human genes. Cell Mol Life Sci 2020;77:4209-22.
Chen FY, Li X, Zhu HP, Huang W. Regulation of the ras-related signaling pathway by small molecules containing an indole core scaffold: A potential antitumor therapy. Front Pharmacol 2020;11:280.
Bebek G. Identifying gene interaction networks. Methods Mol Biol 2012;850:483-94.
Pathan M, Keerthikumar S, Ang CS, Gangoda L, Quek CY, Williamson NA, et al
. Technical brief funrich: An open access standalone functional enrichment and interaction network analysis tool. Proteomics 2015;15:2597-601.
Berg JM, Tymoczko JL, Stryer L. Biochemistry. 5th
ed. New York: W. H. Freeman; 2002.
García-Campos MA, Espinal-Enríquez J, Hernández-Lemus E. Pathway analysis: State of the art. Front Physiol 2015;6:383.
Gentleman RC, Carey VJ, Bates DM, Bolstad B, Dettling M, Dudoit S, et al
. Bioconductor: Open software development for computational biology and bioinformatics. Genome Biol 2004;5:R80.
Simon AF, Nidhi B, Sally B, Charlotte C, Chai Yin K, David B, et al
. COSMIC: Mining complete cancer genomes in the Catalogue of Somatic Mutations in Cancer. Nucleic Acids Res Engl 2011;39:D945-50.
Forbes SA, Bhamra G, Bamford S, Dawson E, Kok C, Clements J, et al
. The Catalogue of Somatic Mutations in Cancer (COSMIC). Curr Protoc Hum Genet 2008. [doi: 10.1002/0471142905.hg1011s57].
Glaab E, Baudot A, Krasnogor N, Schneider R, Valencia A. EnrichNet: Network-based gene set enrichment analysis. Bioinformatics 2012;28:i451-7.
Warde-Farley D, Donaldson SL, Comes O, Zuberi K, Badrawi R, Chao P, et al
. The GeneMANIA prediction server: Biological network integration for gene prioritization and predicting gene function. Nucleic Acids Res 2010;38:W214-20.
Chen J, Xu H, Aronow BJ, Jegga AG. Improved human disease candidate gene prioritization using mouse phenotype. BMC Bioinformatics 2007;8:392.
Chen J, Aronow BJ, Jegga AG. Disease candidate gene identification and prioritization using protein interaction networks. BMC Bioinformatics 2009;10:73.
Wolpert D, Macready W. Technical Report SFI-TR-95-02-010. Santa Fe, NM: No Free Lunch Theorems for Search; 1995.
Fisher RA. On the interpretation of χ2 from contingency tables, and the calculation of P. J R Stat Soc 1992;85:87-94.
Fisher RA. Statistical Methods for Research Workers: In Break Throughs in Statistics 1992;5:66-70.
Ng SK, Zhang Z, Tan SH. Integrative approach for computationally inferring protein domain interactions. Bioinformatics 2003;19:923-9.
Sharma P, Bhattacharyya DK, Kalita JK. Detecting protein complexes based on a combination of topological and biological properties in protein-protein interaction network. J Gen Eng Biotechnol 2018;16:217-26.
[Figure 1], [Figure 2]
[Table 1], [Table 2], [Table 3]