Build risk model based on the expression of complement genes in the CGGA database and verified in the TCGA database
First, a total of 194 complement genes was chosen from the GSEA-MSIGDB database to construct gene signature (Supplement Table 1). Then we build a gene-based risk model from the CGGA database using univariate and multivariate regression analysis. Consequently, an eight-gene signature was selected, and the risk score as follows: (-2.670×GPD2 expression) + (-0.297×LGMN expression) + (-0.391×KCNIP3 expression) + (-0.002×ANXA5 expression) + (-0.003×C1R expression) + (0.029×RAF1 expression) + (0.061×GNAI3 expression) + (-0.124×ZEYVE20 expression). The Kaplan–Meier curve for the CGGA database indicated that patients in the low-risk group from the CGGA database had a significantly longer OS than high-risk group patients in the WHO low-grade glioma (LGG), GBM, and glioma (Fig. 1A). The consistency of results was validated for the TCGA database in the GBM and glioma (Fig. 1B). Then, we analyzed the relationship between complement gene signature and clinicopathological characteristics, including WHO grade, IDH1 mutation, MGMT promoter methylation, 1p19q codeletion, tumor subtypes in the CGGA database (Fig. 2A) and TCGA database (Fig. 2B). Results showed that glioma patients with higher risk scores are prone to have a higher WHO grade, 1p19q non-codeletion, IDH WT status, classical and mesenchymal subtypes both in the CGGA (Fig. 2C) and TCGA (Fig. 2D) databases.
The immune landscapes of patients in low-risk and high-risk groups from the CGGA and the TCGA databases
Next, we explored the tumor purity, estimate score, stromal score, tumor score, and immune cells infiltration in the TME in low-risk and high-risk groups from the CGGA and the TCGA databases (Fig. 3). Patients in the high-risk group from the CGGA database have a lower tumor purity and a higher estimate score, stromal score, tumor score than the low-risk group (Fig. 3A). The consistency of results was also validated in the TCGA database (Fig. 3B). CIBERSORT algorithm revealed fewer M0 macrophages and more CD4 naive T cells and monocytes in the low-risk group from the CGGA database (Fig. 3C). In addition, the results from the TCGA database indicated the expression of several immune cells such as naive B cells, plasma cells, CD8 T cells, CD4 T cells, NK Cells et al. were also significantly different from low-risk and high-risk groups (Fig. 3D).
The immune aspects in the tumor microenvironment from our sequencing data and enrichment pathways related to the complement genes
Then, to test the reliability of the risk score model, we validation the immune results using our sequencing data (Fig. 4). The data showed that patients in the high-risk group from our sequencing data have a higher estimate score, stromal score, tumor score than the low-risk group (Fig. 4A-C). Moreover, the patients in the high-risk group exhibit TME resistance (Fig. 4D). In addition, patients in the high-risk group have more infiltration of follicular helper T cells, monocytes, M2 macrophages, and less infiltration of memory resting CD4 T cells, M0 macrophages in the TME (Fig. 4E).
To investigate the function of the complement genes in gliomas, we analyzed the biological process, cellular component, molecular function, and enrichment pathways from the GO and KEGG databases (Supplement Fig. 1). The KEGG enrichment analysis indicated that the top five enrichment pathways were complement and coagulation cascades, Kaposi sarcoma-associated herpesvirus infection, pertussis, chemokine signaling, tuberculosis (Supplement Fig. 1A). In addition, the GO enrichment analysis showed that the most biological processes were coagulation, regulation of body fluid levels, blood coagulation, hemostasis, and regulation of inflammatory response (Supplement Fig. 1B); the cellular components were enriched mainly in the extracellular matrix, collagen-containing extracellular matrix, secretory granule lumen, cytoplasmic vesicle lumen and vesicle lumen (Supplement Fig. 1C); the most related molecular function were endopeptidase activity, serine-type peptidase activity, serin hydrolase activity, serine-type endopeptidase activity and enzyme activator activity (Supplement Fig. 1D).