Skip to main content
Advertisement
Browse Subject Areas
?

Click through the PLOS taxonomy to find articles in your field.

For more information about PLOS Subject Areas, click here.

  • Loading metrics

Combined mRNAs and clinical factors model on predicting prognosis in patients with triple-negative breast cancer

  • Yanjun Hu,

    Roles Conceptualization, Data curation, Methodology, Project administration, Software, Supervision, Writing – original draft

    Affiliation Department of Breast Surgery, The Cancer Hospital of the University of Chinese Academy of Sciences (Zhejiang Cancer Hospital), Institute of Basic Medicine and Cancer (IBMC), Chinese Academy of Sciences, Hangzhou, Zhejiang, China

  • Dehong Zou

    Roles Conceptualization, Project administration, Resources, Writing – review & editing

    huyj840205@163.com

    Affiliation Department of Breast Surgery, The Cancer Hospital of the University of Chinese Academy of Sciences (Zhejiang Cancer Hospital), Institute of Basic Medicine and Cancer (IBMC), Chinese Academy of Sciences, Hangzhou, Zhejiang, China

Abstract

Objective

Triple-negative breast cancer (TNBC) is aggressive cancer usually diagnosed in young women with no effective prognosis prediction model to use. The present study was performed to develop a useful prognostic model for predicting overall survival (OS) for TNBC patients.

Methods

The Cancer Genome Atlas (TCGA) and Molecular Taxonomy of Breast Cancer International Consortium (METABRIC) databases were used as training and validation data sets, respectively, in which the gene expression levels and clinical prognostic information of TNBC were collected. Differentially expressed genes (DEGs) between TNBC and non-TNBC (NTNBC) were identified with the thresholds of false discovery rate < 0.05 and |log2 Fold Change| > 1. DEGs in AmiGO2 and the Kyoto Encyclopedia of Genes and Genomes (KEGG) databases were retained for further study. Univariate, multivariate Cox, and logistic regression analysis were conducted for detecting DEG signature with the threshold of log-rank P < 0.05. The prognosis models of mRNA signature, clinical factors were constructed and compared.

Results

One five-DEG signature, including CHST4, COCH, CST9, SOX11, and TDGF1 was identified in DEG prognosis model. Stratified analysis showed that the patients aged over 60, with higher pathologic stage (III-IV) and recurrence induced a significantly lower survival rate than those aged below 60, lower pathologic stage and without recurrence. Compared with patients with low-risk scores, those presented high-risk scores demonstrated significantly lower survival rate in the subgroup aged over 60 [HR = 3.780 (1.801–7.933), P < 0.0001]. For patients who obtained a higher pathologic stage and recurrence, high-risk scores were correlated with a significantly lower survival rate than patients with low-risk scores. The five-mRNA signature combined with clinical model (AUC = 0.950) predicted better than single clinical model (AUC = 0.795) or five-mRNA signature model (AUC = 0.823).

Conclusion

Our present study identified a prognostic prediction model (combined with five-mRNA signature and clinical factors) for TNBC patients receiving immunotherapy, which will benefit future research and clinical therapies.

Introduction

Triple-negative breast cancer (TNBC) is an aggressive breast cancer negative for progesterone receptor (PR), estrogen receptor (ER), and human epidermal growth factor receptor 2 (Her-2). As known, TNBC patients benefit little from both endocrine therapy and Her-2 targeted therapy, but chemotherapy, the traditional treatment system [1]. Even worse, TNBC patients suffer from earlier recurrence, worse prognosis, and shorter survival time than other breast cancer subtypes [24]. Nowadays, immunotherapy is drawing extensive attention in TNBC tumor therapy [58]. The efficacy of immunotherapy in most types of breast cancer has not been confirmed when compared with other cancers with higher immunogenicity as malignant melanoma, pulmonary small cell lung cancer [9, 10]. However, the TNBC immunotherapy approved by FDA in the United States has obtained outstanding curative effects [11]. Although the success of immunotherapy is exciting, countless patients did not respond to immunotherapy.

The complexity and diversity of the tumor microenvironment (TME) had been gradually understood in recent years. Moreover, its importance in immunotherapy has also been realized. TME was the cellular environment, including immune cells, mesenchymal cells, endothelial cells, inflammatory mediators, and extracellular matrix molecules. It is widely considered that the microenvironment plays a significant role in tumor development [12]. Therefore, the comprehensive analysis of the correlation between gene signatures and overall survival (OS) may shed light on the pathogenesis of TNBC.

Chemotherapy remains the most effective therapy method [13]. In recent years, immunotherapy had been widely studied in cancers, especially TNBC. PD-1/PD-L1 are a pair of immune co-stimulatory molecules contain the medicines of pembrolizumab [14], atezolizumab [15], and durvalumab [16], which were reported to be effective for prolonging OS. Clinical studies have found that immune infiltration could improve prognosis in TNBC patients [17, 18]. Therefore, the identification of DEGs of TNBC may contribute to the in-depth analysis of factors affecting survival.

Traditionally, the prognosis was predicted by means of clinical risk factors, including age, tumor size, pathologic stage, and location. With the development of high-throughput sequencing technologies, multigene signatures including miRNA-signature, mRNA-signature, and lncRNA-signature were recognized as valuable biomarkers on predicting breast cancer prognosis, such as Oncotype DX, B-cell/IL8, Mammo-print, and Genomic Grading Index [1921]. In recent years, an increasing number of studies demonstrated that mRNAs play identification, biomarker, and prognosis prediction roles in TNBC patients [2227]. However, there was still no mRNA signature associated with prognosis prediction in TNBC patients ever reported.

The Cancer Genome Atlas (TCGA) database and Molecular Taxonomy of Breast Cancer International Consortium (METABRIC) database provide a wealth of available cohorts about cancer-specific gene expression information and detailed clinical characteristics. They accelerated the molecular analysis of cancers. In the present study, the mRNA signature of TNBC was identified by the usage of four databases, including TCGA, METABRIC, AmiGO2, and the Kyoto Encyclopedia of Genes and Genomes (KEGG). Clinical factor and mRNA signature prognosis prediction models were made and compared. The stratified analysis was performed to predict a more accurate prognosis situation. Here, our objective was to find the most accurate and straightforward prognostic model used in the clinical work of TNBC.

In the present study, a five-mRNA signature (CHST4, COCH, CST9, SOX11, and TDGF1) was constructed to predict the prognosis of TNBC. Moreover, the combined prognosis model of mRNA signature and clinical factors have a better prediction function than a single mRNA signature model or clinical factors.

Materials and methods

Clinical information and RNA-Seq dataset in TCGA data set

We downloaded the level three fragments per kilobase million (FPKM) gene-level RNA-Seq data of breast cancer samples produced by Illumina HiSeq 2000 RNA sequencing platform from the TCGA database as a training data set through Genomic Data Commons (GDC) Data Transfer tool (https://portal.gdc.cancer.gov/) before February 01, 2020. The patients with documented complete expression profiles and clinical information, including OS, ER, PR, and Her-2 were selected in the study. This study met the publication guidelines provided by TCGA. At the same time, the gene expression and clinical information of breast cancer samples were downloaded from the METABRIC database (http://molonc.bccrc.ca/) as a validation data set.

Identification of DEGs

In the TCGA training data set, the differentially expressed genes (DEGs) between TNBC and NTNBC patients were identified by R3.4.1 limma (S1 File) [28] (https://bioconductor.org/packages/release/bioc/html/limma.html) with the thresholds of false discovery rate (FDR) < 0.05 and |log2 Fold Change (FC)| > 1. Based on the expression levels of the DEGs, the two-way hierarchical clustering analysis was performed by the centered Pearson correlation algorithm [29] using pheatmap version 1.0.8 (https://cran.r-project.org/web/packages/pheatmap/index.html) [30]. The DEGs in AmiGO2 (http://amigo.geneontology.org/amigo) and KEGG pathway (https://www.kegg.jp/) databases were retained for further study.

DEGs signature identification and survival prognosis models construction

Univariate and multivariate Cox regression analyses were performed to identify the independent prognosis related DEGs based on the clinical prognosis information (recurrence, dead, OS time) of the samples in the TCGA training data set and gene expression levels of DEGs identified above, using the survival package (version 2.41–1, http://cran.r-project.org/web/packages/survival/index.html) of R3.4.1 (S1 File). Log-rank P < 0.05 was considered as the threshold for a significant correlation. Then, the feature DEGs among the above DEGs were screened out using the Logit regression model by glm function in R3.4.1 (S1 File) [31, 32].

The DEG prognostic risk score was calculated based on the expression levels of feature DEGs obtained in the previous steps and the prognostic coefficients of each element in the optimized combination of DEGs. The DEGs prognostic model was calculated as follows:

Here βDEGs indicated the coefficient of DEGs derived from multivariate Cox regression, whereas ExpDEGs represents the expression level of the target DEGs in the training data set.

Finally, the DEG prognostic risk score of each sample was evaluated. Taking medium value as the threshold, samples were divided into high- and low-risk groups in the TCGA training data set. We compared the real survival prognosis with the grouped samples by DEGs prognostic score model using survival package version 2.41–1 Kaplan–Meier curve method in R3.4.1 (S1 File).

DEGs prognostic risk model validation analysis

The samples in the TCGA and METABRIC validation data set were both used to validate the performance of this DEG prognostic risk prediction model. The samples in the TCGA training and METABRIC verification data sets were separately classified into TNBC and NTNBC sampling groups by the logistic regression model. The DEGs with P < 0.05 were considered as feature DEGs. Then, the accuracy of classification was validated by comparing the predicted group with the actual group.

Screening for independent prognostic clinical models

The independent prognostic clinical factors in the breast cancer tumor samples of the TCGA training data set were screened using the univariate and multivariate Cox regression analysis in R3.4.1 survival package (version 2.41–1; S1 File). Log-rank P < 0.05 was used as the threshold. Then, the stratified analysis was performed.

Comparison analysis between prognostic clinical factors and DEGs prognostic risk models

To further investigate the correlation of the independent prognostic clinical factors and DEG prognostic risk model, the nomogram analyses of three- and five-year survival rate prediction models were performed. rms package version 5.1–2 of R3.4.1 (S1 File; https://cran.r-project.org/web/packages/rms/index.html) was used [33, 34]. Then, the clinical prognosis models were constructed and compared with the DEGs prognostic model.

Results

Data source and preprocessing

Overall, 1,217 samples were assessed, with 710 samples (116 TNBC and 594 NTNBC) with integral clinical information in TCGA training data set were retained for further study. The clinical information statistics are shown in Table 1 and S1 Table. Simultaneously, 299 TNBC and 1,605 NTNBC samples with RNA-Seq expression profiles from the METABRIC database was downloaded as the validation data set.

thumbnail
Table 1. The clinical prognosis information statistics of breast cancer samples in the TCGA database.

https://doi.org/10.1371/journal.pone.0260811.t001

Sampling and detection of DEGs

Significant differences were observed in terms of age (P = 0.00021), pathologic_N (P = 0.00266), and recurrence (P = 0.0375) between TNBC and NTNBC patients (Table 1) in TCGA database. No significant differences were detected between TNBC and NTNBC in other clinical factors, including pathologic_M, pathologic_T, pathologic stage, radiotherapy, target-therapy, dead, and OS time (P > 0.05; Table 1).

A total of 15,583 mRNAs were identified in TCGA training data set by removing those with an expression level of 0. With the thresholds of FDR < 0.05 and |log2 FC| > 1, 884 DEGs, with 578 up-regulated and 306 down-regulated were identified (Fig 1A; S2 Table). Moreover, the two-dimensional hierarchical clustering heatmap of these DEGs presented an obvious classification on TNBC and NTNBC patients (Fig 1B). A total of 164 DEGs involved in both AmiGO2 and KEGG databases were used for further study (S3 Table).

thumbnail
Fig 1. The DEGs detection analysis.

(A) The volcano map of the mRNAs. Horizontal dashed lines represent false discovery rate <0.05, and two vertical dashed lines represent |log2Fold Change|> 1. The size of the dots represents the absolute log2Fold Change, and the larger the value, the larger the point. (B) A two-way hierarchical clustering heat map based on the expression level of DEGs. DEGs, differentially expressed genes.

https://doi.org/10.1371/journal.pone.0260811.g001

Screening of characteristic DEGs and constructing of survival prognosis models

In the TCGA training data set, twenty prognosis-related DEGs were selected through univariate Cox regression analysis (S4 Table), seven of which were left after multivariate Cox regression analysis, including cochlin (COCH), carbohydrate sulfotransferase 4 (CHST4), LIM domain only 1 (LMO1), cystatin 9 (CST9), SRY-box transcription factor 11 (SOX11), histatin 3 (HTN3), and teratocarcinoma-derived growth factor 1 (TDGF1) (S5 Table). Thereafter, a five-DEG signature associated with independent prognosis was screened out and used for the Logit regression model construction. These five genes are: CHST4, HR = 0.9379 (0.8816–0.9978); COCH, HR = 0.8080 (0.7303–0.8939); CST9, HR = 0.9499 (0.9127–0.9885); SOX11, HR = 1.109 (1.001–1.230); and TDGF1, HR = 0.9066 (0.8419–0.9763); Table 2.

thumbnail
Table 2. Important DEG signature lists assessed through the Logit regression model and multi-variable Cox regression.

https://doi.org/10.1371/journal.pone.0260811.t002

Then, the DEG prognostic risk model was constructed based on the expression profiles of CHST4, COCH, CST9, SOX11, and TDGF1 in the TCGA training data set. The prognostic risk score of each TCGA sample as calculated as: prognostic risk score = (–0.064) × ExpCHST4 + (–0.213) × ExpCOCH + (–0.051) × ExpCST9 + (0.104) × ExpSOX11+ (–0.098) × ExpTDGF1.

According to the median value of risk scores, samples in each data set were divided into the high- and low-risk groups. Significant differences were observed between high- and low-risk groups in TCGA training data set [HR = 2.509 (1.570–4.012), P < 0.0001; Fig 2A]. The patients in the high- and low-risk groups in the METABRIC validation data set also had a difference survival ratio [HR = 1.234 (1.096–1.389), P < 0.0001; Fig 2B]. The receiver operating characteristic (ROC) curve was plotted, with the AUC of 0.823 (95% CI: 0.675–0.956; Fig 2C) in the training and 0.642 (95% CI: 0.605–0.617; Fig 2D) in the validation data sets.

thumbnail
Fig 2. The mRNA prognostic model used the mortality risk score calculation in the TCGA training data set and METABRIC validation set.

(A, B) showed the Kaplan-Meier curve based on the mRNA prognostic prediction model and the prognosis in the TCGA training data set and Metabric validation set. Significant differences were observed. (C, D) ROC curve of prediction result based on prognosis model. TCGA, The Cancer Genome Atlas. Metabric, Molecular Taxonomy of Breast Cancer International Consortium. ROC, receiver operating characteristic.

https://doi.org/10.1371/journal.pone.0260811.g002

DEGs prognostic risk model validation analysis

The five-DEGs signature was employed on the classification of TNBC and NTNBC samples in the TCGA training and the METABRIC validation data sets. The scatter distribution map revealed that the five-DEG signature could precisely identified TNBC and NTNBC samples both in the TCGA training (AUC = 0.938, 95% CI: 0.897–0.921; Fig 3A and 3C) and the METABRIC validation data sets (AUC = 0.831, 95% CI: 0.749–0.864; Fig 3B and 3D). When compared the predicted classification with the actual group, the overall TNBC predictive percent of this model was 94.08% and 92.91% in the TCGA training and METABRIC validation data sets, respectively (Fig 3E and 3F).

thumbnail
Fig 3. The Logit regression model analysis of TCGA training and METABRIC validation data sets.

(A, B) represent the scatter distribution situation of TNBC and NTNBC. (C, D) represent the logistic regression model classification results. Five-mRNA signature precisely identified TNBC and NTNBC samples in the TCGA training and the METABRIC validation data sets. (E, F) exhibited the fuzzy classification matrix result in the TCGA dataset and Metabric dataset, which demonstrated the overall TNBC predictive percent of 94.08% and 92.91% compared to the predicted classification with the actual group. TNBC, triple-negative breast cancer; NTNBC, non-TNBC. TCGA, The Cancer Genome Atlas. Metabric, Molecular Taxonomy of Breast Cancer International Consortium.

https://doi.org/10.1371/journal.pone.0260811.g003

Screening for independent prognostic clinical models in the TCGA training cohort

In the TCGA training set, several clinical independent factors significantly associated with OS were screened out, including the age [HR = 1.060 (1.014–1.109), P = 0.00995], pathologic stage [HR = 7.367 (1.168–46.48), P = 0.0336], tumor recurrence [HR = 2.237 (1.486–7.129), p < 0.0001], and DEG prognostic model status [HR = 2.064 (1.687–6.208), P = 0.00197] (Table 3). Then, we stratified samples into subgroups according to age (> 60-year-old and < 60-year-old), pathologic stages (I-II and III-IV), and recurrence (with and without recurrence). Stratified analysis showed that patients aged over 60 [HR = 2.573 (1.644–4.029), P < 0.0001; Fig 4A], at higher pathologic stage [III-IV; HR = 2.873 (1.830–4.511), P < 0.0001; Fig 4D], and with recurrence [HR = 9.362 (5.015–17.48), P < 0.0001; Fig 4G] induced a significantly lower survival rate than those were younger, at lower pathologic stage and without recurrence. OS showed no significant differences between the high- and low-risk groups for patients aged under 60, with early pathologic stage, and without recurrence (P > 0.05; Fig 4B, 4E and 4H). As for patients aged over 60 group those who had high-risk scores demonstrated significant lower OS as compared with patients with low-risk scores [HR = 3.780 (1.801–7.933), P < 0.0001; Fig 4C]. As for patients in high pathologic stage and recurrence, high-risk scores were correlated with significant lower survival rate than low-risk scores [HR = 4.027 (1.656–9.797), P = 0.00053; HR = 2.940 (1.068–8.091), P = 0.0224; Fig 4F and 4I].

thumbnail
Fig 4. The prognostic-related Kaplan-Meier curve of age, pathologic stage, and tumor recurrence.

(A) The prognostic-related Kaplan-Meier curve of age. (B, C) Patients aged over 60 and below 60 prognosis-related Kaplan-Meier curves in TCGA samples. (D) The prognostic-related Kaplan-Meier curve of pathologic stage. (E, F) Pathologic stage I-II and III-IV group in the TCGA sample prognosis-related Kaplan-Meier curve chart. (G) The prognostic-related Kaplan-Meier curve of recurrence. (H, I) Samples of the group with and without recurrence based on the prognostic prediction model Kaplan-Meier curve diagrams. TCGA, The Cancer Genome Atlas.

https://doi.org/10.1371/journal.pone.0260811.g004

Comparison analysis between prognostic clinical factors and DEGs prognostic risk models

The survival nomogram model analysis performed in the TCGA training data set samples revealed that age and DEG prognostic contributed most to the three-year and five-year OS (Fig 5A). The predictive three-year (C-index = 0.872) and five-year (C-index = 0.856) survival probability based on the model was basically in line with the actual survival rates (Fig 5B).

thumbnail
Fig 5. The prognostic nomogram model for independent prognostic factors and three-year and five-year survival prediction.

(A) Nomogram of a prognostic model for independent prognostic factors. (B) Line chart of three-year and five-year survival predictions and actual survival. The horizontal axis represents the predicted OS rate, the vertical axis represents the real OS rate, and black and red represent the three-year and five-year forecast line graphs, respectively. OS, overall survival. c-index, concordance index.

https://doi.org/10.1371/journal.pone.0260811.g005

The model comparison analysis revealed that clinical combination model (AUC = 0.795, P < 0.0001; Fig 6; Table 4) presented superior prediction function to single clinical factor ones, including age (AUC = 0.531, P < 0.0001), pathologic stage (AUC = 0.540, P = 0.00036), and recurrence (AUC = 0.734, P < 0.0001). Moreover, the model combined with five-DEGs signature and clinical factors (AUC = 0.950, P < 0.00001) exhibited an absolute advantage on predicting prognosis over combined clinical model (AUC = 0.795, P < 0.0001) or five-DEG signature model alone (AUC = 0.823, P = 0.00039).

thumbnail
Fig 6. ROC curve comparison based on different models.

ROC, receiver operating characteristic. The clinical model contains the factors of age, pathologic, and recurrence.

https://doi.org/10.1371/journal.pone.0260811.g006

Discussion

TNBC is a heterogeneous and aggressive disease with short treatment-to-relapse time and a high rate of visceral metastasis [35]. Its long-term prognosis is worse than other breast cancer subtypes [36]. Worse, TNBC recurrence is known to happen within the first three years [37] after therapy. Once recurrence and metastasis occur, the median survival time is less than one year [38]. Chemotherapy remains the most effective therapy method [13]. In recent years, immunotherapy had been widely studied in cancers, especially TNBC. PD-1/PD-L1 are a pair of immune co-stimulatory molecules contain the medicines of pembrolizumab [14], atezolizumab [15], and durvalumab [16], which were reported to be effective for prolonging OS. Clinical studies have found that immune infiltration could improve prognosis in TNBC patients [17, 18]. Therefore, the identification of DEGs of TNBC may contribute to the in-depth analysis of factors affecting survival. Databases of TCGA, METABRIC, AmiGO 2, and KEGG provide massive comprehensive and reliable high-throughput sequences for mining. In the present study, a five-DEG signature (CHST4, COCH, CST9, SOX11, and TDGF1) was constructed to predict the prognosis of TNBC. Moreover, the combined prognosis model of DEG signature and clinical factors have a better prediction function than a single DEG signature model or clinical factors.

The signature in our present study consisted of five genes, including CHST4, COCH, CST9, SOX11, and TDGF1. CHST4, enriched in the biological process of immune response, was responsive to any potential internal or invasive threat of an organism [39]. COCH, enriched in the biological process of regulating innate immune response positively, is the first line of defense against infection through activating and increasing the frequency, rate, and extent of the innate immune response. CST9 was enriched in the immune response against microbes mediated through body fluid. SOX11 was enriched in the biological process of negatively regulation of lymphocyte proliferation by stopping, preventing, and reducing the rate or extent of lymphocyte proliferation. TDGF1 was enriched in the GO term of cellular response to interferon-gamma. These genes were involved in immune-related GO terms and play essential roles in immune regulation. However, the mechanism of them in TNBC patients still need further analysis.

In recent years, the association between mRNAs, lncRNAs, and the prognosis of TNBC had been widely studied. In the study of Jiang et al. [26], one lncRNA-mRNA signature was developed to predict taxane chemotherapy beneficiation and recurrence in TNBC and assist individual frame treatment for TNBC patients. Loibl et al. [40] predicted an immune-related mRNA signature TIL and IFN-γ to respond durvalumab in primary TNBC. Ren et al. [41] reported a seven-gene signature (1.108*TMEM101–0.213*KRT5–0.315*ACAN–0.464*LCA5+0.446*RPP40–0.373*LAGE3–0.257*CDKL2) on predicting prognostic and de-escalating treatment for early-stage TNBC. In the study of Wang et al. [23], they developed a response score with one lncRNA and two mRNAs (2.595*BPESC1–1.09*WDR72–1.428*GADD45A–0.731), which could be employed clinically to predict complete pathological responses in TNBC patients receiving neoadjuvant chemotherapy. Integrated signature of three DE mRNAs (TRIM59, EX01, and RAD51AP1), one DE lncRNA (KIRREL3-AS1), and one DE miRNA (hsa-mir-106a) was found to be significantly associated with the prognosis of patients with TNBC [42]. However, the application of these molecular signatures was limited because of the clinical factors and more attention on molecular factors. In this study, the prognostic risk was predicted by a single clinical factor, five-mRNA signature, and combination. The predictive performance of our combined prognostic model was found to be superior to those of the five-mRNA signature, age, stage, recurrence, and clinical alone. Prognostic models should be as simple as possible for patients and doctors to utilize in clinical practice. Moreover, it should be accurate. Our combined prognosis prediction model was based on routine factors, including genetic differences (the five-mRNA signature), baseline demographic factor (age), histopathological characteristic (pathologic stage), and prognostic factor (recurrence). The combined essential factors make the association between risk factors and outcomes more accurate. Consequently, the prognosis risk of TNBC patients can be easily estimated.

A good prognosis prediction model has always been made by considering stratified clinical factors, in which the patients were divided into high and low-risk groups. The prognosis prediction model with stratified factors was more accurate than without. The five-DEG signature performed well on risk stratification in subgroups of pathologic stage III-IV, with recurrence, and those aged over 60. As for the impact of age on prognostic, Liedtke et al. [43] demonstrated that patients ≤ 40 years old have poorer survival despite more aggressive systemic therapy, which was 20-years earlier than that demonstrated in our present study. The result might clarify a more accurate signature in our study. As for the pathologic stage, He et al. [44] reported that higher stage correlated with longer disease-free survival (HR = 3.13, 95% CI: 1.94–5.06), which was consistent with the result in our study (HR = 2.873, 95% CI: 1.830–4.511). However, our present study was more comprehensive, which consisted of the stage of IV. Moreover, the predictive capacity of factors, including five-DEG signature, age, pathologic stage, and recurrence, were independent of each other.

There are some limitations in the present study. Firstly, we only detected the model but employed in the actual clinical work. Secondly, these genes needed to be further elucidated on the mechanism and functions of TNBC.

Conclusions

Our present study identified a TNBC prognostic prediction model with five-DEG signature and clinical factors, which will benefit for future research and clinical therapies.

Supporting information

S1 Table. The clinical information statistics.

The clinical information of overall survival time, age, stage, radiation, targeted-therapy, and recurrence were provided.

https://doi.org/10.1371/journal.pone.0260811.s002

(XLSX)

S2 Table. The list of differential expressed genes.

The thresholds were set as false discovery rate (FDR) < 0.05 and |log2 Fold Change| > 1.

https://doi.org/10.1371/journal.pone.0260811.s003

(XLSX)

S3 Table. The list of differential expressed genes retained for signature identification.

https://doi.org/10.1371/journal.pone.0260811.s004

(XLSX)

S4 Table. The list of differential expressed genes through univariate Cox regression analysis.

https://doi.org/10.1371/journal.pone.0260811.s005

(XLSX)

S5 Table. The list of differential expressed genes involved in the model construction.

https://doi.org/10.1371/journal.pone.0260811.s006

(XLSX)

Acknowledgments

The results shown here were based on the analysis of datasets from TCGA (https://gdc-portal.nci.nih.gov/), METABRIC (http://molonc.bccrc.ca/), AmiGO 2(http://amigo.geneontology.org/amigo) and KEGG (https://www.kegg.jp/).

References

  1. 1. Agrawal LS, Mayer IA. Platinum agents in the treatment of early-stage triple-negative breast cancer: is it time to change practice? Clinical advances in hematology oncology: H. 2014;12(10):654–8. pmid:25658890
  2. 2. Sistigu A, Yamazaki T, Vacchelli E, Chaba K, Enot DP, Adam J, et al. Cancer cell–autonomous contribution of type I interferon signaling to the efficacy of chemotherapy. Nature medicine. 2014;20(11):1301. pmid:25344738
  3. 3. Loi S, Pommey S, Haibe-Kains B, Beavis PA, Darcy PK, Smyth MJ, et al. CD73 promotes anthracycline resistance and poor prognosis in triple negative breast cancer. Proceedings of the National Academy of Sciences. 2013;110(27):11091–6. pmid:23776241
  4. 4. Wahba HA, El-Hadaad HA. Current approaches in treatment of triple-negative breast cancer. Cancer Biology Medicine. 2015;000(002):106–16. pmid:26175926
  5. 5. Nalley C. Metastatic TNBC Patients Could Benefit From Immunotherapy. Oncology Times. 2017;39:26–7.
  6. 6. Grima D, Marshall D, Weinstein M, Wong J, Kleinman S, Aubuchon J. Immunotherapy Slows TNBC Progression. Cancer Discovery. 2015;5(6):570.
  7. 7. Rugo Hope S. The Role of Immunotherapy in the Treatment of Triple Negative Breast Cancer (TNBC). Breast. 2017;36:29–30.
  8. 8. Jia H, Truica CI, Wang B, Wang Y, Yang JM. Immunotherapy for Triple-Negative Breast Cancer: Existing Challenges and Exciting Prospects. Drug Resist Updat. 2017;32:S1368764617300316. pmid:29145974
  9. 9. Achkar T, Tarhini AA. The use of immunotherapy in the treatment of melanoma. J Hematol Oncol. 2017;10(1):88. pmid:28434398
  10. 10. Suresh K, Naidoo J, Lin CT, Danoff S. Immune checkpoint immunotherapy for non-small cell lung cancer: benefits and pulmonary toxicities. Chest. 2018;154(6):1416–23. pmid:30189190
  11. 11. Soare GR, Soare CA. Immunotherapy for Breast Cancer: First FDA Approved Regimen. Discoveries. 2019;7(1). pmid:32309609
  12. 12. Dieterich LC, Bikfalvi A, editors. The tumor organismal environment: Role in tumor development and cancer immunotherapy. Seminars in Cancer Biology; 2019: Elsevier.
  13. 13. Twelves C, Jove M, Gombos A, Awada A. Cytotoxic chemotherapy: still the mainstay of clinical practice for all subtypes metastatic breast cancer. Crit Rev Oncol hemat. 2016;100:74–87. pmid:26857987
  14. 14. Ho AY, Barker CA, Arnold BB, Powell SN, Hu ZI, Gucalp A, et al. A phase 2 clinical trial assessing the efficacy and safety of pembrolizumab and radiotherapy in patients with metastatic triple‐negative breast cancer. Cancer. 2020;126(4):850–60. pmid:31747077
  15. 15. Schmid P, Rugo HS, Adams S, Schneeweiss A, Barrios CH, Iwata H, et al. Atezolizumab plus nab-paclitaxel as first-line treatment for unresectable, locally advanced or metastatic triple-negative breast cancer (IMpassion130): updated efficacy results from a randomised, double-blind, placebo-controlled, phase 3 trial. Lancet Oncology. 2020;21(1):44–59. pmid:31786121
  16. 16. Loibl S, Untch M, Burchardi N, Huober J, Sinn B, Blohmer J-U, et al. A randomised phase II study investigating durvalumab in addition to an anthracycline taxane-based neoadjuvant therapy in early triple-negative breast cancer: clinical results and biomarker analysis of GeparNuevo study. Annals of Oncology. 2019;30(8):1279–88. pmid:31095287
  17. 17. Bianchini G, Qi Y, Alvarez RH, Iwamoto T, Coutant C, Ibrahim NK, et al. Molecular anatomy of breast cancer stroma and its prognostic value in estrogen receptor-positive and-negative cancers. Journal of Clinical Oncology. 2010;28(28):4316–23. pmid:20805453
  18. 18. Karn T, Pusztai L, Rody A, Holtrich U, Becker S. The influence of host factors on the prognosis of breast cancer: stroma and immune cell components as cancer biomarkers. Current cancer drug targets. 2015;15(8):652–64. pmid:26452382
  19. 19. Sotiriou C, Wirapati P, Loi S, Harris A, Fox S, Smeds J, et al. Gene expression profiling in breast cancer: understanding the molecular basis of histologic grade to improve prognosis. Journal of the National Cancer Institute. 2006;98(4):262–72. pmid:16478745
  20. 20. Paik S, Shak S, Tang G, Kim C, Baker J, Cronin M, et al. A multigene assay to predict recurrence of tamoxifen-treated, node-negative breast cancer. New England Journal of Medicine. 2004;351(27):2817–26. pmid:15591335
  21. 21. Van De Vijver MJ, He YD, Van’t Veer LJ, Dai H, Hart AA, Voskuil DW, et al. A gene-expression signature as a predictor of survival in breast cancer. New England Journal of Medicine. 2002;347(25):1999–2009. pmid:12490681
  22. 22. Zhang C, Zhang X, Zhao W, Zeng C, Li W, Li B, et al. Chemotherapy drugs derived nanoparticles encapsulating mRNA encoding tumor suppressor proteins to treat triple-negative breast cancer. Nano research. 2019;12(4):855–61. pmid:31737223
  23. 23. Wang Q, Li C, Tang P, Ji R, Chen S, Wen J. A minimal lncRNA-mRNA signature predicts sensitivity to neoadjuvant chemotherapy in triple-negative breast cancer. Cellular Physiology Biochemistry. 2018;48(6):2539–48. pmid:30121642
  24. 24. Liu L, Wang Y, Miao L, Liu Q, Musetti S, Li J, et al. Combination immunotherapy of MUC1 mRNA nano-vaccine and CTLA-4 blockade effectively inhibits growth of triple negative breast cancer. Molecular Therapy. 2018;26(1):45–55. pmid:29258739
  25. 25. Katsuta E, Yan L, Takeshita T, McDonald K-A, Dasgupta S, Opyrchal M, et al. High MYC mRNA Expression Is More Clinically Relevant than MYC DNA Amplification in Triple-Negative Breast Cancer. International journal of molecular sciences. 2020;21(1):217.
  26. 26. Jiang Y-Z, Liu Y-R, Xu X-E, Jin X, Hu X, Yu K-D, et al. Transcriptome analysis of triple-negative breast cancer reveals an integrated mRNA-lncRNA signature with predictive and prognostic value. Cancer Research. 2016;76(8):2105–14. pmid:26921339
  27. 27. Gong W, Liu Y, Preis S, Geng X, Petit-Courty A, Kiechle M, et al. Prognostic value of kallikrein-related peptidase 12 (KLK12) mRNA expression in triple-negative breast cancer patients. Molecular Medicine. 2020;26(1):19. pmid:32028882
  28. 28. Ritchie ME, Phipson B, Wu D, Hu Y, Law CW, Shi W, et al. limma powers differential expression analyses for RNA-sequencing and microarray studies. Nucleic Acids Research. 2015;43(7):e47–e. pmid:25605792
  29. 29. Eisen MB, Spellman PT, Brown PO, Botstein D. Cluster analysis and display of genome-wide expression patterns. Proceedings of the National Academy of Sciences. 1998;95(25):14863–8. pmid:9843981
  30. 30. Wang L, Cao C, Ma Q, Zeng Q, Wang H, Cheng Z, et al. RNA-seq analyses of multiple meristems of soybean: novel and alternative transcripts, evolutionary and functional implications. BMC plant biology. 2014;14(1):169. pmid:24939556
  31. 31. Lee JW, Lee JB, Park M, Song SH. An extensive comparison of recent classification tools applied to microarray data. Computational Statistics Data Analysis. 2005;48(4):869–85.
  32. 32. Zhu J, Hastie T. Classification of gene microarrays by penalized logistic regression. Biostatistics. 2004;5(3):427–43. pmid:15208204
  33. 33. Anderson WI, Schlafer DH, Vesely KR. Thyroid follicular carcinoma with pulmonary metastases in a beaver (Castor canadensis). Journal of wildlife diseases. 1989;25(4):599–600. pmid:2810561
  34. 34. Eng KH, Schiller E, Morrell K. On representing the prognostic value of continuous gene expression biomarkers with the restricted mean survival curve. Oncotarget. 2015;6(34):36308–18. pmid:26486086
  35. 35. Lee KK, Kim JY, Jung JH, Park JY, Park HY. Clinicopathological feature and recurrence pattern of triple negative breast cancer. J Korean Surg Soc. 2010;79(1):14–9.
  36. 36. Qiu J, Xue X, Hu C, Xu H, Kou D, Li R, et al. Comparison of clinicopathological features and prognosis in triple-negative and non-triple negative breast cancer. J Cancer. 2016;7(2):167–73. pmid:26819640
  37. 37. Gougis P, Carton M, Tchokothe C, Campone M, Dalenc F, Mailliez A, et al. CinéBreast-factors influencing the time to first metastatic recurrence in breast cancer: Analysis of real-life data from the French ESME MBC database. The Breast. 2020;49:17–24. pmid:31675683
  38. 38. Balkenhol MC, Vreuls W, Wauters CA, Mol SJ, van der Laak JA, Bult PJAoDP. Histological subtypes in triple negative breast cancer are associated with specific information on survival. Ann Diagn Pathol. 2020;46:151490. pmid:32179443
  39. 39. Cui Y, Chen W, Chi J, Wang L, editors. Differential expression network analysis for diabetes mellitus type 2 based on expressed level of islet cells. Annales d’endocrinologie; 2016: Elsevier.
  40. 40. Loibl S, Sinn B, Karn T, Untch M, Treue D, Sinn H, et al. Abstract PD2-07: mRNA signatures predict response to durvalumab therapy in triple negative breast cancer (TNBC)–Results of the translational biomarker programme of the neoadjuvant double-blind placebo controlled GeparNuevo trial. AACR; 2019.
  41. 41. Ren Y, Jiang Y, Zuo W, Xu X, Jin X, Ma D, et al. Abstract P2-08-33: A novel seven-gene signature predicts prognosis in early-stage triple-negative breast cancer. AACR; 2019.
  42. 42. Yan P, Tang L, Liu L, Tu G. Identification of candidate RNA signatures in triple‑negative breast cancer by the construction of a competing endogenous RNA network with integrative analyses of Gene Expression Omnibus and The Cancer Genome Atlas data. Oncology Letters. 2020;19(3):1915–27. pmid:32194687
  43. 43. Liedtke C, Rody A, Gluz O, Baumann K, Beyer D, Kohls E-B, et al. The prognostic impact of age in different molecular subtypes of breast cancer. Breast cancer research treatment. 2015;152(3):667–73. pmid:26195120
  44. 44. He J, Peng R, Yuan Z, Wang S, Peng J, Lin G, et al. Prognostic value of androgen receptor expression in operable triple-negative breast cancer: a retrospective analysis based on a tissue microarray. Medical oncology. 2012;29(2):406–10. pmid:21264529