Reclassification of breast cancer: Towards improved diagnosis and outcome

Alexander P. Landry; Zsolt Zador; Rashida Haq; Michael D. Cusimano

doi:10.1371/journal.pone.0217036

Abstract

Background

The subtyping of breast cancer based on features of tumour biology such as hormonal receptor and HER2 status has led to increasingly patient-specific treatment and thus improved outcomes. However, such subgroups may not be sufficiently informed to best predict outcome and/or treatment response. The incorporation of multi-modal data may identify unexpected and actionable subgroups to enhance disease understanding and improve outcomes.

Methods

This retrospective cross-sectional study used the cancer registry Surveillance, Epidemiology and End Results (SEER), which represents 28% of the U.S. population. We included adult female patients diagnosed with breast cancer in 2010. Latent class analysis (LCA), a data-driven technique, was used to identify clinically homogeneous subgroups (“endophenotypes”) of breast cancer from receptor status (hormonal receptor and HER2), clinical, and demographic data and each subgroup was explored using Bayesian networks.

Results

Included were 44,346 patients, 1257 (3%) of whom had distant organ metastases at diagnosis. Four endophenotypes were identified with LCA: 1) “Favourable biology” had entirely local disease with favourable biology, 2) “HGHR-” had the highest incidence of HR- receptor status and highest grade but few metastases and relatively good outcomes, 3) “HR+ bone” had isolated bone metastases and uniform receptor status (HR+/HER2-), and 4) “Distant organ spread” had high metastatic burden and poor survival. Bayesian networks revealed clinically intuitive interactions between patient and disease features.

Conclusions

We have identified four distinct subgroups of breast cancer using LCA, including one unexpected group with good outcomes despite having the highest average histologic grade and rate of HR- tumours. Deeper understanding of subgroup characteristics can allow us to 1) identify actionable group properties relating to disease biology and patient features and 2) develop group-specific diagnostics and treatments.

Citation: Landry AP, Zador Z, Haq R, Cusimano MD (2019) Reclassification of breast cancer: Towards improved diagnosis and outcome. PLoS ONE 14(5): e0217036. https://doi.org/10.1371/journal.pone.0217036

Editor: Aamir Ahmad, University of South Alabama Mitchell Cancer Institute, UNITED STATES

Received: December 18, 2018; Accepted: May 2, 2019; Published: May 22, 2019

Copyright: © 2019 Landry et al. This is an open access article distributed under the terms of the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are credited.

Data Availability: All data was obtained, and is publicly available, from the open cancer repository Surveillance, Epidemiology, and End Results (SEER). A Research Data Agreement form is required for data access. More information is available at: https://seer.cancer.gov/data/access.html

Funding: ZZ acknowledges additional support from the National Institute for Health Research through an Academic Clinical Lectureship, award number: CL-2014-06-004. No funding bodies had any role in study design, in the collection, analysis, or interpretation of data, nor in the writing of the manuscript.

Competing interests: The authors have declared that no competing interests exist.

Background

Significant research efforts have been undertaken to identify markers of breast cancer biology and prognosis. Gene expression profiling has elucidated five “intrinsic” subgroups of the disease: luminal A, luminal B, HER2-overexpressing, basal-like, and normal-like[1,2]. Conveniently, hormone receptors (HR; estrogen and progesterone) and human epidermal growth factor receptor 2 (HER2) function as classification biomarkers; their status can approximate intrinsic subtypes without necessitating cumbersome genetic profiles (luminal and normal-like are HR+, HER2-overexpressing is HR-/HER2+, and basal-like is HR-/HER2-)[3,4]. The HR-/HER2- subtype, also called “triple-negative” (TN) disease since it lacks all three receptors, is associated with the worst prognosis[2,5]. Importantly, the molecular classification of breast cancer has led to the development of targeted therapies which have improved survival in primary disease[6]. However, while HER2-targeted treatments also show promise in the setting of metastasis[7], outcomes in metastatic disease have not improved significantly over the last several years[7,8] and only about 30% of stage 4 breast cancers are HER2+[4] and thus amenable to this therapy. Further, the effectiveness of HER2 treatment in the setting of brain metastasis is limited by blood brain barrier (BBB) impermeability[9]. Novel classification systems have the potential to improve our understanding of metastatic breast cancer and further inform the clinical management of this challenging disease state.

The clinical course and prognosis of breast cancer are highly variable[10,11]. Identifying mutually exclusive subgroups of patients within a heterogeneous patient population is called endophenotype discovery[12–14], and has translated into practice-changing recommendations in the management of asthma[12], ARDS[13] and sepsis[14]. Latent class analysis (LCA) has been used for this purpose in the clinical setting[13,15,16], and is preferred over conventional clustering[17]. This technique is based on the assumption that there exists an unobserved categorical variable, inferable from a set of observed variables, which groups subjects into “latent” classes based on individual similarities[18]. The technique of endophenotype discovery can ultimately inform subgroup-specific treatment plans (e.g. identifying tumour-specific drugs), trial design (e.g. testing prophylactic irradiation strategies for high risk groups) and prognostication. Bayesian Network Analysis (BNA), a machine learning based method for computing and visualizing probabilistic associations within a dataset[19], is useful for further data exploration and hypothesis generation, and has been applied to hepatocellular carcinoma[20], lung cancer[21], traumatic brain injury[19], and subarachnoid hemorrhage[22]. Network output is visualized with a directed acyclic graph (DAG), wherein probabilistic associations between nodes (parameters) are indicated by edges (arrows). The utility of these novel approaches in breast cancer has not been adequately explored, and it may provide further insight into this heterogeneous disease.

The purpose of this work is to identify novel subgroups of breast cancer using a large cancer repository, and explore groups for whom current classification systems are inadequate. This will serve as an important guide for future work aimed at constructing deeper, more informative endophenotypes of this disease.

Methods

Data collection

Data was extracted from the 2016 edition of the open-source cancer registry Surveillance, Epidemiology and End Results (SEER). SEER encodes data on individual cancer cases in the USA from 18 population-based registries and represents approximately 28% of the country’s population[23]. Sites of distant metastases and hormone receptor/HER2 statuses of primary breast tumours have been recorded since 2010. Consequently, we limited our study to cases diagnosed in 2010 to maximize follow up and minimize time-related cohort heterogeneity. Demographic information for each case included age (<50 vs. 50–69 vs. ≥70 years old)[4], race (white vs. black vs. other), marital status (married vs. unmarried) and health insurance status (insured vs. uninsured). Disease biology at diagnosis was described by estrogen and progesterone receptor status (present vs. absent vs. borderline), HER2 status (present vs. absent), and grade (1–4) of the primary tumour. Primary tumour size (<2 cm vs. 2–4.9 cm vs. ≥5 cm)[4] as well as lymph node and distant organ (brain, liver, lung, bone) metastases (present vs. absent) served to describe disease burden at the time of diagnosis. Finally, we assessed overall survival based on the most recent documented follow-up (dead vs. alive). Following a recent SEER-based study from Gong et al[4], estrogen receptor (ER) and progesterone receptor (PR) status coded as “borderline” in the database were defined as positive in our study.

Inclusion/Exclusion criteria

This study included female patients from the 2016 SEER registry who were diagnosed with breast cancer during the year 2010. All included patients were at least 18 years old at the time of diagnosis. We required cases to have been diagnosed by histology and excluded those with a diagnosis made at autopsy or based on a death certificate alone[4]. We also excluded cases if any included parameters (defined in the previous subsection) were coded as “unknown” in the database. Similarly, cases with “borderline” HER2 status were not included in the analysis[4].

Endophenotype discovery and hypothesis generation

Comparison was first made between patients with and without distant organ metastases at diagnosis using contingency tables for each parameter and applying a chi-squared test of independence. Building and testing an LCA model then served as the primary tool for endophenotype discovery. Since these models are best suited for binary classification variables, we used the status of receptors (HR, HER2) and metastases (lymph nodes, brain, lung, liver, and bone) for this purpose. The optimal number of classes was determined by computing the Bayesian Information Criterion (BIC) for models with between 2 and 7 classes and selecting the configuration which minimized the BIC. Contingency tables were generated for each parameter and a chi-square test used to probe inter-class differences. For descriptive purposes, the LCA was repeated 10 times to observe the degree of reproducibility.

Bayesian networks were used to explore variable interactions within each endophenotype and thus further probe the characteristics of each class. A hill-climbing algorithm was used to train the networks[19], and BIC values were used to represent association strengths (with an inverse proportionality). In the DAG representation, arrows with tails at the “survival” node were omitted, as a distal outcome cannot logically influence a patient’s state at diagnosis. Results from the BNA were validated using structural equation modelling (SEM), a statistical tool capable of assessing the fit of a theoretical model on a dataset using inter-related multiple regression models. Root mean squared error of approximation (RMSEA) and standardized root mean squared residual (SRMR), both common measures of absolute model fit (wherein the former includes a penalty for model complexity and the latter does not) were used to assess the Bayesian model configurations. We consider fits to be adequate when both RMSEA and SRMR are < 0.05 (based on recommended cutoffs of 0.06 and 0.08, respectively[24]).

Notably, both LCA and BNA were used to explore potential metastatic “drivers” (factors which predispose to metastasis) and “organ-selectors” (factors which influence the location of a metastasis, without necessarily affecting the probability of one actually occurring). A p-value ≤ 0.001 was considered significant in this study.

Computational platform

All analysis and modelling used R, an open-source platform for statistical computation and graphics[25]. The packages “poLCA”[26], “bnlearn”[27], and “lavaan”[28] were used for LCA, BNA, and SEM, respectively.

Results

Patient characteristics

An overview of patient characteristics is presented in Table 1. The majority are white (81.1%), insured (98.4%) and married (72.1%). Most primary tumours are <2 cm (57.2%) and HR+/HER2- (73.3%). About one third of patients have lymph node metastases at diagnosis (32%) but distant organ metastases are rare, with bone being the most common site (2.1%) and brain being the least common (0.2%). Most patients are alive at the most recent documented follow up (86.3%). In the comparison between those with and without organ metastases at diagnosis, all parameters except age and marital status are significantly different (p<0.001, chi square). Patients presenting with metastasis are more likely to be black, uninsured, and unmarried but these differences are small in absolute terms. The metastasis group also has larger and higher grade primary tumours which are more likely to be HR-, and be associated with greater lymph node burden and comparatively poor survival.

Download:

Table 1. Demographics of the final dataset.

https://doi.org/10.1371/journal.pone.0217036.t001

Endophenotype discovery with latent class analysis

A four-latent-class model was selected based on BIC values, which demonstrate good reproducibility (Table 2). LCA outcome probabilities are shown in Fig 1, and a frequency table of class compositions is presented in Table 3. We observe unambiguous HR/HER2 statuses in classes 1 and 4 (high likelihood of HR+/HER2-) yet greater heterogeneity in classes 2 and 3, a high probability of metastases in group 3, and varying probabilities of lymph node metastasis between groups (Fig 1). In terms of case distribution (Table 3), class 4 accounts for the majority of cases, and is associated with the best survival. Patients are more likely to be older and white and none have lymph node or distant organ metastases at diagnosis. This group also has the smallest, lowest grade primary tumours which are mostly HR+/HER2-. Class 2 is the youngest, with higher grade tumours which are more likely to be HR-/HER2+ or TN. Despite higher grade and HR/HER2 status, tumour size and lymph node involvement are moderate, there are very few distant organ metastases (none to bone; no patients have metastases to more than one site), and survival is quite good. Class 1, which is entirely HR+/HER2-, has an especially high lymph node burden and contains approximately half the total number of bone metastases (without any metastases to other sites). Class 3 has the worst prognosis and accounts for the vast majority of metastatic burden at diagnosis (all patients have distant organ involvement); patients are also more likely to have larger primary tumours and to be black, uninsured, and unmarried. Interestingly, this group has fewer lymph node metastases than class 1 and tumours are lower grade and are less likely to be HR-/HER2+ or TN compared to class 2. Based on these observations, we will refer to these endophenotypes as follows: favourable biology (class 4), HGHR- (class 2), HR+ bone (class 1), and distant organ spread (class 3). In Table 3, there are highly significant differences (p<0.001, chi square) between classes for all parameters. Notably, similar LCA outputs were observed in 9/10 repetitions.

Download:

Fig 1. LCA model output.

A: Probabilistic distribution of patients by latent class. Note that this represents the probability of a patient being classified in each subgroup, and is not equivalent to the distribution of cases in this study (Table 3). B: Outcome probabilities for each class. Each bar represents the probability of a parameter being positive in a particular class. Bars are colour coded based on the following groupings: HR/HER2 subtype (blue), distant organ metastases (orange), and lymph node metastases (green).

https://doi.org/10.1371/journal.pone.0217036.g001

Download:

Table 2. LCA model optimization.

Note that BIC is minimized with 4 classes.

https://doi.org/10.1371/journal.pone.0217036.t002

Download:

Table 3. Latent class compositions.

https://doi.org/10.1371/journal.pone.0217036.t003

Bayesian network analysis

We present the DAG outputs of the BNA in Fig 2. In class 1, age and race are proximal features which are associated with disease-specific parameters (grade, spread) and survival. Class 2 is notable for having tumour size and spread (nodes, liver, lung) as proximal variables influencing HR/HER2 subtype, demographic features and survival. Age and HER2 status are directly associated with survival, though the hormone receptor status has no significant effect. In class 3, demographic features don’t appear to have a significant effect on disease or survival despite the increased proportion of patients who are black, uninsured, and unmarried. Receptor subtype affects survival and HR status influences site of metastasis (bone/lung), though there is no significant association between site of metastasis and mortality. Finally, age directly influences all variables (except HR status) in class 4, and survival is also a direct function of size and HR status.

Download:

Fig 2. DAG plots for each latent class.

Arrow thickness represents the strength of association, which is inversely proportional to BIC. Dotted lines represent associations which are ≤5% of the strength of the strongest association. Node colours are grouped as follows: host features (blue), tumour biology (orange), disease burden (green), and overall survival (grey). Red boxes indicate interactions which were not validated by the SEM regression (p > 0.001). RMSEA = root mean square error of approximation; SRMR = standardized root mean square residual; “surv” = survival; “insur” = insurance.

https://doi.org/10.1371/journal.pone.0217036.g002

Structural equation modelling

A single discrepancy is noted between BNA and SEM in classes 2 and 4 (Fig 2). However, each Bayesian model achieved an adequate fit (with the discrepant associations removed) based on the RMSEA and SRMR values, indicating they are plausible constructs with which to describe this data.

Discussion

Overview of key results

Latent class analysis identified four endophenotypes of breast cancer: favourable biology, with biologically favourable and locally contained tumours; HGHR- (“high grade HR-“), with the highest grade tumours on average and largely HR- receptor subtypes, yet limited disease spread and good survival; HR+ bone, described by HR+/HER2- subtype, lymph node and isolated lung metastases, and good survival; distant organ spread, which represents the vast majority of metastatic burden and death despite less aggressive tumour grade and HR/HER2 status than the HGHR- cohort. Bayesian networks identified differences in variable interactions among classes as well.

The “HGHR-” class has unexpected features

The HGHR- group is of particular importance due to its unexpected features. As the cohort with the highest grade tumours which are predominantly HR-/HER2+ and TN, its histopathology would classically be considered unfavourable. It is therefore surprising that this is not the group with the highest metastatic burden and lowest survival. It seems that while HR/HER2 status may be an “organ-selector” of metastasis it may not be a key “driver” of them; i.e. cannot be used to reliably predict whether or not a metastasis will occur in the first place. This postulate is based on the fact that this endophenotype has so few metastases and that HR status does not significantly drive metastases in this cohort’s BNA. Alternatively, these patients may still develop metastases with time but have prolonged survival nevertheless.

The relationship between hormone receptor and HER2 subtype and preferential site of metastasis is not new, and it is known that HR+ tumours are associated with spread to bone while HR- tumours primarily distribute to liver, lung, or brain[4]. This is corroborated by observing differences between HR+ bone (HR+ only, bone is the only site of metastasis) and HGHR- (mostly HR-, no bone metastases) and by noting the influence of HR status on site of metastasis in the distant organ spread BNA (Fig 2; since all patients have metastases, influence is purely on location). Nevertheless, there is considerable overlap in sites of tumour spread[4] and predictions are largely limited to “bone vs. non-bone”. These observations indicate that HR/HER2 subtype alone is not a sensitive nor specific predictor of metastasis, and clearly suggest a need for further exploration of prediction models.

Social factors may play a role in metastasis and outcome

There appears to be a relationship between socioeconomic status and disease burden/outcome as observed in the distant organ spread endophenotype (Table 3). Additionally, race directly influences tumour grade in the HR+ bone BNA (Fig 2) and the relationship between race and HR (which, in turn, is related to tumour grade) has been suggested to be mediated by socioeconomic factors rather than genetics alone[29]. The relevance of host features is not surprising as it has previously been shown that indicators of socioeconomic status including age, marital status, and occupational status are predictors of breast cancer mortality[4,30,31] and stage at diagnosis[30]. This may be mediated by a lack of access to healthcare, differences in exposures/behaviours, and/or underlying biology. For example, better access to care may have played a role in the HGHR- group’s favourable outcomes. Addressing barriers of care will be critical in this subset of patients in order to prevent early disease spread and consequent poor outcomes.

Limitations

Several limitations of this study must be considered. This study is cross sectional in nature. SEER does not record data beyond the time of diagnosis, making it impossible to track disease evolution (i.e. the development of new metastases with time). This hindered our ability to assess metastatic risk based on a patient’s state at diagnosis. The nature of the dataset, with the rarity of distant organ metastases at diagnosis, may also have limited the statistical power of our results, particularly in the BNA. This limited our abilities to draw conclusions, particularly relating to brain and liver metastases which are associated with particularly dismal outcomes and are therefore very important. In addition, molecular classification available in SEER was based on HER2 and HR status alone, which is not fully concordant with gene expression-derived intrinsic subtypes[2]. Similarly, we were not able to include potentially important parameters such as gene expression data and more extensive host information due to limited data availability. This also precluded the analysis of metastasis biology, allowing the possibility of receptor conversion from the primary tumour[9,32]. Finally, despite its size, SEER may not be fully representative of the US breast cancer population and does not consider populations outside the US. Nevertheless, the purpose of this work was to generate hypotheses using data covering significant segments of the population in order to guide the construction of deeper endophenotypes using other, more complete data sources.

Future directions

Validation of our results by independently detecting similar subgroups based on prospectively gathered clinical, demographic, and molecular data will be an important next step. Subsequently, deriving endophenotypes of breast cancer from comprehensive gene expression profiles, in tandem with richer host phenotypes (i.e. patient characteristics), may lead to more holistic and informative classification models. Inclusion of all previously identified genes of significance in these models should be attempted in order to ensure completeness. Consequent identification of the precise phenotypes which are associated with metastases on a longitudinal basis may improve our understanding of this disease state, allow for better prevention and screening, and ultimately lead to more effective treatments.

Conclusions

Data-driven techniques using receptor (HR and HER2) subtypes, host factors and metastatic potential from the SEER dataset identified novel unexpected endophenotypes of breast cancer and suggest the need for better classification using deeper models. Future work to expand on the endophenotypes we have described using more complete data and longitudinal designs will lead to a better ability to understand, predict, and treat metastases and ultimately improve outcomes in these patients.

References

1. Perou CM, Sørile T, Eisen MB, Van De Rijn M, Jeffrey SS, Ress CA, et al. Molecular portraits of human breast tumours. Nature. 2000;406: 747–752. pmid:10963602
- View Article
- PubMed/NCBI
- Google Scholar
2. Sorlie T, Perou CM, Tibshirani R, Aas T, Geisler S, Johnsen H, et al. Gene expression patterns of breast carcinomas distinguish tumor subclasses with clinical implications. Proc Natl Acad Sci U S A. 2001;98: 10869–10874. pmid:11553815
- View Article
- PubMed/NCBI
- Google Scholar
3. Dai X, Li T, Bai Z, Yang Y, Liu X, Zhan J, et al. Breast cancer intrinsic subtype classification, clinical use and future trends. Am J Cancer Res. 2015;5: 2929–2943. www.ajcr.us/ISSN:2156-6976/ajcr0014158 pmid:26693050
- View Article
- PubMed/NCBI
- Google Scholar
4. Gong Y, Liu Y-R, Ji P, Hu X, Shao Z-M. Impact of molecular subtypes on metastatic breast cancer patients: a SEER population-based study. Sci Rep. 2017;7: 45411. pmid:28345619
- View Article
- PubMed/NCBI
- Google Scholar
5. Fan C, Oh DS, Wessels L, Weigelt B, Nuyten DSA, Nobel AB, et al. Concordance among Gene-Expression–Based Predictors for Breast Cancer. N Engl J Med. 2006;355: 560–569. pmid:16899776
- View Article
- PubMed/NCBI
- Google Scholar
6. Lim YJ, Lee SW, Choi N, Kwon J, Eom KY, Kang E, et al. Failure patterns according to molecular subtype in patients with invasive breast cancer following postoperative adjuvant radiotherapy: long-term outcomes in contemporary clinical practice. Breast Cancer Res Treat. 2017;163: 555–563. pmid:28315066
- View Article
- PubMed/NCBI
- Google Scholar
7. Gobbini E, Ezzalfani M, Dieras V, Bachelot T, Brain E, Debled M, et al. Time trends of overall survival among metastatic breast cancer patients in the real-life ESME cohort. Eur J Cancer. 2018;96: 17–24. pmid:29660596
- View Article
- PubMed/NCBI
- Google Scholar
8. Dawood S, Broglio K, Gonzalez-Angulo AM, Buzdar AU, Hortobagyi GN, Giordano SH. Trends in survival over the past two decades among white and black patients with newly diagnosed stage IV breast cancer. J Clin Oncol. 2008;26: 4891–8. pmid:18725649
- View Article
- PubMed/NCBI
- Google Scholar
9. Shao MM, Liu J, Vong JS, Niu Y, Germin B, Tang P, et al. A subset of breast cancer predisposes to brain metastasis. Med Mol Morphol. 2011;44: 15–20. pmid:21424932
- View Article
- PubMed/NCBI
- Google Scholar
10. Molnár IA, Molnár BÁ, Vízkeleti L, Fekete K, Tamás J, Deák P, et al. Breast carcinoma subtypes show different patterns of metastatic behavior. Virchows Arch. 2017;470: 275–283. pmid:28101678
- View Article
- PubMed/NCBI
- Google Scholar
11. Nieder C, Spanne O, Mehta MP, Grosu AL, Geinitz H. Presentation, patterns of care, and survival in patients with brain metastases: What has changed in the last 20 years? Cancer. 2011;117: 2505–2512. pmid:24048799
- View Article
- PubMed/NCBI
- Google Scholar
12. Anderson GP. Endotyping asthma: new insights into key pathogenic mechanisms in a complex, heterogeneous disease. Lancet. 2008;372: 1107–1119. pmid:18805339
- View Article
- PubMed/NCBI
- Google Scholar
13. Calfee CS, Delucchi K, Parsons PE, Taylor B, Ware LB, Matthay MA. Latent Class Analysis of ARDS Subphenotypes: Analysis of Data From Two Randomized Controlled Trials. Lancet Respir Med. 2014;2: 611–620. pmid:24853585
- View Article
- PubMed/NCBI
- Google Scholar
14. Scicluna BP, van Vught LA, Zwinderman AH, Wiewel MA, Davenport EE, Burnham KL, et al. Classification of patients with sepsis according to blood genomic endotype: a prospective cohort study. Lancet Respir Med. 2017;5: 816–826. pmid:28864056
- View Article
- PubMed/NCBI
- Google Scholar
15. Depner M, Fuchs O, Genuneit J, Karvonen AM, Hyvärinen A, Kaulek V, et al. Clinical and epidemiologic phenotypes of childhood asthma. Am J Respir Crit Care Med. 2014;189: 129–138. pmid:24283801
- View Article
- PubMed/NCBI
- Google Scholar
16. Ezzedine K, Le Thuaut A, Jouary T, Ballanger F, Taieb A, Bastuji-Garin S. Latent class analysis of a series of 717 patients with vitiligo allows the identification of two clinical subtypes. Pigment Cell Melanoma Res. 2014;27: 134–139. pmid:24127636
- View Article
- PubMed/NCBI
- Google Scholar
17. Schreiber JB, Pekarik AJ. Technical Note: Using Latent Class Analysis versus K-means or Hierarchical Clustering to Understand Museum Visitors. Curator Museum J. 2014;57: 45–59.
- View Article
- Google Scholar
18. Lanza ST, Rhoades BL. Latent Class Analysis: An Alternative Perspective on Subgroup Analysis in Prevention and Treatment. Prev Sci. 2013;14: 157–168. pmid:21318625
- View Article
- PubMed/NCBI
- Google Scholar
19. Zador Z, Sperrin M, King AT. Predictors of outcome in traumatic brain injury: New insight using receiver operating curve indices and Bayesian network analysis. PLoS One. 2016;11: 1–18. pmid:27388421
- View Article
- PubMed/NCBI
- Google Scholar
20. Cai Z, Si S, Chen C, Zhao Y, Ma Y, Wang L, et al. Analysis of Prognostic Factors for Survival after Hepatectomy for Hepatocellular Carcinoma Based on a Bayesian Network. PLoS One. 2015;10: e0120805. pmid:25826337
- View Article
- PubMed/NCBI
- Google Scholar
21. Berkan Sesen M, Nicholson AE, Banares-Alcantara R, Kadir T, Brady M. Bayesian networks for clinical decision support in lung cancer care. PLoS One. 2013;8. pmid:24324773
- View Article
- PubMed/NCBI
- Google Scholar
22. Zador Z, Huang W, Sperrin M, Lawton MT. Multivariable and Bayesian Network Analysis of Outcome Predictors in Acute Aneurysmal Subarachnoid Hemorrhage: Review of a Pure Surgical Series in the Postinternational Subarachnoid Aneurysm Trial Era. Oper Neurosurg. 2017; pmid:28973260
- View Article
- PubMed/NCBI
- Google Scholar
23. Nih . Surveillance epidemiology and end results. 2012;2: 2–4. Available: http://seer.cancer.gov/popdata/download.html
- View Article
- Google Scholar
24. Hu LT, Bentler PM. Cutoff criteria for fit indexes in covariance structure analysis: Conventional criteria versus new alternatives. Struct Equ Model. 1999;6: 1–55.
- View Article
- Google Scholar
25. R Development Core Team. R: A Language and Environment for Statistical Computing. R Found Stat Comput Vienna Austria. 2016;0: {ISBN} 3-900051-07-0. https://doi.org/10.1038/sj.hdy.6800737
26. Linzer DA, Lewis JB. poLCA: An R Package for Polytomous Variable Latent Class Analysis. J Stat Softw. 2011;42: 1–29.
- View Article
- Google Scholar
27. Scutari M. Learning Bayesian Networks with the bnlearn R Package. J Stat Softw. 2010;35: 1–22.
- View Article
- Google Scholar
28. Rosseel Y. lavaan: An R Package for Structural Equation Modeling. J Stat Softw. 2012;48: 1–36.
- View Article
- Google Scholar
29. Gordon NH. Socioeconomic factors and breast cancer in black and white Americans. Cancer Metastasis Rev. 2003;22: 55–65. pmid:12716037
- View Article
- PubMed/NCBI
- Google Scholar
30. Gentil-Brevet J, Colonna M, Danzon A, Grosclaude P, Chaplain G, Velten M, et al. The influence of socio-economic and surveillance characteristics on breast cancer survival: A French population-based study. Br J Cancer. 2008;98: 217–224. pmid:18182980
- View Article
- PubMed/NCBI
- Google Scholar
31. Ren Z, Li Y, Shen T, Hameed O, Siegal GP, Wei S. Prognostic factors in advanced breast cancer: Race and receptor status are significant after development of metastasis. Pathol Res Pract. Elsevier GmbH.; 2016;212: 24–30. pmid:26616114
- View Article
- PubMed/NCBI
- Google Scholar
32. Hoefnagel LDC, van de Vijver MJ, van Slooten HJ, Wesseling P, Wesseling J, Westenend PJ, et al. Receptor conversion in distant breast cancer metastases. Breast Cancer Res. BioMed Central Ltd; 2010;12: R75. pmid:20863372
- View Article
- PubMed/NCBI
- Google Scholar

[ref1] 1. Perou CM, Sørile T, Eisen MB, Van De Rijn M, Jeffrey SS, Ress CA, et al. Molecular portraits of human breast tumours. Nature. 2000;406: 747–752. pmid:10963602
View Article
PubMed/NCBI
Google Scholar

[2] View Article

[3] PubMed/NCBI

[4] Google Scholar

[ref2] 2. Sorlie T, Perou CM, Tibshirani R, Aas T, Geisler S, Johnsen H, et al. Gene expression patterns of breast carcinomas distinguish tumor subclasses with clinical implications. Proc Natl Acad Sci U S A. 2001;98: 10869–10874. pmid:11553815
View Article
PubMed/NCBI
Google Scholar

[6] View Article

[7] PubMed/NCBI

[8] Google Scholar

[ref3] 3. Dai X, Li T, Bai Z, Yang Y, Liu X, Zhan J, et al. Breast cancer intrinsic subtype classification, clinical use and future trends. Am J Cancer Res. 2015;5: 2929–2943. www.ajcr.us/ISSN:2156-6976/ajcr0014158 pmid:26693050
View Article
PubMed/NCBI
Google Scholar

[10] View Article

[11] PubMed/NCBI

[12] Google Scholar

[ref4] 4. Gong Y, Liu Y-R, Ji P, Hu X, Shao Z-M. Impact of molecular subtypes on metastatic breast cancer patients: a SEER population-based study. Sci Rep. 2017;7: 45411. pmid:28345619
View Article
PubMed/NCBI
Google Scholar

[14] View Article

[15] PubMed/NCBI

[16] Google Scholar

[ref5] 5. Fan C, Oh DS, Wessels L, Weigelt B, Nuyten DSA, Nobel AB, et al. Concordance among Gene-Expression–Based Predictors for Breast Cancer. N Engl J Med. 2006;355: 560–569. pmid:16899776
View Article
PubMed/NCBI
Google Scholar

[18] View Article

[19] PubMed/NCBI

[20] Google Scholar

[ref6] 6. Lim YJ, Lee SW, Choi N, Kwon J, Eom KY, Kang E, et al. Failure patterns according to molecular subtype in patients with invasive breast cancer following postoperative adjuvant radiotherapy: long-term outcomes in contemporary clinical practice. Breast Cancer Res Treat. 2017;163: 555–563. pmid:28315066
View Article
PubMed/NCBI
Google Scholar

[22] View Article

[23] PubMed/NCBI

[24] Google Scholar

[ref7] 7. Gobbini E, Ezzalfani M, Dieras V, Bachelot T, Brain E, Debled M, et al. Time trends of overall survival among metastatic breast cancer patients in the real-life ESME cohort. Eur J Cancer. 2018;96: 17–24. pmid:29660596
View Article
PubMed/NCBI
Google Scholar

[26] View Article

[27] PubMed/NCBI

[28] Google Scholar

[ref8] 8. Dawood S, Broglio K, Gonzalez-Angulo AM, Buzdar AU, Hortobagyi GN, Giordano SH. Trends in survival over the past two decades among white and black patients with newly diagnosed stage IV breast cancer. J Clin Oncol. 2008;26: 4891–8. pmid:18725649
View Article
PubMed/NCBI
Google Scholar

[30] View Article

[31] PubMed/NCBI

[32] Google Scholar

[ref9] 9. Shao MM, Liu J, Vong JS, Niu Y, Germin B, Tang P, et al. A subset of breast cancer predisposes to brain metastasis. Med Mol Morphol. 2011;44: 15–20. pmid:21424932
View Article
PubMed/NCBI
Google Scholar

[34] View Article

[35] PubMed/NCBI

[36] Google Scholar

[ref10] 10. Molnár IA, Molnár BÁ, Vízkeleti L, Fekete K, Tamás J, Deák P, et al. Breast carcinoma subtypes show different patterns of metastatic behavior. Virchows Arch. 2017;470: 275–283. pmid:28101678
View Article
PubMed/NCBI
Google Scholar

[38] View Article

[39] PubMed/NCBI

[40] Google Scholar

[ref11] 11. Nieder C, Spanne O, Mehta MP, Grosu AL, Geinitz H. Presentation, patterns of care, and survival in patients with brain metastases: What has changed in the last 20 years? Cancer. 2011;117: 2505–2512. pmid:24048799
View Article
PubMed/NCBI
Google Scholar

[42] View Article

[43] PubMed/NCBI

[44] Google Scholar

[ref12] 12. Anderson GP. Endotyping asthma: new insights into key pathogenic mechanisms in a complex, heterogeneous disease. Lancet. 2008;372: 1107–1119. pmid:18805339
View Article
PubMed/NCBI
Google Scholar

[46] View Article

[47] PubMed/NCBI

[48] Google Scholar

[ref13] 13. Calfee CS, Delucchi K, Parsons PE, Taylor B, Ware LB, Matthay MA. Latent Class Analysis of ARDS Subphenotypes: Analysis of Data From Two Randomized Controlled Trials. Lancet Respir Med. 2014;2: 611–620. pmid:24853585
View Article
PubMed/NCBI
Google Scholar

[50] View Article

[51] PubMed/NCBI

[52] Google Scholar

[ref14] 14. Scicluna BP, van Vught LA, Zwinderman AH, Wiewel MA, Davenport EE, Burnham KL, et al. Classification of patients with sepsis according to blood genomic endotype: a prospective cohort study. Lancet Respir Med. 2017;5: 816–826. pmid:28864056
View Article
PubMed/NCBI
Google Scholar

[54] View Article

[55] PubMed/NCBI

[56] Google Scholar

[ref15] 15. Depner M, Fuchs O, Genuneit J, Karvonen AM, Hyvärinen A, Kaulek V, et al. Clinical and epidemiologic phenotypes of childhood asthma. Am J Respir Crit Care Med. 2014;189: 129–138. pmid:24283801
View Article
PubMed/NCBI
Google Scholar

[58] View Article

[59] PubMed/NCBI

[60] Google Scholar

[ref16] 16. Ezzedine K, Le Thuaut A, Jouary T, Ballanger F, Taieb A, Bastuji-Garin S. Latent class analysis of a series of 717 patients with vitiligo allows the identification of two clinical subtypes. Pigment Cell Melanoma Res. 2014;27: 134–139. pmid:24127636
View Article
PubMed/NCBI
Google Scholar

[62] View Article

[63] PubMed/NCBI

[64] Google Scholar

[ref17] 17. Schreiber JB, Pekarik AJ. Technical Note: Using Latent Class Analysis versus K-means or Hierarchical Clustering to Understand Museum Visitors. Curator Museum J. 2014;57: 45–59.
View Article
Google Scholar

[66] View Article

[67] Google Scholar

[ref18] 18. Lanza ST, Rhoades BL. Latent Class Analysis: An Alternative Perspective on Subgroup Analysis in Prevention and Treatment. Prev Sci. 2013;14: 157–168. pmid:21318625
View Article
PubMed/NCBI
Google Scholar

[69] View Article

[70] PubMed/NCBI

[71] Google Scholar

[ref19] 19. Zador Z, Sperrin M, King AT. Predictors of outcome in traumatic brain injury: New insight using receiver operating curve indices and Bayesian network analysis. PLoS One. 2016;11: 1–18. pmid:27388421
View Article
PubMed/NCBI
Google Scholar

[73] View Article

[74] PubMed/NCBI

[75] Google Scholar

[ref20] 20. Cai Z, Si S, Chen C, Zhao Y, Ma Y, Wang L, et al. Analysis of Prognostic Factors for Survival after Hepatectomy for Hepatocellular Carcinoma Based on a Bayesian Network. PLoS One. 2015;10: e0120805. pmid:25826337
View Article
PubMed/NCBI
Google Scholar

[77] View Article

[78] PubMed/NCBI

[79] Google Scholar

[ref21] 21. Berkan Sesen M, Nicholson AE, Banares-Alcantara R, Kadir T, Brady M. Bayesian networks for clinical decision support in lung cancer care. PLoS One. 2013;8. pmid:24324773
View Article
PubMed/NCBI
Google Scholar

[81] View Article

[82] PubMed/NCBI

[83] Google Scholar

[ref22] 22. Zador Z, Huang W, Sperrin M, Lawton MT. Multivariable and Bayesian Network Analysis of Outcome Predictors in Acute Aneurysmal Subarachnoid Hemorrhage: Review of a Pure Surgical Series in the Postinternational Subarachnoid Aneurysm Trial Era. Oper Neurosurg. 2017; pmid:28973260
View Article
PubMed/NCBI
Google Scholar

[85] View Article

[86] PubMed/NCBI

[87] Google Scholar

[ref23] 23. Nih . Surveillance epidemiology and end results. 2012;2: 2–4. Available: http://seer.cancer.gov/popdata/download.html
View Article
Google Scholar

[89] View Article

[90] Google Scholar

[ref24] 24. Hu LT, Bentler PM. Cutoff criteria for fit indexes in covariance structure analysis: Conventional criteria versus new alternatives. Struct Equ Model. 1999;6: 1–55.
View Article
Google Scholar

[92] View Article

[93] Google Scholar

[ref25] 25. R Development Core Team. R: A Language and Environment for Statistical Computing. R Found Stat Comput Vienna Austria. 2016;0: {ISBN} 3-900051-07-0. https://doi.org/10.1038/sj.hdy.6800737

[ref26] 26. Linzer DA, Lewis JB. poLCA: An R Package for Polytomous Variable Latent Class Analysis. J Stat Softw. 2011;42: 1–29.
View Article
Google Scholar

[96] View Article

[97] Google Scholar

[ref27] 27. Scutari M. Learning Bayesian Networks with the bnlearn R Package. J Stat Softw. 2010;35: 1–22.
View Article
Google Scholar

[99] View Article

[100] Google Scholar

[ref28] 28. Rosseel Y. lavaan: An R Package for Structural Equation Modeling. J Stat Softw. 2012;48: 1–36.
View Article
Google Scholar

[102] View Article

[103] Google Scholar

[ref29] 29. Gordon NH. Socioeconomic factors and breast cancer in black and white Americans. Cancer Metastasis Rev. 2003;22: 55–65. pmid:12716037
View Article
PubMed/NCBI
Google Scholar

[105] View Article

[106] PubMed/NCBI

[107] Google Scholar

[ref30] 30. Gentil-Brevet J, Colonna M, Danzon A, Grosclaude P, Chaplain G, Velten M, et al. The influence of socio-economic and surveillance characteristics on breast cancer survival: A French population-based study. Br J Cancer. 2008;98: 217–224. pmid:18182980
View Article
PubMed/NCBI
Google Scholar

[109] View Article

[110] PubMed/NCBI

[111] Google Scholar

[ref31] 31. Ren Z, Li Y, Shen T, Hameed O, Siegal GP, Wei S. Prognostic factors in advanced breast cancer: Race and receptor status are significant after development of metastasis. Pathol Res Pract. Elsevier GmbH.; 2016;212: 24–30. pmid:26616114
View Article
PubMed/NCBI
Google Scholar

[113] View Article

[114] PubMed/NCBI

[115] Google Scholar

[ref32] 32. Hoefnagel LDC, van de Vijver MJ, van Slooten HJ, Wesseling P, Wesseling J, Westenend PJ, et al. Receptor conversion in distant breast cancer metastases. Breast Cancer Res. BioMed Central Ltd; 2010;12: R75. pmid:20863372
View Article
PubMed/NCBI
Google Scholar

[117] View Article

[118] PubMed/NCBI

[119] Google Scholar

Figures

Abstract

Background

Methods

Results

Conclusions

Background

Methods

Data collection

Inclusion/Exclusion criteria

Endophenotype discovery and hypothesis generation

Computational platform

Results

Patient characteristics

Endophenotype discovery with latent class analysis

Bayesian network analysis

Structural equation modelling

Discussion

Overview of key results

The “HGHR-” class has unexpected features

Social factors may play a role in metastasis and outcome

Limitations

Future directions

Conclusions

References