Skip to main content

ORIGINAL RESEARCH article

Front. Genet., 22 July 2019
Sec. Computational Genomics

Weighted Fused Pathway Graphical Lasso for Joint Estimation of Multiple Gene Networks

Nuosi WuNuosi Wu1Jiang HuangJiang Huang2Xiao-Fei ZhangXiao-Fei Zhang3Le Ou-Yang,*Le Ou-Yang4,5*Shan HeShan He6Zexuan Zhu*Zexuan Zhu2*Weixin XieWeixin Xie1
  • 1College of Electronics and Information Engineering, Shenzhen University, Shenzhen, China
  • 2College of Computer Science and Software Engineering, Shenzhen University, Shenzhen, China
  • 3School of Mathematics and Statistics, Central China Normal University, Wuhan, China
  • 4Guangdong Key Laboratory of Intelligent Information Processing and Shenzhen Key Laboratory of Media Security, College of Electronics and Information Engineering, Shenzhen University, Shenzhen, China
  • 5Shenzhen Institute of Artificial Intelligence and Robotics for Society, Shenzhen, China
  • 6School of Computer Science, University of Birmingham, Birmingham, United Kingdom

Gene regulatory networks (GRNs) are often inferred based on Gaussian graphical models that could identify the conditional dependence among genes by estimating the corresponding precision matrix. Classical Gaussian graphical models are usually designed for single network estimation and ignore existing knowledge such as pathway information. Therefore, they can neither make use of the common information shared by multiple networks, nor can they utilize useful prior information to guide the estimation. In this paper, we propose a new weighted fused pathway graphical lasso (WFPGL) to jointly estimate multiple networks by incorporating prior knowledge derived from known pathways and gene interactions. Based on the assumption that two genes are less likely to be connected if they do not participate together in any pathways, a pathway-based constraint is considered in our model. Moreover, we introduce a weighted fused lasso penalty in our model to take into account prior gene interaction data and common information shared by multiple networks. Our model is optimized based on the alternating direction method of multipliers (ADMM). Experiments on synthetic data demonstrate that our method outperforms other five state-of-the-art graphical models. We then apply our model to two real datasets. Hub genes in our identified state-specific networks show some shared and specific patterns, which indicates the efficiency of our model in revealing the underlying mechanisms of complex diseases.

Introduction

Most biological processes within cells involve multiple genes (Schlitt and Brazma, 2007; Zhang et al., 2014). Inferring the regulatory relationships between genes is important for understanding the functional organization within cells and helps to reveal the mechanisms of complex diseases (Liu et al., 2016). In recent years, a large number of works have been proposed for inferring gene regulatory networks (GRNs) from gene expression data (Schlitt and Brazma, 2007; Danaher et al., 2014; Verfaillie et al., 2014; Zhang et al., 2014; Liu et al., 2016; Ou-Yang et al., 2017a, Ou-Yang et al., 2017b). Despite their success in addressing some biological problems, revealing the comprehensive GRNs is still a challenging task (Zhang et al., 2014; Grechkin et al., 2015).

Gaussian graphical model (GGM) is an attractive paradigm to depict the associations among biomolecules (Yuan and Lin, 2007). In GGM, each node of the graph represents a random variable from a random vector subjected to multivariate normal distribution, and there is an edge between two nodes if the corresponding two random variables are conditionally dependent, which means the corresponding element of the precision matrix (or inverse covariance matrix) is non-zero (Dempster, 1972; Uhler, 2017). This property makes GGM so popular because we are able to get the network structure by just estimating the precision matrix (Yuan and Lin, 2007). Unfortunately, in the analysis of gene expression data, the number of samples is usually far less than the dimension of a random vector, which makes it hard to estimate the precision matrix directly (the empirical covariance is not invertible). By assuming that the precision matrix is sparse and the data samples are drawn independently from the same distribution, several approaches have been proposed to estimate the precision matrix (Meinshausen and Bühlmann, 2006; Yuan and Lin, 2007; Banerjee et al., 2008).

However, as biological systems are highly dynamic (Ideker and Krogan, 2012), we are faced with observations collected from different states (Huang et al., 2018a). For example, the gene expression data can be collected from both the diseased and normal tissues (Tian et al., 2016; Ou-Yang et al., 2017a). Thus, if we estimate each state-specific network separately for each sample set, the common structures within different state-specific networks will be ignored. In contrast, inferring a single network from all sample sets may mask their differences. To address this problem, many works have been proposed in recent years to jointly estimate multiple graphical models (Guo et al., 2011; Danaher et al., 2014; Ou-Yang et al., 2017b). When the focus is to infer the differential network between two different states, instead of inferring two state-specific networks, some works have also been developed to estimate the differential network directly (Zhang and Zou, 2014; Yuan et al., 2017).

Although the above methods for learning the structure of multiple GGMs have been successfully used to estimate the regulatory relationships among genes, their performance may be limited since they do not consider the existing knowledge about genes and their regulatory relationships. For example, a pathway is a set of components that interact with each other to perform specific biological tasks. Researches have found that many diseases arose from the joint action of multiple genes within a pathway (Peng et al., 2010). Therefore, pathway-based learning of gene regulatory networks may yield biological insights that are hard to detect by traditional GGMs (Grechkin et al., 2015). Although a pathway graphical lasso model has been proposed to incorporate pathway-based constraints into GGMs (Grechkin et al., 2015), it is designed for single network estimation and cannot jointly estimate multiple GGMs. Moreover, with the accumulation of high-throughput data, we are able to collect some literature-curated gene interactions from public database (Han et al., 2017; Xu et al., 2018). As the state-specific gene regulatory relationships as well as their changes across multiple states are more likely to take place between genes that are known to have interactions, incorporating these prior information may help to identify the changes of GRNs more accurately (Xu et al., 2018).

To address the above problems, in this paper, we proposed a novel weighted fused pathway graphical lasso (WFPGL) to jointly estimate multiple gene networks as well as their difference by incorporating prior knowledge derived from known pathways and gene interactions. In particular, given a set of pathways, we first assume that regulatory relationships will not take place between genes that belong to different pathways. Here, the pathway information is assumed to be able to provide us prior knowledge that certain edges are unlikely to be present (Grechkin et al., 2015). Under this assumption, the incorrect links across pathways will be eliminated, as shown in Figure 1. To make use of the prior knowledge from public gene interaction database and draw support from multiple sample sets collected from different states, we introduce a weighted fused lasso penalty in our model. The proposed WFPGL is optimized by alternating direction method of multipliers (ADMM) (Boyd, 2010), and we follow the idea of pathway graphical lasso that decomposes the original problem into pathway-based subproblem to accelerate the optimization. To demonstrate the performance of our new algorithm, we first conduct simulation studies and compare our algorithm with other five state-of-the-art graphical models. Experiment results on synthetic data show that our WFPGL outperforms other related methods. We then apply our WFPGL on two real datasets. The first experiment is to estimate the gene regulatory networks of insulin sensitive and insulin resistant type 2 diabetes patients. Experiment results demonstrate that our method could identify some promising candidate genes related to insulin resistance. The second experiment is to jointly estimate the gene regulatory networks of four breast cancer subtypes. We find that our identified subtype-specific networks have some shared and specific structures, which may help to reveal the mechanisms of cancer differentiation. Overall, the experiment results on synthetic and real data demonstrate that our WFPGL could effectively utilize prior knowledge to jointly estimate multiple gene networks. The datasets and source codes of our proposed WFPGL are freely available at https://github.com/NuosiWu/WFPGL.

FIGURE 1
www.frontiersin.org

Figure 1 Illustration of the proposed method. (A) Graphical lasso just uses gene expression data to separately estimate each state-specific network, leading to incorrect estimation results. (B) The proposed weighted fused pathway graphical lasso jointly estimates multiple state-specific networks by considering the prior knowledge of gene interaction networks and pathways, which could eliminate the spurious links between different pathways and results in more accurate estimation of the networks across multiple states.

The rest of this paper is organized as follows. In the section Weighted Fused Pathway Graphic Lasso, we formulate our WFPGL and describe the optimization processes. In the section Simulation Studies, we illustrate the performance of WFPGL and another five state-of-the-art methods on synthetic datasets. The section Real Data Analysis provides studies on two real datasets, and the section Conclusion discusses the utilization of the method.

Weighted Fused Pathway Graphical Lasso

Problem Formulation

Suppose X (1), …, X ( K ) are K groups of sample sets that measure the gene expression levels of p genes across K different states. Each X ( k ) ∈ Rp × nk may have a different sample size nk, and samples within each group (or state) are independent and follow same multivariate Gaussian distribution. Then maximizing the likelihood functions of multiple Gaussian graphical model is equivalent to solving the following optimization problem (Uhler, 2017):

maximizeΘk=1Knk[log{det(Θ(k))}tr(S(k)Θ(k))],(1)

where S(k) is the sample covariance matrix for X ( k ) and Θ( k ) is the inverse covariance matrix (or precision matrix).

To learn the parameters Θ( k ) (which represent networks) for k = 1, …, K, additional penalty is often required to guarantee structural recovery especially for high-dimensional setting. Fused lasso penalty, which imposes sparse penalties not only on individual networks but also on the differences between each pair of networks, has been proven to be effective on joint estimation of multiple networks (Danaher et al., 2014). However, traditional fused lasso penalty does not take into account prior information. As the changes of GRNs across different states are more likely to take place between genes that are known to interact with each other (Xu et al., 2018), we encourage the identification of differential edges that appear in the known gene interaction network. Thus, given a prior gene interaction network with adjacency matrix G, we introduce the following weighted fused lasso penalty function:

F({Θ(k)})=λ1k=1KijWij|θij(k)|+λ2k<ki,jWij|θij(k)θij(k)|.(2)

Here, λ1 > 0 is a tuning parameter controlling the sparsity of precision matrices, λ2 > 0 is a tuning parameter controlling the sparsity of the differential networks, and Wij is the weight assigned to a pair of genes. In this study, the weight is set as follows:

Wij={1ifGij=0,wifGij=1.(3)

where w ∈ [0, 1] is a predefined parameter. The smaller the value of w, the more likely the corresponding edge will be identified.

As we could make use of pathway information to specify sets of genes that are more likely to work together, incorporating pathway information may help to improve the structure learning of GRNs and achieve more meaningful and interpretable results. Following the ideal of pathway graphical lasso (Grechkin et al., 2015), we constrain the graphical model so that the elements in precision matrices are fixed to 0 if the corresponding gene pairs are not together in any pathways.

Suppose there are J pathways within the p genes (denoted as Pt, t = 1, …, J), and considering the pathway constraints and the knowledge of prior gene interaction network, the objective function of our WFPGL model is as follows:

maximize{Θ(k)}  k=1Knk[log{det(Θ(k))}tr(S(k)Θ(k))]F({Θ(k)})s.t. Θ(k)0,    θij(k)=0,   if (ij)t=1J{(i,j)|i,jPt},

where F({Θ( k )}) takes the form of function (2).

Optimization

Pathway Separable

Given J pathways, let Θt(k)(t=1,,J) denote a sub-matrix of Θ( k ), which models the sub-graph within the pt genes in t-th pathway (p 1 + ⋯ + pJ = p). After some permutations of the rows and columns under the pathway constraints, Θ( k ) can be rearranged into the following form:

Θ(k)=[Θ11(k)Θ12(k)0Θ21(k)Θ22(k)Θ23(k)0Θ32(k)Θ33(k)](5)

where Θ1(k)=[Θ11(k)Θ12(k)Θ21(k)Θ22(k)] contains all edges in the first pathway, and Θother(k)=[Θ22(k)Θ23(k)Θ32(k)Θ33(k)] contains the parameter in the rest of the pathways. Θ22(k) is the overlapping part of Θ1 and Θother, which is a square matrix containing all links among the genes that co-occur in the pathways as well as the remaining pathways. Applying the Schur complement decomposition, we have

det(Θ(k))=det(Θ33k)det(Θ1k1(k)),(6)

where 1(k)=[0;Θ23k](Θ33k)1[0,Θ32(k)]. If we hold the parameters from other pathways fixed and just update the parameters in the first pathway, 1(k) will be a constant matrix, and problem (1) boils down to maximizing the following function:

nk[log{det(Θ1(k))1(k))}tr(S1(k)(Θ1(k)1(k)))],(7)

where the matrix with subscript denotes the sub-matrix corresponding to the first pathway. Notice that (7) is exactly the likelihood function for updating Θ1(k)1(k), and 1(k) is constant and can be pre-calculated. Updating Θ1(k) within a certain pathway via maximizing likelihood function only requires information from the corresponding pathway itself. We define a function with such property to be pathway separable.

Optimization Work Flow

Problem (4) is converted into the following equivalent problem:

minimize{Θ(k)},{Z(k)}  k=1Knk[log{det(Θ(k))}tr(S(k)Θ(k))]+F({Z(k)}),    s.t. Z(k)=Θ(k),k=1,2,,K,(8)

which can be solved with the ADMM method (Boyd, 2010) by minimizing the following augmented Lagrangian function:

k=1Knk[log{det(Θ(k))}tr(S(k)Θ(k))]+(F{Z(k)})  +k=1K(μ2||Θ(k)Z(k)||F2+Y(k),ΘkZ(k)),(9)

where <A, B> = tr (ABT),‖A‖ denotes the Frobenius norm of matrix A, μ > 0 is a penalty parameter, and Y ( k ) is a Lagrangian multiplier. We let V ( k ) = Y ( k )/μ, and (9) can be rewritten as the ultimate objective function:

Lμ({Θ(k)},{Z(k)},V(k)})=k=1Knk[log{det(Θ(k))}tr(S(k)Θ(k))]+F({Z(k)})+k=1K(μ2||Θ(k)Z(k)+V(k)||F2μ2||V(k)||F2).

www.frontiersin.org

ADMM solves above problem by three steps in every iteration, as shown in Algorithm 1.

ADMM Solver

It is easy to verify that ||Θ(k)Z(k)+V(k)||F2 is pathway separable. This allows us to use a block-coordinate descent approach for accelerating the updating of Θ( k ) (Tseng, 2001; Grechkin et al., 2015). By updating Θt(k) in each pathway individually while leaving the parameters in other pathway fixed, the complexity of the problem is narrowed down, and the pathway constraints on the precision matrix naturally meet since the parameters outside the pathway remain 0. Specifically, let Tt(k)=Θt(k)t(k), and we update Θt(k) by solving the subproblem, as follows:

minimize{Θt(k)}  nk[log{det(Tt(k))}tr(St(k)Tt(k))]+μ2||Tt(k)+t(k)Zt(k)+Vt(k)||F2.

Here, the subscript t of a matrix denotes the index of the sub-matrix that corresponds to the t-th pathway. Grechkin et al. (2015) have introduced an efficient message passing algorithm to calculate t(k) efficiently. The solution to this problem is UCUT, where UDUT represents the eigendecomposition of St(k)+μ/nk(t(k)Zt(k)+Vt(k)), and C is a diagonal matrix with i-th diagonal element to be nk2μ{Dii+(Dii2+4μ/nk)1/2}

The second step is to update Z ( k ) as the minimizer of

k=1K(μ2||Θ(k)Z(k)+V(k)||F2)+λ1k=1KijWij|zij(k)|+λ2k<ki,jWij|zij(k)zij(k)|).(12)

The solution to this problem can be obtained by the method introduced in Hoefling (2010). The last step for each iteration is to update V( k ) as V( k ) + Θ( k )Z ( k ). We stop the algorithm until k||Θ(i)(k)Θ(i1)(k)||F2/k||Θ(i1)(k)||F2<105

Simulation Studies

We compare the performance of WFPGL with that of five state-of-the-art graphical models: 1) graphical lasso (GL) (Meinshausen and Bühlmann, 2006), which is a classical algorithm for precision matrix estimation; 2) pathway graphical lasso (PGL) (Grechkin et al., 2015), which is a framework that uses pathway knowledge to estimate single Gaussian graphical model; 3) fused graphical lasso (FGL) (Danaher et al., 2014), which is a method for joint estimation of multiple precision matrices across multiple states; 4) differential network estimation via D-trace loss (Dtrace) (Yuan et al., 2017), which is a method for direct estimation of a differential network between two states; and 5) weighted D-trace loss (WDtrace) (Xu et al., 2018), which is an algorithm proposed for inferring differential network rewiring by integrating static gene regulatory network information. We implement GL and FGL with their R packages and perform PGL and WFPGL in python environment and using Matlab to carry out Dtrace and WDtrace.

WFPGL has three tuning parameters, i.e., the sparsity controllers λ1 and λ2 and the weight of prior information w. For w, as suggested by Xu et al. (2018), we set w = 0.3 in the following experiments.

To evaluate the performance of various, we adopt four evaluation metrics named true-positive rate (TPR), false-positive rate (FPR), true-positive differential rate (TPDR), and false-positive differential rate (FPDR). Let θij and θ^ij denote the elements in true precision matrix and the estimated precision matrix, respectively. The definitions of these four metrics are as follows:

TPR=k=1Ki<jI{θ^ij(k)0 and θij(k)0}k=1Ki<jI{θij(k)0},FPR=k=1Ki<jI{θ^ij(k)0 and θij(k)=0}k=1Ki<jI{θij(k)=0},TPDR=k<kk<jI{θ^ij(k)θ^ij(k) and θij(k)θij(k)}k<ki<jI{θij(k)θij(k)},FPDR=k<ki<jI{θ^ij(k)θ^ij(k) and θij(k)=θij(k)}k<ki<jI{θij(k)=θij(k)},

According to the above definitions, TPR and FPR measure the accuracy of network estimation, whereas TPDR and FPDR measure the accuracy of differential network estimation.

Experiments on Two Groups of Samples

In this section, we first consider the situation when there are two groups of samples corresponding to two different states. The main generation procedure of the synthetic datasets is described as follows.

STEP 1: PATHWAY DEFINITION

For ease of simulation study, we simply put successive features into one pathway and let the intersection of two pathways to be non-empty set only when the pathways are “neighbors.” We create 10 pathways with same size that covers all 400 features with n ol = 5 features overlapped in neighbor pathway. Let Pt (t = 1, …, 10) represent a pathway set in which the element indicates the index of feature. In such configuration, pathways are generated as P 1 = {1, 2, …, 45}, P 2 = (41, 42, …, 85}, …, P 10 = (361, 362, …, 400}.

STEP 2: NETWORK CONSTRUCTION

We first build a random scale-free network for the state 1, denoted by its adjacency binary matrix M (1). M˜(1) is a copy of M (1) with each non-zero element substituted by a uniform distribution value on [−0.6, −0.3] ∪ [0.3, 0.6]. Then M˜(2) is generated by the copy of M˜(1), with about r = 30% non-zero elements vanished.

STEP 3: PRIOR NETWORK GENERATION

We select a proportion (ㅰη) of edges from M (1) and connect the corresponding features in prior gene interaction network G.

STEP 4: PRECISION MATRIX CALCULATION

To ensure the positive definiteness of the covariance matrix, we get the real precision matrix Θ˜(k) as

Θ˜(k)=M˜(k)Q+σI,

where ⚬ is Hadamard product operator, Q is a p × p binary “pathway network” matrix in which the “1” element indicates the corresponding feature pairs that co-occur in any of the pathways, σ is the absolute value of the minimum eigenvalues of (M˜(k)Q) for k = 1, 2, and I is identity matrix.

After the precision matrix Θ˜(k) for each state is settled in Step 4, the synthetic gene expression data could be generated with zero means and Σ=Θ˜1.

We generate 50 random two-state datasets with the same setting: nk = 100 samples for both two states, 10 pathways with an overlapping number n ol = 5 covering all p = 400 genes as introduced in Step 1.

As defined in Step 3 of dataset generation, prior rate η controls the proportion of true edges that is covered by prior gene interaction network. Note that the edges in the prior gene interaction network are not necessarily differential edges. We set η to be 0, 0.4, and 0.8 to see its impact on the performance of WFPGL.

Figure 2 presents the average performance of various methods over 50 random generations of data with different values of parameters. In particular, for GL, PGL, Dtrace, and WDtrace that have one sparsity-controlling parameter, we vary the value of the parameter and show their performance. For FGL and WFPGL that have two tuning parameters, i.e., λ1 and λ2, which control the sparsity of individual networks and their difference, respectively, we fix the value of λ2 and show their performance with the value of λ1 varied. The performance of different methods is shown in distinct colors. Dtrace and WDtrace do not have TPR and FPR, since these two methods directly estimate the differential network between two states and do not predict the individual networks. For GL, PGL, Dtrace, and WDtrace, each curve depicts the performance from a wide range of the sparsity-controlling parameter. For FGL and WFPGL, we pick up three curves here to show their performance. Each curve presents the results with λ2 being fixed to a certain value: solid line for λ2 = 0.0001, dashed line for λ2 = 0.001, and dotted line for λ2 = 0.01.

FIGURE 2
www.frontiersin.org

Figure 2 The experiment results of various methods on two groups of samples, with the value of η changing at (A) η = 0, (B) η = 0.4, and (C) η = 0.8. The performance of various methods on individual network estimation [with respect to true-positive rate (TPR) and false-positive rate (FPR)] is shown on the left side, while the performance of various methods on differential network estimation [with respect to true-positive differential rate (TPDR) and false-positive differential rate (FPDR)] is shown on the right side. For weighted fused pathway graphical lasso (WFPGL) and fused graphical lasso (FGL), different line styles correspond to different choices of λ2: solid line for λ2 = 0.0001, dashed line for λ2 = 0.001, and dotted line for λ2 = 0.01.

The plots on the left side of Figure 2 show the performance of various methods for individual network estimation. We can find that pathway-based methods WFPGL and PGL dominate other methods that do not use pathway information. Joint estimation methods outperform single network estimation methods (WFPGL performs better than PGL, and FGL performs better than GL). This may be because the joint estimation methods can draw support from multiple sample sets to achieve a more accurate estimation. Though λ2 controls the sparsity of differential networks, it also affects the accuracy of individual network estimation with larger values result in a better prediction.

The plots on the right side of Figure 2 show the results of differential network prediction. WFPGL also outperforms other methods, since it uses both two views of prior knowledge, followed by PGL that only uses pathway information and WDtrace that only uses prior gene interaction network knowledge. GL, FGL, and Dtrace, which do not use any prior knowledge, cannot identify differential networks accurately. There is a distinct improvement on the performance of WFPGL and WDtrace when the value of η increases, which indicates the effectiveness of using prior gene interactions.

To evaluate the performance of various methods without prior network information or prior pathway information, we show the results of various methods without prior network information (i.e., η = 0) in Figure 2, and we show the results of various methods without prior pathway information in the Supplementary Material (Figure S1). In fact, when there is no prior network information, our method degenerates to a fused pathway graphical lasso model. When there is no prior pathway information, our method degenerates to a weighted fused graphical lasso model. We can find from Figure 2 and Figure S1 that when there is no prior network or prior pathway information, our method could still achieve competitive performance with other comparative methods.

Experiments on Three and Four Groups of Samples

We generate three groups of samples (corresponding to three different states) with p = 400 features and n = 50 samples per group, and we assume that there are eight pathways covering all features with the overlapping nol = 10. The first two networks are generated in the same way as the above section, except that only r = 10% edges are deleted from state 1 to form the network of state 2 (Step 2). The precision matrix for state 3 is copied from state 2, with the removed elements in state 2 being reassigned by uniform distributed values [−0.6, −0.3] ∪ [0.3, 0.6] with the possibility of 0.5.

To further test the robustness of WFPGL, we generated a dataset with four different states. In particular, we generated four groups of samples with p = 200 features and n = 50 samples per group, covered by five pathways with nov = 8. The first three networks are generated the same way as the above section except that the different rate is set to be r = 20%. Then we randomly deleted 20% non-zero elements from state 1 to build the network of state 4.

Similarly, 50 random datasets are generated for the above two situations, and Figure 3 shows the averaged results with η = 0.8. Because Dtrace and WDtrace cannot handle the estimation of multiple differential networks, we compare WFPGL with GL, PGL, and FGL. WFPGL still outperforms others on both individual and differential network estimations, followed by PGL, which uses pathway information only. FGL and GL do not perform well on these datasets, though FGL performs slightly better than GL on individual network estimation.

FIGURE 3
www.frontiersin.org

Figure 3 The experiment results of various methods on multiple groups of samples: (A) Dataset 1 for three states. (B) Dataset 2 for four states. The performance of various methods on individual network estimation (with respect to TPR and FPR) is shown on the left side, while the performance of various methods on differential network estimation (with respect to TPDR and FPDR) is shown on the right side. For WFPGL and FGL, different line styles corresponding to different choices of λ2: solid line for λ2 = 0.0001, dashed line for λ2 = 0.001, and dotted line for λ2 = 0.01.

Real Data Analysis

In this section, we use WFPGL to estimate the gene regulatory networks of type 2 diabetes patients and breast cancer patients, with 8,444 known gene interactions from TRRUST database (Han et al., 2017) as prior network information and pathways collected from Kyoto Encyclopedia of Genes and Genomes (KEGG) database (Kanehisa, 2000) as prior pathway information. The tuning parameters (λ1, λ2) are determined based on the following Akaike information criterion (AIC):

AIC(λ1,λ2)=k=1K[nk tr(S(k)Θ^(k))nklogdet(Θ^(k))+2ne(k)],

where Θ^(k) is the estimated precision matrix by using the tuning parameters λ1 and λ2, and 2ne(k) is the number of non-zero entities in Θ^(k). We select the parameters λ1 and λ2, which obtain the outputs that minimize AIC(λ1, λ2) to generate the final estimation of Θ( k ).

Insulin Resistance in Type 2 Diabetes

We downloaded the gene expression data of type 2 diabetes patients from (Elbein et al., 2012), which contain 31 insulin-resistant (IR) samples and 31 insulin-sensitive (IS) samples. Four relevant pathways are used as prior pathway information, i.e., type 2 diabetes mellitus pathway, Wnt signaling pathway, AMPK signaling pathway, and PI3K–AKT signaling pathway. These four pathways are all implicated in type 2 diabetes. Wnt co-receptor (LRP-5) and the Wnt pathway effector TCF7L2 have been revealed in the development of diabetes (Ip et al., 2012). Targeted drugs have been designed to treat type 2 diabetes by activating AMPK signaling pathway (Jeon, 2016) or PI3K–AKT signaling pathway (Huang et al., 2018b).

We normalize each gene to have 0 mean and 1 standard deviation. By merging rows with the same gene name based on their average, 442 genes are covered by the prior pathway information. According to the experiences in simulation studies, we set w = 0.3 and choose λ1 (ranging from 0.05 to 0.5) and λ2 (ranging from 0.1 to 1) based on AIC. Networks for both IR and IS samples are built, and we calculate their difference to identify the network rewiring associated with insulin resistance.

For real data analysis, due to the lack of ground truth, it is hard to know whether the predicted regulatory relationships are real. As the changes of gene regulatory relationships are often derived by the aberrations of certain genes, we assess the performance of our method by quantifying how hub nodes in our estimated differential networks capture known functionally important genes. Table 1 shows the top 10 hub genes in our predicted differential network between IR and IS patients. Among these genes, overexpression of MYC is verified to prevent insulin resistance (Riu et al., 2003); TP53 is found to be responsible for the formation of tissue-specific insulin resistance (Strycharz et al., 2017). In liver, inactivation of RELA leads to the improvement of insulin sensitivity (Ke et al., 2015); whole-body insulin sensitivity improves when VEGFA is overexpressed in adipose tissue (Fatima et al., 2017); PRKAA2 gene interfering insulin resistance has been observed in the Japanese population (Horikoshi et al., 2006). Although not confirmed, indirect evidence like statistics (Naderpoor et al., 2017) or mouse experiments (Williams et al., 2015) indicate that EGFR and ITGA1 may be good candidates for insulin resistance. Besides, how RAC1 relates to insulin sensitivity after exercise has attracted a lot of attention (Maarbjerg et al., 2011).

TABLE 1
www.frontiersin.org

Table 1 Top 10 nodes with the highest degree in the predicted differential network between insulin-resistant (IR) and insulin-sensitive (IS) patients.

Breast Cancer Subtypes

We consider the gene expression data from The Cancer Genome Atlas (Cancer Genome Atlas Research Network et al., 2013) that measure the expression levels of 17,327 genes in 511 patients with breast cancer. The observations are classified into four subtypes: 95 for basal-like, 58 for HER2 enriched, and 231 and 127 for luminal A and luminal B, respectively. In this study, we consider seven pathways, namely, apoptosis pathway, hedgehog signaling pathway, homologous recombination pathway, notch signaling pathway, TGF-5 signaling pathway, mTOR signaling pathway, and p53 signaling pathway, which cover 360 genes in the gene expression data. All these pathways are related to breast cancer. Strong evidence shows that reduced apoptosis will cause breast tumor growth, and high levels of apoptosis in a breast tumor are likely to predict worse survival (Parton et al., 2001). DNA repair pathways that targeted genes involved in homologous recombination were discovered to be associated with hereditary breast cancer, while almost 40% of familial and sporadic breast cancers are homologous recombination deficient (den Brok et al., 2017). High notch1 expression was found to be associated with not only high-grade tumors but also poor prognosis for breast cancer (Zardawi et al., 2009). Changes in regulators of p53 activity were demonstrated to be predictive of early relapse in breast cancer (Gasco et al., 2002). mTOR pathway is implicated in endocrine resistance in ER-positive tumors, and the targeted drugs may be used to treat brain metastases (Paplomata and O’Regan, 2014). Dysregulation of hedgehog and TGF-5 signaling pathway have been identified in the development and progression of breast cancer (Habib and O’Shaughnessy, 2016; Skoda et al., 2018).

We build four state-specific networks corresponding to four breast cancer subtypes. We still set w = 0.3 but choose λ1 ranging from 0.01 to 0.1 and λ2 ranging from 0.001 to 0.01. Table 2 shows the nodes with the highest degree in our inferred networks. Two types of genes are categorized by the fused penalty—cancer-related genes and subtype-specific genes.

TABLE 2
www.frontiersin.org

Table 2 Top 30 nodes with the highest degree in four breast cancer subtypes.

The genes appeared in all subtypes are significantly associated with breast cancer. For example, IGF-1 gene is identified as a key player in major signaling pathways involved in breast cancer growth (Christopoulos et al., 2015); TP53 gene mutations have been found in almost all subtypes (Bertheau et al., 2013); TNF gene is highly expressed in breast carcinomas, and its chronic expression supports tumor growth (Kamel et al., 2012); BAMBI and interferon gamma protein (encoded by IFNG gene) are found to inhibit the tumor growth of breast cancer (Shangguan et al., 2012; Ning et al., 2010). In detecting breast cancer, experiments have revealed the diagnostic value of THBS1 protein (Suh et al., 2012). The overexpression and amplification of RPS6KB2 gene as well as FAS gene are reported to be associated with breast cancer prognosis (Pérez-Tenorio et al., 2011; Bębenek et al., 2013); BMP2 is considered as a driving factor for promoting breast cancer stemness, and BMP4 is a potent suppressor of breast cancer metastasis (Huang et al., 2017; Ampuja et al., 2013).

In addition, the genes that emerged only in one subtype are regarded as potential subtype-specific genes, which may have diverse functions across subtypes. In this result, there are three subtype-specific hub genes identified by our method, i.e., CDKN2B in basal-like, and IL1R1 and TNFRSF10B in luminal B. CDKN2B protein is a cyclin-dependent kinase inhibitor that functions as a cell growth regulator, and its methylation is part of triple negative breast cancer (TNBC) profile (Branham et al., 2012). This result is also in accordance with a recent study (Bareche et al., 2018) that declared that the copy number aberrations of CDKN2B gene suffer a high gain in basal-like 1 subtype. The high expression of IL-1a has been found to be correlated with better prognosis in luminal B breast cancer (Dagenais et al., 2017). Since IL1R1 protein is the receptor for IL-1α, there may be a latent relationship between the IL1R1 gene expression level and luminal B breast cancer. TNFRSF10B (also named as DR5) is a cell surface receptor that can be activated by tumor necrosis factor-related apoptosis inducing ligand (TNFSF10/TRAIL) (Benson et al., 2018). TRAIL kills tumor cells while sparing normal cells and becomes a drug target. However, TRAIL may be selective to patients, since only a small subset of patients respond well to TRAIL in previous clinical trials (Dimberg et al., 2013). Considering that the lack of surface DR5 is sufficient to render tumors resistant to the targeted therapies (Twomey et al., 2015), the correlation between DR5 and luminal B discovered by this paper could provide a new insight of TRAIL therapies on their receivers.

Conclusion

In this paper, we propose a novel weighted fused pathway graphical lasso (WFPGL) that can effectively incorporate additional knowledge including pathway information and gene interaction networks to jointly estimate multiple gene regulatory networks. These two kinds of prior information have different effects on our algorithm. We incorporate gene interaction priors by assigning a weight matrix to the estimated individual networks and differential networks. When there is no prior gene interaction information, all elements in the weight matrix are set to 1, and our model degenerates to a fused pathway graphical lasso model. For prior pathway information, we utilize the information by imposing constraints that a pair of genes can be connected to each other only if they co-occur in at least one pathway. The constraints have the potential of improving structure learning of gene regulatory networks. First, it can accelerate the optimization of the algorithm, leading to acceptable results when dealing with high-dimensional data. Second, making use of such prior information in learning the structures of networks can yield results that are more meaningful and interpretable. Moreover, as our algorithm is a flexible framework, the “pathway” here does not need to be an exact biological pathway. The “pathway” here stands for a partition of genes such that genes that belong to different “pathways” are less likely to have regulatory relationships. On the one hand, if we can collect additional information, such as the transcript factors (TFs) of genes in a given pathway, we could combine these TFs and their regulated genes into a new “pathway” such that the regulatory relationships are more likely to take place between genes within the same “pathways.” On the other hand, if the pathway information is not comprehensive or remains unknown to us, we could treat all genes as a pathway, and our model degenerates to a weighted fused graphical lasso model.

Data Availability

Publicly available datasets were analyzed in this study. This data can be found here: https://www.genome.jp/; https://www.ncbi.nlm.nih.gov/geo/query/acc.cgi?acc=GSE40234; https://portal.gdc.cancer.gov/

Author Contributions

NW conceived and designed the study, performed the statistical analysis, and drafted the manuscript. LO-Y and ZZ conceived the study, and participated in its design and coordination, and helped to draft the manuscript. JH and X-FZ participated in the design of the study, performed the statistical analysis, and helped to revise the manuscript. SH and WX participated in the design of the study and helped to revise the manuscript. All authors read and approved the final manuscript.

Funding

This work is supported by the National Natural Science Foundation of China under grants No. 61602309, 61871272, 61575125, 11871026 and 61402190, Shenzhen Fundamental Research Program, under grant JCYJ20170817095210760 and JCYJ20170302154328155, Natural Science Foundation of SZU [2017077], Guangdong Special Support Program of Topnotch Young Professionals, under grants 2014TQ01X273, and 2015TQ01R453, Guangdong Foundation of Outstanding Young Teachers in Higher Education Institutions, under grant Yq2015141, Natural Science Foundation of Hubei province [ZRMS2018001337] and Funding of Shenzhen Institute of Artificial Intelligence and Robotics for Society.

Conflict of Interest Statement

The authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.

Supplementary Material

The Supplementary Material for this article can be found online at: https://www.frontiersin.org/articles/10.3389/fgene.2019.00623/full#supplementary-material

References

Ampuja, M., Jokimäki, R., Juuti-Uusitalo, K., Rodriguez-Martinez, A., Alarmo, E.-L., Kallioniemi, A. (2013). BMP4 inhibits the proliferation of breast cancer cells and induces an MMP-dependent migratory phenotype in MDA-MB-231 cells in 3D environment. BMC Cancer 13, 429. doi: 10.1186/1471-2407-13-429

PubMed Abstract | CrossRef Full Text | Google Scholar

Banerjee, O., Ghaoui, L. E., d’ Aspremont, A. (2008). Model selection through sparse maximum likelihood estimation for multivariate Gaussian or binary data. J. Mach. Learn. Res. 9, 485–516. doi: 10.1145/1390681.1390696

CrossRef Full Text | Google Scholar

Bareche, Y., Venet, D., Ignatiadis, M., Aftimos, P., Piccart, M., Rothe, F., et al. (2018). Unravelling triple-negative breast cancer molecular heterogeneity using an integrative multiomic analysis. Ann. Oncol. 29, 895–902. doi: 10.1093/annonc/mdy024

PubMed Abstract | CrossRef Full Text | Google Scholar

Bębenek, M., Duś, D., Koźlak, J. (2013). Prognostic value of the Fas/Fas ligand system in breast cancer. Contemp. Oncol. 17, 120. doi: 10.5114/wo.2013.34612

CrossRef Full Text | Google Scholar

Benson, D. A., Cavanaugh, M., Clark, K., Karsch-Mizrachi, I., Ostell, J., Pruitt, K. D., et al. (2018). Genbank. Nucleic Acids Res. 46, D41–D47. doi: 10.1093/nar/gkx1094

PubMed Abstract | CrossRef Full Text | Google Scholar

Bertheau, P., Lehmann-Che, J., Varna, M., Dumay, A., Poirot, B., Porcher, R., et al. (2013). p53 in breast cancer subtypes and new insights into response to chemotherapy. Breast 22, S27–S29. doi: 10.1016/j.breast.2013.07.005

PubMed Abstract | CrossRef Full Text | Google Scholar

Boyd, S. (2010). Distributed optimization and statistical learning via the alternating direction method of multipliers. Found. Trends. Mach. Learn. 3, 1–122. doi: 10.1561/2200000016

CrossRef Full Text | Google Scholar

Branham, M., Marzese, D., Laurito, S., Gago, F., Orozco, J., Tello, O., et al. (2012). Methylation profile of triple-negative breast carcinomas. Oncogenesis 1, e17. doi: 10.1038/oncsis.2012.17

PubMed Abstract | CrossRef Full Text | Google Scholar

Cancer Genome Atlas Research Network, Weinstein, J. N., Collisson, E. A., et al. (2013). The cancer genome Atlas pan-cancer analysis project. Nat. Genet. 45 (10), 1113–1120. doi: 10.1038/ng.2764

PubMed Abstract | CrossRef Full Text | Google Scholar

Christopoulos, P. F., Msaouel, P., Koutsilieris, M. (2015). The role of the insulin-like growth factor-1 system in breast cancer. Mol. Cancer 14, 1. doi: 10.1186/s12943-015-0291-7

PubMed Abstract | CrossRef Full Text | Google Scholar

Dagenais, M., Dupaul-Chicoine, J., Douglas, T., Champagne, C., Morizot, A., Saleh, M. (2017). The interleukin (IL)-1R1 pathway is a critical negative regulator of PyMT-mediated mammary tumorigenesis and pulmonary metastasis. Oncoimmunology 6, e1287247. doi: 10.1080/2162402X.2017.1287247

PubMed Abstract | CrossRef Full Text | Google Scholar

Danaher, P., Wang, P., Witten, D. M. (2014). The joint graphical lasso for inverse covariance estimation across multiple classes. J. R. Stat. Soc. Series B. Stat. Methodol. 76, 373–397. doi: 10.1111/rssb.12033

PubMed Abstract | CrossRef Full Text | Google Scholar

Dempster, A. P. (1972). Covariance selection. Biometrics, 28, 157–175. doi: 10.2307/2528966

CrossRef Full Text | Google Scholar

den Brok, W. D., Schrader, K. A., Sun, S., Tinker, A. V., Zhao, E. Y., Aparicio, S., et al. (2017). Homologous recombination deficiency in breast cancer: a clinical review. JCO Precis. Oncol. 1, 1–13. doi: 10.1200/PO.16.00031

CrossRef Full Text | Google Scholar

Dimberg, L. Y., Anderson, C. K., Camidge, R., Behbakht, K., Thorburn, A., Ford, H. L. (2013). On the trail to successful cancer therapy? Predicting and counteracting resistance against trail-based therapeutics. Oncogene 32, 1341. doi: 10.1038/onc.2012.164

PubMed Abstract | CrossRef Full Text | Google Scholar

Elbein, S. C., Gamazon, E. R., Das, S. K., Rasouli, N., Kern, P. A., Cox, N. J. (2012). Genetic risk factors for type 2 diabetes: a trans-regulatory genetic architecture? Am. J. Human Genet. 91, 466–477. doi: 10.1016/j.ajhg.2012.08.002

CrossRef Full Text | Google Scholar

Fatima, L., Campello, R., de Souza Santos, R., Freitas, H., Frank, A., Machado, U., et al. (2017). Estrogen receptor 1 (ESR1) regulates VEGFA in adipose tissue. Sci. Rep. 7, 16716. doi: 10.1038/s41598-017-16686-7

PubMed Abstract | CrossRef Full Text | Google Scholar

Gasco, M., Shami, S., Crook, T. (2002). The p53 pathway in breast cancer. Breast Cancer Res. 4, 70. doi: 10.1186/bcr426

PubMed Abstract | CrossRef Full Text | Google Scholar

Grechkin, M., Fazel, M., Witten, D., Lee, S., (2015). Pathway graphical lasso, in Proceedings of the Twenty-Ninth AAAI Conference on Artificial Intelligence, vol. 2015. (Austin, Texas: NIH Public Access), 2617–2623.

Google Scholar

Guo, J., Levina, E., Michailidis, G., Zhu, J. (2011). Joint estimation of multiple graphical models. Biometrika 98, 1–15. doi: 10.1093/biomet/asq060

PubMed Abstract | CrossRef Full Text | Google Scholar

Habib, J. G., O’Shaughnessy, J. A. (2016). The hedgehog pathway in triple-negative breast cancer. Cancer Med. 5, 2989–3006. doi: 10.1002/cam4.833

PubMed Abstract | CrossRef Full Text | Google Scholar

Han, H., Cho, J.-W., Lee, S., Yun, A., Kim, H., Bae, D., et al. (2017). TRRUST v2: an expanded reference database of human and mouse transcriptional regulatory interactions. Nucleic Acids Res. 46, D380–D386. doi: 10.1093/nar/gkx1013

CrossRef Full Text | Google Scholar

Hoefling, H. (2010). A path algorithm for the fused lasso signal approximator. J. Comput. Graph. Stat. 19, 984–1006. doi: 10.1198/jcgs.2010.09208

CrossRef Full Text | Google Scholar

Horikoshi, M., Hara, K., Ohashi, J., Miyake, K., Tokunaga, K., Ito, C., et al. (2006). A polymorphism in the AMPKA2 subunit gene is associated with insulin resistance and type 2 diabetes in the Japanese population. Diabetes 55, 919–923. doi: 10.2337/diabetes.55.04.06.db05-0727

PubMed Abstract | CrossRef Full Text | Google Scholar

Huang, F., Chen, S., Huang, S.-J. (2018a). Joint estimation of multiple conditional Gaussian graphical models. IEEE Trans. Neural Netw. Learn. Syst. 29, 3034–3046. doi: 10.1109/tnnls.2017.2710090

PubMed Abstract | CrossRef Full Text | Google Scholar

Huang, P., Chen, A., He, W., Li, Z., Zhang, G., Liu, Z., et al. (2017). BMP-2 induces EMT and breast cancer stemness through Rb and CD44. Cell Death Discov. 3, 17039. doi: 10.1038/cddiscovery.2017.39

PubMed Abstract | CrossRef Full Text | Google Scholar

Huang, X., Liu, G., Guo, J., Su, Z. (2018b). The PI3K/AKT pathway in obesity and type 2 diabetes. Int. J. Biol. Sci. 14, 1483. doi: 10.7150/ijbs.27173

PubMed Abstract | CrossRef Full Text | Google Scholar

Ideker, T., Krogan, N. J. (2012). Differential network biology. Mol. Syst. Biol. 8, 565. doi: 10.1038/msb.2011.99

PubMed Abstract | CrossRef Full Text | Google Scholar

Ip, W., Chiang, Y.-t. A., Jin, T. (2012). The involvement of the Wnt signaling pathway and TCF7L2 in diabetes mellitus: the current understanding, dispute, and perspective. Cell Biosci. 2, 28. doi: 10.1186/2045-3701-2-28

PubMed Abstract | CrossRef Full Text | Google Scholar

Jeon, S.-M. (2016). Regulation and function of AMPK in physiology and diseases. Exp. Mol. Med. 48, e245. doi: 10.1038/emm.2016.81

PubMed Abstract | CrossRef Full Text | Google Scholar

Kamel, M., Shouman, S., El-Merzebany, M., Kilic, G., Veenstra, T., Saeed, M., et al. (2012). Effect of tumour necrosis factor-alpha on estrogen metabolic pathways in breast cancer cells. J. Cancer. 3, 310. doi: 10.7150/jca.4584

PubMed Abstract | CrossRef Full Text | Google Scholar

Kanehisa, M. (2000). Kegg: Kyoto encyclopedia of genes and genomes. Nucleic Acids Res. 28, 27–30. doi: 10.1093/nar/28.1.27

PubMed Abstract | CrossRef Full Text | Google Scholar

Ke, B., Zhao, Z., Ye, X., Gao, Z., Manganiello, V., Wu, B., et al. (2015). Inactivation of NF-κB p65 (rela) in liver improves insulin sensitivity and inhibits cAMP/PKA pathway. Diabetes 64, 3355–3362. doi: 10.2337/db15-0242

PubMed Abstract | CrossRef Full Text | Google Scholar

Liu, X., Wang, Y., Ji, H., Aihara, K., Chen, L. (2016). Personalized characterization of diseases using sample-specific networks. Nucleic Acids Res. 44, e164–e164. doi: 10.1093/nar/gkw772

PubMed Abstract | CrossRef Full Text | Google Scholar

Maarbjerg, S. J., Sylow, L., Richter, E. (2011). Current understanding of increased insulin sensitivity after exercise-emerging candidates. Acta Physiol. 202, 323–335. doi: 10.1111/j.1748-1716.2011.02267.x

CrossRef Full Text | Google Scholar

Meinshausen, N., Bühlmann, P. (2006). High-dimensional graphs and variable selection with the lasso. Ann. Stat., 34, 1436–1462. doi: 10.1214/009053606000000281

CrossRef Full Text | Google Scholar

Naderpoor, N., Lyons, J. G., Mousa, A., Ranasinha, S., De Courten, M. P., Soldatos, G., et al. (2017). Higher glomerular filtration rate is related to insulin resistance but not to obesity in a predominantly obese non-diabetic cohort. Sci. Rep. 7, 45522. doi: 10.1038/srep45522

PubMed Abstract | CrossRef Full Text | Google Scholar

Ning, Y., Riggins, R. B., Mulla, J. E., Chung, H., Zwart, A., Clarke, R. (2010). IFNy restores breast cancer sensitivity to fulvestrant by regulating STAT1, IFN regulatory factor 1, NF-κB, BCL2 family members, and signaling to caspase-dependent apoptosis. Mol. Cancer Ther. 9, 1274–1285. doi: 10.1158/1535-7163.MCT-09-1169

PubMed Abstract | CrossRef Full Text | Google Scholar

Ou-Yang, L., Yan, H., Zhang, X. F. (2017a). Identifying differential networks based on multi-platform gene expression data. Mol. Biosyst. 13, 183–192. doi: 10.1039/C6MB00619A

CrossRef Full Text | Google Scholar

Ou-Yang, L., Zhang, X.-F., Wu, M., Li, X.-L. (2017b). Node-based learning of differential networks from multi-platform gene expression data. Methods 129, 41–49. doi: 10.1016/j.ymeth.2017.05.014

PubMed Abstract | CrossRef Full Text | Google Scholar

Paplomata, E., O’Regan, R. (2014). The PI3K/AKT/MTOR pathway in breast cancer: targets, trials and biomarkers. Ther. Adv. Med. Oncol. 6, 154–166. doi: 10.1177/1758834014530023

PubMed Abstract | CrossRef Full Text | Google Scholar

Parton, M., Dowsett, M., Smith, I. (2001). Studies of apoptosis in breast cancer. BMJ 322, 1528–1532. doi: 10.1136/bmj.322.7301.1528

PubMed Abstract | CrossRef Full Text | Google Scholar

Peng, G., Luo, L., Siu, H., Zhu, Y., Hu, P., Hong, S., et al. (2010). Gene and pathway-based second-wave analysis of genome-wide association studies. Eur. J. Hum. Genet. 18, 111. doi: 10.1038/ejhg.2009.115

PubMed Abstract | CrossRef Full Text | Google Scholar

Pérez-Tenorio, G., Karlsson, E., Waltersson, M. A., Olsson, B., Holmlund, B., Nordenskjold, B., et al. (2011). Clinical potential of the mTOR targets S6K1 and S6K2 in breast cancer. Breast Cancer Res. Treat. 128, 713–723. doi: 10.1007/s10549-010-1058-x

PubMed Abstract | CrossRef Full Text | Google Scholar

Riu, E., Ferre, T., Hidalgo, A., Mas, A., Franckhauser, S., Otaegui, P., et al. (2003). Overexpression of c-myc in the liver prevents obesity and insulin resistance. FASEB J. 17, 1715–1717. doi: 10.1096/fj.02-1163fje

PubMed Abstract | CrossRef Full Text | Google Scholar

Schlitt, T., Brazma, A. (2007). Current approaches to gene regulatory network modelling. BMC Bioinf. 8, S9. doi: 10.1186/1471-2105-8-S6-S9

CrossRef Full Text | Google Scholar

Shangguan, L., Ti, X., Krause, U., Hai, B., Zhao, Y., Yang, Z., et al. (2012). Inhibition of TGF-β/Smad signaling by BAMBI blocks differentiation of human mesenchymal stem cells to carcinoma-associated fibroblasts and abolishes their protumor effects. Stem cells 30, 2810–2819. doi: 10.1002/stem.1251

PubMed Abstract | CrossRef Full Text | Google Scholar

Skoda, A. M., Simovic, D., Karin, V., Kardum, V., Vranic, S., Serman, L. (2018). The role of the hedgehog signaling pathway in cancer: a comprehensive review. Bosn. J. Basic Med. Sci. 18, 8. doi: 10.17305/bjbms.2018.2756

PubMed Abstract | CrossRef Full Text | Google Scholar

Strycharz, J., Drzewoski, J., Szemraj, J., Sliwinska, A. (2017). Is p53 involved in tissue-specific insulin resistance formation? Oxid. Med. Cellular Longev. 2017, 15. doi: 10.1155/2017/9270549

CrossRef Full Text | Google Scholar

Suh, E. J., Kabir, M. H., Kang, U.-B., Lee, J. W., Yu, J., Noh, D.-Y., et al. (2012). Comparative profiling of plasma proteome from breast cancer patients reveals thrombospondin-1 and BRWD3 as serological biomarkers. Exp. Mol. Med. 44, 36. doi: 10.3858/emm.2012.44.1.003

PubMed Abstract | CrossRef Full Text | Google Scholar

Tian, D., Gu, Q., Ma, J. (2016). Identifying gene regulatory network rewiring using latent differential graphical models. Nucleic Acids Res. 44, e140. doi: 10.1093/nar/gkw581

PubMed Abstract | CrossRef Full Text | Google Scholar

Tseng, P. (2001). Convergence of a block coordinate descent method for nondifferentiable minimization. J. Optim. Theory Appl. 109, 475–494. doi: 10.1023/A:1017501703105

CrossRef Full Text | Google Scholar

Twomey, J. D., Kim, S.-R., Zhao, L., Bozza, W. P., Zhang, B. (2015). Spatial dynamics of trail death receptors in cancer cells. Drug Resist. Updat. 19, 13–21. doi: 10.1016/j.drup.2015.02.001

PubMed Abstract | CrossRef Full Text | Google Scholar

Uhler, C. (2017). Gaussian graphical models: an algebraic and geometric perspective. arXiv preprint arXiv:1707.04345.

Google Scholar

Verfaillie, A., Imrichova, H., de Sande, B., Standaert, L., Christiaens, V., Hulselmans, G., et al. (2014). iRegulon: from a gene list to a gene regulatory network using large motif and track collections. PLoS Comput. Biol. 10, e1003731. doi: 10.1371/journal.pcbi.1003731

PubMed Abstract | CrossRef Full Text | Google Scholar

Williams, A. S., Kang, L., Zheng, J., Grueter, C., Bracy, D. P., James, F. D., et al. (2015). Integrin a1-null mice exhibit improved fatty liver when fed a high fat diet despite severe hepatic insulin resistance. J. Biol. Chem. 290, 6546–6557. doi: 10.1074/jbc.M114.615716

PubMed Abstract | CrossRef Full Text | Google Scholar

Xu, T., Ou-Yang, L., Hu, X., Zhang, X.-F. (2018). Identifying gene network rewiring by integrating gene expression and gene network data. IEEE/ACM Trans. Comput. Biol. Bioinf. 15, 2079–2085. doi: 10.1109/TCBB.2018.2809603

CrossRef Full Text | Google Scholar

Yuan, H., Xi, R., Chen, C., Deng, M. (2017). Differential network analysis via lasso penalized D-trace loss. Biometrika 104, 755–770. doi: 10.1093/biomet/asx049

CrossRef Full Text | Google Scholar

Yuan, M., Lin, Y. (2007). Model selection and estimation in the Gaussian graphical model. Biometrika 94, 19–35. doi: 10.1093/biomet/asm018

CrossRef Full Text | Google Scholar

Zardawi, S. J., O’Toole, S. A., Sutherland, R. L., Musgrove, E. A. (2009). Dysregulation of hedgehog, Wnt and notch signalling pathways in breast cancer. Histol. Histopathol. 24 (3), 385–398. doi: 10.14670/HH-24.385

CrossRef Full Text | Google Scholar

Zhang, T., Zou, H. (2014). Sparse precision matrix estimation via lasso penalized D-trace loss. Biometrika 101, 103–120. doi: 10.1093/biomet/ast059

CrossRef Full Text | Google Scholar

Zhang, X., Zhao, J., Hao, J.-K., Zhao, X.-M., Chen, L. (2014). Conditional mutual inclusive information enables accurate quantification of associations in gene regulatory networks. Nucleic Acids Res. 43, e31–e31. doi: 10.1093/nar/gku1315

PubMed Abstract | CrossRef Full Text | Google Scholar

Keywords: Gaussian graphical model, precision matrix, prior information, fused lasso penalty, gene network analysis

Citation: Wu N, Huang J, Zhang X-F, Ou-Yang L, He S, Zhu Z and Xie W (2019) Weighted Fused Pathway Graphical Lasso for Joint Estimation of Multiple Gene Networks. Front. Genet. 10:623. doi: 10.3389/fgene.2019.00623

Received: 01 March 2019; Accepted: 13 June 2019;
Published: 22 July 2019.

Edited by:

Alfredo Pulvirenti, University of Catania, Italy

Reviewed by:

Anand Anbarasu, VIT University, India
Hongwu Ma, Tianjin Institute of Industrial Biotechnology (CAS), China

Copyright © 2019 Wu, Huang, Zhang, Ou-Yang, He, Zhu and Xie. This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) and the copyright owner(s) are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.

*Correspondence: Le Ou-Yang, leouyang@szu.edu.cn; Zexuan Zhu, zhuzx@szu.edu.cn

Disclaimer: All claims expressed in this article are solely those of the authors and do not necessarily represent those of their affiliated organizations, or those of the publisher, the editors and the reviewers. Any product that may be evaluated in this article or claim that may be made by its manufacturer is not guaranteed or endorsed by the publisher.