Next Article in Journal
Passive Decoy-State Quantum Key Distribution with Coherent Light
Next Article in Special Issue
General and Local: Averaged k-Dependence Bayesian Classifiers
Previous Article in Journal
Generalized Boundary Conditions for the Time-Fractional Advection Diffusion Equation
Previous Article in Special Issue
Density Regression Based on Proportional Hazards Family
 
 
Font Type:
Arial Georgia Verdana
Font Size:
Aa Aa Aa
Line Spacing:
Column Width:
Background:
Article

A Penalized Likelihood Approach to Parameter Estimation with Integral Reliability Constraints

1
Department of Economics, York University, Toronto ON M3J 1P3, Canada
2
Department of Mathematics and Statistics, York University, Toronto ON M3J 1P3, Canada
*
Author to whom correspondence should be addressed.
Entropy 2015, 17(6), 4040-4063; https://0-doi-org.brum.beds.ac.uk/10.3390/e17064040
Submission received: 21 April 2015 / Accepted: 10 June 2015 / Published: 12 June 2015
(This article belongs to the Special Issue Inductive Statistical Methods)

Abstract

:
Stress-strength reliability problems arise frequently in applied statistics and related fields. Often they involve two independent and possibly small samples of measurements on strength and breakdown pressures (stress). The goal of the researcher is to use the measurements to obtain inference on reliability, which is the probability that stress will exceed strength. This paper addresses the case where reliability is expressed in terms of an integral which has no closed form solution and where the number of observed values on stress and strength is small. We find that the Lagrange approach to estimating constrained likelihood, necessary for inference, often performs poorly. We introduce a penalized likelihood method and it appears to always work well. We use third order likelihood methods to partially offset the issue of small samples. The proposed method is applied to draw inferences on reliability in stress-strength problems with independent exponentiated exponential distributions. Simulation studies are carried out to assess the accuracy of the proposed method and to compare it with some standard asymptotic methods.

1. Introduction

We consider a stress-strength reliability problem, R = P (Y < X), where X and Y are independently distributed as exponentiated exponential random variables. In this case, the parameter of interest, R, is an integral with no closed form solution. Likelihood-based inference for this problem involves, in part, numerical constrained optimization. The likelihood ratio test, for example, uses constrained maximum likelihood parameter estimates where the log-likelihood function is maximized subject to the constraint that the parameter of interest takes on the null-hypothesized value.
A standard approach to solving (equality) constrained problems is to apply the Lagrange method. Detailed discussions can be found in many reference works in optimization, for example, see [1]. Using the Lagrange technique, this problem requires that, in each iteration, one parameter has to be used to guarantee that the integral equality constraint is satisfied up to a given level of numerical accuracy. Satisfying the constraint involves simultaneous numerical integration and the numerical solution of a nonlinear equation for the correct parameter value. Moreover, all of this needs to be repeated for each value of the remaining parameters.
We found that the Lagrange approach led to problems of numerical accuracy and slow convergence. We also discovered that the integral function was homogeneous of degree 0 in two of the parameters of the model. The integral is therefore insensitive to equi-proportionate changes in the two parameters and this complicated optimization.
Penalty function methods are also often used to solve equality and inequality-constrained optimization problems. See Byrne [2] and the extensive list of references therein. The basic idea is to solve a constrained optimization problem by solving a sequence of unconstrained problems where constraint violation is penalized in a successively harsher manner. The sequence of unconstrained optima converges to the constrained optimum. Smith et al. (2014) [3] demonstrates the accuracy of the penalty method in a constrained optimization problem. When implementing this approach to the problem considered in this paper, we found it was numerically stable and rapid in the sense that parameter estimates from successively more harshly penalized models quickly converged to optimal values and became effectively insensitive to further penalization.
The remainder of the paper is organized as follows. Section 2 begins with a brief description of our stress-strength reliability problem. We then examine some of the properties of the unconstrained likelihood function. This seems necessary because the likelihood function is not concave and generally cannot be optimized by an algorithm that uses the exact Hessian matrix. We then introduce the integral constraint and some of its properties together with a numerical example to illustrate how the Lagrange approach failed. The section concludes with a discussion of some properties and an implementation of the penalty function approach. Section 3 briefly reviews the likelihood-based third-order method for inference concerning a scalar parameter of interest when samples are small. Section 4 applies the approaches developed in Sections 2 and 3 to study inference for the stress-strength reliability problem with independent exponentiated exponential distributions. Results from simulation studies are presented to illustrate the extreme accuracy of our suggested approach. Some concluding remarks are given in Section 5. Technical details related to some issues raised in Section 2 are presented in the Appendix.

2. Computational Issues and Penalized Likelihood

2.1. Problem

We consider statistical inference for the reliability, R = P (Y < X) where X and Y are independently distributed with exponentiated exponential distributions. The estimation of R is important in the statistical literature and has been widely studied in other areas, such as radiology, mechanical engineering and materials science. In the context of the stress-strength model, reliability refers to the probability that the unit’s strength Y is less than the stress X. Following the notation in [4], the cumulative distribution function of the two-parameter exponentiated exponential distribution, EE(α, β), is
F ( x ; α , β ) = ( 1 e β x ) α ; α > 0 , β > 0 , x > 0
where α is the shape parameter and β is the scale parameter. The EE(α, β) distribution is also known as the generalized exponential distribution. It is equivalent to the exponentiated Weibull (κ = 1, α, σ) distribution as introduced in [5], where κ is the first shape parameter, α is the second shape parameter and σ is the scale parameter. If α = 1, the distribution reduces to the one parameter exponential (β) model. The EE(α, β) distribution is recognized as a useful model for lifetime data or skewed data. It has a monotone increasing hazard function when α > 1, decreasing when α < 1 and constant when α = 1. It also has a unimodal and right-skewed density function. For a fixed scale parameter value, the skewness gradually decreases as the shape parameter increases.
Kundu and Gupta [6] and Raqab et al. [7] considered maximum likelihood estimation of R when X and Y are independent EE(α1, β1) and EE(α2, β2) distributed random variables and where β1 and β2 are assumed to be the same. Then R = α 1 α 1 + α 2. Likelihood-based inference for R can then be obtained easily. However, without the assumption that β1 = β2, then
R = P ( Y < X ) = 0 α 1 β 1 ( 1 e β 1 x ) α 1 1 e β 1 x ( 1 e β 2 x ) α 2 d x
and this integral does not have a known closed form. To obtain likelihood-based inference for R in this case, it is necessary to solve an optimization problem with an integral constraint. We found that standard macros, functions and subroutines in popular software packages such as R or Matlab experienced problems in converging to constrained maximum likelihood estimates in a Lagrange setting. As noted in the Introduction, a penalized likelihood method is proposed to provide a solution for this constrained maximization problem.
Once the constrained maximum likelihood estimates have been obtained, likelihood methods can be used to draw inferences about R. In this paper, the third-order method discussed in Fraser and Reid [8] is applied. This is known to be very accurate even when the sample sizes are small. See, for example, Chang and Wong [9] and She et al. [10] where the accuracy of third order methods has been assessed.

2.2. Unconstrained Likelihood and Its Properties

Let x = (x1,…, xn) and y = (y1,…, ym) be independent random samples from EE(α1, β1) and EE(α2, β2) respectively. Then the log-likelihood function is:
l ( θ ; x , y ) = l ( α 1 , β 1 , α 2 , β 2 ; x , y ) = n log α 1 + n log β 1 + ( α 1 1 ) i = 1 n log ( 1 e β 1 x i ) β 1 i = 1 n x i + m log α 2 + m log β 2 + ( α 2 1 ) j = 1 m log ( 1 e β 2 y j ) β 2 j = 1 m y j
where θ = (α1, α2, β1, β2).
The log-likelihood function is infinitely differentiable and has some interesting properties. In the first place it is not concave. Nor is it quasiconcave or pseudoconcave or a member of any of the other classes of generalized concave functions that appear regularly in the literature on optimization. However, the function is strictly concave in each of its parameters individually. For fixed values of any 3 parameters, the function reaches a unique maximum with respect to the fourth. The determinant of the Hessian matrix changes sign throughout the domain of the function. But, the gradient of the log-likelihood function vanishes at a unique vector of parameter values and at that point and in a neighbourhood around it the Hessian matrix is negative definite. All of these results are derived in the Appendix. So, when the function is restricted to an open region about the global optimum, the function is concave. These results have some important implications regarding how to maximize log-likelihood for any given sample. In particular, all parameters cannot typically be estimated simultaneously using any algorithm that uses exact Hessian information. Quasi–Newton techniques (sometimes called variable metric) work by building an increasingly accurate quadratic approximation to the log-likelihood surface. These will work in this setting as long as exact Hessians are not used building the surface approximation and updating the parameters. Hessian matrices from the exponentiated exponential model usually won’t be negative definite and will lead to inappropriate updating. The Broyden–Fletcher–Goldfarb–Shanno (BFGS) algorithm can work because some versions of it start with negative definite and symmetric “approximate” Hessians and update them until they converge to the true Hessian at the optimum. Given the independence of X and Y it is also possible to concentrate the log-likelihood using the derivative information on α1 and α2 and then apply simple line searches to find the β parameters. The Appendix shows why this latter, perhaps more cumbersome, approach will work but only in the unconstrained case.
In summary, obtaining the unconstrained parameter estimates is possible if done carefully. As well, we found it is often helpful to restrict large changes in parameters during estimation. We recommend box constraints be placed on parameters during estimation. Preset upper and lower bounds on the parameters require that the parameters must be chosen from a four dimensional hyper-cube. A version of BFGS that implants constraints of this type is available in the statistical package, R.

2.3. The Integral Reliability Constraint

In this paper, the parameter of interest is stress-strength reliability, R = P (Y < X), where X and Y are independently distributed as EE(α1, β1) and EE(α2, β2) distributions respectively. The constraint function, R, is given in (2) and has no known closed form solution. Straightforward differentiation shows that R is an increasing function of α1 and β2 and a decreasing function of α2 and β1. Introducing the change of variables: z = β1x, the integral constraint can be expressed equivalently as:
R = 0 α 1 ( 1 e z ) α 1 1 e z ( 1 e β 2 β 1 z ) α 2 d z
This establishes that R is homogeneous of degree 0 in (β1, β2). The contours of R are all constant along the line in parameter space satisfying β2 = 1. This can complicate the numerical optimization process. It also provides a good reason to introduce the box constraints on the parameters as discussed in Section 2.2.

2.4. Constrained Optimization: Lagrange

We need to find the constrained maximum likelihood parameter estimates θ ^ ψ = ( α ˜ 1 , α ˜ 2 , β ˜ 1 , β ˜ 2 ) for a given ψ0. This requires that we maximize the log-likelihood function in (3) subject to the constraint R = ψ(θ) = ψ0 in (2). Using the Lagrange approach, for given values of ψ0 and three of the likelihood parameters, the constraint R = ψ(θ) = ψ0 can be numerically integrated and the fourth parameter simultaneously chosen so that the integral constraint holds. The value of the fourth parameter and ψ0 are held fixed while the log-likelihood function is then maximized with respect to the three “free” likelihood parameters. This process is iterated until convergence.
There is much that can go wrong in the above iterative process. The likelihood function is not concave, the equality constraint is highly nonlinear in the parameters, the constraint integral must be evaluated numerically and then solved numerically and the integral R is homogeneous of degree 0 with respect to two of the parameters.
We discovered in simulation that this iterative Lagrange method sometimes gave relatively satisfactory results for some values of ψ0 but it often went astray. Typically it took a long time for the estimation process to converge. Performance degraded quickly when R was constrained to be closer to the boundary values of 0 or 1.
Table 1 records one of these situations for a simulated data set from EE(2, 3) and EE(5, 1.6015) distributions each with a sample size of 10 and ψ0 was set equal to 0.1. Initial attempts to program the search algorithm in Matlab resulted in floating point overflow and division by zero errors.
We next tried Matlab’s built-in macro, “fminsearch”, which uses a simplex search method. This macro provided the correct unconstrained parameter estimates but failed when constraints were introduced. The unconstrained parameter estimates are recorded in Table 2. Not surprisingly, these unconstrained parameter estimates do not solve the constrained optimization problem. They yield a constraint value of 0.0141 as opposed 0.1. Correct results were found using a penalized likelihood method (appearing as Penalty in the table). This method is introduced below.
We encountered similar problems with small samples and when sample sizes were unequal. We reprogrammed the Lagrange constrained optimization algorithm in the statistical package R. In addition to slow, inaccurate and sometimes failed convergence, we also encountered cases of singular matrices during the optimization process.

2.5. The Penalized Likelihood Approach

As noted in the previous section, we encountered a wide variety of serious numerical problems when we tried to implement the classical Lagrange approach to our integral equality constrained optimization problem. We now present a penalized likelihood method, which was also discussed in Smith et al. (2014) [3], that reliably solves the constrained optimization problems addressed in this paper. We begin by defining the penalized likelihood function, PL, as:
P L ( θ , ψ 0 ; x , y ) = l ( θ ; x , y ) k ( R ( θ ) ψ 0 ) 2 , k > 0
The function consists of the log-likelihood function l(θ; x, y) and a positive function, k(R(θ) – ψ0)2, called the penalty function, that is subtracted from the the likelihood function. Here we consider a penalty function that is the square of the divergence of R from the constraint value ψ0. The greater is k, the more the likelihood is penalized (decreased) for given divergences of R from ψ0. Next, consider the unconstrained problem of maximizing PL (θ, ψ0; x, y) with respect to θ for a given value of k. Formally, we obtain an optimal vector θ k *. When k = 0, we obtain the parameter vector that maximizes the unconstrained log-likelihood. As k increases and we successively solve the optimization problems, we obtain a sequence of optimal parameter vectors that converges to the solution of the constrained problem of choosing θ to maximize l(θ; x, y) subject to R(θ) = ψ0. In contrast to the Lagrange approach, we now have a sequence of unconstrained optimization problems instead of one constrained problem. This means the penalty approach may be theoretically more complicated. But, in practice, we found it is faster and more reliable than the Lagrange approach to our problem. First, the penalty approach does not require that the equality R(θ) = ψ0 hold at any stage of the optimization process. Instead, divergence of R(θ) from ψ0 is (increasingly) discouraged as k increases. Many of the numerical problems we encountered in applying the Lagrange technique arose from trying to impose R(θ) = ψ0 exactly. Second, in practice as k increases we found that the estimated parameters quickly converged to optimal values and it was not necessary to continue to increase k and solve additional optimization problems. Finally, the Lagrange and Penalty approaches are formally dual so that the Lagrange multiplier can be recovered as the limiting value of the partial derivative of P L with respect to the parameter ψ0.
A recent analysis of the rigorous foundations of the penalty (and other) approaches to constrained optimization can be found in [2] and the discussion that follows draws upon Byrne’s presentation. We need very few technical conditions to hold in order for the sequence of P L-maximizing vectors, θ k *, to converge to the solution of the constrained optimization problem. First, we need the permissible parameter values to be drawn from a compact set. We satisfy this condition by imposing box constraints on the four likelihood parameters. Second, there is a straightforward restriction that l and R are continuous functions of the parameters. These conditions are satisfied. Finally, we require that the sequence of optimal vectors, θ k *, correspond to the global maximum of penalized likelihood for each k. Our experience is that, for each k, there is only ever one optimum and that occurs where the gradient of PL vanishes. In all cases the Hessian matrix at the optimum was negative definite.
When implementing the penalty approach we found that when 0.1 < ψ0 < 0.9, there was little gain from increasing k beyond 10,000. For the extreme values of ψ0, k did not have to exceed 80,000. As well, short sequences were sufficient. It was not necessary solve separate optimization problems for all possible values of k.
We obtained parameter estimates from unconstrained optimization of P L using the optim function in the statistical software package, R. Within optim, we adopted the L-BFGS-B algorithm. L-BFGS-B was developed by Byrd et al. [11] and is a quasi-Newton optimization variant of the Broydon-Fletcher-Goldfarb-Shanno (BFGS) algorithm. L-BFGS-B conveniently allows for simple upper and lower search bounds (box constraints) on the parameters. As noted earlier, L-BFGS-B does not use an exact Hessian. Instead it updates an approximate Hessian that is guaranteed to be negative definite and symmetric. The approximate Hessian converges to the true one. Our code is available upon request.

3. Likelihood-based Inference for Any Scalar Parameter of Interest

In this section, we provide a brief review of likelihood-based inference for any scalar parameter of interest. Let y = (y1,…, yn) be a random sample from a distribution with log-likelihood function (θ) = (θ; y), where θ is a vector parameter with dim(θ) = p. Also let ψ = ψ(θ) be a scalar parameter of interest. Denote θ ^ to be the overall maximum likelihood estimate, which satisfies
θ ( θ ^ ) = ( θ ) θ | θ = θ ^ = 0
Moreover, denote θ ^ ψ as the constrained maximum likelihood estimate of θ for a given ψ(θ) = ψ0. θ ^ and θ ^ ψ can be obtained from the penalized likelihood method discussed in Section 2. Moreover, with θ ^ ψ, the corresponding Lagrange multiplier, K ^, can be obtained. Define the tilted log-likelihood function as
˜ ( θ ) = ( θ ) + K ^ [ ψ ( θ ) ψ 0 ]
which has the property ˜ ( θ ^ ψ ) = ( θ ^ ψ ).
Two widely used likelihood-based methods for obtaining asymptotic confidence interval for ψ are based on the maximum likelihood estimator of θ and the signed log-likelihood ratio statistic. It is well-known that ( θ ^ θ ) [ υar ( θ ^ ) ] 1 ( θ ^ θ ) is asymptotically distributed as χ2-distribution with p degrees of freedom, and υar ( θ ^ ) can be estimated by the inverse of either the expected Fisher information matrix or the observed information matrix evaluated at θ ^. In practice, the latter,
υar ^ ( θ ^ ) j θ θ 1 ( θ ^ ) = [ θ θ ( θ ^ ) ] 1 = [ 2 ( θ ) θ θ ] θ = θ ^ 1
is preferred because of the simplicity in calculation. By applying the Delta method, we have
υar ^ ( ψ ^ ) = υar ^ ( ψ ( θ ^ ) ) ψ θ ( θ ^ ) υar ^ ( θ ^ ) ψ θ ( θ ^ )
where ψ θ ( θ ^ ) = ψ ( θ ) θ | θ = θ ^. Hence ψ ^ ψ υar ^ ( ψ ^ ) is asymptotically distributed as N(0; 1). Thus a (1−γ)100% confidence interval of ψ based on the maximum likelihood estimator is
( ψ ^ z γ / 2 υar ^ ( ψ ^ ) , ψ ^ + z y / 2 υar ^ ( ψ ^ ) )
where zγ/2 is the (1 − γ/2)100th percentile of the standard normal distribution.
Alternatively, with regularity conditions stated in [12], the signed log-likelihood ratio statistic is
r ( ψ ) = sgn ( ψ ^ ψ ) { 2 [ ( θ ^ ) ( θ ^ ψ ) ] } 1 / 2 = sgn ( ψ ^ ψ ) { 2 [ ( θ ^ ) ˜ ( θ ^ ψ ) ] } 1 / 2
is asymptotically distributed as N(0, 1). Therefore, a (1 − γ)100% confidence interval of ψ based on the signed log-likelihood ratio statistic is
{ ψ : | r ( ψ ) | z γ / 2 }
It should be noted that both methods have rates of convergence only O(n−1/2). While the maximum likelihood estimator based interval is often preferred because of the simplicity in calculation, the signed log-likelihood ratio method is invariant to reparametrization and the maximum likelihood estimator based method is not. Doganaksoy and Schmee [13], showed that the signed log-likelihood ratio statistic based interval has better coverage property than the maximum likelihood estimator based interval in cases they considered.
In recent years, various adjustments to the signed log-likelihood ratio statistic have been proposed to improve the accuracy of the signed log-likelihood ratio statistic. Reid [14] and Severeni [15] gave detail overview of this development. In this paper, we consider the modified signed log-likelihood ratio statistic, which was introduced by [16,17] and has the form
r * ( ψ ) = r ( ψ ) + r ( ψ ) 1 log { Q ( ψ ) r ( ψ ) }
where r(ψ) is the signed log-likelihood ratio statistic, and Q(ψ) is a statistic that based on the log-likelihood function and an ancillary statistic. Barndorff-Nielsen [16,17] showed that r*(ψ) is asymptotically distributed as N(0, 1) with rate of convergence O(n−3/2). Hence a 100(1 − γ)% confidence interval for ψ(θ) is given by
{ ψ : | r * ( ψ ) | z γ / 2 }
However, for a general model, an ancillary statistic might not exist, and even when it does, it might not be unique. Fraser and Reid [8] showed that Q(ψ) is a standardized maximum likelihood departure in the canonical parameter scale and Fraser et al. [18] derived the general formula for obtaining the statistic Q(ψ). More specifically, Fraser et al. [18] obtained the locally defined canonical parameter
φ ( θ ) = ( θ ) y V
where
V = ( z ( y ; θ ) y ) 1 ( z ( y ; θ ) θ ) | θ = θ ^
is the ancillary direction and z(y; θ) = (z1(y; θ),⋯, zn(y; θ)) is a vector of pivotal quantities. Then the standardized maximum likelihood departure in φ(θ) scale takes the form:
Q ( ψ ) = sgn ( ψ ^ ψ ) | χ ( θ ^ ) χ ( θ ^ ψ ) | υar ^ ( χ ( θ ^ ) χ ( θ ^ ψ ) )
where
χ ( θ ) = ψ θ ( θ ^ ψ ) φ θ 1 ( θ ^ ψ ) φ ( θ )
is the recalibrated parameter of interest in the φ(θ) scale, and | χ ( θ ^ ) χ ( θ ^ ψ ) | is a measure of maximum likelihood departure of | ψ ^ ψ | in φ(θ) scale,
ψ θ ( θ ) = ψ ( θ ) θ and φ θ ( θ ) = φ ( θ ) θ
Since the exact υar ( χ ( θ ^ ) χ ( θ ^ ψ ) ) is difficult to obtain, Fraser et al. [18] showed that it can be approximated by
υar ^ ( χ ( θ ^ χ ( θ ^ ψ ) ) ψ θ ( θ ^ ψ ) j ˜ θ θ 1 ( θ ^ ψ ) ψ θ ( θ ^ ψ ) | j ˜ ( θ θ ) ( θ ^ ψ ) | | j ( θ θ ) ( θ ^ ) |
where
| j ( θ θ ) ( θ ^ ) | = | j θ θ ( θ ^ ) | | φ θ ( θ ^ ) | 2 and | j ˜ ( θ θ ) ( θ ^ ) | = | j ˜ θ θ ( θ ^ ψ ) | | φ θ ( θ ^ ψ ) | 2
with jθθ(θ) and j ˜ θ θ ( θ ) being the observed information matrix obtained from the log-likelihood function and the tilted log-likelihood function respectively. Hence, the confidence interval of ψ based on r*(ψ) can be obtained from Equation (11).

4. Application to Stress-Strength Reliability with Independent EE Distributions

In Section 2 the EE distribution was introduced as was the reliability constraint. In Section 4.1, we consider the case where the scale parameters are different. The proposed penalized likelihood method is used to obtain the constrained maximum likelihood estimate. The likelihood-based third order method is then applied to obtain inference for R = P (Y < X). Section 4.2 presents results from some numerical studies. The special case where the scale parameters are assumed to be equal, and hence, R = P (Y < X) has a closed-form, is also examined.

4.1. Stress-Strength Reliability with Unequal Scale Parameters

Let X and Y be independently distributed as EE(α1, β1) and EE(α2, β2) respectively. R(θ) is given in (2) which does not have a closed form when β1β2. Following the steps in Section 3, we first obtain the overall maximum likelihood estimate θ ^ = ( α ^ 1 , α ^ 2 , β ^ 1 , β ^ 2 ) and the constrained maximum likelihood estimate θ ^ ψ = ( α ˜ 1 , α ˜ 2 , β ˜ 1 , β ˜ 2 ) by the penalized likelihood method discussed in Section 2.5. Then observed information matrix j θ θ ( θ ^ ) can be obtained accordingly as follows.
j θ θ ( θ ) = θ θ ( θ ^ ) = ( n α ^ 1 2 i = 1 n x i e β ^ 1 x i 1 e β ^ 1 x i 0 0 i = 1 n x i e β ^ 1 x i 1 e β ^ 1 x i n β ^ 1 2 + A 0 0 0 0 m α ^ 2 2 j = 1 m y j e β ^ 2 y j 1 e β ^ 2 y j 0 0 j = 1 m y j e β ^ 2 y j 1 e β ^ 2 y j m β ^ 2 2 + B )
where A = ( α ^ 1 1 ) i = 1 n x i 2 e β ^ 1 x i ( 1 e β ^ 1 x i ) 2 and B = ( α ^ 2 1 ) j = 1 m y j 2 e β ^ 2 y j ( 1 e β ^ 2 y j ) 2.
The tilted log-likelihood function l ˜ ( θ ) is defined as
l ˜ ( θ ) = l ( x , y ; α 1 , α 2 , β 1 , β 2 ) + K ^ [ ψ ( θ ) ψ ]
where ψ(θ) = R defined by (2). Similarly, we can obtain the constrained maximum likelihood estimate θ ^ ψ = ( α ˜ 1 , α ˜ 2 , β ˜ 1 , β ˜ 2 ) by the penalized likelihood method and K ^ is the Lagrange multiplier, then constrained observed information matrix j ^ θ θ ( θ ψ ) can be written as
j ˜ θ θ ( θ ^ ψ ) = l ˜ θ θ ( θ ^ ψ ) = ( j ˜ α 1 α 1 ( θ ^ ψ ) j ˜ α 1 α 2 ( θ ^ ψ ) j ˜ α 1 β 1 ( θ ^ ψ ) j ˜ α 1 β 2 ( θ ^ ψ ) j ˜ β 1 α 1 ( θ ^ ψ ) j ˜ β 1 α 2 ( θ ^ ψ ) j ˜ β 1 β 1 ( θ ^ ψ ) j ˜ β 1 β 2 ( θ ^ ψ ) j ˜ α 2 α 1 ( θ ^ ψ ) j ˜ α 2 α 2 ( θ ^ ψ ) j ˜ α 2 β 1 ( θ ^ ψ ) j ˜ α 2 β 2 ( θ ^ ψ ) j ˜ β 2 α 1 ( θ ^ ψ ) j ˜ β 2 α 2 ( θ ^ ψ ) j ˜ β 2 β 1 ( θ ^ ψ ) j ˜ β 2 β 2 ( θ ^ ψ ) )
where
  • j ˜ α 1 α 1 ( θ ^ ψ ) = n α ˜ 1 2 K ^ R α 1 α 1 ( θ ^ ψ ), where R α 1 α 1 ( θ ^ ψ ) = 2 R ( θ ) α 1 2 | θ = θ ^ ψ.
  • j ˜ α 1 α 2 ( θ ^ ψ ) = K ^ R α 1 α 2 ( θ ^ ψ ), where R α 1 α 2 ( θ ^ ψ ) = 2 R ( θ ) α 1 α 2 | θ = θ ^ ψ.
  • j ˜ α 1 β 1 ( θ ^ ψ ) = i = 1 n x i e β ˜ 1 x i 1 e β ˜ 1 x i K ^ R α 1 β 1 ( θ ^ ψ ), where R α 1 β 1 ( θ ^ ψ ) = 2 R ( θ ) α 1 β 1 | θ = θ ^ ψ.
  • j ˜ α 1 β 2 ( θ ^ ψ ) = K ^ R α 1 β 2 ( θ ^ ψ ), where R α 1 β 2 ( θ ^ ψ ) = 2 R ( θ ) α 1 β 2 | θ = θ ^ ψ.
  • j ˜ α 2 α 2 ( θ ^ ψ ) = m α ˜ 2 2 K ^ R α 2 α 2 ( θ ^ ψ ), where R α 2 α 2 ( θ ^ ψ ) = 2 R ( θ ) α 2 2 | θ = θ ^ ψ.
  • j ˜ α 2 β 1 ( θ ^ ψ ) = K ^ R α 2 β 1 ( θ ^ ψ ), where R α 2 β 1 ( θ ^ ψ ) = 2 R ( θ ) α 2 β 1 | θ = θ ^ ψ.
  • j ˜ α 2 β 2 ( θ ^ ψ ) = j = 1 m y j e β ˜ 2 y j 1 e β ˜ 2 y j K ^ R α 2 β 2 ( θ ^ ψ ), where R α 2 β 2 ( θ ^ ψ ) = 2 R ( θ ) α 2 β 2 | θ = θ ^ ψ.
  • j ˜ β 1 β 1 ( θ ^ ψ ) = m β ˜ 1 2 + ( α ˜ 1 1 ) i = 1 n x i 2 e β ˜ 1 x i ( 1 e β ˜ 1 x i ) 2 K ^ R β 1 β 1 ( θ ^ ψ ), where R β 1 β 1 ( θ ^ ψ ) = 2 R ( θ ) β 1 2 | θ = θ ^ ψ.
  • j ˜ β 1 β 2 ( θ ^ ψ ) = K ^ R β 1 β 2 ( θ ^ ψ ), where R β 1 β 2 ( θ ^ ψ ) = 2 R ( θ ) β 1 β 2 | θ = θ ^ ψ.
  • j ˜ β 2 β 2 ( θ ^ ψ ) = m β ˜ 2 2 + ( α ˜ 2 1 ) j = 1 m y j 2 e β ˜ 2 y j ( 1 e β ˜ 2 y j ) 2 K ^ R β 2 β 2 ( θ ^ ψ ), where R β 2 β 2 ( θ ^ ψ ) = 2 R ( θ ) β 2 2 | θ = θ ^ ψ.
Thus r(ψ) can be obtained accordingly.
For the problem we are considering, the natural choice of the pivotal quantity is
z = ( z 1 , , z n , z n + 1 , , z n + m ) = ( log F ( x 1 ; α 1 , β ) , , log F ( x n ; α 1 , β ) , log F ( y 1 ; α 2 , β ) , , log F ( y m ; α 2 , β ) ) .
Hence the ancillary direction V is
V = ( V 1 , V 2 , V 3 , V 4 ) = ( log ( 1 e β ^ 1 x 1 ) 1 e β ^ 1 x 1 α ^ 1 β ^ 1 e β ^ 1 x 1 x 1 β ^ 1 0 0 log ( 1 e β ^ 1 x n ) 1 e β ^ 1 x n α ^ 1 β ^ 1 e β ^ 1 x n x n β ^ 1 0 0 0 0 log ( 1 e β ^ 2 y 1 ) 1 e β ^ 2 y 1 α ^ 2 β ^ 2 e β ^ 2 y 1 y 1 β ^ 1 0 0 log ( 1 e β ^ 2 y m ) 1 e β ^ 2 y m α ^ 2 β ^ 2 e β ^ 2 y m y m β ^ 2 )
Then we can calculate the locally defined canonical parameter φ(θ) as
φ ( θ ) = ( i = 1 n + m l ( θ ) w i V 1 i , i = 1 n + m l ( θ ) w i V 2 i , i = 1 n + m l ( θ ) w i V 3 i , i = 1 n + m l ( θ ) w i V 4 i )
where w = (x1,…, xn, y1,…, ym) be the observed data. Hence, we also have φθ(θ). Therefore, for this unequal scale parameter case, χ(θ), υar ^ ( χ ( θ ^ ) χ ( θ ^ ψ ) ), Q(ψ) and r*(ψ) can be obtained accordingly. Hence, the (1 − γ)100% confidence interval can be obtained from the modified signed log-likelihood ratio statistics.

4.2. Numerical Examples

To illustrate the proposed third-order method for interval estimation, the following two data sets with sample size of 11 and 9 were used: x = (2.1828, 0.5911, 1.0711, 0.9007, 1.7814, 1.3616, 0.8629, 0.2301, 1.5183, 0.8481, 1.0845) and y = (0.8874, 1.1482, 0.8227, 0.4086, 0.5596, 1.1978, 1.1324, 0.5625, 1.0679). By assumption, X and Y are independently distributed as EE(α1, β1) and EE(α2, β2) distributions respectively. We are interested in testing H0 : R = 0.5 vs. H1 : R > 0.5, where R = P (Y < X) is given in (2). Table 3 present the 90% and 95% confidence intervals (CI) for R based on the maximum likelihood method (MLE), the signed log-likelihood ratio statistic method (r) and the proposed third-order method (Proposed). In examining Table 3, we observe that all three methods give different interval estimates.
When the scale parameters is the same, the value of R can be expressed in closed form as α 1 α 1 + α 2. This simplifies the analysis. Following the same procedures introduced in Section 3, υar ^ ( χ ( θ ^ ) χ ( θ ^ ψ ) ), Q(ψ) and r*(ψ) can be obtained.
To compare the accuracy of the three methods discussed in this paper, Monte Carlo simulation studies were conducted. The cases of unequal and equal scale parameters are both examined. For each parameter configuration and for each sample size, we generate 10,000 random samples from the EE distributions by using the following transformation:
T = 1 β log ( 1 U 1 / α )
where U is a uniform variate between 0 and 1.
For each simulated setting, we report the proportion of ψ that fall outside the lower bound of the confidence interval (lower error), the proportion of ψ that fall outside the upper bound of the confidence interval (upper error), the proportion of ψ that fall within the confidence interval (central coverage), and the average bias (Average Bias), which is defined as
Average Bias = | lower error 0.025 | + | upper error 0.025 | 2
The nominal values for the lower and the upper errors, the central coverage and the average bias are 0.025, 0.025, 0.95 and 0 respectively. These values reflect the desired properties of the accuracy and symmetry of the interval estimates of ψ.
Tables 46 present simulation results for the unequal scale parameters case, i.e., α1 = 2, α2 = 5, β1 = 3, and R = 0.1(0.1)0.9 with (n, m) = (10, 10), (10, 50) and (50, 10). Note that, we fixed R and β2 is determined uniquely by Equation (2). It is clear that the coverage probabilities for R are poor and the two-tail error probabilities are extremely asymmetric from the MLE method. Results from the signed log-likelihood method are not satisfactory especially when the two sample sizes are small or sample sizes are unequal, and it also shows some evidence of asymmetry of two-tail error probabilities. However, the proposed method gives not only an almost exact coverage probability but also it has symmetric two-tail error probabilities even for small or uneven sample sizes.
Note that if we had used the built-in macros / subroutines from Matlab or R, our simulation study could not have been completed because of difficulties in obtaining the constrained maximum likelihood estimates. Fortunately, we did not encounter any difficulties when the penalized likelihood method was used.
Tables 79 present simulation results for the equal scale parameter case, i.e., α1 = 4, β = 8, and R = 0.1(0.1)0.9 with (n, m) = (10, 10), (10, 50) and (50, 10). In estimation, we fixed R and α2 is then determined uniquely by R = α1/(α1 + α2). Again, the proposed method outperformed the other two methods even when the sample sizes were small. Other simulation results, though not reported here, essentially corroborate those of Tables 79, and are available upon request. In the equal scale parameter case, when sample sizes were small, or when R was close to the boundary values, we used the penalized likelihood method to obtain the constrained maximum likelihood estimates. More simulations have been performed with the same pattern of results. They are not reported here, but are available on request.

5. Conclusions

A penalized likelihood method is proposed to overcome numerical difficulties that arose in using the Lagrange method to solve an optimization problem with an integral constraint. The proposed method is used to obtain constrained maximum likelihood estimates for the stress-strength reliability with independent exponentiated exponential distributions. In turn, these estimates are used to obtain inference for the stress-strength reliability via a likelihood-based third order asymptotic method. In our simulation studies, the penalized likelihood method did not encounter any numerical difficulties in obtaining the constrained maximum likelihood estimates. Moreover, the likelihood-based third order asymptotic method gives very accurate results even when the sample sizes are small. Although the paper is restricted to independent exponentiated exponential distributions, it can easily be extended to other distributions.

Acknowledgments

The authors would like to thank two anonymous referees for their very helpful comments and suggestions. The corresponding author gratefully acknowledged the support of NSERC through his Discovery Grant.

Appendix

In this appendix, we derive the properties of the unconstrained likelihood function that we stated in Section 2 of the paper. Some or all of these results may be known in the literature but we have not found such a source.
Let x = (x1,…, xn) and y = (y1,…, ym) be independent random samples from EE(α1, β1) and EE(α2, β2) respectively. Then the log-likelihood function is:
l ( θ ; x , y ) = l ( α 1 , β 1 , α 2 , β 2 ; x , y ) = n log α 1 + n log β 1 + ( α 1 1 ) i = 1 n log ( 1 e β 1 x i ) β 1 i = 1 n x i + m log α 2 + m log β 2 + ( α 2 1 ) j = 1 m log ( 1 e β 2 y j ) β 2 j = 1 m y j
where θ = (α1, α2, β1, β2). We note that, because of independence of X and Y, the log-likelihood function is the sum of two functions that are identical except for the parameter and variable names. Parameters of one function do not enter the second function. For that reason, we limit our analysis to the generic log-likelihood function:
l ( α , β ; x ) = n log α + n log β + ( α 1 ) i = 1 n log ( 1 e β x i ) β i = 1 n x i
The first and second partial derivatives are given, using subscript notation, as:
l α = n α + i = 1 n log ( 1 e β x i )
l β = n β + ( α 1 ) i = 1 n ( x i e β x i ( 1 e β x i ) ) i = 1 n x i
l α α = n α 2
l β β = n β 2 ( α 1 ) i = 1 n x i 2 e β x i ( 1 e β x i ) 2
l α β = i = 1 n x i e β x i ( 1 e β x i )
It is clear from (21) that lαα < 0 as stated in the Section 2. In order to establish that lββ < 0, it is convenient to rewite Equation (22) as:
l β β = n β 2 + i = 1 n x i 2 e β x i ( 1 e β x i ) 2 α i = 1 n x i 2 e β x i ( 1 e β x i ) 2
The first and third terms in Equation (24) are negative but the second term is positive. However, it is possible to show that the the sum of the first two terms is negative. Note first that n = ( 1 e β x 1 ) 2 ( 1 e β x 1 ) 2 + + ( 1 e β x n ) 2 ( 1 e β x n ) 2. This allows us to rewrite Equation (24) as:
l β β = 1 β 2 ( i = 1 n β 2 x i 2 e β x i ( 1 e β x i ) 2 ( 1 e β x i ) 2 ) α i = 1 n x i 2 e β x i ( 1 e β x i ) 2
A representative numerator term from the first expression is: β 2 x i 2 e β x i ( 1 e β x i ) 2 which is the difference of squares and can be re-expressed as:
β 2 x i 2 e β x i ( 1 e β x i ) 2 = ( β x i e .5 β x i ( 1 e β x i ) ) ( β x i e .5 β x i + ( 1 e β x i ) )
The second product term on the right hand side is the sum of two positive terms and is therefore positive. The first product term is always negative when β > 0 and xi > 0. This is because
β x i e .5 β x i ( 1 e β x i ) = e .5 β x i [ β x i ( e .5 β x i e .5 β x i ) ]
and using a Taylor expansion of β x i ( e .5 β x i e .5 β x i ) < 0 with β > 0 and xi > 0, we see that β x i e .5 β x i ( 1 e β x i ) < 0. This completes the demonstration that lββ < 0 whenever β > 0 and xi > 0.
The log-likelihood function, l(α, β; x), is not a concave function of (α, β) over the domain of the function. However, it is possible to show that when n > 2, there is a unique maximum of l(α, β; x) that occurs where lα = 0 and lβ = 0. We begin by showing that there is a unique solution to lα = 0 and lβ = 0.
Equation (19) can be solved for α as:
α ( β ) = n i = 1 n log ( 1 e β x i )
From (26) it is clear that α(β) is monotone increasing and that limβ→0 α(β) = 0. Substituting α(β) into Equation (20) yields:
n β + ( n i = 1 n log ( 1 e β x i ) ) i = 1 n x i e β x i ( 1 e β x i ) i = 1 n x i e β x i ( 1 e β x i ) i = 1 n x i = 0
The first three terms on the left hand side define a function of β, call it lhs(β), with the property that lhs(β) is monotone decreasing with limβ→0 lhs(β) = + and limβ→∞ lhs(β) = 0. The fact that limβ→∞ lhs(β) = 0 is clear by inspection. As well, it is clear that the second term on the left hand side approaches + as β → 0. What is less obvious is that the sum of the first and third terms limits to .5 i = 1 n x i as β → 0. However, this can be established by writing the sum of the first and third terms as: 1 β ( n i = 1 n β x i e β x i ( 1 e β x i ) ). If we make the substitution: n = ( 1 e β x 1 ) ( 1 e β x 1 ) + + ( 1 e β x n ) ( 1 e β x n ) we obtain: 1 β ( n i = 1 n β x i e β x i ( 1 e β x i ) ) = 1 β ( i = 1 n 1 e β x i β x i e β x i ( 1 e β x i ) ). The result above follows when the exponential terms are expressed as series and the limit is taken as β → 0. Alternatively, l’Hopital’s rule can be applied to each term of the sum.
Given that i = 1 n x i > 0 because X is a positive random variable, we see that there is a unique solution to Equation (27).
The final task in this Appendix is to show that the unique point (α, β)*, where the gradient vanishes, is the global maximum of the likelihood function. This can be accomplished by showing that the determinant of the Hessian matrix of l(α, β; x) is strictly positive at (α, β)*. The determinant of the Hessian matrix, call it D, can be expressed equivalently as:
D = A 1 A 2
where:
A 1 = n β 2 + i = 1 n x i 2 e β x i ( 1 e β x i ) 2
and
A 2 = B i = 1 n log ( 1 e β 1 x i ) i = 1 n x i 2 e β x i ( 1 e β x i ) 2 + B ( i = 1 n x i e β x i ( 1 e β x i ) ) 2
with B = n ( i = 1 n log ( 1 e β 1 x i ) ) 2 > 0. To establish that D > 0 it is sufficient to show that A1 < 0 and A 2 B < 0. Note that the first two terms in Equation (24) are exactly the same as A1 and we proved the sum of those terms was negative in the discussion following Equation (24). It remains only to prove that:
i = 1 n log ( 1 e β x i ) i = 1 n x i 2 e β x i ( 1 e β x i ) 2 + ( i = 1 n x i e β x i ( 1 e β x i ) ) 2 < 0
Recalling the inequality log(s) < s − 1 and therefore that log(1 − et) < −et < 0 for all t > 0, we replace i = 1 n log ( 1 e β 1 x i ) by i = 1 n e β 1 x i and obtain:
i = 1 n e β x i i = 1 n x i 2 e β x i ( 1 e β x i ) 2 + ( i = 1 n x i e β x i ( 1 e β x i ) ) 2 < 0
and note that the inequality in Equation (31) will hold whenever the inequality in Equation (32) holds. The Cauchy-Schwatz inequality establishes that ( i = 1 n u i υ i ) 2 i = 1 n u i 2 i = 1 n υ i 2. Equality holds only when u i υ i = K for all i where K is a constant different from 0 or where ui = 0 and/or vi = 0 for all i. Setting u i = e .5 β x i and υ i = x i e .5 β x i ( 1 e β x i ), we see that the Cauchy-Schwartz inequality establishes the strict inequality in Equation (32) and therefore the determinant of the Hessian matrix of the log-likelihood function is positive. In fact, we have proved a stronger result. Suppose that we concentrate the log-likelihood function by substituting α(β) from Equation (26) into Equation (18). This function shows how the log-likelihood varies with β assuming that α is always chosen according to lα = 0 as in Equation (26). The second derivative of the concentrated log-likelihood function with respect to β is exactly equal to the negative of the determinant of the Hessian of the log-likelihood function. We have proved that this is negative without requiring that β must satisfy lβ = 0 as in Equation (27). As such, the concentrated log-likelihood function must be strictly concave as a function of β. We therefore have established a straightforward algorithm to get to the maximum of the log-likelihood function. Use a simple line search to find the unique β* that satisfies Equation (27). This is guaranteed to converge. The corresponding unique α* is then obtained from Equation (26). Alternatively, a search algorithm using second derivative information from the log-likelihood function could fail. Yee [19] notes possible problems in estimating EE models using the maximum likelihood function expexp() in the VGAM package in R.

Author Contributions

The paper is the result of a joint effort by all of the authors, who contributed in equal ways to conceiving of the idea, carrying out all numerical calculations and writing the paper. All authors have read and approved the final manuscript.

Conflicts of Interest

The authors declare no conflict of interest.

References

  1. Bertsekas, D.P. Constrained Optimization and Lagrange Multiplier Methods; Athena Scientific: Belmont, MA, USA, 1996. [Google Scholar]
  2. Byrne, C.L. Sequential Unconstrained Minimization. A Survey. Available online: http://faculty.uml.edu/cbyrne/SUM.pdf accessed on 11 June 2015.
  3. Smith, J.B.; Wong, A.; Zhou, X. Higher order inference for stress-strength reliability with independent Burr-type X distributions. J. Stat. Comput. Simul. 2014. [Google Scholar] [CrossRef]
  4. Gupta, R.D.; Kundu, D. Exponentiated exponential family: An alternative to gamma and Weibull distributions. Biom. J. 2001, 43, 117–130. [Google Scholar]
  5. Mudholkar, G.S.; Srivastava, D.K. Exponentiated Weibull family for analyzing bathtub failure-rate data. IEEE Trans. Reliab. 1993, 42, 299–302. [Google Scholar]
  6. Kundu, D.; Gupta, R.D. Estimation of P [Y < X] for generalized exponential distribution. Metrika 2005, 61, 291–308. [Google Scholar]
  7. Raqab, M.Z.; Madi, M.T.; Kundu, D. Estimation of P [Y < X] for the three-parameter generalized exponential distribution. Commun. Stat. -Theory Methods 2008, 37, 2854–2864. [Google Scholar]
  8. Fraser, D.A.S.; Reid, N. Ancillaries and third order significance. Util. Math 1995, 47, 33–35. [Google Scholar]
  9. Chang, F.; Wong, A. Improved likelihood-based inference for the stationary AR(2) model. J. Stat. Plan. Inference 2010, 140, 2099–2110. [Google Scholar]
  10. She, Y.; Wong, A.; Zhou, X. Revisit the two sample t-test with a known ratio of variances. Open J. Stat. 2011, 1, 151–156. [Google Scholar]
  11. Byrd, R.H.; Lu, P.; Nocedal, J.; Zhu, C. A limited memory algorithm for bound constrained optimization. SIAM J. Sci. Comput. 1995, 16, 1190–1208. [Google Scholar]
  12. Cox, D.R.; Hinkley, D.V. Thoeretical Statistics; Chapman and Hall: London, UK, 1974. [Google Scholar]
  13. Doganaksoy, N.; Schmee, J. Comparisons of approximate confidence intervals for distributions used in life-data analysis. Technometrics 1993, 35, 175–184. [Google Scholar]
  14. Reid, N. Likelihood and higher-order approximations to tail areas: A review and annotated bibliography. Can. J. Stat. 1996, 24, 141–166. [Google Scholar]
  15. Severeni, T. Likelihood Methods in Statistics; Oxford University Press: New York, NY, USA, 2000. [Google Scholar]
  16. Barndorff-Nielsen, O.E. Inference on full or partial parameters based on the standardized signed log likelihood ratio. Biometrika 1986, 73, 307–322. [Google Scholar]
  17. Barndorff-Nielsen, O.E. Modified signed log likelihood ratio. Biometrika 1991, 78, 557–563. [Google Scholar]
  18. Fraser, D.A.S.; Reid, N.; Wu, J. A simple general formula for tail probabilities for frequentist and Bayesian inference. Biometrika 1999, 86, 249–264. [Google Scholar]
  19. Yee, T.W. Vector Generalized Linear and Additive Models: With an Implementation in R; Springer: Berlin/Heidelberg, Germany, 2015. [Google Scholar]
Table 1. Dataset1: Two Simulated Datasets.
Table 1. Dataset1: Two Simulated Datasets.
Data SetObservationsSample Size
X0.4977
0.6414
0.0781
0.2669
0.3827
0.1978
0.2694
0.1968
0.4125
0.2397
10
Y1.7057
1.9481
1.0191
2.1290
0.5899
0.8109
0.9031
1.6463
0.9207
1.9842
10
Table 2. Parameter Estimates for Dataset1.
Table 2. Parameter Estimates for Dataset1.
Methodα1α2β1β2RLoglikelihood
Unconstrained4.42398.52276.77932.02620.0142−3.3191
Constrained: Penalty3.60283.20185.27071.57000.0989−5.1659
Table 3. Interval Estimates of ψ for Example.
Table 3. Interval Estimates of ψ for Example.
90% Confidence Interval95% Confidence Interval
MLE(0.4223,0.8179)(0.3843,0.8557)
r(0.4151,0.7966)(0.3767,0.8241)
Proposed(0.4080,0.7910)(0.3698,0.8188)
Table 4. β1β2, α1 = 2, α2 = 5, β1 = 3 and β2 is obtained by Equation (2), (n, m) = (10, 10).
Table 4. β1β2, α1 = 2, α2 = 5, β1 = 3 and β2 is obtained by Equation (2), (n, m) = (10, 10).
RMethodLower ErrorUpper ErrorCentral CoverageAverage Bias
MLE0.16020.00330.83650.07845
0.1r0.04010.01770.94220.01120
Proposed0.02070.02520.95410.00255

MLE0.11380.01250.87370.05065
0.2r0.03880.02350.93770.00765
Proposed0.02180.02580.95240.00200

MLE0.08570.02250.89180.03160
0.3r0.03720.02620.93660.00670
Proposed0.02300.02590.95270.00135

MLE0.06560.03620.89820.02590
0.4r0.03520.02940.93540.00730
Proposed0.02440.02590.94970.00075

MLE0.05050.05060.89890.02555
0.5r0.03170.03280.93550.00725
Proposed0.02490.02550.94960.00030

MLE0.03530.06700.89770.02615
0.6r0.02900.03590.93510.00745
Proposed0.02440.02460.95100.00050

MLE0.02350.09000.88650.03325
0.7r0.02570.03940.93490.00755
Proposed0.02460.02380.95160.00080

MLE0.01420.12340.86240.05460
0.8r0.02390.04190.93420.00900
Proposed0.02610.02390.95000.00110

MLE0.00350.17630.82020.08640
0.9r0.01980.04650.93370.01335
Proposed0.02620.02400.94980.00110
Table 5. β1β2, α1 = 2, α2 = 5, β1 = 3 and β2 is obtained by Equation (2), (n, m) = (10, 50).
Table 5. β1β2, α1 = 2, α2 = 5, β1 = 3 and β2 is obtained by Equation (2), (n, m) = (10, 50).
RMethodLower ErrorUpper ErrorCentral CoverageAverage Bias
MLE0.13410.00460.86130.06475
0.1r0.03990.01860.94150.01065
Proposed0.02310.02460.95230.00115

MLE0.10070.01310.88620.04380
0.2r0.03700.02430.93870.00635
Proposed0.02390.02510.95100.00060

MLE0.07740.02490.89770.02625
0.3r0.03490.02890.93620.00690
Proposed0.02280.02700.95020.00210

MLE0.06150.03530.90320.02340
0.4r0.03270.03110.93620.00690
Proposed0.02200.02600.95200.00200

MLE0.04750.04880.90370.02315
0.5r0.02940.03230.93830.00585
Proposed0.02290.02400.95310.00155

MLE0.03510.06820.89670.02665
0.6r0.02780.03480.93740.00630
Proposed0.02220.02220.95560.00280

MLE0.02250.09620.88130.03685
0.7r0.02560.03880.93560.00720
Proposed0.02260.02270.95470.00235

MLE0.01260.12490.86250.05615
0.8r0.02170.04310.93520.01070
Proposed0.02240.02420.95340.00170

MLE0.00630.17790.81580.08580
0.9r0.01680.04730.93590.01525
Proposed0.02150.02380.95470.00235
Table 6. β1β2, α1 = 2, α2 = 5, β1 = 3 and β2 is obtained by Equation (2), (n, m) = (50, 10).
Table 6. β1β2, α1 = 2, α2 = 5, β1 = 3 and β2 is obtained by Equation (2), (n, m) = (50, 10).
RMethodLower ErrorUpper ErrorCentral CoverageAverage Bias
MLE0.12570.00460.86970.06055
0.1r0.03920.01670.9441150.01125
Proposed0.02070.02030.95900.00450

MLE0.08850.01540.89610.03655
0.2r0.03470.02230.94300.00620
Proposed0.02260.02260.95480.00240

MLE0.06570.02340.91090.02115
0.3r0.03320.02520.94160.00420
Proposed0.02210.02330.95460.002130

MLE0.04970.03350.91680.01660
0.4r0.03170.02860.93970.00515
Proposed0.02280.02410.95310.00155

MLE0.03680.04280.92040.01480
0.5r0.02850.03090.94050.00475
Proposed0.02220.02360.95420.00210

MLE0.02640.04990.92370.01315
0.6r0.02480.0348300.94220.00410
Proposed0.02150.02310.95540.00270

MLE0.01840.05950.92210.02055
0.7r0.02220.03320.94460.00550
Proposed0.02060.02350.95590.00295

MLE0.01120.07150.91730.03015
0.8r0.01860.03240.94900.00690
Proposed0.01960.02120.95920.00460

MLE0.00550.09270.90180.04360
0.9r0.01490.02910.95600.00710
Proposed0.01830.02090.96080.00540
Table 7. α1 = 4, β = 8 and α2 satisfies R = α1/(α1 + α2), (n, m) = (10, 10).
Table 7. α1 = 4, β = 8 and α2 satisfies R = α1/(α1 + α2), (n, m) = (10, 10).
RMethodLower ErrorUpper ErrorCentral CoverageAverage Bias
MLE0.16080.00240.83680.07920
0.1r0.05260.01890.92850.01685
Proposed0.02360.02670.94970.00155

MLE0.11950.01210.86840.05370
0.2r0.05020.02360.92620.01330
Proposed0.02690.02640.94670.00165

MLE0.09440.02240.88320.03600
0.3r0.04280.02560.93160.00920
Proposed0.02250.02340.95410.00205

MLE0.07240.03470.89290.02855
0.4r0.03870.02770.93360.00820
Proposed0.02560.02240.95200.00160

MLE0.05230.05170.89600.02700
0.5r0.03350.03280.93370.00815
Proposed0.02440.02300.95260.00013

MLE0.03780.07530.88690.03155
0.6r0.02950.04000.93050.00975
Proposed0.02340.02600.95060.00130

MLE0.02450.09440.88110.03495
0.7r0.02820.04540.92640.01180
Proposed0.02610.02620.94770.00115

MLE0.01090.11870.87040.05390
0.8r0.02260.04670.93070.01205
Proposed0.02390.02620.94990.00115

MLE0.00270.14640.85090.07185
0.9r0.02110.04990.92900.01440
Proposed0.02680.02520.94800.00100
Table 8. α1 = 4, β = 8 and α2 satisfies R = α1/(α1 + α2), (n, m) = (10, 50).
Table 8. α1 = 4, β = 8 and α2 satisfies R = α1/(α1 + α2), (n, m) = (10, 50).
RMethodLower ErrorUpper ErrorCentral CoverageAverage Bias
MLE0.07220.00930.91850.03145
0.1r0.03020.02730.94250.00375
Proposed0.02500.02610.94890.00055

MLE0.05770.02060.92170.01855
0.2r0.02870.02680.94450.00275
Proposed0.02600.02440.94960.00080

MLE0.04310.03270.92420.01290
0.3r0.02760.03060.94180.00410
Proposed0.02610.02570.94820.00090

MLE0.02990.04120.92890.01055
0.4r0.02160.02900.94940.00370
Proposed0.02180.02270.95550.00275

MLE0.04750.04880.90370.02315
0.5r0.02340.03580.94080.00620
Proposed0.02430.02690.94880.00130

MLE0.01660.06880.91460.02610
0.6r0.02210.03500.94290.00645
Proposed0.02490.02540.94970.00250

MLE0.01050.07890.91060.03420
0.7r0.02060.03660.94280.00800
Proposed0.02760.02290.94950.00235

MLE0.00590.10210.89200.04810
0.8r0.01950.04260.93790.01155
Proposed0.02590.02690.94720.00140

MLE0.00160.12100.87740.05970
0.9r0.01600.04560.93840.01480
Proposed0.02390.02490.95120.00060
Table 9. α1 = 4, β = 8 and α2 satisfies R = α1/(α1 + α2), (n, m) = (50, 10).
Table 9. α1 = 4, β = 8 and α2 satisfies R = α1/(α1 + α2), (n, m) = (50, 10).
RMethodLower ErrorUpper ErrorCentral CoverageAverage Bias
MLE0.11200.00080.88720.05560
0.1r0.04120.01610.94270.01255
Proposed0.02300.02470.95230.00115

MLE0.10110.00480.89410.04815
0.2r0.04070.02020.93190.01025
Proposed0.02610.02600.94790.00105

MLE0.08050.01080.90870.03485
0.3r0.03660.02320.94020.00670
Proposed0.02480.02750.94770.00135

MLE0.06480.01400.92120.02540
0.4r0.03380.02150.94470.00615
Proposed0.02380.02470.95150.00075

MLE0.05510.02580.91910.01545
0.5r0.03170.02560.94270.00365
Proposed0.02420.02760.94820.00170

MLE0.04370.02960.92670.01165
0.6r0.03290.02280.94430.00505
Proposed0.02470.02290.95240.00120

MLE0.03320.04000.92680.01160
0.7r0.03070.02470.94460.00300
Proposed0.02570.02410.95020.00080

MLE0.02280.05480.92240.01600
0.8r0.02940.02660.94400.00300
Proposed0.02560.02420.95020.00070

MLE0.00980.06700.92320.02860
0.9r0.02470.02990.94540.00260
Proposed0.02340.02610.95050.00135

Share and Cite

MDPI and ACS Style

Smith, B.; Wang, S.; Wong, A.; Zhou, X. A Penalized Likelihood Approach to Parameter Estimation with Integral Reliability Constraints. Entropy 2015, 17, 4040-4063. https://0-doi-org.brum.beds.ac.uk/10.3390/e17064040

AMA Style

Smith B, Wang S, Wong A, Zhou X. A Penalized Likelihood Approach to Parameter Estimation with Integral Reliability Constraints. Entropy. 2015; 17(6):4040-4063. https://0-doi-org.brum.beds.ac.uk/10.3390/e17064040

Chicago/Turabian Style

Smith, Barry, Steven Wang, Augustine Wong, and Xiaofeng Zhou. 2015. "A Penalized Likelihood Approach to Parameter Estimation with Integral Reliability Constraints" Entropy 17, no. 6: 4040-4063. https://0-doi-org.brum.beds.ac.uk/10.3390/e17064040

Article Metrics

Back to TopTop