Next Article in Journal
φ-ψ-Contractions under W-Distances Employing Symmetric Locally T-Transitive Binary Relation
Next Article in Special Issue
Estimation of Asymmetric Spatial Autoregressive Dependence on Irregular Lattices
Previous Article in Journal
A Three-Component Additive Weibull Distribution and Its Reliability Implications
Previous Article in Special Issue
Monitoring the Ratio of Two Normal Variables Based on Triple Exponentially Weighted Moving Average Control Charts with Fixed and Variable Sampling Intervals
 
 
Font Type:
Arial Georgia Verdana
Font Size:
Aa Aa Aa
Line Spacing:
Column Width:
Background:
Article

Generating Optimal Discrete Analogue of the Generalized Pareto Distribution under Bayesian Inference with Applications

by
Hanan Haj Ahmad
1,* and
Ehab M. Almetwally
2,3
1
Department of Basic Science, Preparatory Year Deanship, King Faisal University, Hofuf 31982, Saudi Arabia
2
Department of Statistical, Faculty of Business Administration, Delta University for Science and Technology, Gamasa 11152, Egypt
3
The Scientific Association for Studies and Applied Research, Al Manzalah 35646, Egypt
*
Author to whom correspondence should be addressed.
Submission received: 20 June 2022 / Revised: 13 July 2022 / Accepted: 14 July 2022 / Published: 16 July 2022

Abstract

:
This paper studies three discretization methods to formulate discrete analogues of the well-known continuous generalized Pareto distribution. The generalized Pareto distribution provides a wide variety of probability spaces, which support threshold exceedances, and hence, it is suitable for modeling many failure time issues. Bayesian inference is applied to estimate the discrete models with different symmetric and asymmetric loss functions. The symmetric loss function being used is the squared error loss function, while the two asymmetric loss functions are the linear exponential and general entropy loss functions. A detailed simulation analysis was performed to compare the performance of the Bayesian estimation using the proposed loss functions. In addition, the applicability of the optimal discrete generalized Pareto distribution was compared with other discrete distributions. The comparison was based on different goodness-of-fit criteria. The results of the study reveal that the discretized generalized Pareto distribution is quite an attractive alternative to other discrete competitive distributions.

1. Introduction

The amount of data available in nature has become larger, demanding new statistical distributions to modify the description of each phenomenon or experiment under study. Most lifetime data are continuous, while they are discrete in observation, which leads to a need for appropriate methods to discretize the continuous distribution to better fit these data. Almost always, the observed values are in fact discrete because they are restrained to only a finite number of decimal places and cannot really create all points in a continuum. In some other cases, because of the accuracy of the measuring apparatus or the need to save space, continuous variables are measured by the frequencies of separate class intervals, whose union creates the whole range of random variables, and multinomial law is used to model this situation. Therefore, considering them as discrete values is more appropriate. Even for a continuous life experiment, records in an interval of time result in a discrete model, which seems more suitable than a continuous model.
Recently, many discrete distributions have been identified, particularly in reliability and survival analyses. For a special description and the role of discrete distributions, one may refer to [1,2,3,4,5,6,7,8], among others. Hence, many authors have conducted much work to originate and develop discrete reliability theory from various points of view.
The characterization of continuous random variables can be performed either by their probability density function (pdf), cumulative distribution function (CDF), moments, hazard rate functions, or others. Usually, creating a discrete analogue from a continuous distribution is based on the principle of preserving one or more characteristic properties of the continuous one. Consequently, different ways to discretize a continuous distribution appear in the literature, depending on the property the researcher aims to preserve (see, for example, [9,10]). In [11], the author provided an extensive survey of different discretization methods that preserve different functions.
There are many useful tips for creating discrete random variables from continuous ones: through discretization, data can actually be summarized and simplified; in addition, they can also become easier to understand, use, and explain for researchers (see [12]). Other tests appearing in the literature are suitable for both discrete and continuous distributions (see, for example, [13,14]).
Therefore, it is desirable to study a suitable discrete distribution created from the underlying continuous models.
In the present paper, we discretize the continuous generalized Pareto distribution (GPD) using three different discretization methods. Almost all authors have used one discretization method, which depends on the survival function. In [6,7], discrete normal and discrete Rayleigh distributions were introduced, respectively, and the author used the survival discretization approach. Using the same approach, discrete Burr type II was studied in [15]. Additionally, [16] introduced the discrete additive Weibull distribution (see also [17,18,19,20,21,22,23]). However, there remains a need to improve discrete models and generate new ones for the sake of describing and fitting the huge amount of data that appear and spread evenly throughout humans’ daily lives. Further, [24] discussed the discrete odd Perks-G class of distributions. Reference [25] introduced a new novel discrete distribution with an application to COVID-19, and [26] obtained a discrete Weibull Marshall–Olkin family of distributions.
We aim to discretize the GPD since it has extensive applications and can model many real-life distributions. Recently, many authors have studied the continuous GPD; for example, one may refer to [27], in which the authors discussed baseline methods for parameter estimation. The authors of [28] performed statistical inference of the dynamic conditional GPD with weather and air quality factors, and [29] discussed outlier-robust truncated maximum likelihood parameter estimators of the GPD. Reference [30] introduced risk analysis using the GPD.
The originality of this work stems from the fact that no earlier research has been conducted in this area using the suggested discretization method and compared it with other methods from a Bayesian point of view. Symmetric and asymmetric loss functions are performed in the Bayesian estimation method using different parameter values. Therefore, the main objective of this paper is to illustrate the efficiency and performance of discrete generalized Pareto distributions (DGPDs) for modeling different COVID-19 daily death cases.
The rest of this paper is organized as follows: Section 2 contains the model description and the discretization methods. Section 3 presents Bayesian inference for unknown parameters, and both point and interval estimations are performed for the three DGPDs. In Section 4, the simulation study is described. Real data examples are provided in Section 5. Finally, conclusions are provided in Section 6.

2. Model Description and Discretization Methods

The generalized Pareto distribution is a continuous distribution with two parameters. However, its continuous distributional form is limited in characterizing data of discrete forms. Discretizing the GPD, therefore, produces a consequent distribution that accommodates count data while preserving the vital tail-modeling feature of the GPD. In this paper, we perform three discrete versions of the two-parameter GPD and use these counterparts to model real-life data.
The probability density function (pdf) of the continuous GPD is given as
f ( x ; θ , λ ) = { 1 λ ( 1 + θ λ x ) ( 1 + 1 θ ) θ 0 1 λ e x / λ θ = 0 ,
and the cumulative distribution function (CDF) is given by
F ( x ; θ , λ ) = { 1 ( 1 + θ λ x ) 1 θ θ 0 1 e x / λ θ = 0 ,
where λ > 0 is the scale parameter, and θ is the shape parameter, < θ < . The domain of the random variable x depends on the value of θ , particularly whether it is positive or negative; hence, we have two cases: first, when θ > 0, x > 0 , and when θ < 0, the support of x will be bounded, i.e., 0 < x < λ θ . For θ > 0, the GPD is the well-known Pareto distribution. When θ 0 , the GPD reduces to the exponential distribution, as shown in Equation (1).
The GPD has a mean of ( λ /(1 − θ )) and a variance λ 2 ( 1 θ ) 2 ( 1 2 θ ) , provided θ < 0.5. The survival function S ( x ; θ , λ ) and the hazard rate function HR are given, respectively, as follows:
S ( x ; θ , λ ) = ( 1 + θ x λ ) 1 θ ,
and
h ( x ; θ , λ ) = 1 λ ( 1 + θ λ x ) 1 .
The three discretization methods are presented in the next subsections. The first method aims to preserve the survival function, while the second method preserves the pdf, and the third method preserves the hazard rate.

2.1. Survival Discretization Method

The probability mass function (pmf) of a discrete distribution is defined by [6,7] as follows:
P ( X = k ) = S ( k ) S ( k + 1 ) , k = 0 , 1 , 2 ,
where S ( x )   is the survival function given by Equation (3). Hence, the pmf of the first discrete generalized Pareto distribution (DGPD1) is
P ( X = k ) = ( 1 + θ k λ ) 1 θ ( 1 + θ ( k + 1 ) λ ) 1 θ
The CDF of the DGPD1 distribution in the survival discretization method can be written as:
P ( X < k ) = F ( k + 1 ) = 1 ( 1 + θ ( k + 1 ) λ ) 1 θ

2.2. Methodology II

In this method, the pmf of the discrete random variable is derived as an analogue of the continuous random variable with pdf f ( x ) as
P ( X = k ) = f ( k ) j = 0 f ( j ) ,     k = 0 , 1 , 2 ,
For more details and examples of this method, one can refer to [11]. When applying this method to the continuous GPD, we perceive a second discrete distribution, namely, DGPD2. Accordingly, the pmf can be written as:
P ( X = k ) = ( 1 + θ k λ ) ( 1 θ + 1 ) ( θ λ ) ( 1 θ + 1 ) ξ ( 1 + 1 θ , λ θ ) ,         k = 0 , 1 , 2 , .
The corresponding CDF is derived as
P ( X < k ) = 1 ( θ λ ) ( 1 θ + 1 ) ξ ( 1 + 1 θ , λ θ ) x = 0 k ( 1 + θ x λ ) ( 1 θ + 1 ) ,
where ξ ( s , a ) = ι = 0 ( ι + a ) s represents the Hurwitz zeta function.

2.3. Methodology III (Hazard Rate)

This methodology preserves the hazard rate function. It is performed as a two-stage method. In the first stage, the continuous random variable X with CDF F(x) defined on [0, +∞) is used to construct a new continuous random variable X 1 with the hazard rate function h X 1 ( x ) = e F ( x ) , (x ≥ 0). For more details about this methodology, a good reference is [11]. The survival function of the discrete analogue Y is given by
P ( Y k ) = ( 1 h X 1 ( 1 ) ) ( 1 h X 1 ( 2 ) ) ( 1 h X 1 ( k 1 ) ) ,   k = 1 , 2 , , m .
The corresponding pmf is then given by
P ( Y = k ) = { h X 1 ( 0 ) , k = 0 , ( 1 h X 1 ( 1 ) ) ( 1 h X 1 ( 2 ) ) ( 1 h X 1 ( k 1 ) ) h X 1 ( k ) , k = 1 , 2 , , m 0 ,   o t h e r w i s e  
Note that the range of Y is the value of m (m need not be finite) and is determined so that it satisfies the condition 0 ≤ h (y) ≤ 1.
For the GPD model, the hazard rate function of X 1 will be h X 1 ( y ) = e 1 + ( 1 + θ y λ ) 1 θ ; hence, the above condition holds. The survival function in Equation (11) for the third version of the discrete GP distribution (DGPD3) is
P ( Y k ) = i = 1 k 1 ( 1 e 1 + ( 1 + θ * i λ ) 1 θ ) ,
Therefore, the CDF is
P ( Y < k ) = 1 i = 1 k 1 ( 1 e 1 + ( 1 + θ * i λ ) 1 θ ) .
The corresponding pmf is then given by
P ( Y = k ) = { 1 ,                                                                                                                                     k = 0 e 1 + ( 1 + θ k λ ) 1 θ i = 1 k 1 ( 1 e 1 + ( 1 + θ * i λ ) 1 θ ) ,   k = 1 , 2 , , m
In Figure 1, Figure 2 and Figure 3, the pmfs of DGPD1, DGPD2, and DGPD3 are plotted, respectively, for different parameter values. They possess a decreasing trend with different selected parameter values.

3. Parameter Estimation

In this section, we estimate the unknown parameters of the three versions of the DGPD distribution using the Bayesian estimation method. Numerical techniques are utilized for Bayesian calculations, such as the Monte Carlo Markov Chain (MCMC) technique.
In the Bayesian method, the parameters of the model are assumed to be random variables with a certain distribution called the prior distribution. Usually, the prior information is not available; hence, we need to specify a suitable choice of the prior. In this work, we decided to use a natural joint conjugate prior distribution for the parameters λ   and   θ , which is known as the modified Lwin Prior; it is defined by assuming a gamma distribution for λ and the Pareto (I) distribution for θ . Hence,
λ ~ G a m m a ( a 1 , b 1 ) ,
and
θ | λ ~ P a r e t o ( I ) ( λ a 2 , b 2 ) ,
where a 1 , a 2 , b 1   and   b 2 are nonnegative hyperparameters of the assumed distributions. The authors of [31] mentioned that it is more meaningful to express θ conditional on λ rather than vice versa. Moreover, they strongly believed that it is more appropriate to consider that the prior distributions for λ and θ are independent of each other.
Therefore, the prior distributions for λ and θ can be written as
π 1 ( λ ) = b 1 a 1 Γ ( a 1 ) λ a 1 1 e b 1 λ ,
π 2 ( θ | λ ) = λ a 2 b 2 ( θ b 2 ) a 2 λ .
Hence, the joint prior for λ and θ is
π ( λ , θ ) λ a 1 e b 1 λ ( θ b 2 ) a 2 λ .
The joint posterior of λ and θ given the data is defined as
p ( λ , θ / x _ ) = 1 K L ( x _ / λ , θ ) π ( λ , θ ) ,
where L ( x _ / λ , θ ) is the likelihood function of the DGPD, π ( λ , θ ) is the joint prior given by Equation (14), and K = L ( x _ / λ , θ ) π ( λ , θ ) d λ d θ .
The estimation for the parameters of the DGPD can be performed using different loss functions, such as (i) squared error (SE), (ii) LINEX, and (iii) general entropy (GE) loss functions. The performance of the estimators using the said loss functions was investigated using a simulation study. The bias, the mean square error (MSE), and the length of the credible interval were used as criteria for determining the superiority of the respective estimates.

3.1. Loss Functions

The following loss functions are used for posterior estimation.

3.1.1. Squared Error (SE) Loss Function

Assuming the SE loss function, Bayesian estimation for the parameters λ and θ is defined as the mean or expected value with respect to the joint posterior:
λ ^ S E = 1 k λ L ( x _ / λ , θ ) π ( λ , θ ) d λ d θ ,
and
θ ^ S E = 1 k θ L ( x _ / λ , θ ) π ( λ , θ ) d λ d θ .

3.1.2. LINEX Loss Function

With the LINEX loss function, Bayesian estimation for the parameters λ and θ are formulated as
λ ^ L I N = 1 h ln [ 1 K e h λ L ( x _ / λ , θ ) π ( λ , θ ) d λ d θ ] θ ^ L I N = 1 h ln [ 1 K e h θ L ( x _ / λ , θ ) π ( λ , θ ) d λ d θ ]   .

3.1.3. General Entropy (GE) Loss Functions

Using the GE loss function, Bayesian estimation for the parameters λ and θ is given by
λ ^ G E = ( 1 k λ q L ( x _ / λ , θ ) π ( λ , θ ) d λ d θ ) 1 / q , θ ^ G E = ( 1 k θ q L ( x _ / λ , θ ) π ( λ , θ ) d λ d θ ) 1 / q   .

3.2. Bayesian Estimation

For evaluating the above-expected values and double integration, numerical methods are essential. We opted to use the Markov Chain Monte Carlo (MCMC) technique by using the Gibbs sampling method and by formulating the suitable R code. For more details, one may refer to [32]. Many authors have used Bayesian estimation for different lifetime models with many real data applications (see, for example, [33,34,35]).
Since we implement three different discretization methods on the GP distribution, we have to deal with three cases of Bayesian inference based on the different pmfs of DGPDs that are written in Equations (6), (9), and (13).

3.2.1. Case 1

When applying the survival discretization method, we obtain DGPD1 with the pmf given by Equation 6. The joint posterior density is
p 1 ( λ , θ / x _ ) = 1 K i = 1 n [ ( 1 + θ x i λ ) 1 θ ( 1 + θ x i + 1 λ ) 1 θ ] λ a 1 e b 1 λ ( θ b 2 ) a 2 λ
= G λ ( a 1 + 1 , b 1 ) Q ( λ , θ ) ,
where Q ( λ , θ ) = 1 K i = 1 n [ ( 1 + θ x i λ ) 1 θ ( 1 + θ x i + 1 λ ) 1 θ ] ( θ b 2 ) a 2 λ , and G (.,.) represents the gamma distribution.
Bayesian estimation for the parameters λ and θ using the SE loss function is performed using Equations (15) and (16) with the posterior density Equation (19), respectively:
λ ^ S E = 1 k i = 1 n [ ( 1 + θ x i λ ) 1 θ ( 1 + θ x i + 1 λ ) 1 θ ]   λ a 1 + 1 e b 1 λ ( θ b 2 ) a 2 λ d λ d θ ,
θ ^ S E = 1 k i = 1 n [ ( 1 + θ x i λ ) 1 θ ( 1 + θ x i + 1 λ ) 1 θ ]   θ a 2 λ + 1 λ a 1 e b 1 λ ( b 2 ) a 2 λ d λ d θ .
For the LINEX loss function, Bayesian estimation is obtained by using Equation (17) and the posterior density Equation (18):
λ ^ L I N = 1 h ln [ 1 K i = 1 n [ ( 1 + θ x i λ ) 1 θ ( 1 + θ x i + 1 λ ) 1 θ ]   λ a 1 e ( b 1 + h ) λ ( θ b 2 ) a 2 λ d λ d θ ]
θ ^ L I N = 1 h ln [ 1 K i = 1 n [ ( 1 + θ x i λ ) 1 θ ( 1 + θ x i + 1 λ ) 1 θ ]   λ a 1 e b 1 λ h θ ( θ b 2 ) a 2 λ d λ d θ ]  
Bayesian estimation for the parameters λ and λ using the GE loss function is obtained using Equations (18) and (19) and is given by
λ ^ G E = ( 1 k i = 1 n [ ( 1 + θ x i λ ) 1 θ ( 1 + θ x i + 1 λ ) 1 θ ]   λ a 1 q e b 1 λ ( θ b 2 ) a 2 λ d λ d θ ) 1 / q

3.2.2. Case 2

For the second form of discrete GPD, namely, DGPD2, with the pmf given by Equation (9), the joint posterior density is given by
p 2 ( λ , θ / x _ ) = 1 K i = 1 n [ ( 1 + θ x i λ ) ( 1 θ + 1 ) θ a 2 λ + ( 1 θ + 1 ) b 2 a 2 λ ξ ( 1 + 1 θ , λ θ ) ] λ a 1 ( 1 θ + 1 ) e b 1 λ  
= G λ ( a 1 1 θ , b 1 ) R ( λ , θ ) ,
where R ( λ , θ ) =   1 K i = 1 n [ ( 1 + θ x i λ ) ( 1 θ + 1 ) θ a 2 λ + ( 1 θ + 1 ) b 2 a 2 λ ξ ( 1 + 1 θ , λ θ ) ] .
Bayesian estimation for the parameters λ and θ using the SE loss function is given as
λ ^ S E = 1 k i = 1 n [ ( 1 + θ x i λ ) ( 1 θ + 1 ) θ a 2 λ + ( 1 θ + 1 ) b 2 a 2 λ ξ ( 1 + 1 θ , λ θ ) ]   λ a 1 1 θ e b 1 λ d λ d θ ,
θ ^ S E = 1 k i = 1 n [ ( 1 + θ x i λ ) ( 1 θ + 1 ) θ a 2 λ + ( 1 θ + 2 ) b 2 a 2 λ ξ ( 1 + 1 θ , λ θ ) ]   λ a 1 ( 1 θ + 1 ) e b 1 λ d λ d θ .
For the LINEX loss function, Bayesian estimation is found by the following integrations:
λ ^ L I N = 1 h ln [ 1 K i = 1 n [ ( 1 + θ x i λ ) ( 1 θ + 1 ) θ a 2 λ + ( 1 θ + 1 ) b 2 a 2 λ ξ ( 1 + 1 θ , λ θ ) ]   λ a 1 ( 1 θ + 1 ) e ( b 1 + h ) λ d λ d θ ] ,
θ ^ L I N = 1 h ln [ 1 K i = 1 n [ ( 1 + θ x i λ ) ( 1 θ + 1 ) θ a 2 λ + ( 1 θ + 1 ) b 2 a 2 λ ξ ( 1 + 1 θ , λ θ ) ]   λ a 1 ( 1 θ + 1 ) e b 1 λ h θ d λ d θ ]
For the GE loss function, Bayesian estimation for parameters λ and θ is given by
λ ^ G E = ( 1 k i = 1 n [ ( 1 + θ x i λ ) ( 1 θ + 1 ) θ a 2 λ + ( 1 θ + 1 ) b 2 a 2 λ ξ ( 1 + 1 θ , λ θ ) ]   λ a 1 ( 1 θ + 1 ) q e b 1 λ d λ d θ ) 1 / q
θ ^ G E = ( 1 k i = 1 n [ ( 1 + θ x i λ ) ( 1 θ + 1 ) θ a 2 λ + ( 1 θ + 1 ) q b 2 a 2 λ ξ ( 1 + 1 θ , λ θ ) ]   λ a 1 ( 1 θ + 1 ) e b 1 λ d λ d θ ) 1 / q

3.2.3. Case 3

The third discretization method of GP yields DGPD3 with the pmf described by Equation (13), and the joint posterior density is
p 3 ( λ , θ / x _ ) = 1 k j = 1 n e 1 + ( 1 + θ x j λ ) 1 θ [ i = 1 x j 1 ( 1 e 1 + ( 1 + θ * i λ ) 1 θ ) ] λ a 1 e b 1 λ ( θ b 2 ) a 2 λ
= 1 k G λ ( a 1 + 1 , b 1 ) S ( λ , θ ) ,
where S ( λ , θ ) = j = 1 n e 1 + ( 1 + θ x j λ ) 1 θ [ i = 1 x j 1 ( 1 e 1 + ( 1 + θ * i λ ) 1 θ ) ] ( θ b 2 ) a 2 λ .
Bayesian estimation for the parameters λ and θ using the SE loss function is given as
λ ^ S E = 1 k j = 1 n e 1 + ( 1 + θ x j λ ) 1 θ [ i = 1 x j 1 ( 1 e 1 + ( 1 + θ * i λ ) 1 θ ) ]   λ a 1 + 1 e b 1 λ ( θ b 2 ) a 2 λ d λ d θ ,
θ ^ S E = 1 k j = 1 n e 1 + ( 1 + θ x j λ ) 1 θ [ i = 1 x j 1 ( 1 e 1 + ( 1 + θ * i λ ) 1 θ ) ]   b 2 λ a 1 e b 1 λ ( θ b 2 ) a 2 λ + 1 d λ d θ .
For the LINEX loss function, Bayesian estimation is found by the following integrations:
λ ^ L I N = 1 h ln [ 1 K j = 1 n e 1 + ( 1 + θ x j λ ) 1 θ [ i = 1 x j 1 ( 1 e 1 + ( 1 + θ * i λ ) 1 θ ) ]   λ a 1 e ( b 1 + h ) λ ( θ b 2 ) a 2 λ d λ d θ ] ,
θ ^ L I N = 1 h ln [ 1 K j = 1 n e 1 + ( 1 + θ x j λ ) 1 θ [ i = 1 x j 1 ( 1 e 1 + ( 1 + θ * i λ ) 1 θ ) ]   λ a 1 e b 1 λ h θ ( θ b 2 ) a 2 λ d λ d θ ] .
For the GE loss function, Bayesian estimation for parameters λ and θ is given by
λ ^ G E = ( 1 k j = 1 n e 1 + ( 1 + θ x j λ ) 1 θ [ i = 1 x j 1 ( 1 e 1 + ( 1 + θ * i λ ) 1 θ ) ]   λ a 1 q e b 1 λ ( θ b 2 ) a 2 λ d λ d θ ) 1 / q ,
θ ^ G E = ( 1 k j = 1 n e 1 + ( 1 + θ x j λ ) 1 θ [ i = 1 x j 1 ( 1 e 1 + ( 1 + θ * i λ ) 1 θ ) ]   b 2 q λ a 1 e b 1 λ ( θ b 2 ) a 2 λ q d λ d θ ) 1 / q .

4. Simulation Analysis

To evaluate the performance of the three discrete versions of the continuous GPD, we aim to compare the point estimation of the unknown parameters with respect to bias and MSE. Additionally, a comparison is conducted using the different loss functions described in Section 3. Some interesting conclusions and results are reported at the end of this section.
Random samples were generated with 10,000 iterations using the suitable R code; the different selected values of the parameters λ and θ were {0.5, 3}, and different sample sizes n = {20,50,100} were considered.
The simulation results of point and interval estimations for the three discrete versions of the GPD are reported in Table 1, Table 2 and Table 3. Figure 4, Figure 5 and Figure 6 illustrate the MSE for the simulation results in Table 1, Table 2 and Table 3. The x-axis represents sample sizes, which take values of {20,50,100}. For a fixed sample size, six different parameter values are presented. Therefore, lambda increases from 0.5 to 3 (the first six points) when theta is 0.5, and lambda increases from 0.5 to 3 (the last six points) when theta is 3.
The main simulation analysis points are as follows:
  • It can be observed that the estimated values of the model parameters converge to their true values when increasing the sample size. This can be observed since the MSE and biases decrease as the sample size increases, which shows that the proposed estimators are consistent in nature.
  • For a small sample size, the LINEX loss function provides the lowest values of MSE and bias when estimating θ , while the GE loss function provides the lowest values of MSE and bias when estimating λ .
  • For a large sample size, the LINEX loss function provides the lowest values of MSE and bias when estimating both parameters λ and θ .
  • In almost all cases, the LINEX and GE loss functions produce minimum bias and MSE values, and this is true for different sample sizes. Hence, LINEX and GE are recommended over SE in this study.
  • For the credible CI, it is noted that the shortest interval length is obtained when using the LINEX loss function.
  • The SE loss function has some advantages over other loss functions under some conditions; for example, when λ = θ = 3 and for a small sample size (n = 20), the bias and MSE attain their minimum values when estimating θ .
  • For a fixed value of λ , the bias decreases when the shape parameter θ increases. Similarly, for a fixed value of θ , the bias decreases when λ   increases.
  • The length of the credible interval decreases when the sample size increases, and this is true for all loss functions under study.
When comparing the performance of the three DGPD analogues, we observe the following:
  • For almost all small-size cases, the first discrete analogue DGPD1 has the least bias and lowest MSE for different parameter values.
  • For a large sample size, it is observed that the MSE attains its minimum values when using the second analogue, DGPD2.
  • The advantage of using the third analogue, DGPD3, appears when finding the credible interval for the parameter θ using the GE loss function, where the interval length reaches its minimum value.

5. Real Data Examples

In this section, some real data are utilized for the purpose of proving the efficiency of the discrete analogues of the GP distribution.
Some goodness-of-fit measures are used, such as the chi-square test, Kolmogorov–Smirnov (KS), Akaike information criterion (AIC), Bayesian information criterion (BIC), corrected Akaike information criterion (CAIC), and Hannan–Quinn information criterion (HQIC). As a model selection criterion, the researcher should choose the model with the minimum value from the above-mentioned measures of fit.
Data set 1: The first set of data represents a 42-day COVID-19 data set from the United States Virgin Islands, recorded between 19 April 2021 and 30 May 2021. These data comprise daily new deaths. The data are as follows: 11, 2, 3, 10, 10, 4, 12, 0, 10, 3, 5, 12, 6, 9, 13, 4, 10, 26, 0, 32, 0, 0, 13, 10, 3, 20, 5, 6, 0, 3, 18, 2, 18, 14, 24, 7, 0, 30, 16, 26, 17, 23. The data are available on the Worldometer website at [36].
Table 4 summarizes the values of goodness-of-fit measures when comparing the DGPD with nine different discrete models, including those with one, two, and three parameters. The competitive models are discrete Marshal Olkin inverted Topp–Leone (DMOITL), which is introduced in [37], Discrete Burr (DB), which is introduced in [38], discrete Weibull (DW), which is introduced in [39], discrete inverse Weibull (DIW), which is obtained in [40], negative binomial NB in [41], Poisson, discrete generalized exponential (DGE), which is introduced in [42], discrete alpha power inverse Lomax (DAPIL) in [19], and discrete Lindley (DL) in [43].
Table 4 reveals the efficiency and suitability of DGPD1 for modeling COVID-19 cases with respect to other discrete candidate models, while Figure 7 shows PMF and CDF for the fitted DGPD1 of data set 1. The distribution that has smaller values of key statistics, such as AIC, BIC, CAIC, HQIC, KS-test statistics, and Chi2-test statistics, is generally the one that fits the data the best. These statistics show that among all fitted models, the DGPD1 has the lowest KS-statistical, Chi2-statistical, AIC, BIC, CAIC, and HQIC values. The P-value of KS-test statistics and Chi2-test statistics are compared at the 5% level of significance. For data set 1, Table 5 elucidates the performance of Bayesian estimation, which is marginally better than the well-known classical maximum likelihood estimation (MLE) with respect to minimizing SE.
To confirm this conclusion, we should check the convergence of the MCMC results. Figure 7 shows the trace and convergence plots of MCMC for parameter estimates of DGPD1. Figure 8 depicts the MCMC convergence of λ and θ . We confirm the results of MCMC that the parameters of DGPD1 have convergence by the MH algorithm. Figure 9 shows the posterior density plots of MCMC for parameter estimates of DGPD1 for data set 1, which has a normal curve, as per the proposed distribution of the MH algorithm.
Data set 2: The second set of data represents a 53-day COVID-19 data set from Italy, recorded between 13 June 2021 and 4 August 2021. These data comprise daily new deaths. The data are as follows: 52, 26, 36, 63, 52, 37, 35, 28, 17, 21, 31, 30, 10, 56, 40, 14, 28, 42, 24, 21, 28, 22, 12, 31, 24, 14, 13, 25, 12, 7, 13, 20, 23, 9, 11, 13, 3, 7, 10, 21, 15, 17, 5, 7, 22, 24, 15, 19, 18, 16,5, 20, 27. The data are available on the Worldometer website at [36].
Figure 10 shows PMF and CDF for the fitted DGP of data set 2. The SE values of the parameters of DGP are shown in Table 6 to compare between MLE and Bayesian estimation methods for data set 2. From the results of SE in Table 6, we note that Bayesian estimation is a superior estimation method for data set 2 compared to MLE. Figure 11 shows that the posterior density plots of MCMC for parameter estimates of DGPD1 for data set 2 have a normal curve, as per the proposed distribution of the MH algorithm. To confirm this conclusion, we should check the convergence of the MCMC results. Figure 12 shows the trace and convergence plots of MCMC for parameter estimates of DGPD1 for data set 2. In Figure 12, we confirm that the results of MCMC for the parameters of DGPD1 have convergence by the MH algorithm.

6. Conclusions

In this study, we propose and study new discrete distributions that have a decreasing probability mass function for all choices of their parameters. The new distribution is called the discrete generalized Pareto distribution (DGPD). We used different discretization methods that introduced three discrete analogues of the DGPD. Point and interval estimations through the Bayesian method were obtained, and a simulation analysis was performed using R code to assess the efficiency of the three discrete models. Some loss functions were employed in this study, such as SE, LINEX, and GE loss functions. The tables presented in the simulation section show some good properties for each analogue. To check the validity of the DGPD, two real data examples were considered, which comprised COVID-19 death cases in two different regions. Our proposed DGPD1 was compared with other discrete candidates, and via goodness-of-fit tests, it was proved that DGPD1 fit the data very well. The tables and figures illustrate the efficiency of the new model as well. For further study, we suggest using other discretization methods and testing their performance and suitability using real-life data.

Author Contributions

Conceptualization, H.H.A. and E.M.A.; methodology, H.H.A.; software, E.M.A.; validation, H.H.A. and E.M.A.; formal analysis, H.H.A.; investigation, E.M.A.; resources, H.H.A. and E.M.A.; data curation, E.M.A.; writing—original draft preparation, H.H.A.; writing—review and editing, H.H.A. and E.M.A.; visualization, E.M.A.; supervision, H.H.A.; project administration, H.H.A.; funding acquisition, H.H.A. All authors have read and agreed to the published version of the manuscript.

Funding

This work was supported through the Annual Funding track by the Deanship of Scientific Research, Vice Presidency for Graduate Studies and Scientific Research, King Faisal University, Saudi Arabia [Project No. AN000537].

Institutional Review Board Statement

Not applicable.

Informed Consent Statement

Not applicable.

Data Availability Statement

The data is available in the text of this article.

Conflicts of Interest

The authors declare no conflict of interest.

References

  1. Xekalaki, E. Hazard function and life distributions in discrete time. Commun. Stat. Theory Methods 1983, 12, 2503–2509. [Google Scholar] [CrossRef]
  2. Hitha, N.; Nair, N.U. Characterization of some discrete models by properties of residual life function. Calcutta Stat. Assoc. Bull. 1989, 38, 219–223. [Google Scholar] [CrossRef]
  3. Roy, D.; Gupta, R.P. Classifications of discrete lives. Microelectron. Reliab. 1992, 32, 459–1473. [Google Scholar] [CrossRef]
  4. Roy, D.; Gupta, R.P. Stochastic modeling through reliability measures in the discrete case. Stat. Probab. Lett. 1999, 43, 197–206. [Google Scholar] [CrossRef]
  5. Roy, D. On classifications of multivariate life distributions in the discrete set-up. Microelectron. Reliab. 1997, 37, 361–366. [Google Scholar] [CrossRef]
  6. Roy, D. The discrete normal distribution. Commun. Stat. Theory Methods 2003, 32, 1871–1883. [Google Scholar] [CrossRef]
  7. Roy, D. Discrete Rayleigh distribution. IEEE. Trans. Reliab. 2004, 53, 255–260. [Google Scholar] [CrossRef]
  8. Roy, D.; Ghosh, T. A New Discretization Approach with Application in Reliability Estimation. IEEE Trans. Reliab. 2009, 58, 456–461. [Google Scholar] [CrossRef]
  9. Bracquemond, C.; Gaudoin, O. A survey on discrete life time distributions. Int. J. Reliab. Qual. Saf. Eng. 2003, 10, 69–98. [Google Scholar] [CrossRef]
  10. Lai, C.D. Issues concerning constructions of discrete lifetime models. Qual. Technol. Quant. Manag. 2013, 10, 251–262. [Google Scholar] [CrossRef]
  11. Chakraborty, S. Generating discrete analogues of continuous probability distributions—A survey of methods and constructions. J. Stat. Distrib. Appl. 2015, 2, 6. [Google Scholar] [CrossRef] [Green Version]
  12. Liu, H.; Hussain, F.; Tan, C.L.; Dash, M. Discretization: An Enabling Technique. Data Min. Knowl. Discov. 2002, 6, 393–423. [Google Scholar] [CrossRef]
  13. Arnastauskaitė, J.; Ruzgas, T.; Bražėnas, M. An Exhaustive Power Comparison of Normality Tests. Mathematics 2021, 9, 788. [Google Scholar] [CrossRef]
  14. Korkmaz, S.; Goksuluk, D.; Zararsiz, G. MVN: An R Package for Assessing Multivariate Normality. R J. 2014, 6, 151–162. [Google Scholar] [CrossRef] [Green Version]
  15. Al-Huniti, A.A.; AL-Dayian, G.R. Discrete Burr type III distribution. Am. J. Math. Stat. 2012, 2, 145–152. [Google Scholar] [CrossRef]
  16. Bebbington, M.; Lai, C.D.; Wellington, M.; Zitikis, R. The discrete additive Weibull distribution: A bathtub-shaped hazard for discontinuous failure data. Reliab. Eng. Syst. Saf. 2012, 106, 37–44. [Google Scholar] [CrossRef]
  17. Sarhan, A.M. A two-parameter discrete distribution with a bathtub hazard shape. Commun. Stat. Appl. Methods 2017, 24, 15–27. [Google Scholar] [CrossRef] [Green Version]
  18. Yari, G.; Tondpour, Z. Discrete Burr XII-Gamma Distributions: Properties and Parameter Estimations. Iran. J. Sci. Technol. Trans. A Sci. 2018, 42, 2237–2249. [Google Scholar] [CrossRef]
  19. Almetwally, E.M.; Ibrahim, G.M. Discrete Alpha Power Inverse Lomax Distribution with Application of COVID-19 Data. Int. J. Appl. Math. 2020, 9, 11–22. [Google Scholar]
  20. Eliwa, M.S.; Altun, E.; El-Dawoody, M.; El-Morshedy, M. A new three-parameter discrete distribution with associated INAR (1) process and applications. IEEE Access 2020, 8, 91150–91162. [Google Scholar] [CrossRef]
  21. Almetwally, E.M.; Almongy, H.M.; Saleh, H.A. Managing risk of spreading “COVID-19” in Egypt: Modelling using a discrete Marshall-Olkin generalized exponential distribution. Int. J. Probab. Stat. 2020, 9, 33–41. [Google Scholar]
  22. Al-Babtain, A.A.; Ahmed, A.H.N.; Afify, A.Z. A New Discrete Analog of the Continuous Lindley Distribution, with Reliability Applications. Entropy 2020, 22, 603. [Google Scholar] [CrossRef] [PubMed]
  23. Eldeeb, A.S.; Ahsan-Ul-Haq, M.; Babar, A. A Discrete Analog of Inverted Topp-Leone Distribution: Properties, Estimation and Applications. Int. J. Anal. Appl. 2021, 19, 695–708. [Google Scholar]
  24. Elbatal, I.; Alotaibi, N.; Almetwally, E.M.; Alyami, S.A.; Elgarhy, M. On Odd Perks-G Class of Distributions: Properties, Regression Model, Discretization, Bayesian and Non-Bayesian Estimation, and Applications. Symmetry 2022, 14, 883. [Google Scholar] [CrossRef]
  25. Nagy, M.; Almetwally, E.M.; Gemeay, A.M.; Mohammed, H.S.; Jawa, T.M.; Sayed-Ahmed, N.; Muse, A.H. The new novel discrete distribution with application on covid-19 mortality numbers in Kingdom of Saudi Arabia and Latvia. Complexity 2021, 2021, 7192833. [Google Scholar] [CrossRef]
  26. Gillariose, J.; Balogun, O.S.; Almetwally, E.M.; Sherwani, R.A.K.; Jamal, F.; Joseph, J. On the Discrete Weibull Marshall–Olkin Family of Distributions: Properties, Characterizations, and Applications. Axioms 2021, 10, 287. [Google Scholar] [CrossRef]
  27. Martín, J.; Parra, M.I.; Pizarro, M.M.; Sanjuán, E.L. Baseline Methods for the Parameter Estimation of the Generalized Pareto Distribution. Entropy 2022, 24, 178. [Google Scholar] [CrossRef]
  28. Huang, C.; Zhao, X.; Cheng, W.; Ji, Q.; Duan, Q.; Han, Y. Statistical Inference of Dynamic Conditional Generalized Pareto Distribution with Weather and Air Quality Factors. Mathematics 2022, 10, 1433. [Google Scholar] [CrossRef]
  29. Shui, P.-L.; Zou, P.-J.; Feng, T. Outlier-robust truncated maximum likelihood parameter estimators of generalized Pareto distributions. Digit. Signal Process. 2022, 127, 103527. [Google Scholar] [CrossRef]
  30. He, Y.; Peng, L.; Zhang, D.; Zhao, Z. Risk Analysis via Generalized Pareto Distributions. J. Bus. Econ. Stat. 2021, 40, 852–867. [Google Scholar] [CrossRef]
  31. Arnold, B.C.; Press, S.J. Compatible Conditional Distributions. J. Am. Stat. Assoc. 1989, 84, 152. [Google Scholar] [CrossRef]
  32. Karandikar, R.L. On the Markov Chain Monte Carlo (MCMC) method. Sadhana 2006, 31, 81–104. [Google Scholar] [CrossRef] [Green Version]
  33. Wang, Y.; Zhang, J.; Cai, C.; Lu, W.; Tang, Y. Semiparametric estimation for proportional hazards mixture cure model allowing non-curable competing risk. J. Stat. Plan. Inference 2020, 211, 171–189. [Google Scholar] [CrossRef]
  34. Xu, A.; Zhou, S.; Tang, Y. A unified model for system reliability evaluation under dynamic operating conditions. IEEE Trans. Reliab. 2021, 70, 65–72. [Google Scholar] [CrossRef]
  35. Luo, C.; Shen, L.; Xu, A. Modelling and estimation of system reliability under dynamic operating environments and lifetime ordering constraints. Reliab. Eng. Syst. Saf. 2022, 218, 108136. [Google Scholar] [CrossRef]
  36. Worldometers. Available online: https://www.worldometers.info/coronavirus. (accessed on 1 June 2021).
  37. Almetwally, E.M.; Abdo, D.A.; Hafez, E.H.; Jawa, T.M.; Sayed-Ahmed, N.; Almongy, H.M. The new discrete distribution with application to COVID-19 Data. Results Phys. 2021, 32, 104987. [Google Scholar] [CrossRef]
  38. Krishna, H.; Pundir, P.S. Discrete Burr and discrete Pareto distributions. Stat. Methodol. 2009, 6, 177–188. [Google Scholar] [CrossRef]
  39. Khan, M.A.; Khalique, A.; Abouammoh, A.M. On estimating parameters in a discrete Weibull distribution. IEEE Trans. Reliab. 1989, 38, 348–350. [Google Scholar] [CrossRef]
  40. Jazi, M.A.; Lai, C.-D.; Alamatsaz, M.H. A discrete inverse Weibull distribution and estimation of its parameters. Stat. Methodol. 2010, 7, 121–132. [Google Scholar] [CrossRef]
  41. Fisher, P. Negative Binomial Distribution. Ann. Eugen. 1941, 11, 182–787. [Google Scholar] [CrossRef]
  42. Nekoukhou, V.; Alamatsaz, M.H.; Bidram, H. Discrete generalized exponential distribution of a second type. Statistics 2013, 47, 876–887. [Google Scholar] [CrossRef]
  43. Gómez-Déniz, E.; Calderín-Ojeda, E. The discrete Lindley distribution: Properties and applications. J. Stat. Comput. Simul. 2011, 81, 1405–1416. [Google Scholar] [CrossRef]
Figure 1. Plots of pmf of the DGPD1 distribution with different values of the parameters λ   and   θ .
Figure 1. Plots of pmf of the DGPD1 distribution with different values of the parameters λ   and   θ .
Symmetry 14 01457 g001
Figure 2. Plots of pmf of the DGPD2 distribution with different values of the parameters λ   and   θ .
Figure 2. Plots of pmf of the DGPD2 distribution with different values of the parameters λ   and   θ .
Symmetry 14 01457 g002
Figure 3. Plots of the pmf of the DGPD3 distribution with different values of the parameters λ   and   θ .
Figure 3. Plots of the pmf of the DGPD3 distribution with different values of the parameters λ   and   θ .
Symmetry 14 01457 g003
Figure 4. MSE of Bayesian inference for DGPD1.
Figure 4. MSE of Bayesian inference for DGPD1.
Symmetry 14 01457 g004
Figure 5. MSE of Bayesian inference for DGPD2.
Figure 5. MSE of Bayesian inference for DGPD2.
Symmetry 14 01457 g005
Figure 6. MSE of Bayesian inference for DGPD3.
Figure 6. MSE of Bayesian inference for DGPD3.
Symmetry 14 01457 g006
Figure 7. Plots of estimated pmf and CDF of DGPD1 for data set I.
Figure 7. Plots of estimated pmf and CDF of DGPD1 for data set I.
Symmetry 14 01457 g007
Figure 8. Trace and convergence plots of MCMC for parameter estimates of DGPD1 for data set I.
Figure 8. Trace and convergence plots of MCMC for parameter estimates of DGPD1 for data set I.
Symmetry 14 01457 g008
Figure 9. Posterior density plots of MCMC for parameter estimates of DGPD1 for data set I.
Figure 9. Posterior density plots of MCMC for parameter estimates of DGPD1 for data set I.
Symmetry 14 01457 g009
Figure 10. Plots of estimated pmf and CDF of DGPD1 for data set 2.
Figure 10. Plots of estimated pmf and CDF of DGPD1 for data set 2.
Symmetry 14 01457 g010
Figure 11. Posterior density plots of MCMC for parameter estimates of DGPD1 for data set 2.
Figure 11. Posterior density plots of MCMC for parameter estimates of DGPD1 for data set 2.
Symmetry 14 01457 g011
Figure 12. Trace and convergence plots of MCMC for parameter estimates of DGPD1 for data set 2.
Figure 12. Trace and convergence plots of MCMC for parameter estimates of DGPD1 for data set 2.
Symmetry 14 01457 g012
Table 1. Bayesian inference for DGPD1 (bias, MSE, and length of CI) for different values of parameters.
Table 1. Bayesian inference for DGPD1 (bias, MSE, and length of CI) for different values of parameters.
SELINEX (−1.5)LINEX (1.5)GE (−1.5)GE (1.5)
θ λ n BiasMSEL.CCIBiasMSEL.CCIBiasMSEL.CCIBiasMSEL.CCIBiasMSEL.CCI
0.50.520 θ 0.02470.08870.53350.06010.01940.4371−0.00660.01670.46300.04500.08200.6443−0.08700.02450.4930
λ 0.29460.12840.74120.35970.18700.85410.23680.08660.66920.31900.14580.75670.16310.05860.6918
50 θ −0.01300.01550.43940.00340.01670.4526−0.02890.01500.4294−0.00200.01550.4396−0.07500.02150.4590
λ 0.26660.09520.60310.29010.11200.64050.24290.07960.56160.27640.10140.60670.21320.04650.5528
100 θ −0.00840.01120.4062−0.00250.01120.4070−0.01440.01130.4062−0.00410.01100.4023−0.03160.01360.4360
λ 0.18270.04240.37450.19230.04700.39140.17290.03810.35690.18720.04440.37810.15860.03260.3407
320 θ 0.03530.01500.46100.06800.01620.45880.00640.01780.45330.05370.01260.4755−0.06420.01590.4910
λ 0.07040.05550.87430.16150.08520.9396−0.01760.04690.84640.08030.05740.87730.02060.05000.8675
50 θ 0.00090.01160.42670.00800.01200.4324−0.00620.01140.42200.00570.01160.4274−0.02480.01310.4528
λ 0.01920.02760.62260.03010.02870.62740.00840.02690.61890.02040.02770.62170.01320.02740.6259
100 θ −0.00800.00790.3509−0.00420.00790.3511−0.01170.00790.3508−0.00540.00780.3480−0.02140.00870.3554
λ 0.02300.01430.45540.02850.01480.46010.01750.01390.45130.02360.01430.45520.02000.01410.4526
30.520 θ 0.01210.07510.33890.05120.01070.3506−0.05950.07790.33060.01640.07660.3387−0.09350.06740.3351
λ 0.21730.16450.84490.43020.17591.00640.24120.09610.72220.25270.12970.87800.23310.05140.7289
50 θ −0.00370.00980.30790.03480.00980.3335−0.04050.00870.29440.00050.00760.3612−0.02450.00800.2998
λ 0.27190.14110.59480.34100.16230.65690.21150.07450.51190.24990.12720.69320.12510.04990.5576
100 θ −0.03210.00970.30520.00180.00920.3060−0.06550.00610.2343−0.02840.00690.3509−0.01510.00610.2534
λ 0.13170.13300.56680.37230.13800.60740.20680.05980.50960.23380.01480.68090.10210.03710.4620
320 θ 0.00390.07050.36290.04300.00960.4776−0.03390.00900.36250.00820.00710.3986−0.01750.07240.4327
λ 0.04400.05240.87890.14020.07910.9525−0.04870.04890.89140.05450.05380.8868−0.00910.04960.8982
50 θ 0.00380.05750.33390.04210.00750.3526−0.03330.00830.33480.00800.00700.3368−0.01720.01670.3383
λ 0.04430.05220.80950.13700.07730.8957−0.04510.04090.86790.05440.05350.7960−0.00690.04970.8517
100 θ −0.01520.01700.3049−0.00800.00690.3489−0.02240.00730.4917−0.01440.00690.3049−0.01920.00720.2491
λ 0.01120.02330.57070.01970.02400.57720.00280.02280.57870.01220.02340.57020.00650.02310.5744
Table 2. Bayesian inference for DGPD2 (bias, MSE, and length of CI) for different values of parameters.
Table 2. Bayesian inference for DGPD2 (bias, MSE, and length of CI) for different values of parameters.
SELINEX (−1.5)LINEX (1.5)GE (−1.5)GE (1.5)
θ λ n BiasMSEL.CCIBiasMSEL.CCIBiasMSEL.CCIBiasMSEL.CCIBiasMSEL.CCI
0.50.520 θ −0.11450.06680.7749−0.11100.06520.7740−0.11770.06810.7734−0.10860.06300.7578−0.13490.07930.7870
λ −0.48890.24910.0185−0.48560.24590.0247−0.49110.24120.0142−0.46840.22960.0421−0.49930.24930.0039
50 θ −0.09720.05250.7273−0.09510.05180.7252−0.09910.05310.7284−0.09450.05110.7180−0.10750.05740.7516
λ −0.49010.24020.0177−0.48780.23800.0204−0.49160.24170.0157−0.47320.22400.0291−0.49800.24800.0092
100 θ −0.05220.01860.4950−0.05150.01840.4941−0.05290.01870.4963−0.05160.01840.4927−0.05500.01920.4994
λ −0.47470.22540.0255−0.46960.22060.0293−0.47820.22880.0243−0.45050.20310.0335−0.49190.24200.0193
320 θ 0.14240.03780.45010.19060.06040.50410.10000.02340.40230.16470.04590.45790.02060.01430.4190
λ −0.03660.05910.89320.05340.06990.9843−0.12460.06810.8693−0.02650.05880.9016−0.08810.06410.8811
50 θ 0.02480.01450.42950.03280.01540.43730.01670.01380.42410.03000.01470.4286−0.00310.01370.4048
λ −0.03710.03120.6886−0.02560.03000.6766−0.04870.03270.6941−0.03580.03100.6877−0.04370.03230.6971
100 θ 0.00680.00770.34050.01040.00780.33840.00320.00750.33670.00920.00770.3346−0.00560.00790.3475
λ −0.02570.01130.4118−0.02130.01090.4001−0.03020.01170.4212−0.02520.01130.4104−0.02830.01160.4019
30.520 θ 0.03150.05470.90010.03480.05490.90380.02820.05430.89260.03180.05470.90040.02970.05460.8976
λ −0.47980.23090.0569−0.47150.22400.0845−0.48500.23550.0409−0.45030.20460.1119−0.49980.24980.0006
50 θ 0.02110.01690.48770.02340.01710.48830.01870.01660.48420.02140.01690.48790.01980.01680.4854
λ −0.39020.15680.2287−0.36760.14060.2581−0.40900.17100.1935−0.33880.11950.2469−0.49810.24900.0003
100 θ 0.01830.00850.36020.01990.00860.36280.01660.00830.35700.01840.00850.36050.01730.00840.3587
λ −0.34940.12520.1901−0.32460.10870.2058−0.37150.14070.1709−0.30020.09290.1893−0.49920.20490.0002
320 θ 0.09320.02550.63190.13330.02510.32800.05440.01950.53060.09750.02630.53190.07190.02190.5632
λ 0.02250.06180.93810.11750.08571.0463−0.07030.06080.89150.03300.06290.9506−0.03090.06060.8925
50 θ 0.05460.02030.52080.06400.02190.52180.04530.01890.50900.05560.02040.52090.04950.01960.5177
λ −0.01730.02810.6513−0.00590.02720.6505−0.02870.02930.6510−0.01600.02790.6504−0.02380.02900.6459
100 θ 0.04510.01150.38160.04950.01230.38720.04060.01070.37280.04550.01160.38350.04260.01110.3791
λ −0.00420.01150.41370.00000.01150.4164−0.00830.01170.4146−0.00370.01150.4138−0.00650.01160.4162
Table 3. Bayesian inference for DGPD3 (bias, MSE, and length of CI) for different values of parameters.
Table 3. Bayesian inference for DGPD3 (bias, MSE, and length of CI) for different values of parameters.
SELINEX (−1.5)LINEX (1.5)GE (−1.5)GE (1.5)
θ λ n BiasMSEL.CCIBiasMSEL.CCIBiasMSEL.CCIBiasMSEL.CCIBiasMSEL.CCI
0.50.520 θ 0.02310.05520.94050.02480.05520.93960.02140.05500.93990.02360.05520.93970.02080.05520.9422
λ −0.49080.24090.0134−0.48790.23810.0189−0.49270.24280.0097−0.47100.22190.0349−0.49980.24980.0005
50 θ 0.00670.01340.47490.00750.01350.47570.00580.01330.47190.00690.01340.47500.00550.01330.4718
λ −0.45050.20320.0495−0.43740.19160.0645−0.46030.21200.0397−0.40640.16550.0713−0.49990.24990.0002
100 θ 0.02910.01240.43070.03030.01340.43330.00280.01310.42550.02950.01300.43120.02740.01310.4246
λ −0.42040.17710.0642−0.40300.16280.0796−0.43420.18870.0521−0.37420.14050.0841−0.49460.24460.0097
320 θ 0.07830.16981.34500.11810.20031.39800.04180.14251.23180.09890.17471.3478−0.03020.15341.2360
λ −0.59670.45281.0853−0.48900.32540.9966−0.69410.58491.1111−0.58180.43291.0580−0.67040.55751.1327
50 θ −0.03890.08290.9246−0.02190.08660.9413−0.05580.07930.8868−0.02470.08030.9205−0.11200.10120.9059
λ −0.22420.09060.7755−0.19740.07260.6992−0.25070.11080.8376−0.22070.08810.7653−0.24140.10400.8219
100 θ −0.04570.07990.8656−0.02900.08280.9041−0.06220.07700.8300−0.03140.07690.8707−0.11920.09990.8511
λ −0.22030.08760.7755−0.19380.07000.6992−0.24660.10730.8376−0.21690.08510.7653−0.23730.10070.8219
30.520 θ −0.01190.05240.8664−0.01010.05230.8657−0.01370.05240.8661−0.01170.05240.8664−0.01290.05250.8665
λ −0.49110.24110.0129−0.48840.23850.0180−0.49290.24290.0099−0.47170.22260.0330−0.49980.24980.0005
50 θ 0.00290.01120.40810.00360.01120.40770.00220.01110.40810.00300.01120.40800.00250.01120.4082
λ −0.44950.20230.0525−0.43600.19050.0672−0.45960.21130.0404−0.40480.16430.0736−0.49990.24990.0002
100 θ 0.00340.00490.28570.00400.00490.28530.00290.00490.28600.00350.00490.28570.00310.00490.2859
λ −0.42380.17990.0538−0.40640.16540.0653−0.43750.19160.0439−0.37570.14150.0652−0.49860.24860.0026
320 θ −0.02610.03700.69370.01260.02970.6419−0.06400.03200.6453−0.02180.03170.5972−0.04780.03860.6255
λ −0.61230.41870.7591−0.50020.30030.8630−0.71680.55470.7219−0.59690.40020.7670−0.69000.52000.7443
50 θ −0.02770.02740.5896−0.01820.02680.5730−0.03720.02820.6052−0.02670.02730.5874−0.03310.02800.6017
λ −0.22260.08260.7089−0.20050.06870.6596−0.24490.09780.7456−0.21980.08070.7030−0.23680.09250.7363
100 θ −0.02520.02700.5209−0.01590.02640.5730−0.03450.02770.6052−0.02420.02690.5874−0.03050.02760.6017
λ −0.22070.08160.6709−0.19900.06800.6596−0.24230.09650.7456−0.21790.07980.7030−0.23440.09130.7363
Table 4. MLE estimates with goodness-of-fit test and different measures for different alternative models.
Table 4. MLE estimates with goodness-of-fit test and different measures for different alternative models.
EstimatesKS-TestChi2-TestAICCAICBICHQIC
DGP θ −0.40520.142935.2645284.7945285.1021288.2698286.0683
λ 15.60700.35810.3164
DMOITL θ 16.56270.142949.3821297.3120297.6197300.7873298.5859
λ 1.84340.35810.0255
DB α 1.64600.320994.9821325.9139326.2216329.3892327.1877
θ 0.74010.00040.0000
DW λ 0.92970.142938.7117288.3261288.6338291.8014289.6000
β 1.08370.35810.1925
DIW λ 0.06420.203464.6983315.3363315.6439318.8116316.6101
β 0.77970.06180.0005
NBP0.80150.307228307.5450431.9343432.0343433.6720432.5712
0.00070.0000
Poisson λ 10.40480.3277677700.3282482.2590482.3590483.9967482.8960
0.00020.0000
DGE α 0.91240.159538.3097288.6633288.9710292.1386289.9371
θ 0.99860.23590.2049
DAPL α 48.56290.180444.5099305.8090306.4406311.0221307.7198
θ 3.11370.13010.0697
λ 0.5752
DL θ 0.84370.123151.3964289.7677289.8677291.5054290.4046
0.54790.0163
Table 5. MLE and Bayesian estimates with SE for data set 1.
Table 5. MLE and Bayesian estimates with SE for data set 1.
MLEBayesian
EstimatesSEEstimatesSE
θ −0.40520.1651−0.23370.1209
λ 15.60703.390215.54170.8679
Table 6. MLE and Bayesian estimates with SE for data set 2.
Table 6. MLE and Bayesian estimates with SE for data set 2.
MLEBayesian
EstimatesSEEstimatesSE
θ −0.4919110.103421−0.411470.093889
λ 33.3127555.26681733.347270.886706
Publisher’s Note: MDPI stays neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Share and Cite

MDPI and ACS Style

Haj Ahmad, H.; Almetwally, E.M. Generating Optimal Discrete Analogue of the Generalized Pareto Distribution under Bayesian Inference with Applications. Symmetry 2022, 14, 1457. https://0-doi-org.brum.beds.ac.uk/10.3390/sym14071457

AMA Style

Haj Ahmad H, Almetwally EM. Generating Optimal Discrete Analogue of the Generalized Pareto Distribution under Bayesian Inference with Applications. Symmetry. 2022; 14(7):1457. https://0-doi-org.brum.beds.ac.uk/10.3390/sym14071457

Chicago/Turabian Style

Haj Ahmad, Hanan, and Ehab M. Almetwally. 2022. "Generating Optimal Discrete Analogue of the Generalized Pareto Distribution under Bayesian Inference with Applications" Symmetry 14, no. 7: 1457. https://0-doi-org.brum.beds.ac.uk/10.3390/sym14071457

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Metrics

Back to TopTop