Next Article in Journal
Structural Panel Bayesian VAR Model to Deal with Model Misspecification and Unobserved Heterogeneity Problems
Next Article in Special Issue
Special Issue “Celebrated Econometricians: Peter Phillips”
Previous Article in Journal
Asymptotic Theory for Cointegration Analysis When the Cointegration Rank Is Deficient
Previous Article in Special Issue
Information Flow in Times of Crisis: The Case of the European Banking and Sovereign Sectors
 
 
Font Type:
Arial Georgia Verdana
Font Size:
Aa Aa Aa
Line Spacing:
Column Width:
Background:
Article

Panel Data Estimation for Correlated Random Coefficients Models

1
Department of Economics, University of Southern California, Los Angeles, CA 90089, USA
2
Department of Quantitative Finance, NTHU and WISE, Xiamen University, Xiamen 361005, China
3
Department of Economics, Texas A&M University, College Station, TX 77843, USA
4
Department of Economics, University at Albany, SUNY, Albany, NY 12222, USA
*
Author to whom correspondence should be addressed.
These authors contributed equally to this work.
Submission received: 30 January 2018 / Revised: 13 January 2019 / Accepted: 23 January 2019 / Published: 1 February 2019
(This article belongs to the Special Issue Celebrated Econometricians: Peter Phillips)

Abstract

:
This paper considers methods of estimating a static correlated random coefficient model with panel data. We mainly focus on comparing two approaches of estimating unconditional mean of the coefficients for the correlated random coefficients models, the group mean estimator and the generalized least squares estimator. For the group mean estimator, we show that it achieves Chamberlain (1992) semi-parametric efficiency bound asymptotically. For the generalized least squares estimator, we show that when T is large, a generalized least squares estimator that ignores the correlation between the individual coefficients and regressors is asymptotically equivalent to the group mean estimator. In addition, we give conditions where the standard within estimator of the mean of the coefficients is consistent. Moreover, with additional assumptions on the known correlation pattern, we derive the asymptotic properties of panel least squares estimators. Simulations are used to examine the finite sample performances of different estimators.
JEL Classification:
C13; C33

1. Introduction

One useful tool for reducing real-world details for econometric modeling is through “suitable” aggregations of micro data. For aggregation not to distort the fundamental behavioral relationships between the micro data and aggregate data, certain “homogeneity” conditions must hold between the micro units (e.g., Hsiao et al. 2005; Pesaran 2003; Stoker 1993; Theil 1954). However, the “homogeneity” assumption is often rejected by empirical investigators (e.g., Kuh 1963; Hsiao and Tahmiscioglu 1997). On the other hand, most policy makers are only interested in the average relationships of the population, not the individual relationship. Random coefficients formulation can be a useful tool to accommodate the “heterogeneity” among micro units and policy makers’ desire to find the average relationship (e.g., Hsiao et al. 1993).
Standard random coefficients models assume the variation of coefficients are independent of the variation of regressors (e.g., Hsiao 1996; Hsiao and Pesaran 2008). In recent years, a great deal of attention has been devoted to the correlated random coefficients model. For instance, in the human capital literature, let the dependent variable y denote the logarithm of earnings and the explanatory variable x denote the years of schooling; the coefficient β denotes the rate of return. It is possible that the return to schooling declines with the level of schooling. It is also plausible that there are unmeasured ability or motivational factors that affect the return to schooling and are also correlated with the level of schooling (e.g., Card 1995; Heckman and Vytlacil 1998; Heckman et al. 2006; Heckman et al. 2010). Particularly, Heckman and Vytlacil (1998) propose an instrumental variable method for the population mean of slope coefficients but not the intercept in the cross sectional correlated random coefficients model. They require the existence of both instrumental variables for the regressors and random coefficients.
Many people have worked on correlated random coefficients panel data models. For instance, Chamberlain (1992) showed how to apply his general result on the semiparametric efficiency bound to a random coefficients panel model as an example. The model considered in Chamberlain (1992) also allows for time varying parameters, which is more general than our model. However, the expression of efficient bound obtained using Chamberlain (1992)’s formulas are different from the expression obtained using the direct derivation. We show, in this paper, that they are, indeed, exactly the same. Due to the inclusion of the time varying parameters, Chamberlain (1992) requires the number of time periods T is greater than the number of random coefficients K. Otherwise, the information matrix of the time varying coefficients is singular. Graham and Powell (2012) further considered the situation when T = K , and proposed a novel “irregular” method that leads to consistent estimation. Their approach assumes the existence of panel data with two subpopulations, where one corresponds to units whose regressor values do not change across periods and the other changes across periods. Arellano and Bonhomme (2012) discuss the identification of the distribution of random coefficients conditional on the values of the regressors, extending the idea from Chamberlain (1992). Chernozhukov et al. (2013) consider more general nonseparable panel models that allow for correlated random coefficients model as a special case.
In this paper, we consider the parametric identification and estimation of the unconditional mean of the random coefficients using panel data when the regularity conditions hold. Two approaches are considered; the approach of ignoring the correlations between the coefficients and regressors and the approach of explicitly modeling the correlations between the coefficients and regressors.
The rest of the paper is organized as follows. We discuss the estimation of the unconditional mean of the random coefficients with panel data in Section 2 and Section 3. Section 2 considers the approach without explicitly modeling the pattern of correlations. Section 3 considers the approach with explicit assumption about the correlations between the coefficients and regressors. Section 4 provides Monte Carlo results of the different estimators in a finite sample. Concluding remarks are in Section 5.

2. Panel Parametric Approaches without Explicit Assumption about the Correlations between Coefficients and Regressors

When only cross-sectional data are available, the identification conditions of average effects for a correlated random coefficients model require the existence of instrumental variables, which are very stringent and may not be satisfied for many data sets. However, when panel data are available, it is possible to obtain a consistent estimator of the population mean of random coefficients without the existence of instrumental variables.
Suppose there are T time series observations of ( y i t , x i t ) t = 1 , , T for each individual i. Let y i and x i be T × 1 vector and T × K matrix with typical row elements given by y i t and x i t = ( x i t , 1 , , x i t , K ) , respectively, for i = 1 , 2 , , N . Also, let β i = ( β i 1 , , β i K ) . We have
y i = x i β i + u i , i = 1 , , N .
Let u i = ( u i 1 , , u i T ) , and we assume u i is iid across i, with E ( u i | x i ) = 0 and E ( u i u i | x i ) = Σ x i (a T × T matrix). We assume that β i is iid with mean β and variance V a r ( β i ) = Δ . Then we can write
β i = β + α i ,
where
E ( α i ) = E ( β i β ) = 0 ,
and
C o v ( β i , β j ) = E ( α i α j ) = Δ , if   i = j , 0 , if   i j .
Substituting β i = β + α i into (1) yields
y i = x i β + x i α i + u i = x i β + v i ,
where v i = u i + x i α i .
The standard random coefficients model assumes that α i is a random draw from a population with E ( α i | x i ) = 0 . Then
E ( v i | x i ) = E ( x i α i + u i | x i ) = 0 ,
and
E ( v i v i | x i ) = x i Δ x i + σ u 2 .
Therefore, a consistent estimator of β can be obtained by simply regressing Y on X, where Y and X are of dimensions N × 1 and N × K , respectively. An efficient estimator of β can be obtained by applying the generalized least squares estimator (GLS) (or feasible GLS) (e.g., Hsiao 2003, chp. 6; Swamy 1970).
When E ( α i | x i ) = 0 is violated, which is very common in practice, there exist the correlations between the coefficients and regressors, which is our main focus in the paper. We discuss different conditions and estimations in the following subsections.

2.1. Group Mean Estimator

In this subsection we impose the following mild conditional moment restriction:
E ( u i | x i ) = 0 .
Note that (7) is weaker than E ( u i | x i , β i ) = 0 as we do not require that α i and u i are orthogonal with each other. Equation (7) implies the following unconditional moment condition
E ( ( x i x i ) 1 x i u i ) = 0 .
When ( x i x i ) is invertible (which requires that T K ), then from (1) one obtains ( x i x i ) 1 x i u i = ( x i x i ) 1 x i y i β i . Taking expectation yields the unconditional moment condition,
E [ ( x i x i ) 1 x i y i β ] = 0 .
Moment condition (9) leads to the estimator of β given by
β ^ G M = 1 N i = 1 N β ^ i .
where β ^ i = ( x i x i ) 1 x i y i . Estimator (10) is the group mean (GM) estimator of Pesaran and Smith (1995) or Hsiao et al. (1999).
Under certain regularity conditions, we show that the GM estimator achieves the semiparametric efficiency bound derived in Chamberlain (1992). Note that ( α i = β i β )
β ^ i β = α i + ( x i x i ) 1 x i u i .
Then
V a r ( β ^ i ) = V a r ( α i ) + E [ ( x i x i ) 1 x i Σ x i x i ( x i x i ) 1 ] + E [ α i u i x i ( x i x i ) 1 ] + E [ ( x i x i ) 1 x i u i α i ] Ω .
Particularly, in the uncorrelated case, we impose the restriction that
E ( u i | x i , α i ) = 0 .
Then the covariance term in (12) drops out. Moreover, if we further impose the conditional homoskedastic error assumption:
V a r ( u i | x i ) = V a r ( u i ) = σ u 2 I T .
Then V a r ( β ^ i ) is simplified to
Δ + σ u 2 E [ ( x i x i ) 1 ] ,
where Δ = V a r ( α i ) .
The following proposition describes the asymptotic behavior of β ^ G M .
Proposition 1.
If E ( u i | x i ) = 0 and T K , then
(i) 
The group mean estimator defined in (10) is N -consistent and asymptotically normally distributed, specifically, we have
N ( β ^ G M β ) d N ( 0 , Ω ) ,
where Ω is defined in (12).
(ii) 
β ^ G M is semiparametrically efficient.
(iii) 
If conditions (13) and (14) also hold, then the asymptotic variance Ω can be simplified to Ω = Δ + σ u 2 E [ ( x i x i ) 1 ] .
Proof. 
(i) β ^ G M = β + 1 N i = 1 N [ α i + ( x i x i ) 1 x i u i ] . Hence N ( β ^ G M β ) = 1 N i = 1 N w i , where w i = α i + ( x i x i ) 1 x i u i is i.i.d. with mean zero and finite variance Ω . Proposition 1(i) follows from the Lindeberg’s central limit theorem. (iii) follows from (i), (13) and (14) directly. We postpone the proof for (ii) to the Appendix A. ☐
Remark 1.
Note that (16) holds without imposing any restriction on the correlations between x i and α i , and u i and α i . The random coefficient α i can be correlated with both x i and u i with arbitrary correlation patterns. Also, since x i can contain a constant (an intercept), the conventional fixed effects model is included in the correlated random coefficient model as a special case.

2.2. Generalized Least Squares Estimator

In this subsection we consider a generalized least squares (GLS) estimator of β under the assumption that C o v ( β i , x i ) = 0 and compare the relative efficiency of the group mean estimator and the GLS estimator. Under the assumption that
E ( α i | x i ) = 0 .
and the assumptions of (13) and (14), i.e., E ( u i | x i , α i ) = 0 and V a r ( u i | x i ) = σ u 2 I T , then the best linear unbiased estimator (BLUE) of β is the generalized least squares estimator (e.g., Hsiao 2003, chp. 6):
β ^ G L S = i = 1 N x i ( x i Δ x i + σ u 2 I T ) 1 x i 1 i = 1 N x i ( x i Δ x i + σ u 2 I T ) 1 y i = i = 1 N W i β ^ i ,
where
W i = i = 1 N [ Δ + σ u 2 ( x i x i ) 1 ] 1 1 [ Δ + σ u 2 ( x i x i ) 1 ] 1
is a positive definite weight matrix satisfying i = 1 N W i = 1 .
Contrary to the group mean estimator (10) that takes the simple average of the individual least squares estimator, β ^ i , the GLS estimator takes the weighted average of β ^ i .
By noting that y i = x i β i + u i = x i β + ϵ i , where ϵ i = x i α i + u i , we define Y ( N T ) × 1 = ( y 1 , , y N ) , X ( N T ) × K = ( x 1 , , x N ) and ϵ ( N T ) × 1 = ( ϵ 1 , , ϵ N ) . Then we have V a r ( ϵ ) = Ω ( N T ) × ( N T ) = B l o c k d i a g ( Σ i ) is a block-diagonal matrix with the ith block diagonal element Σ i = x i Δ x i + σ u 2 I T . Hence,
β ^ G L S = ( X Ω 1 X ) 1 X Ω 1 Y = i = 1 N x i ( x i Δ x i + σ u 2 I T ) 1 x i 1 i = 1 N x i ( x i Δ x i + σ u 2 I T ) 1 y i = β + N 1 i = 1 N x i ( x i Δ x i + σ u 2 I T ) 1 x i 1 N 1 i = 1 N x i ( x i Δ x i + σ u 2 I T ) 1 ϵ i .
Then by the law of large numbers and the central limit theorem arguments and by noting that V a r ( ϵ i | x i ) = x i Δ x i + σ u 2 I T , we obtain
N ( β ^ G L S β ) d A N ( 0 , A 1 ) = N ( 0 , A ) ,
where
A = { E [ x i ( x i Δ x i + σ u 2 I T ) 1 x i ] } 1 .
Clearly, β ^ G L S is not feasible. A feasible GLS estimator is given with Δ and σ u 2 replaced by
Δ ^ = N 1 i = 1 N ( β ^ i β ^ G M ) ( β ^ i β ^ G M ) σ ^ u 2 N i = 1 N ( x i x i ) 1 , σ ^ u 2 = N 1 ( T K ) 1 i = 1 N t = 1 T ( y i t x i t β ^ i ) 2 ,
where β ^ i is given in (10). The consistency of Δ ^ and σ ^ u 2 can be proved similarly as that of Δ ^ and σ ˜ u 2 in the Appendix A.
Remark 2.
Note that without condition that E ( α i | x i ) = 0 of (17), β ^ G M is still a root-N consistent estimator for β as shown in Proposition 1 while β ^ G L S becomes inconsistent when T is finite due to E ( ϵ i | x i ) = x i E ( α i | x i ) 0 . However, when T is large, by noting that x i x i / T = E ( x i t x i t ) + O p ( 1 / T ) under the strong mixing condition and the conditions given in Theorem 24.6 in Davidson (1994), the weight matrix W i is close to a constant matrix:
W i = 1 N i = 1 N [ Δ + 1 T σ u 2 ( x i x i / T ) 1 ] 1 1 1 N [ Δ + 1 T σ u 2 ( x i x i / T ) 1 ] 1 = 1 N i = 1 N [ Δ + 1 T σ u 2 [ E ( x i t x i t ) ] 1 ] 1 1 1 N [ Δ + 1 T σ u 2 [ E ( x i t x i t ) ] 1 ] 1 + O p 1 N T 3 / 2 1 N + O p 1 N T .
It is easy to see that β ^ G L S = N 1 i = 1 N β ^ i + O p ( ( N T ) 1 i = 1 N β ^ i ) is a consistent estimate for β = E ( β i ) .
The next proposition compares the relative efficiency of β ^ G L S and β ^ G M by comparing their asymptotic variances: A v a r ( N β ^ G L S ) = { E [ x i ( x i Δ x i + σ u 2 I T ) 1 x i ] } 1 and A v a r ( N β ^ G M ) = Ω = Δ + σ u 2 E [ ( x i x i ) 1 ] .
Proposition 2.
Assuming that T is small (but still T K ) and that conditions (13), (14) and (17) hold. Then A v a r ( N β ^ G L S ) A v a r ( N β ^ G M ) .
The proof of Proposition 2 is given in the Appendix A.
Proposition 2 says that, under some additional assumptions, β ^ G L S is asymptotically more efficient than β ^ G M . This is in no contradiction with Proposition 1(ii) because the result of Proposition 1 does not require any of the conditions (13), (14) and (17) to hold. With additional conditions, β ^ G M is no longer a semiparametric efficient estimator of β . However, these additional conditions, especially condition (17), are quite restrictive.
It was shown by Hsiao et al. (1999) that when T is large, β ^ G L S becomes a consistent estimator for β without needing the restrictive condition (17).
Proposition 3.
Under conditions (13) and (14), if both N , T and N 1 / 2 / T 0 , N ( β ^ G L S β ^ G M ) = o p ( 1 ) .
In other words, if both N and T are large and if lim N , T ( N 1 / 2 / T ) 0 , contrary to the case of only cross-sectional data are available, one can ignore the issue of possible correlations between α i and x i (i.e., we allow for E ( α i | x i ) 0 ) and simply treat the model as if β i and x i are uncorrelated and apply the conventional GLS (e.g., Hsiao 2003, eq. (6.2.6)).

2.3. Within Estimator

If T < K , neither the GM, nor the GLS can be implemented. However, the standard within estimator can still yield a consistent estimator of β in certain cases. Let y ¯ i · = 1 T t y i t and x ¯ i · = 1 T t x i t . The within estimator (or fixed effects estimator) first takes the deviation of each observation from its time series mean, then regress ( y i t y ¯ i · ) on ( x i t x ¯ i · ) (e.g., Hsiao 2003, chp. 3)). Model (1) leads to
( y i t y ¯ i · ) = ( x i t x ¯ i · ) β + ( x i t x ¯ i · ) α i + ( u i t u ¯ i · ) , i = 1 , , N , t = 1 , , T ,
where u ¯ i · = T 1 t u i t . The fixed effects (FE) estimator of β is the least squares estimator of (23).
β ^ F E = i t ( x i t x ¯ i · ) ( x i t x ¯ i · ) 1 i t ( x i t x ¯ i · ) ( y i t y ¯ i · ) = β + i t ( x i t x ¯ i · ) ( x i t x ¯ i · ) 1 × i t ( x i t x ¯ i · ) ( x i t x ¯ i · ) α i + i t ( x i t x ¯ i · ) ( u i t u ¯ i · ) .
In general, (24) is inconsistent. However, if the data generating process of x i t takes the form:
x i t = μ i + j B j ϵ i , t j , B j < ,
where μ i is iid with mean a and variance Σ μ and ϵ i t is iid across both i and over t with
E ( ϵ i t | α i ) = E ( ϵ i s | α i ) d i for   all   t , s = 1 , , T .
Then (23) is consistent. To see this, note that under (25) and (26), we have
E ( x i t | α i ) = E ( μ i | α i ) + j B j E ( ϵ i , t j | α i ) = δ i + d i j B j μ i ,
where δ i = E ( μ i | α i ) and d i = E ( ϵ i , t j | α i ) .
Let
x i t = E ( x i t | α i ) + η i t μ i + η i t ,
where η i t = x i t μ i = ( μ i E ( μ i | α i ) ) + j B j ( ϵ i , t j E ( ϵ i , t j | α i ) ) . Then x i t x ¯ i · = η i t η ¯ i · , where m ¯ i · = 1 T t = 1 T m i t (m can be x or η ). Also, from E ( η i t η ¯ i · | α i ) = 0 , we know that E ( x i t x ¯ i · | α i ) E ( η i t η ¯ i · | α i ) = 0 . If in addition the following conditional homoskedastic error assumption holds:
E [ ( x i t x ¯ i · ) ( x i t x ¯ i · ) | α i ] = E [ ( η i t η ¯ i · ) ( η i t η ¯ i · ) | α i ] = C ,
where C = E [ ( η i t η ¯ i · ) ( η i t η ¯ i · ) ] is a K × K nonsingular constant matrix. Then
1 N T i t ( x i t x ¯ i · ) ( x i t x ¯ i · ) α i p C E [ α i ] = 0
as N . Therefore
Proposition 4.
Under (25) and (29), the conventional fixed effects estimator is N -consistent and asymptotically normally distributed as N . The asymptotic covariance matrix of (24) can be approximated using Newey-West heteroscedasticity-autocorrelation consistent formula.
When ( x i t , α i ) has a joint elliptical distribution, the conditional homoscedasticity of E [ ( x i t x ¯ i · ) ( x i t x ¯ i · ) | α i ] = C also holds (e.g., Fang and Zhang 1990; Gupta et al. 1993). Therefore,
Proposition 5.
When ( x i t , α i ) are jointly elliptically distributed, or conditional homoscedasticity of ( x i t x ¯ i · ) of (29) holds, the FE estimator (24) is N -consistent and asymptotically normally distributed.
Another case where the fixed effects estimator can be consistent is that ( x i t , α i ) are jointly symmetrically distributed. Since x i t x ¯ i · has mean equal to zero, ( x i t x ¯ i · , α i ) will be symmetrically distributed around ( 0 , 0 ) , then 1 N T i t ( x i t x ¯ i · ) ( x i t x ¯ i · ) α i p 0 even though x i t has mean different from zero. We have
Proposition 6.
Under (25), (26) and if ( x i t , α i ) are symmetrically distributed, the fixed effects estimator (24) is N consistent and asymptotically normally distributed.
Wooldridge (2005) also discussed conditions for the validity of the fixed effects estimator. Although the conventional FE estimator (24) can yield a consistent estimator of β , if x i t contains time-invariant variables, the mean effects of time-invariant variables cannot be identified by the conventional fixed effects estimator. Moreover, the FE estimator only makes use of the within (group) variation. Since, in general, the between group variation is much larger than within group variation, the FE estimator could also mean a loss of efficiency.

3. Panel Least Squares or Generalized Least Squares Estimator

If α i is correlated with x i , i.e., E ( α i | x i ) 0 , we can re-write (1) as
y i t = x i t β + x i t E ( α i | x i ) + v i t ,
where v i t = u i t + x i t w i , w i = α i E ( α i | x i ) . Equation (31) is no longer a linear function of x i t . For instance, suppose
E ( α i | x i ) = a + B v e c ( x i ) ,
as assumed by Mundlak (1978). Noting that
E ( α i ) = E [ E ( α i | x i ) ] = a + B E ( v e c ( x i ) ) = 0 ,
which implies that a = B E ( v e c ( x i ) ) . Hence, (32) can be written as
E ( α i | x i ) = B ( v e c ( x i ) E ( v e c ( x i ) ) ) .
Equation (31) then becomes
y i t = x i t β + x i t B v e c ( x i E ( x i ) ) + v i t , i = 1 , , N , t = 1 , , T .
Let v i = ( v i 1 , , v i T ) . Then E ( v i | x i ) = 0 and E ( v i v i ) = E ( x i Δ x i ) + σ u 2 I T where Δ = E ( w i w i ) = E ( w i w i | x i ) . Therefore, the least squares or the generalized least squares estimator of β is N consistent provided
1 N i = 1 N E x i x i x i ( x i ( v e c ( ( x i x ¯ · ) ) ) ) ( x i v e c ( ( x i x ¯ · ) ) ) x i ( x i v e c ( ( x i x ¯ · ) ) ) ( x i ( v e c ( ( x i x ¯ · ) ) ) )
is a full rank matrix, where x ¯ · = N 1 i = 1 N x i .
Similar reasoning can be applied if E ( α i | x i ) is a higher order polynomial of x i , say,
E ( α i | x i ) = a + B v e c ( x i ) + C v e c ( x i x i ) .
Then from E [ E ( α i | x i ) ] = 0 , we get a = B E ( v e c ( x i ) ) C E [ v e c ( x i x i ) ] , it follows that
E ( α i | x i ) = B [ v e c ( x i E ( x i ) ) ] + C [ v e c ( x i x i E ( x i x i ) ) ] .
Substituting (38) into (31) we have
y i t = x i t β + ( x i t ( v e c ( ( x i E ( x i ) ) ) ) ) v e c ( B ) + ( x i t ( v e c ( [ ( x i x i ) E ( x i x i ) ] ) ) ) v e c ( C ) + v i t ,
where v i t = x i t ω i + u i t and ω i = α i E ( α i | x i ) . By construction, v i t is uncorrelated with the regressor. Therefore, the least squares (LS) or the feasible generalized least squares (FGLS) estimator of (39) yields a N consistent and asymptotically normally distributed estimator of β when x ¯ · and N 1 i = 1 N x i x i are substituted in lieu of E ( x i ) and E ( x i x i ) in (39).
Next, we derive the asymptotic distribution of β ^ M , P L S which is the estimator for β in (35). The feasible GLS type estimator β ^ M , P L S of β in (35) can be constructed as
β ^ M , P L S = X Ω ^ 1 X X Ω ^ 1 X 1 ( X 1 Ω ^ 1 X 1 ) 1 X 1 Ω ^ 1 X 1 X Ω ^ 1 Y X Ω ^ 1 X 1 ( X 1 Ω ^ 1 X 1 ) 1 X 1 Ω ^ 1 Y , X 1 = ( x 1 v e c ( ( x 1 x ¯ · ) ) , , x N v e c ( ( x N x ¯ · ) ) ) , x ¯ · = N 1 i = 1 N x i , Ω ^ = d i a g { Σ ^ 1 , , Σ ^ N } , Σ ^ i = x i Δ ^ x i + σ ˜ u 2 I T ,
where X and Y are defined in (19), and
σ ˜ u 2 = σ ˜ G M 2 ( N T ) 1 i = 1 N t = 1 T x i t V ˜ G M x i t / 1 ( N T ) 1 i = 1 N t = 1 T x i t E ^ [ ( x i x i ) 1 ] x i t , E ^ [ ( x i x i ) 1 ] = N 1 i = 1 N ( x i x i ) 1 , σ ˜ G M 2 = ( N T ) 1 i = 1 N t = 1 T ( y i t x i t β ^ G M ) 2 , V ˜ G M = N 1 i = 1 N ( β ^ i β ^ G M ) ( β ^ i β ^ G M ) , Δ ^ = V ˜ G M σ ˜ u 2 N i = 1 N ( x i x i ) 1 B ^ 2 1 N i = 1 N v e c x i 1 N i = 1 N x i 2 ,
β ^ i is given in (10), and B ^ is the OLS estimator of B in (35).
We have the following proposition
Proposition 7.
Under conditions (13), (14) and (34), we obtain
N ( β ^ M , P L S β ) d N ( 0 , V M , P L S ) ,
where
V M , P L S = V a r [ E ( α i | x i ) ] + ( E [ x i Σ i 1 x i ] E [ x i Σ i 1 ( x i ( v e c ( ( x i E x i ) ) ) ) ] × E [ ( x i ( v e c ( ( x i E x i ) ) ) ) Σ i 1 ( x i ( v e c ( ( x i E x i ) ) ) ) ] 1 E [ ( x i ( v e c ( ( x i E x i ) ) ) ) Σ i 1 x i ] ) 1 , Σ i = x i Δ x i + σ u 2 I T , Δ = E ( w i w i ) , w i = α i E ( α i | x i ) .
The proof is given in the Appendix A.
Remark 3.
From the total variance decomposition,
Δ = V a r ( α i ) = V a r [ E ( α i | x i ) ] + E [ V a r ( α i | x i ) ] .
Furthermore,
V a r [ E ( α i | x i ) ] + E [ V a r ( α i | x i ) ] = V a r [ E ( α i | x i ) ] + E [ Δ ]
under the assumption that E ( α i | x i ) has a known functional form, for example as given by (34) or (38). Therefore, the asymptotic variance in Proposition 1(iii) can be rewritten as
Ω = Δ + σ u 2 E [ ( x i x i ) 1 ] = V a r [ E ( α i | x i ) ] + E [ Δ ] + σ u 2 E [ ( x i x i ) 1 ] .
By contrast, from Proposition 7 with some algebraic operations,
V M , P L S = V a r [ E ( α i | x i ) ] + ( E [ x i Σ i 1 x i ] E [ x i Σ i 1 ( x i ( v e c ( ( x i E x i ) ) ) ) ] × E [ ( x i ( v e c ( ( x i E x i ) ) ) ) Σ i 1 ( x i ( v e c ( ( x i E x i ) ) ) ) ] 1 E [ ( x i ( v e c ( ( x i E x i ) ) ) ) Σ i 1 x i ] ) 1 V a r [ E ( α i | x i ) ] + E [ x i Σ i 1 x i ] 1 = V a r [ E ( α i | x i ) ] + { E [ Δ + σ u 2 ( x i x i ) 1 ] 1 } 1 ,
since E [ x i Σ i 1 ( x i ( v e c ( ( x i E x i ) ) ) ) ] E [ ( x i ( v e c ( ( x i E x i ) ) ) ) Σ i 1 ( x i ( v e c ( ( x i E x i ) ) ) ) ] 1 E [ ( x i ( v e c ( ( x i E x i ) ) ) ) Σ i 1 x i ] is positive definite. Compared with the group mean estimator, it is not clear which estimator is more efficient, even though E [ x i Σ i 1 x i ] 1 = { E [ Δ + σ u 2 ( x i x i ) 1 ] 1 } 1 E [ Δ ] + σ u 2 E [ ( x i x i ) 1 ] .

4. Monte Carlo Studies

We consider several data generating designs for the following correlated random coefficients model
y i t = β i x i t + u i t , i = 1 , , N , t = 1 , , T ,
β i = β + α i , β = 1 ,
and u i t is a random draw from the standard normal distribution, and is independent with ( x i t , α i ) for all simulation designs. The regressor x i t and the random coefficient α i are correlated with each other and are generated according to the following sample designs:
  • Design 1: Randomly draw
    α i v i 0 v i 1 v i 2 v i 3 N 0 0 0 0 0 , 1 0.2 0.2 0.2 0.2 0.2 0.5 0 0 0 0.2 0 0.5 0 0 0.2 0 0 0.5 0 0.2 0 0 0 0.5 ,
    then generate x i t = v i t + 0.3 v i , t 1 .
  • Design 2: Randomly draw ( α i , v i 0 , v i 1 , v i 2 , v i 3 ) from independent multivariate normal as in (43), then generate x i t = 2 + v i t + v i , t 1 .
  • Design 3: Randomly draw α i from a uniform distribution ( 0.75 , 0.75 ) . Then generate
    x i t = α i + w i t ,
    where w i t = 1 + χ 2 ( 5 ) , where χ 2 ( 5 ) is a random draw from a chi-square distribution with five degrees of freedom.
  • Design 4: Randomly draw α i from uniform distribution ( 0.75 , 0.75 ) and w i t from 1 + χ 2 ( 5 ) where χ 2 ( 5 ) is chi-square distribution with five degrees of freedom. Then generate
    x i t = α i + w i t + 0.3 w i , t 1 .
  • Designs 5 and 6: Generate α i from Gamma(1,1), then generate x i t according to (44) and (45), respectively.1
  • Designs 7 and 8: Generate α i from Beta(1,3), then generate x i t according to (44) and (45), respectively.
  • Designs 9 and 10: Generate x i t = 1 + α i w i t , where α i is from Gamma(1,1) and Beta(1,3), respectively.
Designs 1 and 2 generate β i and x i t from uniformly jointly symmetric distribution with mean (1,0), and mean (1,2), respectively. Designs 3–8 generate β i and x i t from correlated but not symmetrical distributions. Designs 9 and 10 yield nonlinearly correlated β i and x i t .
We examine the finite sample performances of the least squares estimator (LS), the conventional fixed effects estimator, (24), (FE), the panel least squares estimator, (35), (PLS1), (39), (PLS2), the group mean estimator, (10), (GM), and the generalized least squares estimator, (18), (GLS).
We consider a case where T = 3 and 20, and N = 50 , 100, and 200, respectively. We replicate the experiments two thousand times. The simulation results are consistent with the theoretical results. Because the results for T = 3 or T = 20 are similar, we only report the results for T = 3 . The results for T = 20 are available upon request. Table 1 and Table 2 provide the bias and mean squared errors of our estimators. As expected, when ( α i and x i t ) are generated from symmetric distribution with mean (0,0) (Design 1), the LS estimator yields unbiased estimator. However, if ( α i and x i t ) are generated from symmetric distribution with mean (0,2) (Design 2) or nonsymmetric distribution (Design 3), the LS estimator yields biased estimates of β (=1). Most other estimators will work well if there exhibits linear correlations between the coefficients and regressors except the LS estimator. Performances under nonlinear correlations (Designs 9 and 10) will tell them apart. The GM always enjoys the highest efficiency, which is consistent with our theories. The panel least squares estimators PLS1 and PLS2 could pick up the linear correlations well but fail to do so in the nonlinear cases. Also, they create larger biases to achieve smaller mean squared errors compared with the FE estimator. The GLS ignoring the correlations between the coefficients and regressors would work when there are no correlations or linear correlations, but fails when the correlation is nonlinear and T is small which can be seen from the results of Designs 9 and 10. The conventional FE estimator is nearly unbiased if α i and x i t are symmetric (Designs 1 to 8), but is not consistent when the condition is not satisfied, and yields larger mean squared errors than that of GM or GLS. From Table 2 we see that for design 9, only GM estimator exhibits consistency result, while all other estimators’ estimation MSEs do not decrease as sample size N increases.

5. Concluding Remarks

Parameter heterogeneity among micro units is quite common and a random coefficients model is a convenient way to take into account unobserved heterogeneity in pooling the panel data (e.g., Hsiao and Tahmiscioglu 1997; Hsiao et al. 2005). However, as demonstrated by Card (1995), Heckman and Vytlacil (1998), etc. the parameter variation could often be correlated. When only cross-sectional data are available, it was shown by Heckman and Vytlacil (1998) that the consistent estimate of the mean of the coefficients requires very stringent conditions. In this paper, we show that when panel data are available, there is no need to find separate instruments for x i t and β i . As long as the time series dimension T is no smaller than the number of regressors K, we can accommodate the correlations between the random coefficients ( α i ) and regressors ( x i ). Particularly, the group mean estimator is consistent and achieves Chamberlain’s semiparametric efficiency bound. We also give conditions under which the conventional fixed effects estimator and the generalized least squares estimator lead to consistent estimates of the mean coefficient vector β = E ( β i ) . Simulation results strongly support our theoretical analysis. In particular, our Monte Carlo studies show that the group mean estimate is, indeed, robust to a variety of patterns of correlations between the coefficients and regressors.

Author Contributions

The authors contributed equally to this work.

Funding

This research was funded by National Natural Science Foundation of China (71131008, 71631004, 71601130, 71501133).

Acknowledgments

We thank two anonymous referees and L.Q. Wang for helpful comments, and H. Wang for computation assistance. Cheng Hsiao wishes to thank National Natural Science Foundation of China, #71131008 and #71631004 for research support, and Qi Li thanks National Natural Science Foundation of China, #71601130 and #71501133 for research support.

Conflicts of Interest

The authors declare no conflict of interest. The funding sponsors had no role in the design of the study; in the collection, analyses, or interpretation of data; in the writing of the manuscript, and in the decision to publish the results.

Appendix A

Proof of Proposition 1(ii).
Applying Chamberlain (1992)’s result to our case, we have that the semiparametric lower bound for the asymptotic variance of a regular semiparametric estimator of β is given by
V β = V a r [ E ( β i | x i ) ] + E [ ( x i V a r ( y i | x i ) x i ) 1 ]
Since E ( β i | x i ) = β + E ( α i | x i ) , it implies V a r [ E ( β i | x i ) ] = V a r [ E ( α i | x i ) ] . Also, V a r ( y i | x i ) = V a r ( x i α i + u i | x i ) = x i V a r ( α i | x i ) x i + V a r ( u i | x i ) + x i E ( α i u i | x i ) + E ( u i α i | x i ) x i because C o v ( α i , u i | x i ) = E ( α i u i | x i ) . Thus, we have
V β = V a r [ E ( α i | x i ) ] + E { ( x i [ x i V a r ( α i | x i ) x i + V a r ( u i | x i ) + x i E ( α i u i | x i ) + E ( u i α i | x i ) x i ] x i ) 1 } = V a r [ E ( α i | x i ) ] + E ( M i ) ,
where
M i = x i [ x i V a r ( α i | x i ) x i + V a r ( u i | x i ) + x i E ( α i u i | x i ) + E ( u i α i | x i ) x i ] 1 x i 1 .
Then from
M i M i 1 = M i x i [ x i V a r ( α i | x i ) x i + V a r ( u i | x i ) + x i E ( α i u i | x i ) + E ( u i α i | x i ) x i ] 1 x i = I T ,
post-multiplying both sides of the above equation by x i ( x i x i ) 1 , we get
M i x i [ x i V a r ( α i | x i ) x i + V a r ( u i | x i ) + x i E ( α i u i | x i ) + E ( u i α i | x i ) x i ] 1 = x i ( x i x i ) 1 .
Pre-multiplying both sides of the above equation by x i gives
x i M i x i [ x i V a r ( α i | x i ) x i + V a r ( u i | x i ) + x i E ( α i u i | x i ) + E ( u i α i | x i ) x i ] 1 = I T .
Then post-multiplying both sides of it by [ x i V a r ( α i | x i ) x i + V a r ( u i | x i ) + x i E ( α i u i | x i ) + E ( u i α i | x i ) x i ] leads to
x i M i x i = x i V a r ( α i | x i ) x i + V a r ( u i | x i ) + x i E ( α i u i | x i ) + E ( u i α i | x i ) x i .
Pre-multiplying both sides by ( x i x i ) 1 x i leads to
M i x i = V a r ( α i | x i ) x i + ( x i x i ) 1 x i V a r ( u i | x i ) + E ( α i u i | x i ) + ( x i x i ) 1 x i E ( u i α i | x i ) x i .
Post-multiplying both sides by x i ( x i x i ) 1 gives
M i = V a r ( α i | x i ) + ( x i x i ) 1 x i V a r ( u i | x i ) x i ( x i x i ) 1 + E ( α i u i | x i ) x i ( x i x i ) 1 + ( x i x i ) 1 x i E ( u i α i | x i ) .
Combining (A1) and (A2) we obtain
V β = V a r [ E ( α i | x i ) ] + E ( M i ) = V a r [ E ( α i | x i ) ] + E [ V a r ( α i | x i ) ] + E [ ( x i x i ) 1 x i V a r ( u i | x i ) x i ( x i x i ) 1 ] + E [ E ( α i u i | x i ) x i ( x i x i ) 1 + ( x i x i ) 1 x i E ( u i α i | x i ) ] = V a r ( α i ) + E [ ( x i x i ) 1 x i V a r ( u i | x i ) x i ( x i x i ) 1 ] + E [ α i u i x i ( x i x i ) 1 ] + E [ ( x i x i ) 1 x i u i α i ] ,
which is the same as A v a r ( N β ^ G M ) = Ω given in (12). This completes the proof of Proposition 1(ii). ☐
Proof of Proposition 2.
Under the condition of Proposition 2, we have M i = ( x i ( x i Δ x i + σ u 2 I T ) 1 x i ) 1 . Then from M i M i 1 = I T , one can show that M i = Δ + σ u 2 ( x i x i ) 1 (e.g., Hsiao 2003, p. 325)). Hence, E ( M i ) E { [ x i ( x i Δ x i + σ u 2 I T ) 1 x i ] 1 } = Δ + σ u 2 E [ ( x i x i ) 1 ] .
Therefore, our asymptotic variance is A v a r ( N β ^ G L S ) = A = [ E ( M i 1 ) ] 1 E [ M i ] = A v a r ( N β ^ G M ) by applying diagonalization and utilizing Jansen’s inequality to each diagonal element. That is, β ^ G L S is asymptotically more efficient than β ^ G M . This is because that β ^ G L S uses additional information (17) while β ^ G M does not utilize this additional information.
However, if T is large, we have ( x i x i ) / T = T 1 t = 1 T x i t x i t p E ( x i t x i t ) . Hence, [ ( x i x i ) / T ] 1 = [ E ( x i t x i t ) ] 1 + o p ( 1 ) . We have { E [ ( x i x i ) / T ] 1 } = [ E ( x i t x i t ) + o ( 1 ) ] 1 = [ E ( x i t x i t ) ] 1 + o ( 1 ) = E { [ ( x i x i ) / T ] 1 } + o ( 1 ) . So that we have E ( C i 1 ) = [ E ( C i ) ] 1 + o ( 1 ) (for large T), where C i = x i x i / T . This explains why when T , we have A v a r ( N β ^ G L S ) = A v a r ( N β ^ G M ) . This gives an intuitive proof for Proposition 3. ☐
Proof of Proposition 7.
First, we show that σ ˜ u 2 and Δ ^ are consistent estimators of σ u 2 and E ( Δ ) , respectively. Let u ˜ i t = y i t x i t β ^ G M , then it can be shown that
σ ˜ G M 2 = ( N T ) 1 i = 1 N t = 1 T u ˜ i t 2 = ( N T ) 1 i = 1 N t = 1 T [ x i t ( β i β ^ G M ) + u i t ] 2 = ( N T ) 1 i = 1 N t = 1 T [ x i t ( β i β ^ G M ) ( β i β ^ G M ) x i t + u i t 2 ] + o p ( 1 ) = ( N T ) 1 i = 1 N t = 1 T [ x i t ( β i β ) ( β i β ) x i t + u i t 2 ] + o p ( 1 ) = ( T ) 1 t = 1 T E [ x i t ( β i β ) ( β i β ) x i t ] + σ u 2 + o p ( 1 ) = ( T ) 1 t = 1 T E { x i t E [ ( β i β ) ( β i β ) | X ] x i t } + σ u 2 + o p ( 1 ) = ( T ) 1 t = 1 T E { x i t Δ x i t } + σ u 2 + o p ( 1 )
where the first o p ( 1 ) term comes from E ( u i | x i ) = 0 , and others used β ^ G M = β + O p ( N 1 / 2 ) and also the law of large numbers.
Furthermore, we have
V ˜ G M = N 1 i = 1 N ( β ^ i β ^ G M ) ( β ^ i β ^ G M ) = N 1 i = 1 N [ β i + ( x i x i ) 1 x i u i β ^ G M ] [ β i + ( x i x i ) 1 x i u i β ^ G M ] = N 1 i = 1 N [ β i β + ( x i x i ) 1 x i u i ] [ β i β + ( x i x i ) 1 x i u i ] + o p ( 1 ) = N 1 i = 1 N { ( β i β ) ( β i β ) + ( x i x i ) 1 x i u i 2 x i ( x i x i ) 1 ] + o p ( 1 ) = E [ ( β i β ) ( β i β ) ] + E [ ( x i x i ) 1 x i E ( u i 2 | X ) x i ( x i x i ) 1 ] + o p ( 1 ) = Δ + σ u 2 E [ ( x i x i ) 1 ] + o p ( 1 ) = V a r [ E ( α i | x i ) ] + E [ Δ ] + σ u 2 E [ ( x i x i ) 1 ] + o p ( 1 ) .
Combining (A4) and (A5), we have
σ ˜ u 2 p σ u 2 , Δ ^ p E ( Δ ) .
Next, we look at β ^ M , P L S . We have
β ^ M , P L S = ( i = 1 N x i Σ ^ i 1 x i i = 1 N x i Σ ^ i 1 ( x i ( v e c ( ( x i x ¯ · ) ) ) ) i = 1 N ( x i ( v e c ( ( x i x ¯ · ) ) ) ) Σ ^ i 1 ( x i ( v e c ( ( x i x ¯ · ) ) ) ) 1 × i = 1 N ( x i ( v e c ( ( x i x ¯ · ) ) ) ) Σ ^ i 1 x i ) 1 ( i = 1 N x i Σ ^ i 1 y i i = 1 N x i Σ ^ i 1 ( x i ( v e c ( ( x i x ¯ · ) ) ) ) × i = 1 N ( x i ( v e c ( ( x i x ¯ · ) ) ) ) Σ ^ i 1 ( x i ( v e c ( ( x i x ¯ · ) ) ) ) 1 i = 1 N ( x i ( v e c ( ( x i x ¯ · ) ) ) ) Σ ^ i 1 y i ) = β + ( i = 1 N x i Σ ^ i 1 x i i = 1 N x i Σ ^ i 1 ( x i ( v e c ( ( x i x ¯ · ) ) ) ) i = 1 N ( x i ( v e c ( ( x i x ¯ · ) ) ) ) Σ ^ i 1 ( x i ( v e c ( ( x i x ¯ · ) ) ) ) 1 × i = 1 N ( x i ( v e c ( ( x i x ¯ · ) ) ) ) Σ ^ i 1 x i ) 1 ( i = 1 N x i Σ ^ i 1 [ ( x i ( v e c ( ( x i E ( x i ) ) ) ) ) v e c ( B ) ] i = 1 N x i Σ ^ i 1 ( x i ( v e c ( ( x i x ¯ · ) ) ) ) i = 1 N ( x i ( v e c ( ( x i x ¯ · ) ) ) ) Σ ^ i 1 ( x i ( v e c ( ( x i x ¯ · ) ) ) ) 1 × i = 1 N ( x i ( v e c ( ( x i x ¯ · ) ) ) ) Σ ^ i 1 [ ( x i ( v e c ( ( x i E ( x i ) ) ) ) ) v e c ( B ) ] ) + ( i = 1 N x i Σ ^ i 1 x i i = 1 N x i Σ ^ i 1 ( x i ( v e c ( ( x i x ¯ · ) ) ) ) i = 1 N ( x i ( v e c ( ( x i x ¯ · ) ) ) ) Σ ^ i 1 ( x i ( v e c ( ( x i x ¯ · ) ) ) ) 1 × i = 1 N ( x i ( v e c ( ( x i x ¯ · ) ) ) ) Σ ^ i 1 x i ) 1 ( i = 1 N x i Σ ^ i 1 v i i = 1 N x i Σ ^ i 1 ( x i ( v e c ( ( x i x ¯ · ) ) ) ) × i = 1 N ( x i ( v e c ( ( x i x ¯ · ) ) ) ) Σ ^ i 1 ( x i ( v e c ( ( x i x ¯ · ) ) ) ) 1 i = 1 N ( x i ( v e c ( ( x i x ¯ · ) ) ) ) Σ ^ i 1 v i ) = β + B v e c ( x ¯ · E ( x i ) ) + ( i = 1 N x i Σ ^ i 1 x i i = 1 N x i Σ ^ i 1 ( x i ( v e c ( ( x i x ¯ · ) ) ) ) i = 1 N ( x i ( v e c ( ( x i x ¯ · ) ) ) ) Σ ^ i 1 ( x i ( v e c ( ( x i x ¯ · ) ) ) ) 1 × i = 1 N ( x i ( v e c ( ( x i x ¯ · ) ) ) ) Σ ^ i 1 x i ) 1 ( i = 1 N x i Σ ^ i 1 v i i = 1 N x i Σ ^ i 1 ( x i ( v e c ( ( x i x ¯ · ) ) ) ) × i = 1 N ( x i ( v e c ( ( x i x ¯ · ) ) ) ) Σ ^ i 1 ( x i ( v e c ( ( x i x ¯ · ) ) ) ) 1 i = 1 N ( x i ( v e c ( ( x i x ¯ · ) ) ) ) Σ ^ i 1 v i ) .
Therefore,
V M , P L S = lim N V a r ( N β ^ M , G L S ) = V a r [ E ( α i | x i ) ] + ( E [ x i Σ i 1 x i ] E [ x i Σ i 1 ( x i ( v e c ( ( x i E x i ) ) ) ) ] × E [ ( x i ( v e c ( ( x i E x i ) ) ) ) Σ i 1 ( x i ( v e c ( ( x i E x i ) ) ) ) ] 1 E [ ( x i ( v e c ( ( x i E x i ) ) ) ) Σ i 1 x i ] ) 1 ,
where Σ i = x i Δ x i + σ u 2 I T , Δ = E ( w i w i ) and w i = α i E ( α i | x i ) , which implies
N ( β ^ M , P L S β ) d N ( 0 , V M , P L S ) .
This completes the proof of Proposition 7.
Moreover, let M i = ( x i Σ i 1 x i ) 1 = ( x i ( x i Δ x i + σ u 2 I T ) 1 x i ) 1 . Similar as that in the proof of Proposition 2(i) above, we have
M i = Δ + σ u 2 ( x i x i ) 1 .

References

  1. Arellano, Manuel, and Stéphane Bonhomme. 2012. Identifying distributional characteristics in random coefficient panel data models. Review of Economic Studies 79: 987–1020. [Google Scholar] [CrossRef]
  2. Card, David. 1995. Using geographic variation in college proximity to estimate the return to schooling. In Aspects of Labour Market Behavior: Essays in Honour of John Vanderkamp. Edited by Christofides Loizos Nicolaou, E. Kenneth Grant and Robert Swidinsky. Toronto: University of Toronto Press, pp. 201–22. [Google Scholar]
  3. Chamberlain, Gary. 1992. Efficiency bounds for semiparametric regression. Econometrica 60: 567–96. [Google Scholar] [CrossRef]
  4. Chernozhukov, Victor, Iván Fernández-Val, Jinyong Hahn, and Whitney Newey. 2013. Average and quantile effects in nonseparable panel models. Econometrica 81: 535–80. [Google Scholar]
  5. Davidson, James. 1994. Stochastic Limit Theory: An Introduction for Econometricians. New York: Oxford University Press. [Google Scholar]
  6. Fang, Kai-Tai, and Yao-Ting Zhang. 1990. Generalized Multivariate Analysis. New York: Springer. [Google Scholar]
  7. Graham, Bryan S., and James L. Powell. 2012. Identification and estimation of average partial effects in ‘irregular’ correlated random coefficient panel data models. Econometrica 80: 2105–52. [Google Scholar]
  8. Gupta, Arjun K., Tamas Varga, and Taras Bodnar. 1993. Elliptically Contoured Models in Statistics. Norwell: Kluwer Academic Publishers. [Google Scholar]
  9. Heckman, James J., Daniel Schmierer, and Sergio Urzua. 2010. Testing the correlated random coefficient model. Journal of Econometrics 158: 177–203. [Google Scholar] [CrossRef] [PubMed] [Green Version]
  10. Heckman, James J., Sergio Urzua, and Edward Vytlacil. 2006. Understanding instrumental variables in models with essential heterogeneity. Review of Economics and Statistics 88: 389–432. [Google Scholar] [CrossRef]
  11. Heckman, James, and Edward Vytlacil. 1998. Instrumental variables methods for the correlated random coefficient model: Estimating the average rate of return to schooling when the return is correlated with schooling. Journal of Human Resources 33: 974–87. [Google Scholar] [CrossRef]
  12. Hsiao, Cheng. 1996. Random coefficients models. In The Econometrics of Panel Data: A Handbook of the Theory with Applications. Edited by Lśzl Mátyás and Patrick Sevestre. Berlin: Springer, vol. 33, pp. 77–99. [Google Scholar]
  13. Hsiao, Cheng. 2003. Analysis of Panel Data, 2nd ed. Cambridge: Cambridge University Press. [Google Scholar]
  14. Hsiao, Cheng, Trent W. Appelbe, and Christopher R. Dineen. 1993. A general framework for panel data analysis—With an application to Canadian customer dialed long distance service. Journal of Econometrics 59: 63–86. [Google Scholar] [CrossRef]
  15. Hsiao, Cheng, M. Hashem Pesaran, and A. Kamil Tahmiscioglu. 1999. Bayes estimation of short-run coefficients in dynamic panel data models. In Analysis of Panels and Limited Dependent Variables Models. Edited by Cheng Hsiao, M. Hashem Pesaran, Kajal Lahiri and Lung-Fei Lee. Berlin: Springer, pp. 185–213. [Google Scholar]
  16. Hsiao, Cheng, and M. Hashem Pesaran. 2008. Random coefficient models. In The Econometrics of Panel Data: Fundamentals and Recent Developments in Theory and Practice. Edited by Lśzl Mátyás and Patrick Sevestre. Berlin: Springer, vol. 46, pp. 185–213. [Google Scholar]
  17. Hsiao, Cheng, Yan Shen, and Hiroshi Fujiki. 2005. Aggregate versus disaggregate data analysis—A paradox in the estimation of money demand function of Japan under the low interest rate policy. Journal of Applied Econometrics 20: 579–601. [Google Scholar] [CrossRef]
  18. Hsiao, Cheng, and A. Kamil Tahmiscioglu. 1997. A panel analysis of liquidity constraints and firm investment. Journal of the American Statistical Association 92: 455–65. [Google Scholar] [CrossRef]
  19. Kuh, Edwin. 1963. Capital Stock Growth: A Micro-Econometric Approach. Amsterdam: North-Holland. [Google Scholar]
  20. Mundlak, Yair. 1978. On the pooling of time series and cross section data. Econometrica 46: 69–85. [Google Scholar] [CrossRef]
  21. Pesaran, M. Hashem. 2003. Aggregation of linear dynamic models: An application of life-cycle consumption models under habit formation. Economic Modelling 20: 227–435. [Google Scholar]
  22. Pesaran, M. Hashem, and Ron Smith. 1995. Estimation of long-run relationships from dynamic heterogenous panels. Journal of Econometrics 68: 79–114. [Google Scholar] [CrossRef]
  23. Stoker, Thomas M. 1993. Empirical approaches to the problem of aggregation over individuals. Journal of Economic Literature 31: 1827–74. [Google Scholar]
  24. Swamy, Paravastu AVB. 1970. Efficient inference in a random coefficient regression model. Econometrica 38: 311–23. [Google Scholar] [CrossRef]
  25. Theil, Henri. 1954. Linear Aggregation of Economic Relations. Amsterdam: North Holland. [Google Scholar]
  26. Wooldridge, Jeffrey M. 2005. Fixed-effects related estimators for correlated random-coefficient and treatment-effect panel data models. The Review of Economics and Statistics 87: 385–90. [Google Scholar] [CrossRef]
1
Note that when α i is generated from Gamma(1,1) and Beta(1,3), the mean of α i is 1 and 0.25, respectively. Therefore, β in these cases will be 2 and 1.25, respectively.
Table 1. The Bias of the LS, FE, PLS1, PLS2, GM and GLS.
Table 1. The Bias of the LS, FE, PLS1, PLS2, GM and GLS.
NLSFEPLS1PLS2GMGLS
Design 150−0.0114−0.0115−0.0095−0.0090−0.0076−0.0115
1000.00480.00290.00330.00320.00570.0043
200−0.0021−0.0005−0.0015−0.00150.0004−0.0015
Design 2500.3084−0.0087−0.0031−0.0054−0.00310.1613
1000.32130.00340.00540.00390.00430.1835
2000.3198−0.00110.0016−0.0002−0.00030.1845
Design 3500.0452−0.0043−0.0006−0.0006−0.00200.0185
1000.0485−0.00040.00380.0036−0.00010.0235
2000.0477−0.00160.00390.0036−0.00070.0222
Design 4500.0376−0.00120.00280.0022−0.00210.0150
1000.0396−0.00170.00540.0051−0.00020.0173
2000.0397−0.00070.00590.0054−0.00070.0182
Design 5500.26000.0040−0.0482−0.0459−0.0004−0.0205
1000.2610−0.0020−0.0518−0.0516−0.00220.0103
2000.26690.0054−0.0524−0.05230.00020.0417
Design 6500.2143−0.0023−0.0240−0.0243−0.0005−0.0315
1000.2149−0.0008−0.0252−0.0265−0.00220.0059
2000.21910.0015−0.0233−0.02490.00020.0310
Design 7500.0094−0.00070.00110.00100.00050.0025
1000.01020.00020.00130.00120.00060.0050
2000.00970.00010.00110.00100.00010.0057
Design 8500.0075−0.00210.00140.00130.00050.0016
1000.0082−0.00030.00160.00150.00060.0038
2000.00800.00010.00130.00120.00020.0046
Design 9501.66671.68890.14610.10860.00040.0252
1001.74841.79630.17630.1490−0.00250.0262
2001.80391.89130.21830.18560.00060.0237
Design 10500.17820.23160.02630.02180.00110.0899
1000.18300.24460.02960.0254−0.00020.0918
2000.18460.24720.03290.0296−0.00020.0923
Table 2. The Mean Squared Errors of the LS, FE, PLS1, PLS2, GM and GLS.
Table 2. The Mean Squared Errors of the LS, FE, PLS1, PLS2, GM and GLS.
NLSFEPLS1PLS2GMGLS
Design 1500.06220.06360.04230.04200.05640.0542
1000.03220.03040.02080.02080.02810.0283
2000.01680.01620.01080.01070.01480.0148
Design 2500.12310.05860.02350.02330.02150.0550
1000.11780.02870.01190.01180.01130.0482
2000.10970.01540.00610.00610.00550.0416
Design 3500.00730.01090.00450.00440.00400.0071
1000.00510.00540.00240.00240.00210.0041
2000.00360.00280.00110.00110.00100.0023
Design 4500.00630.01060.00430.00420.00390.0066
1000.00420.00560.00230.00220.00200.0038
2000.00290.00290.00110.00110.00090.0021
Design 5500.13440.05790.02360.02280.02080.0360
1000.10040.02820.01500.01440.01050.0210
2000.08770.01430.00870.00850.00500.0131
Design 6500.10360.05490.02200.02130.02070.0340
1000.07390.02830.01230.01200.01050.0198
2000.06210.01470.00620.00610.00500.0121
Design 7500.00140.00290.00110.00110.00100.0014
1000.00080.00160.00050.00050.00050.0008
2000.00040.00070.00030.00030.00020.0004
Design 8500.00120.00320.00100.00100.00090.0013
1000.00060.00150.00050.00050.00040.0006
2000.00030.00080.00020.00020.00020.0003
Design 9503.40343.64860.11020.06950.02140.0223
1003.47203.78500.09430.06930.01080.0113
2003.48513.93480.09830.07130.00510.0059
Design 10500.03500.06990.00340.00310.00260.0100
1000.03530.06780.00240.00210.00140.0093
2000.03490.06540.00180.00160.00070.0090

Share and Cite

MDPI and ACS Style

Hsiao, C.; Li, Q.; Liang, Z.; Xie, W. Panel Data Estimation for Correlated Random Coefficients Models. Econometrics 2019, 7, 7. https://0-doi-org.brum.beds.ac.uk/10.3390/econometrics7010007

AMA Style

Hsiao C, Li Q, Liang Z, Xie W. Panel Data Estimation for Correlated Random Coefficients Models. Econometrics. 2019; 7(1):7. https://0-doi-org.brum.beds.ac.uk/10.3390/econometrics7010007

Chicago/Turabian Style

Hsiao, Cheng, Qi Li, Zhongwen Liang, and Wei Xie. 2019. "Panel Data Estimation for Correlated Random Coefficients Models" Econometrics 7, no. 1: 7. https://0-doi-org.brum.beds.ac.uk/10.3390/econometrics7010007

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Metrics

Back to TopTop