Next Article in Journal
Ten Things You Should Know about the Dynamic Conditional Correlation Representation
Previous Article in Journal
Outlier Detection in Regression Using an Iterated One-Step Approximation to the Huber-Skip Estimator
 
 
Font Type:
Arial Georgia Verdana
Font Size:
Aa Aa Aa
Line Spacing:
Column Width:
Background:
Article

Generalized Spatial Two Stage Least Squares Estimation of Spatial Autoregressive Models with Autoregressive Disturbances in the Presence of Endogenous Regressors and Many Instruments

1
School of Economics, Shanghai University of Finance and Economics, Shanghai 200433, China
2
Department of Economics, The Ohio State University, Columbus, OH 43210, USA
*
Author to whom correspondence should be addressed.
Submission received: 25 March 2013 / Revised: 25 April 2013 / Accepted: 25 April 2013 / Published: 27 May 2013

Abstract

:
This paper studies the generalized spatial two stage least squares (GS2SLS) estimation of spatial autoregressive models with autoregressive disturbances when there are endogenous regressors with many valid instruments. Using many instruments may improve the efficiency of estimators asymptotically, but the bias might be large in finite samples, making the inference inaccurate. We consider the case that the number of instruments K increases with, but at a rate slower than, the sample size, and derive the approximate mean square errors (MSE) that account for the trade-offs between the bias and variance, for both the GS2SLS estimator and a bias-corrected GS2SLS estimator. A criterion function for the optimal K selection can be based on the approximate MSEs. Monte Carlo experiments are provided to show the performance of our procedure of choosing K.

1. Introduction

This paper considers the instrumental variable (IV) estimation of the spatial autoregressive (SAR) model with SAR disturbances (SARAR model) in the presence of endogenous regressors and many instruments. We study the case where the number of instruments increases with the sample size and derive asymptotic distributions of the generalized spatial two stage least squares (GS2SLS) estimator and a bias-corrected GS2SLS (CGS2SLS) estimator based on the leading-order many-instrument bias. Using many moments may improve the asymptotic efficiency but can make inference inaccurate in finite samples. [1] propose to minimize an approximate mean square error (MSE) as that of [2] for choosing the number of instruments in a cross section data model with endogenous regressors. The MSE takes into account an important bias term, so the method can avoid cases where asymptotic inferences are poor due to the bias being large relative to the standard deviation.
Ref [3] have derived the approximate MSEs of the two stage least squares (2SLS) and bias-corrected 2SLS (C2SLS) estimators for the SAR model with endogenous regressors and many instruments, but that SAR model has not included a SAR process in the disturbances. We extend the analysis to the SARAR model with endogenous regressors. The SARAR model combines spatial lag with spatial error dependence. The latter reflects spatial autocorrelation in measurement errors or in variables that are otherwise not crucial to the model [4,5]. It has a broader application than the simpler SAR model. It has been applied to empirical studies, e.g., Case’s work [6,7,8,9,10]. Due to the presence of the spatial error dependence in addition to the spatial lag dependence, we consider the GS2SLS estimation of the model as in [11]. (Ref [12] have extended the estimation method in [11] to the SARAR model with endogenous regressors. Our focus here is on choosing the number of instruments by minimizing the approximated MSEs.) The estimation has taken into account the spatial error structure, based on a transformed equation. Because the transformation uses an initial consistent estimator of the spatial error dependence parameter, the impact from this initial estimator creates extra complexity that should be investigated. The analytical difficulty lies in determining the leading order terms depending on the number of instruments due to the presence of the spatial error process, whose orders cannot be expressed using terms appeared only in a SAR model without SAR disturbances. The approximated MSEs of the GS2SLS and CGS2SLS estimators turn out to be more complicated than those of the corresponding 2SLS and C2SLS estimators for the SAR model but are still tractable for empirical use. For the GS2SLS, the expression for the approximate MSE is similar to that for the 2SLS in [3], except for the presence of the filter for spatial error dependence in various matrices. If the formula for the approximate MSE in [3] is used for the SARAR models, then the derived number of instruments will not be asymptotically optimal. For the CGS2SLS estimator, however, except for the filter, the approximate MSE has additional terms compared with that for the C2SLS in [3], which are generated from the asymptotic distributions of the first two stage estimators.
We consider the following SARAR model:
y n = λ W n y n + Z 2 n γ + u n , u n = ρ M n u n + ϵ n
where n is the number of spatial units, y n is an n-dimensional vector of observations on the dependent variable, the n-dimensional vector of disturbances ϵ n = ( ϵ n 1 , , ϵ n n ) has i.i.d. elements with mean zero and variance σ ϵ 2 , and Z 2 n is an n × m matrix of variables that are possibly correlated with ϵ n , W n and M n are n × n spatial weights matrices that can be equal or different from each other, scalars λ and ρ are spatial autoregressive parameters, and γ is a parameter vector for Z 2 n . Let Z 2 n = Z ¯ 2 n + v n , where Z ¯ 2 n = E ( Z 2 n ) . The Z ¯ 2 n is assumed to be an unknown function of X n , which is an n × k x matrix of exogenous variables, and spatial lags of X n : W n X n , W n 2 X n , and so on. Model (1) can be an equation of the spatial simultaneous system as in [13]. In this case, y n is a vector of observations on one of, say, k y endogenous variables, and the equation for y n , similar to those for other endogenous variables, is y n = λ W n y n + X 1 n γ 1 + Y n γ 2 + u n , where X 1 n is the included exogenous variable matrix, Y n is the endogenous variable matrix including all observations on the other ( k y 1 ) endogenous variables and γ 1 and γ 2 are parameter vectors, then Z ¯ 2 n = i = 0 W n i X n Π i , where Π i ’s are matrices of parameters. Alternatively, Z 2 n or some elements of Z 2 n may be generated by an unknown nonlinear model [14], and thus we have an unknown nonlinear functional form for the conditional mean Z ¯ 2 n [1]. For v n = ( v n 1 , , v n n ) , we assume that v n i ’s are i.i.d. with mean zero and E ( v n i v n i ) = Σ v , v n i is independent of ϵ n j for j i , but E ( v n i ϵ n i ) = σ v ϵ . That is, v n i and ϵ n i are correlated except with the exogenous explanatory variables. The ith variable in Z 2 n is exogenous if the ith element of σ v ϵ is zero. Let Z n = ( W n y n , Z 2 n ) and δ = ( λ , γ ) , then y n = Z n δ + u n .
We are interested in the parameter δ. As in [11], the final generalized estimator for δ is based on the Cochrane–Orcutt transformed equation:
R n ( ρ ˜ n ) y n = λ R n ( ρ ˜ n ) Z n δ + R n ( ρ ˜ n ) u n
where R n ( ρ ˜ n ) = I n ρ ˜ n M n with ρ ˜ n being a consistent estimator of ρ. We consider the problem of choosing the number of instruments for R n ( ρ ˜ n ) Z n , which can be many due to the unknown functional form of Z ¯ 2 n for its endogenous components. To derive ρ ˜ n , we may first estimate the equation y n = Z n δ + u n by the 2SLS with a fixed number of instruments to obtain an initial estimator δ ˇ n of δ, and then estimate ρ with a fixed number of quadratic moment equations that have the form ϵ n ( ρ , δ ˇ n ) D n j ϵ n ( ρ , δ ˇ n ) = 0 , where the n × n matrix D n j has a zero trace, and ϵ n ( ρ , δ ˇ n ) = R n ( ρ ) ( y n Z n δ ˇ n ) . (The equation ϵ n ( ρ , δ ˇ n ) D n j ϵ n ( ρ , δ ˇ n ) = 0 is a valid moment equation since E ( ϵ n D n j ϵ n ) = σ ϵ 2 tr ( D n j ) = 0 and 1 n [ ϵ n ( ρ 0 , δ ˇ n ) D n j ϵ n ( ρ 0 , δ ˇ n ) ϵ n D n j ϵ n ] = o P ( 1 ) under regularity conditions.) The estimation thus involves three stages and the derivation of approximated MSEs is more complicated due to the presence of many terms with different orders. In [11], the asymptotic distribution of the third stage estimator δ ^ 2 s l s , n is not affected by the estimators in the first two stages as long as ρ ˜ n is a consistent estimator of ρ. For the approximate MSE of our GS2SLS estimator in the third stage, one may expect that it involves the asymptotic distributions of the first two stage estimators, since we use higher-order asymptotic theory for IV. However, it turns out that the variance of the dominant component related to the first two stage estimators in the expression for the GS2SLS estimator has a smaller order compared with other terms because of the i.i.d. property of ϵ n i ’s. As a result, the leading order component of the MSE does not depend on the asymptotic distributions of the first two stage estimators and the expression for the approximate MSE is similar to that in [3] except for the filter for spatial error dependence. However, for the CGS2SLS estimator, the expression for the approximate MSE is more complicated than that in [3], because the term resulting from the estimation error of the leading order bias involves the asymptotic distributions of the first two stage estimators and an additional term appears due to the estimation of the spatial autoregressive parameter in the error process.
As Z ¯ 2 n is an unknown function of X n , W n X n , W n 2 X n , etc., we may assume an infinite series approximation for Z ¯ 2 n and, in practice, use a known n × q matrix ψ q , n to approximate Z ¯ 2 n , where ψ q , n depends on X n , W n X n and so on. To closely approximate Z ¯ 2 n with a linear combination of ψ q , n , we may need a large column number q as well as appropriate form of ψ q , n . The instruments for W n y n can be based on ψ q , n . Denote the true parameters for δ and ρ by δ 0 and ρ 0 respectively. As model (1) represents an equilibrium model, ( I n λ 0 W n ) can be assumed to be invertible, where I n is the n × n identity matrix. (The SAR model is known as a simultaneous equation model in the spatial literature because the outcomes are determined by the interactions of spatial units. By assuming ( I n λ 0 W n ) to be invertible, we have the equilibrium vector y n .) Then, if | | λ 0 W n | | < 1 for some matrix norm | | · | | , the equilibrium vector y n = ( I n λ 0 W n ) 1 ( Z 2 n γ 0 + u n ) can have an expansion i = 0 λ 0 i W n i ( Z 2 n γ 0 + u n ) . Therefore, the instruments for W n y n can be W n ψ q , n , W n 2 ψ q , n and so on, and the instruments for Z n can be taken as the n × K matrix
F K , n = [ ψ q , n , W n ψ q , n , , W n p ψ q , n ]
where K = ( p + 1 ) q m + 1 . As an extension, we use the instrument matrix
Q K , n = [ F K , n , M n F K , n ]
for Z n ( ρ ˜ n ) = ( I n ρ ˜ n M n ) Z n . (Due to technical difficulties in the presence of many IVs that involve estimated parameters in the literature, we do not use ( I n ρ ˜ n M n ) F K , n as the instrument matrix for Z n ( ρ ˜ n ) (see [15]). If W n = M n , then M n F K , n generates some identical IVs as those in F K , n . In this case, we can simply take Q K , n = [ F K , n , W n p + 1 ψ q , n ] .) The asymptotic variance of the 2SLS estimator decreases when a linear combination of IVs approximates the conditional mean of the endogenous variables more closely. The efficiency (lower bound) of IV estimators is achieved when a linear combination of IVs equals the conditional mean [16]. Under regularity conditions, a linear combination of [ I n , W n , W n 2 , , W n p ] can approximate ( I n ρ W n ) 1 arbitrarily well as p . Thus, if a linear combination of ψ q , n can approximate Z ¯ 2 n well as n , q , a linear combination of Q K , n can approximate Z n ( ρ ˜ n ) arbitrarily well in probability as n , p , q . On the other hand, if the number of instruments increases too fast relative to the sample size, they will lead to a bias of certain order for the corresponding IV estimators. The tradeoff between variance and bias can be summarized by the MSE of the estimator. So, minimizing the (approximated) MSE can reduce inaccurate inference due to the presence of many instruments. Following [1], we consider the case that the number of instruments K increases with, but at a rate slower than, the sample size n, which facilitates the investigation of the high order asymptotics of the MSEs.
The rest of the paper is organized as follows. Section 2 establishes asymptotic properties of the GS2SLS and CGS2SLS estimators. Section 3 derives the approximated MSEs for the estimators and gives a criterion function to choose the optimal number of IVs using the approximated MSEs. Section 4 presents some Monte Carlo results on the performance of the instrumental variable selection procedure in finite samples. Section 5 concludes. A list of notations, lemmas and proofs are collected in the appendices.

2. Properties of the GS2SLS and CGS2SLS Estimators

We establish the properties of the GS2SLS and CGS2SLS estimators in this section. Let R n ( ρ ) = I n ρ M n , G n ( λ ) = W n ( I n λ W n ) , Z n = Z ¯ n + ζ n with Z ¯ n = E ( Z n ) , and | | A | | = tr ( A A ) be the Frobenius matrix norm for a matrix A. UB stands for boundedness of the sequences of both row and column sum matrix norms for a sequence of matrices. For simplicity, denote y n ( ρ ) = R n ( ρ ) y n , Z n ( ρ ) = R n ( ρ ) Z n , u n ( ρ ) = R n ( ρ ) u n , Z 2 n ( ρ ) = R n ( ρ ) Z 2 n , R n = R n ( ρ 0 ) , and G n = G n ( λ 0 ) . As y n = ( I n λ 0 W n ) 1 ( Z 2 n γ 0 + R n 1 ϵ n ) , Z ¯ n = [ G n Z ¯ 2 n γ 0 , Z ¯ 2 n ] and ζ n = [ G n v n γ 0 + G n R n 1 ϵ n , v n ] . The following are some basic regularity conditions.
Assumption 1. 
{ ϵ n i , v n i } ’s, i = 1 , , n , are i.i.d. with mean zero, E ( ϵ n i 2 ) = σ ϵ 2 , E ( v n i v n i ) = Σ v and E ( v n i ϵ n i ) = σ v ϵ . The moments E | ϵ n i | 4 + τ , E | | v n i | | 4 and E | | v n i ϵ n i | | 2 are finite, where τ is some positive constant.
Assumption 2. 
(i) The sequences of matrices { W n } , { M n } , { ( I n λ 0 W n ) 1 } and { R n 1 } are UB;
(ii) 
W n and M n have zero diagonals.
Since we use quadratic moments to estimate ρ in model (1), the existence of a moment of ϵ n i higher than the fourth order is required to properly apply the central limit theorem for linear-quadratic forms of disturbances in [17]. Some moment conditions are also imposed on v n i and v n i ϵ n i in Assumption 1. Assumption 2 (i), originated in [11,18], is a condition that bounds the degree of spatial dependence; Assumption 2 (ii) implies that no spatial unit is viewed as its own neighbor.
Let F 0 , n be a full rank n × k f instrument matrix for Z n in the first stage of the GS2SLS estimation. The number k f of IVs is at least as large as the number ( m + 1 ) of columns of Z n , but is fixed for all n. Denote P F n = F 0 , n ( F 0 , n F 0 , n ) F 0 , n , where A is a generalized inverse for the matrix A. The first stage 2SLS estimator for δ is δ ˇ n = ( Z n P F n Z n ) 1 Z n P F n y n . The following assumption about F 0 , n is maintained.
Assumption 3. 
The instrument matrix F 0 , n has full column rank k f m + 1 for all n, lim n 1 n F 0 , n F 0 , n is finite and nonsingular, and lim n 1 n F 0 , n Z ¯ n is finite and has full column rank, where Z ¯ 2 n in Z ¯ n has uniformly bounded elements.
Proposition 1. 
Under Assumptions 1–3, n ( δ ˇ n δ 0 ) = ( 1 n Z ¯ n P F n Z ¯ n ) 1 1 n Z ¯ n P F n R n 1 ϵ n + O P ( n 1 / 2 ) d N 0 , lim n ( 1 n Z ¯ n P F n Z ¯ n ) 1 σ ϵ 2 n Z ¯ n P F n R n 1 R n 1 P F n Z ¯ n ( 1 n Z ¯ n P F n Z ¯ n ) 1 .
In the second stage of the GS2SLS estimation, we use a fixed number, say k d , of quadratic moments to estimate ρ in model (1). Let g n ( ρ , δ ˇ n ) = 1 n [ ϵ n ( ρ , δ ˇ n ) D n 1 ϵ n ( ρ , δ ˇ n ) , , ϵ n ( ρ , δ ˇ n ) D n , k d ϵ n ( ρ , δ ˇ n ) ] , where ϵ n ( ρ , δ ˇ n ) = R n ( ρ ) ( y n Z n δ ˇ n ) and n × n matrices D n j ’s have zero traces. The D n j ’s can be, e.g., M n and M n 2 I n tr ( M n 2 ) / n . We maintain the following regularity condition on D n j .
Assumption 4. 
The sequences of matrices { D n j } , j = 1 , , k d , have zero traces and are UB.
Consider a generalized moments estimator ρ ˜ n of ρ, which is
ρ ˜ n = arg min ρ [ a , a ] g n ( ρ , δ ˇ n ) g n ( ρ , δ ˇ n )
for some a 1 so that [ a , a ] contains ρ 0 . It can be shown that g n ( ρ , δ ˇ n ) g n ( ρ , δ ˇ n ) E g n ( ρ , δ 0 ) E g n ( ρ , δ 0 ) converges to zero in probability uniformly over [ a , a ] . For the identification of ρ 0 , it requires E g n ( ρ , δ 0 ) E g n ( ρ , δ 0 ) to be zero uniquely at ρ 0 . Let A s = A + A for any square matrix A. Note that E g n ( ρ , δ 0 ) = σ ϵ 2 2 Ξ n [ ( ρ 0 ρ ) , ( ρ 0 ρ ) 2 ] , where
Ξ n = 1 n tr [ ( M n R n 1 ) s D n 1 s ] tr [ ( M n R n 1 ) D n 1 s ( M n R n 1 ) ] tr [ ( M n R n 1 ) s D n , k d s ] tr [ ( M n R n 1 ) D n , k d s ( M n R n 1 ) ]
Assumption 5. 
The smallest eigenvalue of Ξ n Ξ n is bounded away from zero.
Assumption 5 is satisfied if the limit of the 2 × 2 matrix Ξ n Ξ n exists and is nonsingular. With Assumption 5, there exists some η > 0 such that E g n ( ρ , δ 0 ) E g n ( ρ , δ 0 ) > η for any ρ ρ 0 . Thus for any ρ ρ 0 , g n ( ρ , δ ˇ n ) g n ( ρ , δ ˇ n ) > η / 2 with probability approaching 1 as n .
Proposition 2. 
Under Assumptions 1–5, ρ ˜ n is a consistent estimator of ρ 0 , and
n ( ρ ˜ n ρ 0 ) = 1 n ( ϵ n D n ϵ n + F n ϵ n ) + O P ( n 1 / 2 )
is asymptotically normal with a finite variance, where
D n = σ ϵ 2 n 2 j = 1 k d tr 2 ( D n j s M n R n 1 ) 1 j = 1 k d 1 n tr ( D n j s M n R n 1 ) D n j
and
F n = σ ϵ 2 n 2 j = 1 k d tr 2 ( D n j s M n R n 1 ) 1 j = 1 k d 1 n tr ( D n j s M n R n 1 ) 1 n E ( ϵ n D n j s R n ζ n ) ( 1 n Z ¯ n P F n Z ¯ n ) 1 Z ¯ n P F n R n 1
with E ( ϵ n D n j s R n ζ n ) = [ tr ( D n j s R n G n ) σ v ϵ γ 0 + σ ϵ 2 tr ( D n j s R n G n R n 1 ) , tr ( D n j s R n ) σ v ϵ ] .
In the expression for n ( ρ ˜ n ρ 0 ) above, the term 1 n F n ϵ n with the order O P ( 1 ) is due to the usage of the first stage estimator δ ˇ n . That is to say that the asymptotic distribution of δ ˇ n has implication on the asymptotic distribution of ρ ˜ n .
We now consider the GS2SLS estimator using the transformed Equation (2). With the instrument matrix Q K , n in Equation (4), the GS2SLS estimator of δ is
δ ^ 2 s l s , n = [ Z n ( ρ ˜ n ) P K , n Z n ( ρ ˜ n ) ] 1 Z n ( ρ ˜ n ) P K , n y n ( ρ ˜ n )
where P K , n = Q K , n ( Q K , n Q K , n ) Q K , n .
Assumption 6. 
(i) H ¯ = lim n H n , where H n = 1 n Z ¯ n ( ρ 0 ) Z ¯ n ( ρ 0 ) , is a finite nonsingular ( m + 1 ) × ( m + 1 ) matrix; (ii) for each Q K , n in Equation (4), there exists π K , n such that 1 n | | Z ¯ n ( ρ ) Q K , n π K , n | | 2 0 as n , K .
Assumption 6 (i) gives a sufficient condition for the identification of δ 0 in Equation (2); Assumption 6 (ii) requires Z ¯ n ( ρ ) to be approximated arbitrarily well by a linear combination of Q K , n for large enough K and n, which is implied by Lemma 1 in Section B under some other basic assumptions. For analytical tractability, we maintain the following assumption.
Assumption 7. 
The elements of Q K , n in Equation (4) are uniformly bounded constants, and lim n 1 n Q K , n Q K , n exists and is nonsingular for each K.
The GS2SLS estimator δ ^ 2 s l s , n is characterized by the first order condition 1 n Z n ( ρ ˜ n ) P K , n [ y n ( ρ ˜ n ) Z n ( ρ ˜ n ) δ ^ 2 s l s , n ] = 0 . By a Taylor expansion of this condition at δ 0 , the first term is 1 n Z n ( ρ ˜ n ) P K , n u n ( ρ ˜ n ) , which has the dominant component 1 n Z n ( ρ 0 ) P K , n ϵ n by Lemma 8. The expectation of this dominant component is 1 n Υ n ( K ) , where
Υ n ( K ) = E ( ζ n R n P K , n ϵ n ) = [ tr ( Γ n K , 2 ) σ v ϵ γ 0 + σ ϵ 2 tr ( Γ n K , 3 ) , tr ( Γ n K , 1 ) σ v ϵ ] = O ( K )
with
Γ n K , 1 = P K , n R n , Γ n K , 2 = P K , n R n G n , and Γ n K , 3 = P K , n R n G n R n 1
Thus when K / n c 0 , the GS2SLS estimator δ ^ 2 s l s , n is generally inconsistent. When K / n 0 , δ ^ 2 s l s , n is consistent, but if the number of instruments K grows somehow fast relative to the sample size n, the asymptotic distribution may not center at the true δ 0 . The following proposition provides more information on this issue.
Proposition 3. 
Under Assumptions 1–7,
(i) 
if K / n c 0 , then δ ^ 2 s l s , n δ 0 p lim n b ¯ n K , 1 , where
b ¯ n K , 1 = [ Z ¯ n ( ρ 0 ) Z ¯ n ( ρ 0 ) + Ω n 1 ( K ) ] 1 Υ n ( K ) = O ( K / n )
with
Ω n 1 ( K ) = E ( ζ n R n P K , n R n ζ n ) = γ 0 Σ v γ 0 tr ( Γ n K , 2 Γ n K , 2 ) + σ ϵ 2 tr ( Γ n K , 3 Γ n K , 3 ) + 2 σ v ϵ γ 0 tr ( Γ n K , 3 Γ n K , 2 ) Σ v γ 0 tr ( Γ n K , 1 Γ n K , 2 ) + σ v ϵ tr ( Γ n K , 1 Γ n K , 3 ) Σ v tr ( Γ n K , 1 Γ n K , 1 )
might converge to a nonzero constant;
(ii) 
if K / n 0 , then n ( δ ^ 2 s l s , n δ 0 b n , K ) d N ( 0 , σ ϵ 2 H ¯ 1 ) , where
b n , K = [ Z n ( ρ ˜ n ) P K , n Z n ( ρ ˜ n ) ] 1 Υ n ( K ) = b ¯ n K , 2 + o P ( K / n )
with b ¯ n K , 2 = [ Z ¯ n ( ρ 0 ) Z ¯ n ( ρ 0 ) ] 1 Υ n ( K ) = O ( K / n ) .
From the above proposition, when K / n 0 , δ ^ 2 s l s , n is consistent of δ 0 , but whether its asymptotic distribution is centered at δ 0 or not depends on the ratio K / n as n b n , K = O P ( K / n ) . The following corollary shows various scenarios.
Corollary 1. 
Under Assumptions 1–7,
(i) 
if K 2 / n 0 , n ( δ ^ 2 s l s , n δ 0 ) d N ( 0 , σ ϵ 2 H ¯ 1 ) ;
(ii) 
if K 2 / n c < and c 0 , n ( δ ^ 2 s l s , n δ 0 b ¯ n K , 2 ) d N ( 0 , σ ϵ 2 H ¯ 1 ) ;
(iii) 
if K 2 / n but K 1 + η / n 0 for some 0 < η < 1 , K η ( δ ^ 2 s l s , n δ 0 ) p 0 .
When K 2 / n 0 , the number of instruments K increases slow relative to the sample size n and the asymptotic variance matrix σ ϵ 2 H ¯ 1 achieves the efficiency lower bound for the class of IV estimators. When K 2 / n goes to a non-zero limit as n goes to infinity, n ( δ ^ 2 s l s , n δ 0 ) is centered at lim n n b ¯ n K , 2 , which might be a non-zero finite constant and is a many instrument bias. Due to the spatial error dependence, the matrices Γ n K , 1 , Γ n K , 2 and Γ n K , 3 in Equation (10) of the bias component in Equation (9) play important roles. Without spatial error dependence, these matrices reduce to P K , n and P K , n G n . Although the GS2SLS estimation is based on the spatial Cochrane–Orcutt transformed model (2), the asymptotic distribution of the estimator ρ ˜ n in the transformation does not affect the asymptotic distribution of δ ^ 2 s l s , n , as usual for the GS2SLS estimation.
To correct the many instrument bias, we consider a bias corrected estimator based on the estimation of the leading order bias b n , K in Equation (12). Let Q 0 , n be an instrument matrix with a fixed number of instruments and P 0 , n = Q 0 , n ( Q 0 , n Q 0 , n ) Q 0 , n .
Assumption 8. 
The instrument matrix Q 0 , n has full column rank k q m + 1 for all n, lim n 1 n Q 0 , n Q 0 , n is finite and nonsingular, and lim n 1 n Q 0 , n Z ¯ n ( ρ 0 ) is finite and has full column rank.
The GS2SLS estimator
δ ˜ n = [ Z n ( ρ ˜ n ) P 0 , n Z n ( ρ ˜ n ) ] 1 Z n ( ρ ˜ n ) P 0 , n y n ( ρ ˜ n )
and ρ ˜ n together can be used to estimate b n , K . Let Γ ˜ n K , 1 = P K , n R n ( ρ ˜ n ) , Γ ˜ n K , 2 = P K , n R n ( ρ ˜ n ) G n ( λ ˜ n ) , Γ ˜ n K , 3 = P K , n R n ( ρ ˜ n ) G n ( λ ˜ n ) R n 1 ( ρ ˜ n ) , σ ˜ ϵ 2 = 1 n ( y n Z n δ ˜ n ) R n ( ρ ˜ n ) R n ( ρ ˜ n ) ( y n Z n δ ˜ n ) and σ ˜ v ϵ = 1 n ( y n Z n δ ˜ n ) R n ( ρ ˜ n ) Z 2 n . A bias-corrected GS2SLS ( CGS2SLS) estimator is
δ ^ c 2 s l s , n = δ ^ 2 s l s , n b ˜ n , K
where b ˜ n , K = [ Z n ( ρ ˜ n ) P K , n Z n ( ρ ˜ n ) ] 1 Υ ˜ n ( K ) with Υ ˜ n ( K ) = [ tr ( Γ ˜ n K , 2 ) σ ˜ v ϵ γ ˜ n + σ ˜ ϵ 2 tr ( Γ ˜ n K , 3 ) , tr ( Γ ˜ n K , 1 ) σ ˜ v ϵ ] .
Proposition 4. 
Under Assumptions 1–8, if K / n 0 , then n ( δ ^ c 2 s l s , n δ 0 ) d N ( 0 , σ ϵ 2 H ¯ 1 ) .
Note that the asymptotic distribution of δ ^ c 2 s l s , n in Equation (14) when K / n 0 is the same as that of δ ^ 2 s l s , n in Equation (8) when K 2 / n 0 . So the bias correction procedure has effectively relaxed some requirement on K in order for the corrected estimator to have a properly centered asymptotic distribution. The asymptotic distributions of the initial estimators δ ˜ n in Equation (13) and ρ ˜ n in Proposition 2 used for the bias correction do not enter into the asymptotic distribution of δ ^ c 2 s l s , n , when only the first order asymptotic expansion is considered. But when we investigate the approximated MSE of δ ^ c 2 s l s , n later, as high order asymptotic expansions are considered, the asymptotic distributions of the estimators δ ˜ n and ρ ˜ n used for the bias correction will generate additional terms for the approximated MSE.

3. Approximated MSE and Optimal K

For an estimator δ ^ n satisfying n ( δ ^ n δ 0 ) = H ^ n 1 h ^ n , [1] have derived a lemma that gives conditions on the decompositions of H ^ n and h ^ n such that the leading order term of the MSE depending on K is S n ( K ) , in the sense that
n ( δ ^ n δ 0 ) ( δ ^ n δ 0 ) = L ^ n ( K ) + r ^ n ( K )
where E [ L ^ n ( K ) ] = σ ϵ 2 H n 1 + S n ( K ) + T n ( K ) , and T n ( K ) and r ^ n ( K ) are remainder terms that diminish faster than S n ( K ) , such that [ r ^ n ( K ) + T n ( K ) ] / tr ( S n ( K ) ) = o P ( 1 ) as K , n . A criterion function for the optimal K can be S n , ξ ( K ) = ξ S n ( K ) ξ , the leading order MSE depending on K for a linear combination ξ δ ^ n . In particular, one may use the unweighted version tr ( S n ( K ) ) as a practical criterion. Let S ^ n ( K ) be an estimator of S n ( K ) , then K can be chosen by minimizing the function S ^ n , ξ ( K ) = ξ S ^ n ( K ) ξ .
In this section, we first derive the expression for S n ( K ) for both the GS2SLS and CGS2SLS estimators and then show that the chosen K by minimizing S ^ n , ξ ( K ) is asymptotically optimal in a sense in Equation (20) originated in [1]. Intuitively, this indicates that the error in the use of the feasible S ^ n , ξ ( K ) criterion in place of the actual ideal S n , ξ ( K ) is asymptotically negligible.
Assumption 9. 
(i) 1 K tr ( Γ n K , 1 ) c , where c 0 , as n , K ;
(ii) 
max i | Γ n K j , i i | 0 for j = 1 , 2 , 3 , as n , K , where Γ n K j , i i is the ( i , i ) th element of Γ n K , j ;
(iii) 
μ 3 = E ( ϵ n i 3 ) = 0 and E ( ϵ n i 2 v n i ) = 0 .
Assumption 9 (i) is for analytical tractability; Assumption 9 (ii) simplifies the expression for S n ( K ) by imposing a restriction on the rate at which K increases with n; Assumption 9 (iii) is also a condition that simplifies S n ( K ) . These simplifications are adopted in [1,3]. (Without Assumption 9 (iii), S n ( K ) for the GS2SLS will have an additional term 1 n H n 1 { Z ¯ n ( ρ 0 ) [ E ( ϵ n i 2 v n i ) γ 0 ve c D ( Γ n K , 2 ) + μ 3 ve c D ( Γ n K , 3 ) , E ( ϵ n i 2 v n i ) ve c D ( Γ n K , 1 ) ] } s H n 1 , and S n ( K ) for the CGS2SLS has an additional term that is much more complicated due to the estimator of ρ in the second stage of the GS2LS estimation and its use to correct the many instrument bias. Without Assumption 9 (ii), S n ( K ) for the GS2SLS is not affected, but S n ( K ) for the CGS2SLS has an additional term. Those additional terms can be estimated along with other terms, but they are not included here for simplicity.)
Proposition 5. 
Under Assumptions 1–9, if K 2 / n 0 and σ v ϵ 0 , then Equation (15) for the GS2SLS estimator δ ^ 2 s l s , n is satisfied with
S n ( K ) = 1 n H n 1 [ σ ϵ 2 Z ¯ n ( ρ 0 ) ( I n P K , n ) Z ¯ n ( ρ 0 ) + Ω n 2 ( K ) ] H n 1
where Ω n 2 ( K ) = Υ n ( K ) Υ n ( K ) .
Note that S n ( K ) above has a similar form as that in [3] except for the transformation R n involved due to the spatial error dependence. The S n ( K ) has a similar interpretation as that in [3]: σ ϵ 2 n H n 1 Z ¯ n ( ρ 0 ) ( I n P K , n ) Z ¯ n ( ρ 0 ) H n 1 is a variance term, which becomes smaller as a linear combination of Q K , n approximates the mean of Z ¯ n ( ρ 0 ) better; 1 n H n 1 Ω n 2 ( K ) H n 1 is the leading order term in the MSE of 1 n H n 1 ζ n R n P K , n ϵ n with the dominant component being from its expectation, which stands for the many instrument bias and increases as K increases. The minimization of a criterion function ξ S n ( K ) ξ thus takes into account the trade-off between the bias and variance.
Proposition 6. 
Under Assumptions 1–9, if K / n 0 and σ v ϵ 0 , then Equation (15) for the CGS2SLS estimator δ ^ c 2 s l s , n is satisfied with
S n ( K ) = 1 n H n 1 [ σ ϵ 2 Z ¯ n ( ρ 0 ) ( I n P K , n ) Z ¯ n ( ρ 0 ) + Π n 1 ( K ) + Π n 2 ( K ) + Π n 3 ( K ) ] H n 1
where Π n 1 ( K ) , Π n 2 ( K ) and Π n 3 ( K ) are given in Equations (21), (22) and (25) respectively.
The first term in Equation (17) is the same as that in Equation (16). The second term 1 n H n 1 Π n 1 ( K ) H n 1 is the leading order term in the variance of 1 n H n 1 [ ζ n R n P K , n ϵ n E ( ζ n R n P K , n ϵ n ) ] . The third term 1 n H n 1 Π n 2 ( K ) H n 1 is due to the estimation error of the lead order bias of the GS2SLS estimator. This term becomes much more complicated than that for the SAR model because of the spatial error dependence. The last term 1 n H n 1 Π n 3 ( K ) H n 1 is an additional term compared with S n ( K ) in [3], which is due to the estimation of ρ. (Thus, the F n in Π n 2 ( K ) is from ρ ˜ n used for the bias correction, and the F n in Π n 3 ( K ) is from ρ ˜ n in the spatial Cochrane–Orcutt transformation of the GS2SLS estimation.) The S n ( K ) is a sum of different variance terms, because the bias terms have smaller orders compared with the variance terms.
We now consider the estimation of S n , ξ ( K ) = ξ S n ( K ) ξ . Estimators for the parameters in S n , ξ ( K ) can be constructed using a GS2SLS estimator. For the GS2SLS estimator, let the first stage IV matrix be F K ¯ , n with K ¯ instruments, the matrices for the quadratic moments in the second stage be D n 1 , , D n , k ¯ d , and the last stage IV matrix be Q K ¯ , n = [ F K ¯ , n , M n F K ¯ , n ] . (The K ¯ needs to increase with n so that the estimators for σ v ϵ , Σ v and H ¯ defined below are consistent.) Then the first stage estimator for δ is δ ˙ = ( Z n P F ¯ n Z n ) 1 Z n P F ¯ n y n with P F ¯ n = F K ¯ , n ( F K ¯ , n F K ¯ , n ) F K ¯ , n , and the last stage estimator for δ is δ ^ n = [ Z n ( ρ ^ n ) P K ¯ , n Z n ( ρ ^ n ) ] 1 Z n ( ρ ^ n ) P K ¯ , n y n ( ρ ^ n ) with P K ¯ , n = Q K ¯ , n ( Q K ¯ , n Q K ¯ , n ) Q K ¯ , n and ρ ^ n being the estimator for ρ in the second stage. Let the estimators for σ ϵ 2 , σ v ϵ and Σ v be, respectively, σ ^ ϵ 2 = 1 n ϵ ^ n ϵ ^ n , σ ^ v ϵ = 1 n ϵ ^ n v ^ n and Σ ^ v = 1 n v ^ n v ^ n , where ϵ ^ n = y n ( ρ ^ n ) Z n ( ρ ^ n ) δ ^ n and v ^ n = ( I n P F ¯ n ) Z 2 n . An estimator for Ω n 2 ( K ) , Ω ^ n 2 ( K ) , can be derived by replacing the parameters with their respective estimators. An estimator for H n is H ^ n = 1 n Z n ( ρ ^ n ) P K ¯ , n Z n ( ρ ^ n ) . For 1 n Z ¯ n ( ρ 0 ) ( I n P K , n ) Z ¯ n ( ρ 0 ) , note that
1 n E [ Z n ( ρ 0 ) ( I n P K , n ) Z n ( ρ 0 ) ] = 1 n E { [ Z ¯ n ( ρ 0 ) + R n ζ n ] ( I n P K , n ) [ Z ¯ n ( ρ 0 ) + R n ζ n ] } = 1 n Z ¯ n ( ρ 0 ) ( I n P K , n ) Z ¯ n ( ρ 0 ) + 1 n E ( ζ n R n R n ζ n ) 1 n Ω n 1 ( K )
where Ω n 1 ( K ) is in Equation (11), thus 1 n Z ¯ n ( ρ 0 ) ( I n P K , n ) Z ¯ n ( ρ 0 ) can be estimated, up to an additive constant not depending on K, by 1 n Z n ( ρ ^ n ) ( I n P K , n ) Z n ( ρ ^ n ) + 1 n Ω ^ n 1 ( K ) , where Ω ^ n 1 ( K ) is an estimator for Ω n 1 ( K ) , derived by replacing the parameters in Ω n 1 ( K ) by their estimators. Hence, for the GS2SLS, S n , ξ ( K ) = ξ S n ( K ) ξ can be estimated, up to an additive constant not depending on K, by
S ^ n , ξ ( K ) = 1 n ξ H ^ n 1 [ σ ^ ϵ 2 Z n ( ρ ^ n ) ( I n P K , n ) Z n ( ρ ^ n ) + σ ^ ϵ 2 Ω ^ n 1 ( K ) + Ω ^ n 2 ( K ) ] H ^ n 1 ξ
Similarly, for the CGS2SLS, S n , ξ ( K ) can be estimated, up to an additive constant not depending on K, by
S ^ n , ξ ( K ) = 1 n ξ H ^ n 1 [ σ ^ ϵ 2 Z n ( ρ ^ n ) ( I n P K , n ) Z n ( ρ ^ n ) + σ ^ ϵ 2 Ω ^ n 1 ( K ) + Π ^ n 1 ( K ) + Π ^ n 2 ( K ) + Π ^ n 3 ( K ) ] H ^ n 1 ξ
where Π ^ n 1 ( K ) is an estimator of Π n 1 ( K ) derived by replacing the parameters in Π n 1 ( K ) by their estimators, Π ^ n 2 ( K ) is given in Equation (26) and Π ^ n 3 ( K ) is given in Equation (27).
The optimal choice of K is the minimizer K ^ of S ^ n , ξ ( K ) . The K ^ is optimal in the sense that S n , ξ ( K ^ ) is asymptotically as small as min K S n , ξ ( K ) , i.e.,
S n , ξ ( K ^ ) min K S n , ξ ( K ) p 1
Assumption 10. 
(i) n ( ρ ^ n ρ 0 ) = O P ( 1 ) , δ ^ n p δ 0 , σ ^ ϵ 2 p σ ϵ 2 , σ ^ v ϵ p σ v ϵ and Σ ^ v p Σ v ;
(ii) 
For the GS2SLS, | S n , ξ ( K ) | / ( K 2 / n + Δ n K , 1 ) > c , and for the CGS2SLS, | S n , ξ ( K ) | / ( K / n + Δ n K , 1 ) > c , for some constant c > 0 , where Δ n K , 1 = 1 n tr [ Z ¯ n ( ρ 0 ) ( I n P K , n ) Z ¯ n ( ρ 0 ) ] .
Assumption 11. 
For both the GS2SLS and CGS2SLS, K [ n S n , ξ ( K ) ] 1 0 .
We assume the n -consistency of ρ ^ n and consistency of other preliminary estimators in Assumption 10 (i). Assumption 10 (ii) and Assumption 11 are similar to those in [3]. For the GS2SLS, from the proof of Proposition 5, the trace of the positive semi-definite matrix S n ( K ) has exactly the same order as ( K 2 / n + Δ n K , 1 ) , then S n , ξ ( K ) has the order O ( K 2 / n + Δ n K , 1 ) . Assumption 10 (ii) requires S n , ξ ( K ) for the GS2SLS to have exactly the same order as ( K 2 / n + Δ n K , 1 ) . A similar condition on S n , ξ ( K ) for the CGS2SLS is imposed. Assumption 11 imposes a restriction on the set of possible K.
Proposition 7. 
Under Assumptions 1–11, for K ^ = arg min K S ^ n , ξ ( K ) , Equation (20) is satisfied for both the GS2SLS and CGS2SLS.

4. Monte Carlo Study

We demonstrate the finite sample performance of our instrument selection procedure with Monte Carlo experiments. Except for the additional spatial error dependence, most parts of the experimental design follow [3]. The model considered is
y n = λ 0 W n y n + γ 0 Z 2 n + u n , u n = ρ 0 M n u n + ϵ n , Z 2 n = X n β 0 + v n
where ϵ n = ( ϵ n 1 , , ϵ n n ) , v n = ( v n 1 , , v n n ) and Z 2 n is a vector. The ( ϵ n i , v n i ) ’s are i.i.d. normal with mean zero, ϵ n i and v n i both have unit variance, and the correlation coefficient between ϵ n i and v n i is σ v ϵ , which will be varied by design. In the experiments, γ 0 = 1 , λ 0 = 0.6 , and ρ 0 = 0.1 or 0.5 . Elements of the n × q ¯ matrix X n are random samples from the standard normal distribution. The specification implies a theoretical first stage coefficient of determination R f 2 = β 0 β 0 / ( β 0 β 0 + 1 ) (with the spatial dependence being ignored), according to [19]. The q ¯ will be designed later on.
As in [3], we consider two models with different specifications of β 0 . In Model 1, the coefficients are decreasing, i.e., the jth element of β 0 is
β 0 j = c ( q ¯ ) 1 j q ¯ + 1 4 , for j = 1 , , q ¯
where c ( q ¯ ) is chosen such that R f 2 is equal to some specified value in the experiments; in Model 2, the coefficients are all equal, i.e.,
β 0 j = R f 2 q ¯ ( 1 R f 2 ) , for j = 1 , , q ¯
These two specifications stand for, respectively, the case that some instruments are more important than others and the other case that no instrument should be preferred over others [1]. In the experiments, R f 2 is equal to 0.02 or 0.1 , σ v ϵ is equal to 0.1 , 0.5 or 0.9 , and n = 98 or 490. The W n is a block diagonal matrix with each block in the diagonal being the row normalized matrix used for the study of crimes across 49 districts in Columbus, OH in [20]. The spatial weights matrix M n in the error process is set to be the same as the spatial weights matrix W n . The number of Monte Carlo repetitions is 2000.
Let X n q be a matrix consisting of the first q columns of X n , and Q p , q = [ X n q , W n X n q , , W n p X n q ] , for p = 1 , 2 , , p ¯ and q = 1 , 2 , , q ¯ . For n = 98 , we set p ¯ = 4 and q ¯ = 5 ; for n = 490 , we set p ¯ = 10 and q ¯ = 10 . The following estimators are considered:
(i)
GS2SLS-min: the GS2SLS with Q 1 , 1 (as the instrument matrix in the third stage);
(ii)
GS2SLS-max: the GS2SLS with Q p ¯ , q ¯ ;
(iii)
GS2SLS-op: the GS2SLS with Q p , q , where ( p , q ) minimizes S ^ n , ξ ( K ) in Equation (18) with ξ = ( 1 , 1 ) ;
(iv)
CGS2SLS-max: the CGS2SLS with Q p ¯ , q ¯ ;
(v)
CGS2SLS-op: the CGS2SLS with Q p , q , where ( p , q ) minimizes S ^ n , ξ ( K ) in Equation (19) with ξ = ( 1 , 1 ) .
The leading order bias for the CGS2SLS and the approximated MSEs are estimated using the GS2SLS with Q 2 , q ¯ as the instrument matrix in the third stage. For all the GS2SLS and CGS2SLS estimators considered, the instrument matrix used in the first stage is Q 2 , q ¯ , and the matrices used for the quadratic moments in the second stage are W n and W n 2 I n tr ( W n 2 ) / n . (As q ¯ is relatively large compared with the sample size, for the first stage estimator of the GS2SLS estimation and the estimator for the bias correction, we use p = 2 as suggested by [11].)
For each estimator, the following robust measures of central tendency and dispersion are reported: (There are some outliers in the GS2SLS and CGS2SLS estimates, thus the mean and variance of the estimators are not reported.) the median bias (MB), the median of the absolute deviations (MAD), the difference between the 0.1 and 0.9 quantiles (DQ) in the empirical distribution, and the coverage rate (CR) of a nominal 95% confidence interval.
The summary statistics of the estimators for Model 1 are reported in Table 1, Table 2, Table 3 and Table 4. We first compare GS2SLS-min, GS2SLS-max and GS2SLS-op. The GS2SLS-max has the largest median bias in most cases, and the GS2SLS-op has the smallest median bias for half of the cases when n = 98 but it has the intermediate medium bias when n = 490 . The GS2SLS-max has the smallest MAD and DQ in all cases, the GS2SLS-op of λ 0 has the intermediate MAD and DQ, and GS2SLS-op of γ 0 has the intermediate MAD and DQ when R f 2 = 0.02 but largest MAD and DQ when R f 2 = 0.1 . The CR of GS2SLS-op is closest to the nominal level in most cases, while the CR of GS2SLS-max is significantly lower than the nominal level in many cases. The CGS2SLS-max generally reduces the bias of GS2SLS-max significantly, has similar magnitudes of MAD and DQ to those of GS2SLS-max, and has a CR closer to the nominal level compared with GS2SLS-max but still significantly lower than the nominal level in many cases. Compared with the GS2SLS-op, in most cases, the CGS2SLS-op has much larger MAD and DQ, similar CR, and has smaller median bias for λ 0 but larger median bias for γ 0 .
Table 1. Estimation of Model 1 with R f 2 = 0.02 and n = 98 .
Table 1. Estimation of Model 1 with R f 2 = 0.02 and n = 98 .
λ 0 = 0.6 γ 0 = 1.0
MBMADDQCRMBMADDQCR
ρ 0 = 0.1
σ v ϵ = 0.1 GS2SLS-min0.1740.3752.3271.000 −0.0110.6123.6181.000
GS2SLS-max0.2420.0830.3230.810 −0.0650.1750.6540.992
GS2SLS-op0.1710.2971.7020.999 0.0150.4682.3071.000
CGS2SLS-max−0.0460.1250.6670.870 0.0980.2951.2480.974
CGS2SLS-op−0.3750.58110.4890.991 0.3320.7198.2771.000
0.5 GS2SLS-min0.1570.4282.9171.000 0.1880.5823.4151.000
GS2SLS-max0.1560.0710.2790.921 0.3470.1540.6090.932
GS2SLS-op0.1290.2351.3571.000 0.3330.3821.8830.999
CGS2SLS-max−0.0080.0810.3740.958 0.4070.2460.9830.824
CGS2SLS-op−0.1900.3835.7130.999 0.5010.5314.2741.000
0.9 GS2SLS-min0.1480.2952.0391.000 0.2930.4563.6330.982
GS2SLS-max0.0640.0310.1200.968 0.7910.0810.3060.033
GS2SLS-op0.0740.1290.8141.000 0.7000.2911.4920.782
CGS2SLS-max0.0320.0340.1360.997 0.7230.1520.6080.189
CGS2SLS-op0.0110.1601.5441.000 0.6280.3552.0910.820
ρ 0 = 0.5
σ v ϵ = 0.1 GS2SLS-min0.3490.3422.4130.992 0.0260.5723.2341.000
GS2SLS-max0.3440.0590.2410.414 −0.0700.1760.6960.992
GS2SLS-op0.3100.2721.7550.984 0.0310.4282.3751.000
CGS2SLS-max0.0570.1601.4320.720 0.0490.3301.5440.967
CGS2SLS-op−0.1370.73010.2210.972 0.2030.8107.6071.000
0.5 GS2SLS-min0.2620.3472.2251.000 0.1950.5563.4871.000
GS2SLS-max0.2610.0530.2080.503 0.3350.1550.5780.934
GS2SLS-op0.2270.2081.3420.991 0.3500.4032.1081.000
CGS2SLS-max0.0920.0770.4010.855 0.4000.2541.0220.848
CGS2SLS-op−0.0850.3686.5240.996 0.4640.5654.6581.000
0.9 GS2SLS-min0.2280.2191.6720.992 0.3390.4603.4610.973
GS2SLS-max0.1810.0270.1030.224 0.7750.0660.2640.023
GS2SLS-op0.1910.1140.6590.952 0.6900.2621.3900.768
CGS2SLS-max0.1400.0290.1240.546 0.7050.1270.5420.165
CGS2SLS-op0.1150.1331.6000.966 0.6390.3102.0550.809
MB: median bias; MAD: median of the absolute deviations; DQ: difference between the 0.1 and 0.9 quantiles; CR: coverage rate of a nominal 95% confidence interval.
Table 2. Estimation of Model 1 with R f 2 = 0.1 and n = 98 .
Table 2. Estimation of Model 1 with R f 2 = 0.1 and n = 98 .
λ 0 = 0.6 γ 0 = 1.0
MBMADDQCRMBMADDQCR
ρ 0 = 0.1
σ v ϵ = 0.1 GS2SLS-min0.0810.2831.5281.000 −0.0240.2401.0261.000
GS2SLS-max0.2250.0760.3020.835 −0.0620.1350.5480.995
GS2SLS-op0.1160.2541.4371.000 0.0010.2881.3651.000
CGS2SLS-max−0.0330.1090.5580.887 0.0390.1970.8290.979
CGS2SLS-op−0.2120.3795.1580.997 0.2110.4613.9971.000
0.5 GS2SLS-min0.0820.2311.3681.000 −0.0310.2551.1271.000
GS2SLS-max0.1490.0660.2630.903 0.2060.1360.5340.965
GS2SLS-op0.0780.2131.2030.998 0.1470.3121.4521.000
CGS2SLS-max−0.0040.0800.3610.961 0.1740.1740.6930.949
CGS2SLS-op−0.1550.2833.9630.999 0.3120.3812.4181.000
0.9 GS2SLS-min0.1030.2701.6831.000 0.0270.2981.7630.995
GS2SLS-max0.0750.0440.1710.914 0.5950.0950.3680.207
GS2SLS-op0.0710.1821.1850.998 0.2840.3571.6710.928
CGS2SLS-max0.0220.0490.2100.985 0.4070.1530.6050.598
CGS2SLS-op−0.0340.1902.7951.000 0.3740.3942.1750.913
ρ 0 = 0.5
σ v ϵ = 0.1 GS2SLS-min0.2530.3131.9910.996 0.0210.2731.2431.000
GS2SLS-max0.3270.0590.2370.412 −0.0580.1460.5680.995
GS2SLS-op0.2570.2531.6110.983 0.0190.2901.3741.000
CGS2SLS-max0.0550.1271.0860.766 0.0200.2291.0140.981
CGS2SLS-op−0.1590.4726.9500.972 0.1440.5575.1321.000
0.5 GS2SLS-min0.1970.2801.6460.997 0.0020.2781.2531.000
GS2SLS-max0.2680.0550.2130.444 0.2140.1380.5270.959
GS2SLS-op0.2170.2321.4150.991 0.1660.3161.5151.000
CGS2SLS-max0.0870.0830.4210.826 0.1970.1920.8020.941
CGS2SLS-op−0.0470.3094.1200.987 0.2820.4002.7061.000
0.9 GS2SLS-min0.2220.2621.6710.994 0.0130.2391.1650.995
GS2SLS-max0.2170.0300.1180.156 0.4880.0800.3220.334
GS2SLS-op0.2160.1901.2880.958 0.1480.2461.2160.968
CGS2SLS-max0.1400.0430.1850.669 0.3100.1290.5270.753
CGS2SLS-op0.0770.1852.5910.966 0.2490.3271.7560.947
MB: median bias; MAD: median of the absolute deviations; DQ: difference between the 0.1 and 0.9 quantiles; CR: coverage rate of a nominal 95% confidence interval.
Table 3. Estimation of Model 1 with R f 2 = 0.02 and n = 490 .
Table 3. Estimation of Model 1 with R f 2 = 0.02 and n = 490 .
λ 0 = 0.6 γ 0 = 1.0
MBMADDQCRMBMADDQCR
ρ 0 = 0.1
σ v ϵ = 0.1 GS2SLS-min0.1240.3482.0791.000 0.0000.3831.7361.000
GS2SLS-max0.2450.0400.1470.168 −0.1160.0940.3600.958
GS2SLS-op0.1580.2101.2030.995 −0.0090.3071.4311.000
CGS2SLS-max−0.0230.0560.3100.866 0.0460.1490.6030.904
CGS2SLS-op−0.3260.4337.0330.993 0.2650.4254.0961.000
0.5 GS2SLS-min0.1450.3221.9721.000 0.0240.3641.7661.000
GS2SLS-max0.1560.0310.1170.367 0.3060.0800.3020.587
GS2SLS-op0.1170.1660.9841.000 0.3010.2761.3010.998
CGS2SLS-max0.0140.0350.1380.978 0.2990.1350.5020.587
CGS2SLS-op−0.1280.2694.3601.000 0.3640.3601.9610.999
0.9 GS2SLS-min0.1430.2711.7720.999 0.0160.2951.5690.995
GS2SLS-max0.0670.0160.0610.514 0.7570.0410.1550.000
GS2SLS-op0.0890.1831.0140.998 0.3480.2741.3610.898
CGS2SLS-max0.0380.0190.0760.934 0.5580.0880.3420.043
CGS2SLS-op−0.0110.1631.7621.000 0.4230.2841.5130.850
ρ 0 = 0.5
σ v ϵ = 0.1 GS2SLS-min0.2410.3332.1210.996 0.0090.3821.6821.000
GS2SLS-max0.3380.0290.1110.001 −0.1110.0980.3700.948
GS2SLS-op0.2480.2201.4520.978 0.0150.3311.4721.000
CGS2SLS-max0.0570.0790.7230.634 0.0150.1880.8600.855
CGS2SLS-op−0.2410.5309.1600.936 0.2180.5725.4181.000
0.5 GS2SLS-min0.2410.2661.6410.996 0.0300.3151.4911.000
GS2SLS-max0.2650.0250.0940.002 0.3080.0790.3110.552
GS2SLS-op0.2300.1630.9560.971 0.2740.2841.2311.000
CGS2SLS-max0.1060.0380.1790.575 0.3020.1400.5510.572
CGS2SLS-op−0.0770.2924.4670.984 0.3440.3582.3320.999
0.9 GS2SLS-min0.2180.2631.7650.994 0.0750.2941.8200.995
GS2SLS-max0.1840.0120.0460.000 0.7540.0370.1380.000
GS2SLS-op0.2040.1610.9610.963 0.3770.2561.2200.887
CGS2SLS-max0.1420.0150.0580.032 0.5800.0840.3190.031
CGS2SLS-op0.1110.1512.0190.950 0.4210.2871.5300.836
MB: median bias; MAD: median of the absolute deviations; DQ: difference between the 0.1 and 0.9 quantiles; CR: coverage rate of a nominal 95% confidence interval.
Table 4. Estimation of Model 1 with R f 2 = 0.1 and n = 490 .
Table 4. Estimation of Model 1 with R f 2 = 0.1 and n = 490 .
λ 0 = 0.6 γ 0 = 1.0
MBMADDQCRMBMADDQCR
ρ 0 = 0.1
σ v ϵ = 0.1 GS2SLS-min0.0320.1540.8010.999 −0.0160.1310.5261.000
GS2SLS-max0.2140.0370.1440.257 −0.0680.0760.2740.984
GS2SLS-op0.1260.2111.2580.999 −0.0040.2961.4751.000
CGS2SLS-max0.0070.0440.1760.970 0.0090.0930.3620.979
CGS2SLS-op−0.1720.3013.8070.999 0.2090.3882.9281.000
0.5 GS2SLS-min0.0450.1500.8340.999 −0.0150.1360.5531.000
GS2SLS-max0.1650.0310.1210.290 0.1990.0670.2600.792
GS2SLS-op0.0970.2211.4021.000 0.1100.3011.4331.000
CGS2SLS-max0.0290.0350.1390.967 0.1120.0860.3280.922
CGS2SLS-op−0.1130.2583.8741.000 0.2480.3382.2031.000
0.9 GS2SLS-min0.0530.1470.9751.000 −0.0030.1360.5740.998
GS2SLS-max0.1140.0190.0730.144 0.5030.0440.1670.003
GS2SLS-op0.1070.1821.0800.996 0.1060.2200.9800.986
CGS2SLS-max0.0600.0260.1030.861 0.2170.0750.2730.643
CGS2SLS-op−0.0460.2162.9241.000 0.2800.3642.0830.957
ρ 0 = 0.5
σ v ϵ = 0.1 GS2SLS-min0.0720.1891.2550.996 0.0030.1310.5251.000
GS2SLS-max0.3160.0300.1150.006 −0.0540.0730.2870.983
GS2SLS-op0.2110.2381.5630.986 0.0200.2761.2411.000
CGS2SLS-max0.0790.0540.2770.718 0.0140.1080.4310.957
CGS2SLS-op−0.1370.3825.6540.967 0.2050.4533.5171.000
0.5 GS2SLS-min0.0970.1731.2750.993 −0.0060.1400.5951.000
GS2SLS-max0.2640.0250.1010.006 0.2000.0680.2630.776
GS2SLS-op0.1910.2191.3770.991 0.1160.2581.1840.999
CGS2SLS-max0.1100.0340.1500.570 0.1080.0920.3610.910
CGS2SLS-op−0.0280.2914.9010.985 0.2060.3302.1571.000
0.9 GS2SLS-min0.0980.1561.3410.989 −0.0050.1480.6380.999
GS2SLS-max0.2100.0170.0640.000 0.4820.0440.1670.004
GS2SLS-op0.1500.1801.1140.977 0.1200.1910.8330.996
CGS2SLS-max0.1380.0220.0880.183 0.1950.0780.3000.702
CGS2SLS-op0.0390.2133.9700.974 0.2050.3071.8650.969
MB: median bias; MAD: median of the absolute deviations; DQ: difference between the 0.1 and 0.9 quantiles; CR: coverage rate of a nominal 95% confidence interval.
Table 5, Table 6, Table 7 and Table 8 report the summary statistics of the estimators for Model 2. Among GS2SLS-min, GS2SLS-max and GS2SLS-op, in most cases, the GS2SLS-max has the largest median bias, the GS2SLS-op of λ 0 has the smallest median bias, and the GS2SLS-op of γ 0 has the intermediate median bias. The GS2SLS-max has the smallest MAD and DQ, and the GS2SLS-op has the intermediate MAD and DQ. The CR of GS2SLS-op is closest to the nominal level, while the CR of GS2SLS-max is significantly lower than the nominal level in many cases. The performance of CGS2SLS-max for Model 2 is similar to that for Model 1. Compared with the GS2SLS-op, the CGS2SLS-op has much larger MAD and DQ in most cases, similar CR, and has smaller median bias in more than half of the cases when ρ 0 = 0.5 but larger median bias in most cases when ρ 0 = 0.1 .
Table 5. Estimation of Model 2 with R f 2 = 0.02 and n = 98 .
Table 5. Estimation of Model 2 with R f 2 = 0.02 and n = 98 .
λ 0 = 0.6 γ 0 = 1.0
MBMADDQCRMBMADDQCR
ρ 0 = 0.1
σ v ϵ = 0.1 GS2SLS-min0.2460.6113.4341.000 0.0460.7744.2341.000
GS2SLS-max0.2500.0820.3170.797 −0.0790.1770.6610.994
GS2SLS-op0.1980.3262.0340.999 0.0590.4722.3641.000
CGS2SLS-max−0.0550.1320.7300.849 0.1070.2991.3400.967
CGS2SLS-op−0.3910.60210.3130.992 0.3600.7667.2931.000
0.5 GS2SLS-min0.1600.3542.3171.000 0.4080.7964.9891.000
GS2SLS-max0.1550.0650.2500.910 0.3380.1460.5760.943
GS2SLS-op0.1280.2161.2281.000 0.3990.3931.9641.000
CGS2SLS-max−0.0080.0780.3540.960 0.4180.2330.9490.845
CGS2SLS-op−0.2040.3616.7220.998 0.5720.5314.6211.000
0.9 GS2SLS-min0.0500.2101.4331.000 0.7410.5233.2430.963
GS2SLS-max0.0630.0320.1330.968 0.7930.0800.3160.038
GS2SLS-op0.0510.1210.6991.000 0.7630.2381.2780.775
CGS2SLS-max0.0300.0360.1550.993 0.7210.1470.6090.193
CGS2SLS-op−0.0060.1481.4451.000 0.7140.2862.5440.800
ρ 0 = 0.5
σ v ϵ = 0.1 GS2SLS-min0.2890.3632.2640.994 0.0590.8295.2601.000
GS2SLS-max0.3420.0610.2380.367 −0.0910.1800.7120.991
GS2SLS-op0.2670.2741.6450.985 0.0710.5233.2351.000
CGS2SLS-max0.0630.1601.4840.698 0.0230.3561.6650.958
CGS2SLS-op−0.1670.6759.2280.966 0.2540.8077.4331.000
0.5 GS2SLS-min0.2770.3422.4080.997 0.3030.6944.5511.000
GS2SLS-max0.2640.0520.2030.449 0.3300.1510.5850.934
GS2SLS-op0.2260.1961.3240.986 0.3560.3942.0010.999
CGS2SLS-max0.1000.0730.3720.844 0.3560.2421.0270.853
CGS2SLS-op−0.0980.3627.2970.989 0.4750.5845.3361.000
0.9 GS2SLS-min0.1820.1811.1720.992 0.6890.4702.9780.962
GS2SLS-max0.1840.0270.1050.240 0.7770.0730.2850.024
GS2SLS-op0.1790.1090.6820.969 0.7620.2201.1840.779
CGS2SLS-max0.1440.0300.1300.568 0.7100.1370.5590.183
CGS2SLS-op0.0990.1461.7370.972 0.7000.2992.2230.812
MB: median bias; MAD: median of the absolute deviations; DQ: difference between the 0.1 and 0.9 quantiles; CR: coverage rate of a nominal 95% confidence interval.
Table 6. Estimation of Model 2 with R f 2 = 0.1 and n = 98 .
Table 6. Estimation of Model 2 with R f 2 = 0.1 and n = 98 .
λ 0 = 0.6 γ 0 = 1.0
MBMADDQCRMBMADDQCR
ρ 0 = 0.1
σ v ϵ = 0.1 GS2SLS-min0.1990.4392.6731.000 −0.0010.4822.4701.000
GS2SLS-max0.2300.0760.2950.804 −0.0640.1510.5730.996
GS2SLS-op0.1900.2901.7030.999 0.0170.3641.7021.000
CGS2SLS-max−0.0390.1150.5620.892 0.0690.2090.8390.983
CGS2SLS-op−0.2850.4616.7200.994 0.2060.4793.6881.000
0.5 GS2SLS-min0.0020.3852.4081.000 0.1980.6694.0251.000
GS2SLS-max0.1370.0680.2660.907 0.2170.1350.5310.963
GS2SLS-op0.0580.2171.3231.000 0.1980.3371.7101.000
CGS2SLS-max−0.0030.0760.3350.964 0.1770.1780.7090.942
CGS2SLS-op−0.1730.3024.6850.999 0.2630.3642.4750.999
0.9 GS2SLS-min0.0580.3642.2090.999 0.2600.5043.8430.992
GS2SLS-max0.1020.0420.1700.887 0.5220.0850.3330.311
GS2SLS-op0.1030.2311.5210.999 0.3690.2822.0340.958
CGS2SLS-max0.0390.0530.2200.982 0.3310.1340.5280.728
CGS2SLS-op−0.0230.1972.7931.000 0.3390.2681.7380.955
ρ 0 = 0.5
σ v ϵ = 0.1 GS2SLS-min0.2900.3642.4460.998 0.0530.6013.6901.000
GS2SLS-max0.3190.0680.2650.454 −0.0640.1620.6320.989
GS2SLS-op0.2520.2781.9010.991 0.0560.4432.3571.000
CGS2SLS-max0.0630.1201.0880.777 0.0160.2511.1690.966
CGS2SLS-op−0.1490.4957.1950.970 0.2200.6425.9691.000
0.5 GS2SLS-min0.2440.3091.9490.997 0.3290.7284.3531.000
GS2SLS-max0.2680.0510.2030.440 0.2330.1290.5070.961
GS2SLS-op0.2430.2141.3170.986 0.2220.3661.9241.000
CGS2SLS-max0.0910.0820.4450.825 0.2130.1820.7800.944
CGS2SLS-op−0.0520.3215.6130.988 0.2590.3952.9861.000
0.9 GS2SLS-min0.1630.2611.7810.984 0.0880.3872.6160.991
GS2SLS-max0.1960.0380.1500.307 0.4870.0860.3300.371
GS2SLS-op0.1490.1841.2070.970 0.2900.2471.5030.965
CGS2SLS-max0.1170.0500.2200.774 0.2910.1410.5560.787
CGS2SLS-op0.0490.1953.2840.978 0.2790.2701.7420.970
MB: median bias; MAD: median of the absolute deviations; DQ: difference between the 0.1 and 0.9 quantiles; CR: coverage rate of a nominal 95% confidence interval.
Table 7. Estimation of Model 2 with R f 2 = 0.02 and n = 490 .
Table 7. Estimation of Model 2 with R f 2 = 0.02 and n = 490 .
λ 0 = 0.6 γ 0 = 1.0
MBMADDQCRMBMADDQCR
ρ 0 = 0.1
σ v ϵ = 0.1 GS2SLS-min0.1930.4162.8231.000 0.0700.7044.3511.000
GS2SLS-max0.2420.0370.1450.160 −0.1000.0930.3510.967
GS2SLS-op0.1690.2181.2200.998 0.0140.3471.6731.000
CGS2SLS-max−0.0170.0560.2730.893 0.0400.1440.5800.919
CGS2SLS-op−0.3480.46811.2340.992 0.3210.5175.6471.000
0.5 GS2SLS-min0.1300.3542.2721.000 0.2590.6523.9571.000
GS2SLS-max0.1540.0310.1200.389 0.3160.0780.2970.557
GS2SLS-op0.1040.1710.9801.000 0.3490.2881.4080.999
CGS2SLS-max0.0150.0330.1360.977 0.3030.1320.5020.581
CGS2SLS-op−0.1530.2933.5251.000 0.4050.3842.5600.999
0.9 GS2SLS-min0.1000.2631.7691.000 0.4120.5414.1620.986
GS2SLS-max0.0700.0150.0590.472 0.7480.0410.1590.000
GS2SLS-op0.0860.1550.9951.000 0.5460.2631.4470.860
CGS2SLS-max0.0410.0200.0740.925 0.5380.0830.3380.051
CGS2SLS-op−0.0080.1692.3351.000 0.4900.2471.9640.863
ρ 0 = 0.5
σ v ϵ = 0.1 GS2SLS-min0.3220.3982.5740.997 −0.0050.7233.9841.000
GS2SLS-max0.3380.0290.1100.002 −0.1150.0990.3820.940
GS2SLS-op0.2710.2431.5080.976 −0.0080.4072.0411.000
CGS2SLS-max0.0600.0820.6570.634 0.0140.1890.8620.855
CGS2SLS-op−0.3000.58711.2330.939 0.2520.6756.9061.000
0.5 GS2SLS-min0.2510.2811.6920.997 0.2910.6613.6511.000
GS2SLS-max0.2630.0250.0960.004 0.3060.0820.3070.553
GS2SLS-op0.2390.1721.0550.971 0.3370.2951.3850.998
CGS2SLS-max0.1040.0380.1810.576 0.3020.1400.5540.580
CGS2SLS-op−0.0860.3166.5780.984 0.3750.3732.9440.999
0.9 GS2SLS-min0.2520.2361.5950.991 0.2400.4003.1860.988
GS2SLS-max0.1840.0120.0460.000 0.7540.0370.1420.000
GS2SLS-op0.2120.1340.9430.961 0.5340.2561.5110.831
CGS2SLS-max0.1420.0150.0590.035 0.5840.0820.3200.026
CGS2SLS-op0.0980.1562.0480.958 0.5030.2631.8080.823
MB: median bias; MAD: median of the absolute deviations; DQ: difference between the 0.1 and 0.9 quantiles; CR: coverage rate of a nominal 95% confidence interval.
Table 8. Estimation of Model 2 with R f 2 = 0.1 and n = 490 .
Table 8. Estimation of Model 2 with R f 2 = 0.1 and n = 490 .
λ 0 = 0.6 γ 0 = 1.0
MBMADDQCRMBMADDQCR
ρ 0 = 0.1
σ v ϵ = 0.1 GS2SLS-min0.1380.3181.8861.000 0.0190.3421.5841.000
GS2SLS-max0.2150.0380.1470.246 −0.0730.0750.2820.983
GS2SLS-op0.1230.2411.3391.000 0.0000.2951.2421.000
CGS2SLS-max0.0080.0440.1820.956 0.0020.0990.3680.982
CGS2SLS-op−0.2610.4167.5210.998 0.1280.3932.7191.000
0.5 GS2SLS-min0.1430.2791.6541.000 0.0370.3221.6611.000
GS2SLS-max0.1640.0320.1210.286 0.2010.0720.2690.784
GS2SLS-op0.0940.2231.2180.999 0.1200.2731.1621.000
CGS2SLS-max0.0280.0350.1380.970 0.1080.0910.3430.912
CGS2SLS-op−0.1810.3577.3361.000 0.1290.3842.3411.000
0.9 GS2SLS-min0.1930.2061.2981.000 0.0560.3161.8210.997
GS2SLS-max0.1180.0190.0750.117 0.4760.0450.1820.005
GS2SLS-op0.1170.1851.1270.997 0.2360.1961.0180.981
CGS2SLS-max0.0590.0270.1060.854 0.2000.0710.2850.703
CGS2SLS-op−0.0730.2845.4650.998 0.0690.2751.4520.994
ρ 0 = 0.5
σ v ϵ = 0.1 GS2SLS-min0.2360.2691.7060.992 0.0260.2751.2361.000
GS2SLS-max0.3190.0300.1130.008 −0.0570.0710.2750.982
GS2SLS-op0.2240.2201.3650.988 0.0300.2371.0361.000
CGS2SLS-max0.0780.0540.3060.724 0.0090.1020.4020.967
CGS2SLS-op−0.1600.3896.3100.962 0.0650.2981.7341.000
0.5 GS2SLS-min0.2550.2451.5590.993 0.0820.3741.8651.000
GS2SLS-max0.2670.0260.0980.005 0.2150.0730.2670.740
GS2SLS-op0.2020.2161.2330.987 0.1390.2531.2001.000
CGS2SLS-max0.1090.0370.1620.566 0.1360.0970.3850.882
CGS2SLS-op−0.1120.3727.5720.988 0.0910.4062.5931.000
0.9 GS2SLS-min0.2500.2001.2350.986 0.0600.2711.4560.996
GS2SLS-max0.2110.0150.0590.000 0.4920.0420.1580.001
GS2SLS-op0.1860.1600.9870.973 0.2470.1720.8570.978
CGS2SLS-max0.1420.0220.0890.164 0.2110.0760.2990.667
CGS2SLS-op0.0220.2494.4160.985 0.0780.2661.4470.993
MB: median bias; MAD: median of the absolute deviations; DQ: difference between the 0.1 and 0.9 quantiles; CR: coverage rate of a nominal 95% confidence interval.
From the Monte Carlo results of both models, we can see that the proposed CGS2SLS estimator can effectively reduce the many instrument bias, and the estimators derived by choosing the number of instruments to minimize their respective approximated MSEs, GS2SLS-op and CGS2SLS-op, have coverage rates closer to the nominal level than the estimators using very few or many instruments, i.e., GS2SLS-op and CGS2SLS-op can make inference more reliable. Between GS2SLS-op and CGS2SLS-op, no one is always better than the other in terms of central tendency or coverage rate, but the GS2SLS-op has much smaller dispersion in most cases.
The summary statistics of the estimated p and q are presented in Table 9 and Table 10. Consistent with [3], in most cases for both models, only the first spatial lag ( p = 1 ) is used. For Model 1, in most cases, q ^ is 1 or 2 with n = 98 , and is larger with n = 490 but is smaller than the maximum number of instruments q ¯ = 10 . For Model 2, q ^ tends to be larger, which might be due to the fact that the variables in X n of Model 2 are equally important but the importance of the variables in X n of Model 1 is in decreasing order. For both models, q ^ tends to be larger with a larger R f 2 .
Table 9. The Distributions of p ^ and q ^ in Model 1.
Table 9. The Distributions of p ^ and q ^ in Model 1.
GS2SLS CGS2SLS
p ^ q ^ p ^ q ^
MOLQMEUQ MOLQMEUQ MOLQMEUQ MOLQMEUQ
n = 98
R f 2 = 0.02 , ρ 0 = 0.1 , σ v ϵ = 0.1 1114 1115 1124 1125
0.5 1114 1125 1114 1125
0.9 1114 1115 1114 1115
R f 2 = 0.02 , ρ 0 = 0.5 , σ v ϵ = 0.1 1114 1114 4134 5134
0.5 1114 1115 1114 1125
0.9 1114 1115 1114 1115
R f 2 = 0.1 , ρ 0 = 0.1 , σ v ϵ = 0.1 1114 1115 1124 1135
0.5 1114 1124 1124 2124
0.9 1113 1113 1113 1123
R f 2 = 0.1 , ρ 0 = 0.5 , σ v ϵ = 0.1 1114 1114 1124 5144
0.5 1113 1124 1123 1124
0.9 1112 1112 1112 1122
n = 490
R f 2 = 0.02 , ρ 0 = 0.1 , σ v ϵ = 0.1 1129 1139 1129 1139
0.5 1129 2139 1119 2139
0.9 1113 1123 1113 1123
R f 2 = 0.02 , ρ 0 = 0.5 , σ v ϵ = 0.1 1116 1127 1126 1137
0.5 1117 1128 1117 1128
0.9 1113 1123 1113 1123
R f 2 = 0.1 , ρ 0 = 0.1 , σ v ϵ = 0.1 1117 3248 1127 3248
0.5 1112 3235 1112 3245
0.9 1111 2223 1111 3243
R f 2 = 0.1 , ρ 0 = 0.5 , σ v ϵ = 0.1 1113 3135 1113 3245
0.5 1112 3234 1112 3244
0.9 1111 2223 1111 3233
MO: mode; LQ: 0.1 quantile; ME: median; UQ: 0.9 quantile.
Table 10. The Distributions of p ^ and q ^ in Model 2.
Table 10. The Distributions of p ^ and q ^ in Model 2.
GS2SLS CGS2SLS
p ^ q ^ p ^ q ^
MOLQMEUQ MOLQMEUQ MOLQMEUQ MOLQMEUQ
n = 98
R f 2 = 0.02 , ρ 0 = 0.1 , σ v ϵ = 0.1 1114 1115 1124 5135
0.5 1114 1125 1114 1125
0.9 1114 1125 1114 1115
R f 2 = 0.02 , ρ 0 = 0.5 , σ v ϵ = 0.1 1114 1115 4124 5135
0.5 1114 1125 1114 1125
0.9 1114 1115 1114 1115
R f 2 = 0.1 , ρ 0 = 0.1 , σ v ϵ = 0.1 1114 1125 1124 5145
0.5 1113 1135 1113 5145
0.9 1112 1113 1112 5143
R f 2 = 0.1 , ρ 0 = 0.5 , σ v ϵ = 0.1 1113 1125 1123 5255
0.5 1113 1125 1113 5145
0.9 1112 1113 1112 5143
n = 490
R f 2 = 0.02 , ρ 0 = 0.1 , σ v ϵ = 0.1 1129 11410 1129 11410
0.5 1129 11410 1119 11410
0.9 1113 1124 1113 1144
R f 2 = 0.02 , ρ 0 = 0.5 , σ v ϵ = 0.1 1116 1129 1126 10159
0.5 1118 11310 1118 11410
0.9 1113.5 1114 1113.5 1124
R f 2 = 0.1 , ρ 0 = 0.1 , σ v ϵ = 0.1 1115 103910 1115 1081010
0.5 1111 102610 1111 1081010
0.9 1111 3125 1111 108105
R f 2 = 0.1 , ρ 0 = 0.5 , σ v ϵ = 0.1 1111 101710 1111 1081010
0.5 1111 42410 1111 1081010
0.9 1111 3124 1111 108104
MO: mode; LQ: 0.1 quantile; ME: median; UQ: 0.9 quantile.

5. Conclusions

In this paper, we derive an approximated MSE of the GS2SLS estimator and a bias corrected GS2SLS (CGS2SLS) estimator for the SARAR model in the presence of endogenous variables and many instruments. We propose a instrument selection procedure by minimizing the approximated MSEs. Our Monte Carlo experiments show that the CGS2SLS can effectively correct the many instrument bias and the instrument selection procedure generally makes inference in finite samples more accurate.

Acknowledgements

We are grateful to the editor and one anonymous referee for helpful comments that have improved the presentation of the paper.

Appendix

A. Notations

A s = A + A for a square matrix A.
| | A | | = tr ( A A ) is the Frobenius matrix norm for a matrix A.
ve c D ( A ) is a column vector whose elements are the diagonal elements of a square matrix A.
R n = R n ( ρ 0 ) and G n = G n ( λ 0 ) , where R n ( ρ ) = I n ρ M n and G n ( λ ) = W n ( I n λ W n ) 1 .
y n ( ρ ) = R n ( ρ ) y n , Z 2 n ( ρ ) = R n ( ρ ) Z 2 n , Z n ( ρ ) = R n ( ρ ) Z n and u n ( ρ ) = R n ( ρ ) u n .
Z 2 n = Z ¯ 2 n + v n , where Z ¯ 2 n = E ( Z 2 n ) .
Z n = [ W n y n , Z 2 n ] = Z ¯ n + ζ n , where Z ¯ n = E ( Z n ) = [ G n Z ¯ 2 n γ 0 , Z ¯ 2 n ] and ζ n = [ G n v n γ 0 + G n R n 1 ϵ n , v n ] .
P K , n = Q K , n ( Q K , n Q K , n ) Q K , n , where ( Q K , n Q K , n ) is a generalized inverse of Q K , n Q K , n .
Γ n K , 1 = P K , n R n , Γ n K , 2 = P K , n R n G n and Γ n K , 3 = P K , n R n G n R n 1 .
Δ n K , 1 = 1 n tr [ Z ¯ n ( ρ 0 ) ( I n P K , n ) Z ¯ n ( ρ 0 ) ] and Δ n K , 2 = 1 n tr [ Z ¯ n M n ( I n P K , n ) M n Z ¯ n ] .
h n = 1 n Z ¯ n ( ρ 0 ) ϵ n and H n = 1 n Z ¯ n ( ρ 0 ) Z ¯ n ( ρ 0 ) .
For the GS2SLS,
S n ( K ) = 1 n H n 1 [ σ ϵ 2 Z ¯ n ( ρ 0 ) ( I n P K , n ) Z ¯ n ( ρ 0 ) + Ω n 2 ( K ) ] H n 1
where
Ω n 2 ( K ) = Υ n ( K ) Υ n ( K )
with
Υ n ( K ) = E ( ζ n R n P K , n ϵ n ) = [ tr ( Γ n K , 2 ) σ v ϵ γ 0 + σ ϵ 2 tr ( Γ n K , 3 ) , tr ( Γ n K , 1 ) σ v ϵ ]
For the CGS2SLS,
S n ( K ) = 1 n H n 1 [ σ ϵ 2 Z ¯ n ( ρ 0 ) ( I n P K , n ) Z ¯ n ( ρ 0 ) + Π n 1 ( K ) + Π n 2 ( K ) + Π n 3 ( K ) ] H n 1
where Π n 1 ( K ) is a symmetric matrix equal to
Π n 1 ( K ) = γ 0 σ v ϵ σ v ϵ γ 0 tr ( Γ n K , 2 2 ) + σ ϵ 2 γ 0 Σ v γ 0 tr ( Γ n K , 2 Γ n K , 2 ) + σ ϵ 4 tr ( Γ n K , 3 Γ n K , 3 s ) + 2 σ ϵ 2 σ v ϵ γ 0 tr ( Γ n K , 2 Γ n K , 3 s ) σ v ϵ σ v ϵ γ 0 tr ( Γ n K , 1 Γ n K , 2 ) + σ ϵ 2 Σ v γ 0 tr ( Γ n K , 1 Γ n K , 2 ) + σ ϵ 2 σ v ϵ tr ( Γ n K , 3 Γ n K , 1 s )
with the ( 2 , 2 ) th block being σ v ϵ σ v ϵ tr ( Γ n K , 1 2 ) + σ ϵ 2 Σ v tr ( Γ n K , 1 Γ n K , 1 ) ,
Π n 2 ( K ) = [ Π n 2 , 1 ( K ) , Π n 2 , 2 ( K ) ] s 2 σ ϵ 2 Ω n 1 ( K )
where
Ω n 1 ( K ) = E ( ζ n R n P K , n R n ζ n ) = γ 0 Σ v γ 0 tr ( Γ n K , 2 Γ n K , 2 ) + σ ϵ 2 tr ( Γ n K , 3 Γ n K , 3 ) + 2 σ v ϵ γ 0 tr ( Γ n K , 3 Γ n K , 2 ) Σ v γ 0 tr ( Γ n K , 1 Γ n K , 2 ) + σ v ϵ tr ( Γ n K , 1 Γ n K , 3 ) Σ v tr ( Γ n K , 1 Γ n K , 1 ) Π n 2 , 1 ( K ) = V 1 n [ σ ϵ 2 tr ( Γ n K , 3 M n R n 1 ) σ ϵ 2 tr ( P K , n M n G n R n 1 ) tr ( P K , n M n G n ) σ v ϵ γ 0 ] + V 2 n [ tr ( Γ n K , 2 G n ) σ v ϵ γ 0 + σ ϵ 2 tr ( Γ n K , 2 G n R n 1 ) , σ v ϵ tr ( Γ n K , 2 ) ] + V 3 n γ 0 tr ( Γ n K , 2 ) + V 4 n tr ( Γ n K , 3 )
and
Π n 2 , 2 ( K ) = V 1 n σ v ϵ tr ( P K , n M n ) + V 3 n tr ( Γ n K , 1 )
with V 1 n = σ ϵ 2 n Z ¯ n ( ρ 0 ) F n V 2 n = σ ϵ 2 I m + 1
V 3 n = σ ϵ 2 n Z ¯ n ( ρ 0 ) Z ¯ 2 n σ ϵ 2 n 2 Z ¯ n ( ρ 0 ) F n σ v ϵ tr ( M n R n 1 ) σ ϵ 2 n ( Z ¯ n R n Z ¯ 2 n + [ Σ v γ 0 tr ( R n G n ) + σ v ϵ tr ( G n ) , Σ v tr ( R n ) ] )
and
V 4 n = 2 σ ϵ 4 n 2 Z ¯ n ( ρ 0 ) F n tr ( M n R n 1 ) 2 σ ϵ 2 n [ tr ( R n G n ) σ v ϵ γ 0 + σ ϵ 2 tr ( G n ) , σ v ϵ tr ( R n ) ]
and
Π n 3 ( K ) = σ ϵ 2 n 2 { Z ¯ n ( ρ 0 ) F n [ E ( u n M n P K , n R n ζ n ) + E ( ϵ n P K , n M n ζ n ) ] } s
with
E ( u n M n P K , n R n ζ n ) = [ σ v ϵ γ 0 tr ( R n 1 M n Γ n K , 2 ) + σ ϵ 2 tr ( R n 1 M n Γ n K , 3 ) , σ v ϵ tr ( R n 1 M n Γ n K , 1 ) ]
and
E ( ϵ n P K , n M n ζ n ) = [ σ v ϵ γ 0 tr ( P K , n M n G n ) + σ ϵ 2 tr ( P K , n M n G n R n 1 ) , σ v ϵ tr ( P K , n M n ) ]
Let V ^ 1 n = σ ^ ϵ 2 n Z n ( ρ ^ n ) F ^ n , V ^ 2 n = σ ^ ϵ 2 I m + 1 ,
V ^ 3 n = σ ^ ϵ 2 n [ Z n ( ρ ^ n ) Z 2 n E ^ ( ζ n R n v n ) ] σ ^ ϵ 2 n 2 Z n ( ρ ^ n ) F ^ n σ ^ v ϵ tr ( M n R ^ n 1 ) σ ^ ϵ 2 n Z n R ^ n Z 2 n
and
V ^ 4 n = 2 σ ^ ϵ 4 n 2 Z n ( ρ ^ n ) F ^ n tr ( M n R ^ n 1 ) 2 σ ^ ϵ 2 n [ tr ( R ^ n G ^ n ) σ ^ v ϵ γ ^ n + σ ^ ϵ 2 tr ( G ^ n ) , σ ^ v ϵ tr ( R ^ n ) ]
where R ^ n = R n ( ρ ^ n ) , G ^ n = G n ( ρ ^ n ) , E ^ ( ζ n R n v n ) = [ Σ ^ v γ ^ n tr ( R ^ n G ^ n ) + σ ^ v ϵ tr ( G ^ n ) , Σ ^ v tr ( R ^ n ) ] and F ^ n is an estimator of F n in (7) derived by replacing Z ¯ n by Z n and true parameters by their estimators. An estimation for Π n 2 ( K ) is
Π ^ n 2 ( K ) = [ Π ^ n 2 , 1 ( K ) , Π ^ n 2 , 2 ( K ) ] s 2 σ ^ ϵ 2 Ω ^ n 1 ( K )
where Π ^ n 2 , 1 ( K ) , Π ^ n 2 , 2 ( K ) and Ω ^ n 1 ( K ) are derived respectively from Π n 2 , 1 ( K ) , Π n 2 , 2 ( K ) and Ω n 1 ( K ) by replacing V j n ’s by V ^ j n ’s and the rest of involved parameters by their respective estimators.
An estimator for Π n 3 ( K ) is
Π ^ n 3 ( K ) = σ ^ ϵ 2 n 2 { Z n ( ρ ^ n ) F ^ n [ E ^ ( u n M n P K , n R n ζ n ) + E ^ ( ϵ n P K , n M n ζ n ) ] } s
where E ^ ( ζ n R n P K , n M n u n ) and E ^ ( ζ n M n P K , n ϵ n ) are derived by replacing the parameters in, respectively, E ( ζ n R n P K , n M n u n ) and E ( ζ n M n P K , n ϵ n ) by their estimators.

B. Lemmas

The following lemma gives sufficient conditions under which Z ¯ n ( ρ ) can be approximated arbitrarily well by a linear combination of Q K , n as n , K . When the approximation of Z ¯ n ( ρ ) becomes better as the number of instruments K increases, the variance part of the MSE becomes smaller.
Lemma 1. 
Suppose that sup n | | λ 0 W n | | < 1 , elements of ψ q , n are uniformly bounded constants, and there exists π q 0 such that | | Z ¯ 2 n ψ q , n π q 0 | | 0 as n , q . Then, for F K , n = [ ψ q , n , W n ψ q , n , , W n p ψ q , n ] , where p , q as n ,
(i) 
there exists π K , n such that 1 n | | Z ¯ n F K , n π K , n | | 2 0 as n , K ,
(ii) 
1 n | | Z ¯ n ( ρ ) Q K , n π K n , 1 | | 2 c n | | Z ¯ n F K , n π K , n | | 2 for some c > 0 , where π K n , 1 = [ π K , n , ρ π K , n ] , which implies that 1 n | | Z ¯ n ( ρ ) Q K , n π K n , 1 | | 2 0 as n , K .
Proof. 
(i) is Lemma 2.1 in [3]. The argument is as follows. Let
π K , n = 0 ( π q 0 γ 0 ) ( λ 0 π q 0 γ 0 ) ( λ 0 p 1 π q 0 γ 0 ) π q 0 0 0 0
Then, F K , n π K , n = [ W n j = 0 p 1 λ 0 j W n j ψ q , n π q 0 γ 0 , ψ q , n π q 0 ] = [ ( I n λ 0 p W n p ) G n ψ q , n π q 0 γ 0 , ψ q , n π q 0 ] and
Z ¯ n F K , n π K , n = [ λ 0 p W n p G n Z ¯ 2 n γ 0 + ( I n λ 0 p W n p ) G n ( Z ¯ 2 n ψ q , n π q 0 ) γ 0 , Z ¯ 2 n ψ q , n π q 0 ]
Thus
| | Z ¯ n F K , n π K , n | | | | λ 0 W n | | p | | G n | | | | Z ¯ 2 n γ 0 | | + ( 1 + | | λ 0 W n | | p ) | | G n | | | | Z ¯ 2 n ψ q , n π q 0 | | | | γ 0 | | + | | Z ¯ 2 n ψ q , n π q 0 | | 0 ,
as n , p , q . Since 1 n | | Z ¯ 2 n ψ q , n π q 0 | | 2 ( | | Z ¯ 2 n ψ q , n π q 0 | | ) 2 , the result follows.
(ii) Let R n ( ρ ) R n ( ρ ) = R 1 n ( ρ ) R 2 n ( ρ ) R 1 n ( ρ ) be an eigenvalue-eigenvector decomposition, where R 2 n ( ρ ) is a diagonal matrix whose diagonal elements are the eigenvalues of R n ( ρ ) R n ( ρ ) and R 1 n ( ρ ) is an orthonormal matrix whose columns are eigenvectors of R n ( ρ ) R n ( ρ ) . Then,
1 n | | Z ¯ n ( ρ ) Q K , n π K n , 1 | | 2 = 1 n | | R n ( ρ ) ( Z ¯ n F K , n π K , n ) | | 2 = 1 n tr [ ( Z ¯ n F K , n π K , n ) R n ( ρ ) R n ( ρ ) ( Z ¯ n F K , n π K , n ) ] = 1 n tr [ ( Z ¯ n F K , n π K , n ) R 1 n ( ρ ) R 2 n ( ρ ) R 1 n ( ρ ) ( Z ¯ n F K , n π K , n ) ] 1 n r n , max tr [ ( Z ¯ n F K , n π K , n ) ( Z ¯ n F K , n π K , n ) ]
where r n , max is the largest eigenvalue of R n ( ρ ) R n ( ρ ) . By the spectral radius theorem,
r n , max | | R n ( ρ ) R n ( ρ ) | | | | R n ( ρ ) | | | | R n ( ρ ) | | c
for some c > 0 and all n. Thus (ii) holds.       ☐
The following lemma, Lemma A.1 in [1], gives conditions on the decomposition of an estimator, such that the dominant component of the MSE depending on the number of instruments can be derived.
Lemma 2. 
For an estimator given by n ( δ ^ n δ 0 ) = H ^ n 1 h ^ n , suppose that there is a decomposition, h ^ n = h n + T n h + Z n h , H ^ n = H n + T n H + Z n H ,
( h n + T n h ) ( h n + T n h ) h n h n H n 1 T n H T n H H n 1 h n h n = A ^ n ( K ) + Z n A ( K )
such that
(i) 
T n h = o P ( 1 ) , h n = O P ( 1 ) , H n = O ( 1 ) ,
(ii) 
the determinant of H n is bounded away from zero,
(iii) 
ρ K , n = tr ( S n ( K ) ) = o ( 1 ) ,
(iv) 
| | T n H | | 2 = o P ( ρ K , n ) , | | T n h | | | | T n H | | = o P ( ρ K , n ) , | | Z n h | | = o P ( ρ K , n ) , | | Z n H | | = o P ( ρ K , n ) , Z n A ( K ) = o P ( ρ K , n ) ,
(v) 
E [ A ^ n ( K ) ] = σ ϵ 2 H n + H n S n ( K ) H n + o ( ρ K , n ) .
Then (15) is satisfied.
Lemma 3. 
Let A n = [ a n , i j ] and B n = [ b n , i j ] be n × n matrices, then
(i) 
E ( ϵ n A n v n ) = σ v ϵ tr ( A n ) ,
(ii) 
E ( ϵ n A n ϵ n ) = σ ϵ 2 tr ( A n ) ,
(iii) 
E ( v n A n v n ) = Σ v tr ( A n ) ,
(iv) 
E ( v n A n ϵ n ϵ n B n v n ) = [ E ( ϵ n i 2 v n i v n i ) 2 σ v ϵ σ v ϵ σ ϵ 2 Σ v ] ve c D ( A n ) ve c D ( B n ) + σ v ϵ σ v ϵ [ tr ( A n ) tr ( B n ) + tr ( A n B n ) ] + σ ϵ 2 Σ v tr ( A n B n ) ,
(v) 
E ( ϵ n A n ϵ n ϵ n B n v n ) = [ E ( ϵ n i 3 v n i ) 3 σ ϵ 2 σ v ϵ ] ve c D ( A n ) ve c D ( B n ) + σ ϵ 2 σ v ϵ [ tr ( A n ) tr ( B n ) + tr ( A n B n s ) ] ,
(vi) 
E ( ϵ n A n ϵ n ϵ n B n ϵ n ) = ( μ 4 3 σ ϵ 4 ) ve c D ( A n ) ve c D ( B n ) + σ ϵ 4 [ tr ( A n ) tr ( B n ) + tr ( A n B n s ) ] ,
(vii) 
E ( v n A n v n v n B n v n ) = [ E ( v n i v n i ) 2 Σ v 2 E ( v n i v n j v n j v n i ) E ( v n i v n j ) 2 ] ve c D ( A n ) ve c D ( B n ) + Σ v 2 tr ( A n ) tr ( B n ) + E ( v n i v n j v n j v n i ) tr ( A n B n ) + E ( v n i v n j ) 2 tr ( A n B n ) .
Proof. 
For (i)–(iii), we only prove (i), as the other two follow similarly; for (iv)–(vii), we only prove (iv) for the same reason.
For (i), ϵ n A n v n = i = 1 n a n , i i ϵ n i v n i + i = 1 n j i a n , i j ϵ n i v n j . As E ( ϵ n i v n i ) = σ v ϵ and E ( ϵ n i v n j ) = 0 for i j , the result follows. For (iv),
E ( v n A n ϵ n ϵ n B n v n ) = i = 1 n j = 1 n r = 1 n s = 1 n a n , i j b n , r s E ( v n i v n s ϵ n j ϵ n r )
where E ( v n i v n s ϵ n j ϵ n r ) 0 in one of the following situations: i = j = r = s ; i = j and r = s , but i r ; i = r and j = s , but i j ; i = s and j = r , but i j . Then
E ( v n A n ϵ n ϵ n B n v n ) = i = 1 n a n , i i b n , i i E ( v n i v n i ϵ n i 2 ) + i = 1 n j i E [ ( a n , i i b n , j j + a n , i j b n , i j ) v n i ϵ n i v n j ϵ n j + a n , i j b n , j i v n i v n i ϵ n j 2 ] = [ E ( ϵ n i 2 v n i v n i ) 2 σ v ϵ σ v ϵ σ ϵ 2 Σ v ] ve c D ( A n ) ve c D ( B n ) + σ v ϵ σ v ϵ [ tr ( A n ) tr ( B n ) + tr ( A n B n ) ] + σ ϵ 2 Σ v tr ( A n B n )
Lemma 4. 
Suppose that n × n matrices { A n } and { B n } are UB, C n = P K , n A n = [ c n , i j ] and D n = P K , n B n = [ d n , i j ] . Then
(i) 
tr ( P K , n ) = K ,
(ii) 
| tr ( C n ) | = O ( K ) , | tr ( C n 2 ) | = O ( K ) and i = 1 n c n , i i 2 = O ( K ) ,
(iii) 
| tr ( C n D n ) | = O ( K ) and i = 1 n c n , i i d n , i i = O ( K ) .
Proof. 
(i) and (ii) are Lemma B.2 in [21]; (iii) By the Cauchy–Schwarz inequality,
tr 2 ( C n D n ) tr ( C n C n ) tr ( D n D n )
and
( i = 1 n c n , i i d n , i i ) 2 i = 1 n c n , i i 2 i = 1 n d n , i i 2
where tr ( C n C n ) = tr ( P K , n A n A n ) and tr ( D n D n ) = tr ( P K , n B n B n ) , thus the results follow by (ii). ☐
Lemma 5. 
Suppose that { A n } and { B n } are n × n matrices that are UB and C n = A n P K , n B n , then
(i) 
1 n ϵ n A n ϵ n = O P ( 1 ) , 1 n ϵ n A n v n = O P ( 1 ) , and 1 n v n A n v n = O P ( 1 ) ;
(ii) 
1 n [ ϵ n A n ϵ n E ( ϵ n A n ϵ n ) ] = O P ( 1 ) , 1 n [ ϵ n A n v n E ( ϵ n A n v n ) ] = O P ( 1 ) , and 1 n [ v n A n v n E ( v n A n v n ) ] = O P ( 1 ) ;
(iii) 
1 n [ ϵ n C n ϵ n E ( ϵ n C n ϵ n ) ] = O P ( K / n ) , 1 n [ ϵ n C n v n E ( ϵ n C n v n ) ] = O P ( K / n ) , and 1 n [ v n C n v n E ( v n C n v n ) ] = O P ( K / n ) .
Proof. 
All the results follow by Chebyshev’s inequality and Lemmas 3–4. We only prove the last result in (iii). Let e i be the ith column of the m × m identity matrix. Then the variance of the ( i , j ) th element of [ v n C n v n E ( v n C n v n ) ] is 1 n E { e i [ v n C n v n E ( v n C n v n ) ] e j e j [ v n C n v n E ( v n C n v n ) ] e i } , which is smaller than or equal to
1 n E { e i [ v n C n v n E ( v n C n v n ) ] [ v n C n v n E ( v n C n v n ) ] e i } = 1 n e i [ E ( v n C n v n v n C n v n ) E ( v n C n v n ) E ( v n C n v n ) ] e i = O ( K / n )
by Lemmas 3–4. Thus, the ( i , j ) th element of 1 n [ ϵ n C n v n E ( ϵ n C n v n ) ] is O P ( K / n ) by Chebyshev’s inequality. The result follows as i and j are arbitrary.     ☐
Lemma 6. 
Suppose that { A n } is a sequence of n × n matrices that are bounded in the column sum matrix norm, the elements of the n × k matrix C n are uniformly bounded, and ϵ n i ’s in ϵ n = ( ϵ n 1 , , ϵ n n ) are i.i.d. with zero mean and finite variance σ ϵ 2 . Then 1 n C n A n ϵ n = O P ( 1 ) .
Furthermore, if the limit of 1 n C n A n A n C n exists and is positive definite, then
1 n C n A n ϵ n d N ( 0 , lim n σ ϵ 2 n C n A n A n C n )
Proof. 
See [22].      ☐
The following two lemmas show the orders of relevant terms in deriving the decompositions for the GS2SLS and CGS2SLS estimators.
Lemma 7. 
Suppose that n × n matrices { A n } are UB, then
(i) 
Δ n K , 1 = 1 n tr [ Z ¯ n ( ρ 0 ) ( I n P K , n ) Z ¯ n ( ρ 0 ) ] = o ( 1 ) ,
(ii) 
Δ n K , 2 = 1 n tr [ Z ¯ n M n ( I n P K , n ) M n Z ¯ n ] = o ( 1 ) ,
(iii) 
1 n | | Z ¯ n M n ( I n P K , n ) Z ¯ n ( ρ 0 ) | | = O ( Δ n K , 1 Δ n K , 2 ) and 1 n Z ¯ n M n ( I n P K , n ) A n Z ¯ n ( ρ 0 ) = O ( Δ n K , 2 ) ,
(iv) 
1 n Z ¯ n ( ρ 0 ) ( I n P K , n ) A n ϵ n = O P ( Δ n K , 1 1 / 2 ) , 1 n Z ¯ n ( ρ 0 ) ( I n P K , n ) A n v n = O P ( Δ n K , 1 1 / 2 ) , 1 n Z ¯ n M n ( I n P K , n ) A n ϵ n = O P ( Δ n K , 2 1 / 2 ) and 1 n Z ¯ n M n ( I n P K , n ) A n v n = O P ( Δ n K , 2 1 / 2 ) ,
(v) 
1 n tr [ Z ¯ n M n ( I n P K , n ) A n A n ( I n P K , n ) M n Z ¯ n ] = O ( Δ n K , 2 ) ,
(vi) 
Δ n K , 1 / n = o ( K / n + Δ n K , 1 ) , Δ n K , 2 / n = o ( K / n + Δ n K , 2 ) , K Δ n K , 1 / n = o ( K 2 / n + Δ n K , 1 ) and K Δ n K , 2 / n = o ( K 2 / n + Δ n K , 2 ) .
Proof. 
(i) By Assumption 6, there exists π K , n such that 1 n | | Z ¯ n ( ρ 0 ) Q K , n π K , n | | 2 0 as n , K . Then
Δ n K , 1 = 1 n tr [ ( Z ¯ n ( ρ 0 ) Q K , n π K , n ) ( I n P K , n ) ( Z ¯ n ( ρ 0 ) Q K , n π K , n ) ] 1 n tr [ ( Z ¯ n ( ρ 0 ) Q K , n π K , n ) ( Z ¯ n ( ρ 0 ) Q K , n π K , n ) ] = 1 n | | Z ¯ n ( ρ 0 ) Q K , n π K , n | | 2 = o ( 1 )
(ii) As ρ M n Z ¯ n = Z ¯ n ( 0 ) Z ¯ n ( ρ ) , there exist π n K , 1 such that 1 n | | M n Z ¯ n Q K , n π n K , 1 | | 2 0 as n , K . Then (ii) holds by an argument similar to that for (i).
(iii) By the Cauchy–Schwarz inequality,
| 1 n e i Z ¯ n M n ( I n P K , n ) Z ¯ n ( ρ 0 ) e j | 2 1 n e i Z ¯ n M n ( I n P K , n ) M n Z ¯ n e i · 1 n e j Z ¯ n ( ρ 0 ) ( I n P K , n ) Z ¯ n ( ρ 0 ) e j Δ n K , 1 Δ n K , 2
where e i denotes the ith column of the ( m + 1 ) × ( m + 1 ) identity matrix. Thus the first result follows. The second result in (iii) follows by
| 1 n e i Z ¯ n M n ( I n P K , n ) A n Z ¯ n ( ρ 0 ) e j | 2 1 n e i Z ¯ n M n ( I n P K , n ) M n Z ¯ n e i · 1 n e j Z ¯ n ( ρ 0 ) A n A n Z ¯ n ( ρ 0 ) e j = O ( Δ n K , 2 )
(iv) By Chebyshev’s inequality,
P ( | 1 n e i Z ¯ n ( ρ 0 ) ( I n P K , n ) A n ϵ n | > η ) σ ϵ 2 n η 2 e i Z ¯ n ( ρ 0 ) ( I n P K , n ) A n A n ( I n P K , n ) Z ¯ n ( ρ 0 ) e i
for some η > 0 . Let A n A n = A 1 n A 2 n A 1 n , where A 1 n is an orthonormal matrix whose columns are A n A n ’s eigenvectors and A 2 n is a diagonal matrix with the diagonal elements being A n A n ’s eigenvalues. Then
1 n e i Z ¯ n ( ρ 0 ) ( I n P K , n ) A n A n ( I n P K , n ) Z ¯ n ( ρ 0 ) e i = 1 n e i Z ¯ n ( ρ 0 ) ( I n P K , n ) A 1 n A 2 n A 1 n ( I n P K , n ) Z ¯ n ( ρ 0 ) e i 1 n ι n e i Z ¯ n ( ρ 0 ) ( I n P K , n ) Z ¯ n ( ρ 0 ) e i 1 n | | A n A n | | e i Z ¯ n ( ρ 0 ) ( I n P K , n ) Z ¯ n ( ρ 0 ) e i = O ( Δ n K , 1 )
where ι n is the largest eigenvalue of A n A n and the last inequality follows by the spectral radius theorem. Thus 1 n e i Z ¯ n ( ρ 0 ) ( I n P K , n ) A n ϵ n = O P ( Δ n K , 1 1 / 2 ) and 1 n Z ¯ n ( ρ 0 ) ( I n P K , n ) A n ϵ n = O P ( Δ n K , 1 1 / 2 ) . Other results follow similarly.
(v) Use the expression A n A n = A 1 n A 2 n A 1 n as in the proof of (iv), then
1 n tr [ Z ¯ n M n ( I n P K , n ) A n A n ( I n P K , n ) M n Z ¯ n ] 1 n ι n tr [ Z ¯ n M n ( I n P K , n ) A 1 n A 1 n ( I n P K , n ) M n Z ¯ n ] = 1 n ι n tr [ Z ¯ n M n ( I n P K , n ) M n Z ¯ n ] = O ( Δ n K , 2 )
(vi) The first two results are Lemma A.3 (vi) in [1]. For the third result, either Δ n K , 1 = 0 , in which case K Δ n K , 1 / n / ( K 2 / n + Δ n K , 1 ) = 0 , or K Δ n K , 1 / n / ( K 2 / n + Δ n K , 1 ) = 1 K 2 / n K Δ n K , 1 + n Δ n K , 1 / K 1 2 K 0 , by the inequality of arithmetic and geometric means. Thus the result follows. The last result follows similarly.     ☐
Lemma 8. 
With Δ n K , 1 and Δ n K , 2 defined, respectively, in Lemma 7 (i) and (ii),
(i) 
1 n Z n ( ρ 0 ) P K , n Z n ( ρ 0 ) = H n + T 1 n H + T 2 n H + T 3 n H + T 4 n H ,
where
(a) 
H n = 1 n Z ¯ n ( ρ 0 ) Z ¯ n ( ρ 0 ) = O ( 1 ) ,
(b) 
T 1 n H = 1 n Z ¯ n ( ρ 0 ) ( I n P K , n ) Z ¯ n ( ρ 0 ) = O ( Δ n K , 1 ) ,
(c) 
T 2 n H = 1 n [ Z ¯ n ( ρ 0 ) R n ζ n + ζ n R n Z ¯ n ( ρ 0 ) ] = O P ( n 1 / 2 ) ,
(d) 
T 3 n H = 1 n ζ n R n P K , n R n ζ n = O P ( K / n ) and
(e) 
T 4 n H = 1 n [ Z ¯ n ( ρ 0 ) ( I n P K , n ) R n ζ n + ζ n R n ( I n P K , n ) Z ¯ n ( ρ 0 ) ] = O P ( Δ n K , 1 / n ) = o P ( K / n + Δ n K , 1 ) .
(ii) 
1 n Z n ( ρ 0 ) P K , n M n Z n = 1 n Z ¯ n ( ρ 0 ) M n Z ¯ n + O ( Δ n K , 1 Δ n K , 2 ) + O P ( n 1 / 2 ) + O P ( K / n ) + o P ( K / n + Δ n K , 1 ) + O P ( Δ n K , 2 / n ) .
(iii) 
1 n Z n M n P K , n M n Z n = 1 n Z ¯ n M n M n Z ¯ n + O ( Δ n K , 2 ) + O P ( n 1 / 2 ) + O P ( K / n ) + O P ( Δ n K , 2 / n ) .
(iv) 
1 n Z n ( ρ 0 ) P K , n ϵ n = h n + T 1 n h + T 2 n h ,
where h n = 1 n Z ¯ n ( ρ 0 ) ϵ n = O P ( 1 ) , T 1 n h = 1 n Z ¯ n ( ρ 0 ) ( I n P K , n ) ϵ n = O P ( Δ n K , 1 1 / 2 ) and T 2 n h = 1 n ζ n R n P K , n ϵ n = O P ( K / n ) .
(v) 
1 n Z n ( ρ 0 ) P K , n M n u n = 1 n Z ¯ n ( ρ 0 ) M n u n + O P ( Δ n K , 1 1 / 2 ) + 1 n ζ n R n P K , n M n u n ,
where 1 n ζ n R n P K , n M n u n = O P ( K / n ) .
(vi) 
1 n Z n M n P K , n ϵ n = 1 n Z ¯ n M n ϵ n 1 n Z ¯ n M n ( I n P K , n ) ϵ n + 1 n ζ n M n P K , n ϵ n ,
where 1 n Z ¯ n M n ( I n P K , n ) ϵ n = O P ( Δ n K , 2 1 / 2 ) and 1 n ζ n M n P K , n ϵ n = O P ( K / n ) .
(vii) 
1 n Z n M n P K , n M n u n = 1 n Z ¯ n M n M n u n + O P ( Δ n K , 2 1 / 2 ) + O P ( K / n ) .
(viii) 
1 n [ ζ n R n P K , n R n ζ n E ( ζ n R n P K , n R n ζ n ) ] = O P ( K / n ) and 1 n [ ζ n A n ϵ n E ( ζ n A n ϵ n ) ] = O P ( K / n ) ,
where A n = M n P K , n , R n P K , n or R n P K , n M n R n 1 .
Proof. 
(i) Because Z n = Z ¯ n + ζ n , we have the decomposition that 1 n Z n ( ρ 0 ) P K , n Z n ( ρ 0 ) = H n + T 1 n H + T 2 n H + T 3 n H + T 4 n H . Since elements of Z ¯ n are uniformly bounded, H n = O ( 1 ) . By Lemma 7 (i), T 1 n H = O ( Δ n K , 1 ) . By Lemma 6, T 2 n H = O P ( n 1 / 2 ) . By Lemmas 3 and 4, E ( T 3 n H ) = O ( K / n ) , and, hence, T 3 n H = O P ( K / n ) by Markov’s inequality. By Lemma 7 (iv) and (vi), T 4 n H = O P ( Δ n K , 1 / n ) = o P ( K / n + Δ n K , 1 ) .
(ii) and (iii) follow similarly to (i).
(iv) Because Z n = Z ¯ n + ζ n , we have 1 n Z n ( ρ 0 ) P K , n u n = h n + T 1 n h + T 2 n h . By Lemma 6, h n = O P ( 1 ) . By Lemma 7 (iv), T 1 n h = O P ( Δ n K , 1 1 / 2 ) . As ζ n = [ G n v n γ 0 + G n R n 1 ϵ n , v n ] , by Lemmas 3 and 4, E ( T 2 n h T 2 n h ) = O ( K 2 / n ) , and, hence, T 2 n h = O P ( K / n ) by Chebyshev’s inequality.
(v), (vi) and (vii) follow similarly to (iv). (viii) follows directly by Lemma 5.  ☐
The following lemma shows the orders of some expectation terms, which are helpful in determining the approximated MSEs of the GS2SLS and CGS2SLS estimators.
Lemma 9. 
(i) E [ ( ϵ n D n ϵ n + F n ϵ n ) ϵ n ϵ n ] is UB, where D n and F n are given in Proposition 2.
(ii) 
Let the elements of n × n matrices { A n } be uniformly bounded, then elements of E ( ϵ n ϵ n A n v n ) and E ( ϵ n ϵ n A n ϵ n ) are uniformly bounded.
(iii) 
Let the elements of n × m matrices { B n } be uniformly bounded, where m is a finite fixed number, then E ( ϵ n ϵ n B n v n ) is UB.
(iv) 
Let the elements of n-dimensional vectors { C n } be uniformly bounded, then E ( ϵ n ϵ n C n ϵ n ) is UB.
Proof. 
(i) Let D n = [ d n , i j ] and F n = [ f n 1 , , f n n ] . The ( i , i ) th element of E [ ( ϵ n D n ϵ n + F n ϵ n ) ϵ n ϵ n ] is
E ( ϵ n i 4 ) d n , i i + μ 3 f n i + σ ϵ 4 [ tr ( D n ) d n , i i ] = [ E ( ϵ n i 4 ) σ ϵ 4 ] d n , i i + μ 3 f n i
and the ( i , j ) th element for i j is σ ϵ 4 ( d n , i j + d n , j i ) . Since D n is UB and elements of F n are uniformly bounded, E [ ( ϵ n D n ϵ n + F n ϵ n ) ϵ n ϵ n ] is UB.
(ii) Let A n = [ a n , i j ] , then the ith row of E ( ϵ n ϵ n A n v n ) is r = 1 n s = 1 n a n , r s E ( ϵ n i ϵ n r v n s ) = a n , i i E ( ϵ n i 2 v n i ) . Thus elements of E ( ϵ n ϵ n A n v n ) are uniformly bounded. Similarly, elements of E ( ϵ n ϵ n A n ϵ n ) are uniformly bounded.
(iii) Let B n = [ b n 1 , , b n n ] , then the ( i , j ) th element of E ( ϵ n ϵ n B n v n ) is l = 1 n b n l E ( ϵ n l ϵ n i v n j ) , which is equal to b n i E ( ϵ n i 2 v n i ) when i = j and 0 otherwise. Thus E ( ϵ n ϵ n B n v n ) is UB.
(iv) follows similarly to (iii).     ☐
Lemma 10. 
The sequence of matrices { ( I n λ W n ) 1 } is UB in a neighborhood of λ 0 and { R n 1 ( ρ ) } is UB in a neighborhood of ρ 0 .
Proof. 
See [22].      ☐
The following lemma shows the dominant components of estimation errors for parameters of the model or for Υ n ( K ) in (9), which help to derive the approximated MSE of the CGS2SLS estimator.
Lemma 11. 
Let L 1 n = 1 n ( ϵ n D n ϵ n + F n ϵ n ) as in Proposition 2, then
(i) 
the GS2SLS estimator δ ˜ n = [ Z n ( ρ ˜ n ) P 0 , n Z n ( ρ ˜ n ) ] 1 Z n ( ρ ˜ n ) P 0 , n y n ( ρ ˜ n ) satisfies n ( δ ˜ n δ 0 ) = L 2 n + o P ( 1 ) , where L 2 n = [ 1 n Z ¯ n ( ρ 0 ) P 0 , n Z ¯ n ( ρ 0 ) ] 1 1 n Z ¯ n ( ρ 0 ) P 0 , n ϵ n ,
(ii) 
n ( σ ˜ v ϵ σ v ϵ ) = L 3 n + o P ( 1 ) , where
L 3 n = n ( 1 n v n ϵ n σ v ϵ ) + 1 n Z ¯ 2 n ϵ n σ v ϵ n tr ( M n R n 1 ) L 1 n 1 n [ Z ¯ 2 n R n Z ¯ n + E ( v n R n ζ n ) ] L 2 n
with E ( v n R n ζ n ) = [ tr ( R n G n ) Σ v γ 0 + tr ( G n ) σ v ϵ , Σ v tr ( R n ) ] ,
(iii) 
n ( σ ˜ ϵ 2 σ ϵ 2 ) = L 4 n + o P ( 1 ) , where
L 4 n = n ( 1 n ϵ n ϵ n σ v ϵ ) 2 σ ϵ 2 n tr ( M n R n 1 ) L 1 n 2 n E ( ϵ n R n ζ n ) L 2 n
with E ( ϵ n R n ζ n ) = [ tr ( R n G n ) σ v ϵ γ 0 + tr ( G n ) σ ϵ 2 , σ v ϵ tr ( R n ) ]
(iv) 
1 n [ Υ ˜ n ( K ) Υ n ( K ) ] = 1 n ( a 1 , a 2 ) + o P ( K / n ) = O P ( K / n ) , where
a 1 = [ σ ϵ 2 tr ( Γ n K , 3 M n R n 1 ) σ ϵ 2 tr ( P K , n M n G n R n 1 ) tr ( P K , n M n G n ) σ v ϵ γ 0 ] L 1 n + [ tr ( Γ n K , 2 G n ) σ v ϵ γ 0 + σ ϵ 2 tr ( Γ n K , 2 G n R n 1 ) , σ v ϵ tr ( Γ n K , 2 ) ] L 2 n + tr ( Γ n K , 2 ) γ 0 L 3 n + tr ( Γ n K , 3 ) L 4 n
and a 2 = σ v ϵ tr ( P K , n M n ) L 1 n + tr ( Γ n K , 1 ) L 3 n .
Proof. 
(i) The δ ˜ n satisfies
n ( δ ˜ n δ 0 ) = [ 1 n Z n ( ρ ˜ n ) Q 0 , n ( 1 n Q 0 , n Q 0 , n ) 1 n Q 0 , n Z n ( ρ ˜ n ) ] 1 1 n Z n ( ρ ˜ n ) Q 0 , n ( 1 n Q 0 , n Q 0 , n ) 1 1 n Q 0 , n u n ( ρ ˜ n ) .
Note that 1 n Q 0 , n u n ( ρ ˜ n ) = 1 n Q 0 , n ϵ n + 1 n Q 0 , n M n R n 1 ϵ n n ( ρ 0 ρ ˜ n ) and
1 n Q 0 , n Z n ( ρ ˜ n ) = 1 n Q 0 , n Z ¯ n ( ρ 0 ) + 1 n Q 0 , n R n ζ n + [ 1 n Q 0 , n M n Z ¯ n + 1 n Q 0 , n M n ζ n ] ( ρ 0 ρ ˜ n ) .
By Lemma 6, 1 n Q 0 , n R n ζ n = o P ( 1 ) , 1 n Q 0 , n M n ζ n = o P ( 1 ) and 1 n Q 0 , n M n R n 1 ϵ n = o P ( 1 ) . By Proposition 2, n ( ρ ˜ n ρ 0 ) = L 1 n + o P ( 1 ) = O P ( 1 ) . Furthermore, 1 n Q 0 , n Z ¯ n ( ρ 0 ) = O ( 1 ) and 1 n Q 0 , n M n Z ¯ n = O ( 1 ) . Thus
n ( δ ˜ n δ 0 ) = [ 1 n Z ¯ n ( ρ 0 ) P 0 , n Z ¯ n ( ρ 0 ) ] 1 1 n Z ¯ n ( ρ 0 ) P 0 , n ϵ n + o P ( 1 )
(ii) Write σ ˜ v ϵ as
σ ˜ v ϵ = 1 n Z 2 n [ R n + ( ρ 0 ρ ˜ n ) M n ] [ u n + Z n ( δ 0 δ ˜ n ) ] = 1 n Z 2 n ϵ n + 1 n Z 2 n M n u n ( ρ 0 ρ ˜ n ) + 1 n Z 2 n R n Z n ( δ 0 δ ˜ n ) + 1 n Z 2 n M n Z n ( δ 0 δ ˜ n ) ( ρ 0 ρ ˜ n )
By Lemmas 5 and 6,
(a)
1 n Z 2 n M n u n = 1 n E ( v n M n u n ) + o P ( 1 ) = σ v ϵ n tr ( M n R n 1 ) + o P ( 1 ) = O P ( 1 ) ,
(b)
1 n Z 2 n R n Z n = 1 n E ( Z 2 n R n Z n ) + o P ( 1 ) with E ( Z 2 n R n Z n ) = Z ¯ 2 n R n Z ¯ n + [ tr ( R n G n ) Σ v γ 0 + tr ( G n ) σ v ϵ , Σ v tr ( R n ) ] = O ( n ) , and
(c)
1 n Z 2 n M n Z n = O P ( 1 ) .
Then
n ( σ ˜ v ϵ σ v ϵ ) = n ( 1 n v n ϵ n σ v ϵ ) + 1 n Z ¯ 2 n ϵ n + σ v ϵ n tr ( M n R n 1 ) n ( ρ 0 ρ ˜ n ) + 1 n E ( Z 2 n R n Z n ) n ( δ 0 δ ˜ n ) + o P ( 1 )
The result follows as n ( ρ ˜ n ρ 0 ) = L 1 n + o P ( 1 ) by Proposition 2 and n ( δ ˜ n δ 0 ) = L 2 n + o P ( 1 ) by (i).
(iii) Note that
R n ( ρ ˜ n ) ( y n Z n δ ˜ n ) = [ R n + ( ρ 0 ρ ˜ n ) M n ] [ u n + Z n ( δ 0 δ ˜ n ) ] = ϵ n + ( ρ 0 ρ ˜ n ) M n u n + R n Z n ( δ 0 δ ˜ n ) + M n Z n ( δ 0 δ ˜ n ) ( ρ 0 ρ ˜ n )
then by an argument similar to that for (ii),
n ( σ ˜ ϵ 2 σ ϵ 2 ) = n ( 1 n ϵ n ϵ n σ ϵ 2 ) + 2 n ϵ n M n R n 1 ϵ n n ( ρ 0 ρ ˜ n ) + 2 n ϵ n R n Z n n ( δ 0 δ ˜ n ) + o P ( 1 )
where
(a)
1 n ϵ n M n R n 1 ϵ n = 1 n E ( ϵ n M n R n 1 ϵ n ) + o P ( 1 ) with E ( ϵ n M n R n 1 ϵ n ) = σ ϵ 2 tr ( M n R n 1 ) = O ( n ) , and
(b)
1 n ϵ n R n Z n = 1 n E ( ϵ n R n ζ n ) + o P ( 1 ) with E ( ϵ n R n ζ n ) = [ tr ( R n G n ) σ v ϵ γ 0 + tr ( G n ) σ ϵ 2 , Σ v ϵ tr ( R n ) ] = O ( n ) .
The result follows by using the expressions for n ( δ ˜ n δ 0 ) and n ( ρ ˜ n ρ 0 ) .
(iv) By the mean value theorem,
1 n [ Υ ˜ n ( K ) Υ n ( K ) ] = 1 n tr ( P K , n R ¨ n G ¨ n 2 ) σ ¨ v ϵ γ ¨ n + σ ¨ ϵ 2 tr ( P K , n R ¨ n G ¨ n 2 R ¨ n 1 ) σ ¨ v ϵ tr ( P K , n R ¨ n G ¨ n ) 0 0 n ( δ ˜ n δ 0 ) + 1 n tr ( P K , n M n G ¨ n ) σ ¨ v ϵ γ ¨ n + σ ¨ ϵ 2 [ tr ( P K , n M n G ¨ n R ¨ n 1 ) + tr ( P K , n R ¨ n G ¨ n R ¨ n 1 M n R ¨ n 1 ) ] σ ¨ v ϵ tr ( P K , n M n ) n ( ρ ˜ n ρ 0 ) + 1 n tr ( P K , n R ¨ n G ¨ n R ¨ n 1 ) 0 n ( σ ˜ ϵ 2 σ ϵ 2 ) + 1 n tr ( P K , n R ¨ n G ¨ n ) γ ¨ n tr ( P K , n R ¨ n ) I m n ( σ ˜ v ϵ σ v ϵ )
where σ ¨ v ϵ is between σ ˜ v ϵ and σ v ϵ , σ ¨ ϵ 2 is between σ ˜ ϵ 2 and σ ϵ 2 , γ ¨ n is between γ ˜ n and γ 0 , R ¨ n = R n ( ρ ¨ n ) with ρ ¨ n being between ρ 0 and ρ ˜ n , and G ¨ n = G n ( λ ¨ n ) with λ ¨ n being between λ 0 and λ ˜ n . Let tr ( P K , n A ¨ n ) stand for a trace term that appeared in the above equation and tr ( P K , n A n ) be the term evaluated at the true δ 0 and ρ 0 . Using the mean value theorem once again, then 1 n [ tr ( P K , n A ¨ n ) tr ( P K , n A n ) ] = o P ( K / n ) by Lemmas 10 and 4. Thus by (ii), (iii) and Propositions 1 and 2, 1 n [ Υ ˜ n ( K ) Υ n ( K ) ] = 1 n ( a 1 , a 2 ) + o P ( K / n ) , where a 1 = O P ( K ) and a 2 = O P ( K ) .       ☐
The following lemma, Lemma A.9 in [1], gives a sufficient condition that the chosen K by the minimization of S ^ n , ξ ( K ) , say K ^ , is asymptotically optimal.
Lemma 12. 
If sup K | S ^ n , ξ ( K ) S n , ξ ( K ) | S n , ξ ( K ) p 0 , then S n , ξ ( K ^ ) inf K S n , ξ ( K ) p 1 .
The following is a central limit theorem for linear-quadratic forms of disturbances from [17].
Lemma 13. 
Suppose that { A n = [ a n , i j ] } is a sequence of symmetric n × n matrices that are UB, b n , K = ( b n K , 1 , , b n n ) is a vector such that sup n n 1 i = 1 n | b n i | 2 + η 1 < for some η 1 > 0 , and ϵ n i ’s in ϵ n = ( ϵ n 1 , , ϵ n n ) are mutually independent, with mean zero, variance σ n i 2 and finite moment of order higher than four such that E ( | ϵ n i | 4 + η 2 ) for some η 2 > 0 are uniformly bounded for all n and i. Let σ Q n 2 be the variance of Q n where Q n = ϵ n A n ϵ n + b n ϵ n i = 1 n a n , i i σ n i 2 . Assume that σ Q n 2 / n is bounded away from zero.
Then, Q n / σ Q n d N ( 0 , 1 ) .

C. Proofs

Proof of Proposition 1. 
As δ ˇ n = δ 0 + ( Z n P F n Z n ) 1 Z n P F n R n 1 ϵ n ,
n ( δ ˇ n δ 0 ) = [ 1 n Z n F 0 , n ( 1 n F 0 , n F 0 , n ) 1 1 n F 0 , n Z n ] 1 1 n Z n F 0 , n ( 1 n F 0 , n F 0 , n ) 1 1 n F 0 , n R n 1 ϵ n .
By Lemma 6, 1 n F 0 , n ζ n = O P ( n 1 / 2 ) and 1 n F 0 , n R n 1 ϵ n d N ( 0 , lim n σ ϵ 2 n F 0 , n R n 1 R n 1 F 0 , n ) . Hence,
n ( δ ˇ n δ 0 ) = ( 1 n Z ¯ n P F n Z ¯ n ) 1 1 n Z ¯ n P F n R n 1 ϵ n + O P ( n 1 / 2 ) d N 0 , lim n ( 1 n Z ¯ n P F n Z ¯ n ) 1 σ ϵ 2 n Z ¯ n P F n R n 1 R n 1 P F n Z ¯ n ( 1 n Z ¯ n P F n Z ¯ n ) 1
by Slutsky’s lemma.     ☐
Proof of Proposition 2. 
The consistency of ρ ˜ n follows from the uniform convergence that
g n ( ρ , δ ˇ n ) g n ( ρ , δ ˇ n ) E g n ( ρ , δ 0 ) E g n ( ρ , δ 0 ) = o P ( 1 ) uniformly in ρ [ a , a ]
and the identification uniqueness condition ([23] [Theorem 3.4]).
To prove the uniform convergence, we first show that g n ( ρ , δ ˇ n ) g n ( ρ , δ 0 ) = o P ( 1 ) uniformly in ρ [ a , a ] . As ϵ n ( ρ , δ ˇ n ) = R n ( ρ ) ( y n Z n δ ˇ n ) = R n ( ρ ) [ u n + Z n ( δ 0 δ ˇ n ) ] ,
1 2 n ϵ n ( ρ , δ ˇ n ) D n j s ϵ n ( ρ , δ ˇ n ) 1 2 n ϵ n ( ρ , δ 0 ) D n j s ϵ n ( ρ , δ 0 ) = 1 n ϵ n ( ρ , δ 0 ) D n j s R n ( ρ ) Z n ( δ 0 δ ˇ n ) + 1 2 n ( δ 0 δ ˇ n ) Z n R n ( ρ ) D n j s R n ( ρ ) Z n ( δ 0 δ ˇ n )
Note that Z n = Z ¯ n + ζ n , 1 n ϵ n ( ρ , δ 0 ) D n j s R n ( ρ ) Z n = O P ( 1 ) and 1 n Z n R n ( ρ ) D n j s R n ( ρ ) Z n = O P ( 1 ) by Lemmas 5 and 6. Then 1 2 n ϵ n ( ρ , δ ˇ n ) D n j s ϵ n ( ρ , δ ˇ n ) 1 2 n ϵ n ( ρ , δ 0 ) D n j s ϵ n ( ρ , δ 0 ) = o P ( 1 ) , as δ ˇ n δ 0 = o P ( 1 ) . Since g n ( ρ , δ ) is quadratic in ρ, it follows that
g n ( ρ , δ ˇ n ) g n ( ρ , δ 0 ) = o P ( 1 ) uniformly in ρ [ a , a ]
By Lemma 5, g n ( ρ , δ 0 ) E g n ( ρ , δ 0 ) = o P ( 1 ) uniformly in ρ [ a , a ] . Thus,
g n ( ρ , δ ˇ n ) E g n ( ρ , δ 0 ) = o P ( 1 ) uniformly in ρ [ a , a ]
Furthermore, E g n ( ρ , δ 0 ) = O ( 1 ) uniformly in ρ [ a , a ] . Hence,
g n ( ρ , δ ˇ n ) g n ( ρ , δ ˇ n ) E g n ( ρ , δ 0 ) E g n ( ρ , δ 0 ) = 2 [ g n ( ρ , δ ˇ n ) E g n ( ρ , δ 0 ) ] E g n ( ρ , δ 0 ) + [ g n ( ρ , δ ˇ n ) E g n ( ρ , δ 0 ) ] [ g n ( ρ , δ ˇ n ) E g n ( ρ , δ 0 ) ] = o P ( 1 )
uniformly in ρ [ a , a ] .
We now show that the identification uniqueness condition holds. Note that E g n ( ρ , δ 0 ) = σ ϵ 2 2 Ξ n [ ( ρ 0 ρ ) , ( ρ 0 ρ ) 2 ] . Let τ n 1 and τ n 2 be the eigenvalues of Ξ n Ξ n . Write Ξ n Ξ n = Ξ 1 n Ξ 2 n Ξ 1 n , where Ξ 2 n is a 2 × 2 diagonal matrix with diagonal elements τ n 1 and τ n 2 , and Ξ 2 n is an orthonormal matrix containing the eigenvectors of Ξ n Ξ n . By Assumption 5, there exists some constant η > 0 such that τ n 1 > η and τ n 2 > η for all n. Obviously, E g n ( ρ 0 , δ 0 ) E g n ( ρ 0 , δ 0 ) = 0 . Then,
E g n ( ρ , δ 0 ) E g n ( ρ , δ 0 ) E g n ( ρ 0 , δ 0 ) E g n ( ρ 0 , δ 0 ) = σ ϵ 4 4 [ ρ 0 ρ , ( ρ 0 ρ ) 2 ] Ξ 1 n Ξ 2 n Ξ 1 n [ ρ 0 ρ , ( ρ 0 ρ ) 2 ] η σ ϵ 4 4 [ ρ 0 ρ , ( ρ 0 ρ ) 2 ] Ξ 1 n Ξ 1 n [ ρ 0 ρ , ( ρ 0 ρ ) 2 ] = η σ ϵ 4 4 [ ( ρ 0 ρ ) 2 + ( ρ 0 ρ ) 4 ] > 0
for any ρ ρ 0 . Thus the identification uniqueness condition holds.
The consistency of ρ ˜ n follows from the uniform convergence and identification uniqueness.
For the asymptotic distribution, by the mean value theorem, we have
0 = g n ( ρ ˜ n , δ ˇ n ) ρ g n ( ρ ˜ n , δ ˇ n ) = g n ( ρ ˜ n , δ ˇ n ) ρ [ g n ( ρ 0 , δ ˇ n ) + g n ( ρ ¨ n , δ ˇ n ) ρ ( ρ ˜ n ρ 0 ) ]
where ρ ¨ n is between ρ ˜ n and ρ 0 . Then
n ( ρ ˜ n ρ 0 ) = g n ( ρ ˜ n , δ ˇ n ) ρ g n ( ρ ¨ n , δ ˇ n ) ρ 1 g n ( ρ ˜ n , δ ˇ n ) ρ n g n ( ρ 0 , δ ˇ n )
The ith element of g n ( ρ ˜ n , δ ˇ n ) ρ is 1 n ϵ n ( ρ ˜ n , δ ˇ n ) D n j s M n ( y n Z n δ ˇ n ) , which can be expanded by using y n Z n δ ˇ n = u n + Z n ( δ 0 δ ˇ n ) and ϵ n ( ρ ˜ n , δ ˇ n ) = [ R n + ( ρ 0 ρ ˜ n ) M n ] [ u n + Z n ( δ 0 δ ˇ n ) ] . By Lemmas 5 and 6, the terms involving ( δ 0 δ ˇ n ) or ( ρ 0 ρ ˜ n ) are O P ( n 1 / 2 ) . Therefore,
1 n ϵ n ( ρ ˜ n , ρ ˜ n ) D n j s M n ( y n Z n δ ˇ n ) = 1 n ϵ n D n j s M n R n 1 ϵ n + O P ( n 1 / 2 ) = 1 n σ ϵ 2 tr ( D n j s M n R n 1 ) + O P ( n 1 / 2 ) = O P ( 1 )
by Lemma 5. Thus, g n ( ρ ˜ n , δ ˇ n ) ρ = E g n ( ρ 0 , δ 0 ) ρ + O P ( n 1 / 2 ) = O P ( 1 ) , where
E g n ( ρ 0 , δ 0 ) ρ = σ ϵ 2 n [ tr ( D n 1 s M n R n 1 ) , , tr ( D n , k d s M n R n 1 ) ]
Similarly, g n ( ρ ¨ n , δ ˇ n ) ρ = E g n ( ρ 0 , δ 0 ) ρ + O P ( n 1 / 2 ) = O P ( 1 ) . Thus,
n ( ρ ˜ n ρ 0 ) = E g n ( ρ 0 , δ 0 ) ρ E g n ( ρ 0 , δ 0 ) ρ 1 E g n ( ρ 0 , δ 0 ) ρ n g n ( ρ 0 , δ ˇ n ) + O P ( n 1 / 2 )
For n g n ( ρ 0 , δ ˇ n ) , the ith element is
1 n [ ϵ n + R n Z n ( δ 0 δ ˇ n ) ] D n j [ ϵ n + R n Z n ( δ 0 δ ˇ n ) ] = 1 n ϵ n D n j ϵ n + 1 n ϵ n D n j s R n Z n n ( δ 0 δ ˇ n ) + O P ( n 1 / 2 )
By Lemmas 5 and 6,
1 n ϵ n D n j s R n Z n = 1 n ϵ n D n j s R n ( Z ¯ n + ζ n ) = 1 n E ( ϵ n D n j s R n ζ n ) + O P ( n 1 / 2 ) = O P ( 1 )
where
E ( ϵ n D n j s R n ζ n ) = [ tr ( D n j s R n G n ) σ v ϵ γ 0 + σ ϵ 2 tr ( D n j s R n G n R n 1 ) , tr ( D n j s R n ) σ v ϵ ]
By Proposition 1,
n ( δ ˇ n δ 0 ) = ( 1 n Z ¯ n P F n Z ¯ n ) 1 1 n Z ¯ n P F n R n 1 ϵ n + O P ( n 1 / 2 )
Then the ith element of n g n ( ρ 0 , δ ˇ n ) is A n i + O P ( n 1 / 2 ) , where
A n i = 1 n ϵ n D n j ϵ n 1 n E ( ϵ n D n j s R n ζ n ) ( 1 n Z ¯ n P F n Z ¯ n ) 1 1 n Z ¯ n P F n R n 1 ϵ n
Hence,
n ( ρ ˜ n ρ 0 ) = σ ϵ 2 n 2 j = 1 k d tr 2 ( D n j s M n R n 1 ) 1 j = 1 k d 1 n tr ( D n j s M n R n 1 ) A n i + O P ( n 1 / 2 ) = 1 n ( ϵ n D n ϵ n + F n ϵ n ) + O P ( n 1 / 2 )
where D n and F n are in, respectively, (6) and (7). Note that tr ( D n ) = 0 , then n ( ρ ˜ n ρ 0 ) is asymptotically normal with a finite variance by Lemma 13.   ☐
Proof of Proposition 3. 
The GS2SLS estimator δ ^ 2 s l s , n satisfies
n ( δ ^ 2 s l s , n δ 0 b n , K ) = [ 1 n Z n ( ρ ˜ n ) P K , n Z n ( ρ ˜ n ) ] 1 1 n [ Z n ( ρ ˜ n ) P K , n R n ( ρ ˜ n ) u n E ( ζ n R n P K , n ϵ n ) ]
where
E ( ζ n R n P K , n ϵ n ) = [ tr ( Γ n K , 2 ) σ v ϵ γ 0 + σ ϵ 2 tr ( Γ n K , 3 ) , tr ( Γ n K , 1 ) σ v ϵ ] = O ( K )
by Lemma 4. Write Z n ( ρ ˜ n ) = Z n ( ρ 0 ) + ( ρ 0 ρ ˜ n ) M n Z n , then
1 n Z n ( ρ ˜ n ) P K , n Z n ( ρ ˜ n ) = 1 n Z n ( ρ 0 ) P K , n Z n ( ρ 0 ) + 1 n [ Z n M n P K , n Z n ( ρ 0 ) ] s ( ρ 0 ρ ˜ n ) + 1 n Z n M n P K , n M n Z n ( ρ 0 ρ ˜ n ) 2
By Lemma 8 (i)–(iii),
(a)
if K / n c 0 , 1 n Z n ( ρ 0 ) P K , n Z n ( ρ 0 ) = H n + 1 n ζ n R n P K , n R n ζ n + o P ( 1 ) = O P ( 1 ) , 1 n Z n M n P K , n Z n ( ρ 0 ) = O P ( 1 ) and 1 n Z n M n P K , n M n Z n = O P ( 1 ) ;
(b)
if K / n 0 , 1 n Z n ( ρ 0 ) P K , n Z n ( ρ 0 ) = H n + o P ( 1 ) = O P ( 1 ) , 1 n Z n M n P K , n Z n ( ρ 0 ) = O P ( 1 ) and 1 n Z n M n P K , n M n Z n = O P ( 1 ) .
By Lemma 5, 1 n ζ n R n P K , n R n ζ n 1 n Ω n 1 ( K ) = O P ( K / n ) , where Ω n 1 ( K ) = O ( K ) by Lemma 4. By Proposition 1, n ( δ 0 δ ˇ n ) = O P ( 1 ) . Hence,
(c)
if K / n c 0 , 1 n Z n ( ρ ˜ n ) P K , n Z n ( ρ ˜ n ) = H n + 1 n Ω n 1 ( K ) + o P ( 1 ) = O P ( 1 ) and b n , K b ¯ n K , 1 = o P ( 1 ) ;
(d)
if K / n 0 , 1 n Z n ( ρ ˜ n ) P K , n Z n ( ρ ˜ n ) = H n + o P ( 1 ) = O P ( 1 ) and b n , K b ¯ n K , 2 = o P ( K / n ) .
As R n u n = ϵ n , Z n ( ρ ˜ n ) = Z n ( ρ 0 ) + ( ρ 0 ρ ˜ n ) M n Z n and R n ( ρ ˜ n ) = R n + ( ρ 0 ρ ˜ n ) M n ,
1 n [ Z n ( ρ ˜ n ) P K , n R n ( ρ ˜ n ) u n E ( ζ n R n P K , n ϵ n ) ] = 1 n Z ¯ n ( ρ 0 ) ϵ n 1 n Z ¯ n ( ρ 0 ) ( I n P K , n ) ϵ n + 1 n [ ζ n R n P K , n ϵ n E ( ζ n R n P K , n ϵ n ) ] + 1 n Z n M n P K , n ϵ n n ( ρ 0 ρ ˜ n ) + 1 n Z n ( ρ 0 ) P K , n M n R n 1 ϵ n n ( ρ 0 ρ ˜ n ) + 1 n Z n M n P K , n M n R n 1 ϵ n n ( ρ 0 ρ ˜ n ) 2 .
The terms on the right hand side have the following properties:
(1)
By Lemma 8 (iv), 1 n Z ¯ n ( ρ 0 ) ϵ n = O P ( 1 ) and 1 n Z ¯ n ( ρ 0 ) ( I n P K , n ) ϵ n = O P ( Δ n K , 1 1 / 2 ) = o P ( 1 ) ;
(2)
by Lemma 8 (viii), 1 n [ ζ n R n P K , n ϵ n E ( ζ n R n P K , n ϵ n ) ] = O P ( K / n ) ;
(3)
by Lemma 8 (vi), 1 n Z n M n P K , n ϵ n = O P ( n 1 / 2 ) + O P ( Δ n K , 2 / n ) + O P ( K / n ) ;
(4)
by Lemma 8 (v), 1 n Z n ( ρ 0 ) P K , n M n R n 1 ϵ n = O P ( n 1 / 2 ) + O P ( Δ n K , 1 / n ) + O P ( K / n ) ;
(5)
by Lemma 8 (vii), 1 n Z n M n P K , n M n R n 1 ϵ n = O P ( n 1 / 2 ) + O P ( Δ n K , 2 / n ) + O P ( K / n ) .
Therefore,
(e)
if K / n c 0 , 1 n [ Z n ( ρ ˜ n ) P K , n R n ( ρ ˜ n ) u n E ( ζ n R n P K , n ϵ n ) ] = o P ( 1 ) , and δ ^ 2 s l s , n δ 0 b ¯ n K , 1 p 0 ;
(f)
if K / n 0 , 1 n [ Z n ( ρ ˜ n ) P K , n R n ( ρ ˜ n ) u n E ( ζ n R n P K , n ϵ n ) ] = 1 n Z ¯ n ( ρ 0 ) ϵ n + o P ( 1 ) , and n ( δ ^ 2 s l s , n δ 0 b n , K ) d N ( 0 , σ ϵ 2 H ¯ 1 ) by Lemma 6.   ☐
Proof of Proposition 4. 
By Proposition 3, it is sufficient to show that n ( b ˜ n , K b n , K ) = o P ( 1 ) . Furthermore, as 1 n Z n ( ρ ˜ n ) P K , n Z n ( ρ ˜ n ) = O P ( 1 ) , we only need to show that
1 n Υ ˜ n ( K ) 1 n Υ n ( K ) = o P ( 1 )
By Lemma 4, as ρ ˜ n ρ 0 = O P ( n 1 / 2 ) ,
1 n [ tr ( Γ ˜ n K , 1 ) tr ( Γ n K , 1 ) ] = 1 n ( ρ 0 ρ ˜ n ) tr ( P K , n M n ) = O P ( K / n ) = o P ( 1 )
By the mean value theorem,
1 n [ tr ( Γ ˜ n K , 2 ) tr ( Γ n K , 2 ) ] = 1 n ( ρ 0 ρ ˜ n ) tr [ P K , n M n G n ( λ ¨ n ) ] + 1 n ( λ ˜ n λ 0 ) tr [ P K , n R n ( ρ ¨ n ) G n 2 ( λ ¨ n ) ] = O P ( K / n )
as ρ ˜ n ρ 0 = O P ( n 1 / 2 ) , λ ˜ n λ 0 = O P ( n 1 / 2 ) , and G n ( λ ¨ n ) is UB in probability by Proposition 10, where λ ¨ n is between λ 0 and λ ˜ n , and ρ ¨ n is between ρ 0 and ρ ˜ n . Similarly,
1 n [ tr ( Γ ˜ n K , 3 ) tr ( Γ n K , 3 ) ] = O P ( K / n ) = o P ( 1 )
By Proposition 11, σ ˜ ϵ 2 σ ϵ 2 = O P ( n 1 / 2 ) and σ ˜ v ϵ σ v ϵ = O P ( n 1 / 2 ) . Furthermore, tr ( Γ n K , 1 ) = O ( K ) , tr ( Γ n K , 2 ) = O ( K ) and tr ( Γ n K , 2 ) = O ( K ) . Then
1 n [ tr ( Γ ˜ n K , 2 ) σ ˜ v ϵ γ ˜ n + σ ˜ ϵ 2 tr ( Γ ˜ n K , 3 ) , tr ( Γ ˜ n K , 1 ) σ ˜ v ϵ ] 1 n [ tr ( Γ n K , 2 ) σ v ϵ γ 0 + σ ϵ 2 tr ( Γ n K , 3 ) , tr ( Γ n K , 1 ) σ v ϵ ] = o P ( 1 )
and the result in the proposition holds.   ☐
Proof of Proposition 5. 
We find a decomposition for n ( δ ^ 2 s l s , n δ 0 ) as in Lemma 2 and show that all the conditions in the lemma are satisfied.
Let ρ K , n = tr ( S n ( K ) ) , where S ( K ) is in Equation (16). We first establish some order properties for ρ K , n . The ρ K , n is equal to
ρ K , n = σ ϵ 2 n tr [ ( I n P K , n ) Z ¯ n ( ρ 0 ) H n 1 H n 1 Z ¯ n ( ρ 0 ) ( I n P K , n ) ] + 1 n Υ n ( K ) H n 1 H n 1 Υ n ( K ) σ ϵ 2 τ H , max n tr [ ( I n P K , n ) Z ¯ n ( ρ 0 ) Z ¯ n ( ρ 0 ) ( I n P K , n ) ] + τ H , max n Υ n ( K ) Υ n ( K )
where τ H , max is the largest eigenvalue of H n 1 , which is bounded from above because lim n H n is finite and nonsingular. Furthermore, as Υ n ( K ) Υ n ( K ) K 2 c for some constant c > 0 by Lemma 4, ρ K , n = O ( K 2 / n + Δ n K , 1 ) . By a similar argument but with a lower bound for ρ K , n by using the smallest eigenvalue of H n 1 and by Assumption 9 (i), as σ v ϵ 0 , lim n ρ K , n / ( K 2 / n + Δ n K , 1 ) > c for some constant c > 0 . These together mean that ρ K , n has exactly the same order as ( K 2 / n + Δ n K , 1 ) . This order of ρ K , n together with K 2 / n 0 are helpful to determine the orders of the terms in the decomposition of n ( δ ^ 2 s l s , n δ 0 ) .
The δ ^ 2 s l s , n satisfies
n ( δ ^ 2 s l s , n δ 0 ) = H ^ n 1 h ^ n
where h ^ n = 1 n Z n ( ρ ˜ n ) P K , n u n ( ρ ˜ n ) , and
H ^ n = 1 n Z n ( ρ 0 ) P K , n Z n ( ρ 0 ) + 1 n ( ρ 0 ρ ˜ n ) [ Z n ( ρ 0 ) P K , n M n Z n + Z n M n P K , n Z n ( ρ 0 ) ] + 1 n ( ρ 0 ρ ˜ n ) 2 Z n M n P K , n M n Z n
because Z n ( ρ ˜ n ) = Z n ( ρ 0 ) + ( ρ 0 ρ ˜ n ) M n Z n . By Lemma 8 (i)–(iii) and Lemma 7 (vi),
1 n Z n ( ρ 0 ) P K , n Z n ( ρ 0 ) = H n + T 1 n H + T 2 n H + T 3 n H + T 4 n H
where H n = O ( 1 ) , T 1 n H = O ( Δ n K , 1 ) , T 2 n H = O P ( n 1 / 2 ) , T 3 n H = O P ( K / n ) = o P ( ρ K , n ) and T 4 n H = o P ( K / n + Δ n K , 1 ) = o P ( ρ K , n ) ;
1 n n Z n ( ρ 0 ) P K , n M n Z n = 1 n n Z ¯ n ( ρ 0 ) M n Z ¯ n + O ( Δ n K , 1 Δ n K , 2 / n ) + O P ( n 1 ) + O P ( K n 3 / 2 ) + o P ( K n 3 / 2 + n 1 / 2 Δ n K , 1 ) + O P ( Δ n K , 2 / n ) = 1 n n Z ¯ n ( ρ 0 ) M n Z ¯ n + o P ( ρ K , n )
and 1 n 2 Z n M n P K , n M n Z n = O P ( 1 / n ) + O P ( K / n 2 ) = o P ( ρ K , n ) . As
n ( ρ ˜ n ρ 0 ) = 1 n ( ϵ n D n ϵ n + F n ϵ n ) + O P ( n 1 / 2 ) = O P ( 1 )
it follows that
1 n ( ρ 0 ρ ˜ n ) [ Z n ( ρ 0 ) P K , n M n Z n + Z n M n P K , n Z n ( ρ 0 ) ] = T 5 n H + o P ( ρ K , n )
where T 5 n H = 1 n 2 ( ϵ n D n ϵ n + F n ϵ n ) [ Z ¯ n ( ρ 0 ) M n Z ¯ n + Z ¯ n M n Z ¯ n ( ρ 0 ) ] = O P ( n 1 / 2 ) . Then, H ^ n = H n + T n H + o P ( ρ K , n ) with T n H = T 1 n H + T 2 n H + T 5 n H .
For h ^ n , we have
h ^ n = 1 n Z n ( ρ 0 ) P K , n ϵ n + 1 n ( ρ 0 ρ ˜ n ) [ Z n M n P K , n ϵ n + Z n ( ρ 0 ) P K , n M n u n ] + 1 n ( ρ 0 ρ ˜ n ) 2 Z n M n P K , n M n u n ,
where, by Lemma 8 (iv)–(vii) and Lemma 7 (vi),
(a)
1 n Z n ( ρ 0 ) P K , n ϵ n = h n + T 1 n h + T 2 n h with h n = O P ( 1 ) , T 1 n h = O P ( Δ n K , 1 1 / 2 ) and T 2 n h = O P ( K / n ) ,
(b)
1 n Z n M n P K , n ϵ n = 1 n Z ¯ n M n P K , n ϵ n + O P ( K / n ) = O P ( n 1 / 2 ) ,
(c)
1 n Z n ( ρ 0 ) P K , n M n u n = 1 n Z ¯ n ( ρ 0 ) M n u n + o P ( ρ K , n ) , and
(d)
n 3 / 2 Z n M n P K , n M n u n = o P ( ρ K , n ) .
Thus, h ^ n = h n + T n h + o P ( ρ K , n ) , where T n h = T 1 n h + T 2 n h + T 3 n h + T 4 n h with
T 3 n h = n 3 / 2 ( ϵ n D n ϵ n + F n ϵ n ) Z ¯ n ( ρ 0 ) M n u n = O P ( 1 / n )
and
T 4 n h = n 3 / 2 ( ϵ n D n ϵ n + F n ϵ n ) Z ¯ n M n P K , n ϵ n = O P ( 1 / n )
Corresponding to the terms of the decomposition in Lemma 2, we have Z n h = h ^ n h n T n h , Z n H = H ^ n H n T n H ,
A ^ n ( K ) = ( h n + T 1 n h + T 2 n h ) ( h n + T 1 n h + T 2 n h ) + [ ( T 3 n h + T 4 n h ) h n ] s ( h n h n H n 1 T n H ) s
and
Z n A ( K ) = [ ( T 1 n h + T 2 n h ) ( T 3 n h + T 4 n h ) ] s + ( T 3 n h + T 4 n h ) ( T 3 n h + T 4 n h ) = o P ( ρ K , n )
We shall check that all conditions in Lemma 2 are satisfied and derive the explicit expression for E [ A ^ n ( K ) ] . As h n + T 1 n h + T 2 n h = 1 n Z ¯ n ( ρ 0 ) P K , n ϵ n + 1 n ζ n R n P K , n ϵ n , then under the assumption that μ 3 = E ( ϵ n i 2 v n i ) = 0 , we have
E [ ( h n + T 1 n h + T 2 n h ) ( h n + T 1 n h + T 2 n h ) ] = σ ϵ 2 n Z ¯ n ( ρ 0 ) P K , n Z ¯ n ( ρ 0 ) + 1 n E ( ζ n R n P K , n ϵ n ϵ n P K , n R n ζ n )
Since ζ n = [ G n v n γ 0 + G n R n 1 ϵ n , v n ] , the matrix E ( ζ n R n P K , n ϵ n ϵ n P K , n R n ζ n ) Ω n 2 ( K ) , where
Ω n 2 ( K ) = E ( ζ n R n P K , n ϵ n ) E ( ϵ n P K , n R n ζ n )
can be expanded as a 4 × 4 block matrix, with each block being of the order O ( K ) by Lemmas 3 and 4. Thus,
E ( ζ n R n P K , n ϵ n ϵ n P K , n R n ζ n ) = Ω n 2 ( K ) + O ( K )
Then,
E [ ( h n + T 1 n h + T 2 n h ) ( h n + T 1 n h + T 2 n h ) ] = σ ϵ 2 n Z ¯ n ( ρ 0 ) P K , n Z ¯ n ( ρ 0 ) + Ω n 2 ( K ) + o P ( ρ K , n )
Note that
E ( T 3 n h h n ) = 1 n 2 Z ¯ n ( ρ 0 ) M n R n 1 E [ ( ϵ n D n ϵ n + F n ϵ n ) ϵ n ϵ n ] Z ¯ n ( ρ 0 )
and
E ( T 4 n h h n ) = 1 n 2 Z ¯ n M n P K , n E [ ( ϵ n D n ϵ n + F n ϵ n ) ϵ n ϵ n ] Z ¯ n ( ρ 0 )
where E [ ( ϵ n D n ϵ n + F n ϵ n ) ϵ n ϵ n ] is UB by Lemma 9 (i), then E ( T 3 n h h n ) = O ( 1 / n ) = o ( ρ K , n ) , and
E ( T 4 n h h n ) = 1 n 2 Z ¯ n M n E [ ( ϵ n D n ϵ n + F n ϵ n ) ϵ n ϵ n ] Z ¯ n ( ρ 0 ) + 1 n 2 Z ¯ n M n ( I n P K , n ) E [ ( ϵ n D n ϵ n + F n ϵ n ) ϵ n ϵ n ] Z ¯ n ( ρ 0 ) = O ( 1 / n ) + O ( Δ n K , 2 / n ) = o ( ρ K , n )
by Lemma 7 (ii). As E ( h n h n H n 1 T 1 n H ) = σ ϵ 2 n Z ¯ n ( ρ 0 ) ( I n P K , n ) Z ¯ n ( ρ 0 ) , and, by Lemma 9, E ( h n h n H n 1 T 2 n H ) = O ( 1 / n ) = o P ( ρ K , n ) and E ( h n h n H n 1 T 5 n H ) = O ( 1 / n ) = o P ( ρ K , n ) , we have
E [ A ^ n ( K ) ] = σ ϵ 2 n Z ¯ n ( ρ 0 ) P K , n Z ¯ n ( ρ 0 ) + 1 n Ω n 2 ( K ) + 2 σ ϵ 2 n Z ¯ n ( ρ 0 ) ( I n P K , n ) Z ¯ n ( ρ 0 ) + o P ( ρ K , n ) = σ ϵ 2 H n + σ ϵ 2 n Z ¯ n ( ρ 0 ) ( I n P K , n ) Z ¯ n ( ρ 0 ) + 1 n Ω n 2 ( K ) + o P ( ρ K , n )
Let S n ( K ) be given by (16), then all conditions of Lemma 2 are satisfied.  ☐
Proof of Proposition 6. 
The proof follows by modifying that of Proposition 5. Now ρ K , n = tr ( S n ( K ) ) = O ( K / n + Δ n K , 1 ) . The δ ^ c 2 s l s , n satisfies
n ( δ ^ c 2 s l s , n δ 0 ) = H ^ n 1 h ^ n with H ^ n = 1 n Z n ( ρ ˜ n ) P K , n Z n ( ρ ˜ n ) and h ^ n = 1 n [ Z n ( ρ ˜ n ) P K , n u n ( ρ ˜ n ) Υ ˜ n ( K ) ]
By Lemma 8 (viii), T 3 n H E ( T 3 n H ) = o P ( K / n ) , where T 3 n H = 1 n ζ n R n P K , n R n ζ n is defined in Lemma 8 and E ( T 3 n H ) = 1 n Ω n 1 ( K ) with Ω n 1 ( K ) given in Equation (11).
Define T 6 n H = E ( T 3 n H ) . Then, from the proof of Proposition 5,
H ^ n = H n + T n H + o P ( ρ K , n ) , where T n H = T 1 n H + T 2 n H + T 5 n H + T 6 n H
For h ^ n , we have
h ^ n = h n + T 1 n h + T 5 n h 1 n [ Υ ˜ n ( K ) Υ n ( K ) ] + 1 n ( ρ 0 ρ ˜ n ) [ Z n M n P K , n ϵ n + Z n ( ρ 0 ) P K , n M n u n ] + 1 n ( ρ 0 ρ ˜ n ) 2 Z n M n P K , n M n u n
where T 5 n h = 1 n [ ζ n R n P K , n ϵ n E ( ζ n R n P K , n ϵ n ) ] = O P ( K / n ) by Lemma 8 (viii). By Proposition 11 (iv),
1 n [ Υ ˜ n ( K ) Υ n ( K ) ] = T 6 n h + o P ( ρ K , n ) with T 6 n h = 1 n ( a 1 , a 2 ) = O P ( K / n )
where a 1 and a 2 are defined in Proposition 11 (iv). By Lemma 8 (v)–(viii), 1 n Z n M n P K , n ϵ n = 1 n Z ¯ n M n P K , n ϵ n + 1 n E ( ζ n M n P K , n ϵ n ) + o P ( ρ K , n ) ,
1 n Z n ( ρ 0 ) P K , n M n u n = 1 n Z ¯ n ( ρ 0 ) M n u n + 1 n E ( ζ n R n P K , n M n u n ) + o P ( ρ K , n )
and 1 n 3 / 2 Z n M n P K , n M n u n = o P ( ρ K , n ) . Therefore,
h ^ n = h n + T n h + o P ( ρ K , n ) , where T n h = T 1 n h + T 5 n h + T 6 n h + T 3 n h + T 4 n h + T 7 n h
with T 3 n h and T 4 n h defined in the proof of Proposition 5; and
T 7 n h = n 3 / 2 ( ϵ n D n ϵ n + F n ϵ n ) [ E ( ζ n M n P K , n ϵ n ) + E ( ζ n R n P K , n M n u n ) ] = O P ( K / n )
For the decomposition in Lemma 2, take Z n A ( K ) = ( h n + T n h ) ( h n + T n h ) ( h n h n H n 1 T n H ) s A ^ n ( K ) , and
A ^ n ( K ) = ( h n + T 1 n h ) ( h n + T 1 n h ) + [ h n ( T 3 n h + T 4 n h + T 5 n h + T 6 n h + T 7 n h ) ] s + ( T 1 n h T 5 n h ) s + T 5 n h T 5 n h ( h n h n H n 1 T n H ) s
Then Z n A ( K ) = o P ( ρ K , n ) . To check that the conditions in Lemma 2 are satisfied, we now investigate E ( A ^ n ( K ) ) . First,
E [ ( h n + T 1 n h ) ( h n + T 1 n h ) ] = 1 n E ( Z ¯ n ( ρ 0 ) P K , n ϵ n ϵ n P K , n Z ¯ n ( ρ 0 ) ) = σ ϵ 2 n Z ¯ n ( ρ 0 ) P K , n Z ¯ n ( ρ 0 )
By the proof of Proposition 5, E [ h n ( T 3 n h + T 4 n h ) ] = o P ( ρ K , n ) . Under the assumption that E ( ϵ n i 3 ) = E ( ϵ n i 2 v n i ) = 0 , we have E ( h n T 5 n h ) = 0 , E ( T 1 n h T 5 n h ) = 0 ,
E ( h n T 7 n h ) = σ ϵ 2 n 2 Z ¯ n ( ρ 0 ) F n [ E ( ζ n M n P K , n ϵ n ) + E ( ζ n R n P K , n M n u n ) ]
and
E ( h n T 6 n h ) = 1 n [ E ( h n a 1 ) , E ( h n a 2 ) ] = 1 n [ Π n 2 , 1 ( K ) , Π n 2 , 2 ( K ) ]
where Π n 2 , 1 ( K ) and Π n 2 , 2 ( K ) are given in (23) and (24) respectively. The expression for E ( T 5 n h T 5 n h ) can be derived by Lemma 3. Under Assumption 9 (ii), ve c D ( Γ n K , i ) ve c D ( Γ n K , j ) = o ( K ) for i , j = 1 , 2 , 3 . Then E ( T 5 n h T 5 n h ) = 1 n Π n 1 ( K ) + o ( K / n ) , where Π n 1 ( K ) is given in (21). By the proof of Proposition 5, E [ h n h n H n 1 ( T 2 n H + T 5 n H ) ] = o P ( ρ K , n ) . Furthermore,
E [ h n h n H n 1 ( T 1 n H + T 6 n H ) ] = σ ϵ 2 n Z ¯ n ( ρ 0 ) ( I n P K , n ) Z ¯ n ( ρ 0 ) + σ ϵ 2 T 6 n H
Therefore,
E ( A ^ n ( K ) ) = σ ϵ 2 H n + σ ϵ 2 n Z ¯ n ( ρ 0 ) ( I n P K , n ) Z ¯ n ( ρ 0 ) + 1 n Π n 1 ( K ) + 1 n Π n 2 ( K ) + 1 n Π n 3 ( K ) + o P ( ρ K , n )
where 1 n Π n 2 ( K ) = E [ ( h n T 6 n h ) s ] 2 σ ϵ 2 T 6 n H and 1 n Π n 3 ( K ) = E [ ( h n T 7 n h ) s ] . Let S n ( K ) be given by (17), then all conditions in Lemma 2 are satisfied and the result in the proposition holds.    ☐
Proof of Proposition 7. 
As Z n ( ρ ^ n ) = Z n ( ρ 0 ) + ( ρ 0 ρ ^ n ) M n Z n and Z n = Z ¯ n + ζ n , we have
Z n ( ρ ^ n ) ( I n P K , n ) Z n ( ρ ^ n ) = Z n ( ρ 0 ) ( I n P K , n ) Z n ( ρ 0 ) + ( ρ 0 ρ ^ n ) [ Z n M n ( I n P K , n ) Z n ( ρ 0 ) ] s + ( ρ 0 ρ ^ n ) 2 Z n M n ( I n P K , n ) M n Z n
where
Z n ( ρ 0 ) ( I n P K , n ) Z n ( ρ 0 ) = Z ¯ n ( ρ 0 ) ( I n P K , n ) Z ¯ n ( ρ 0 ) + [ Z ¯ n ( ρ 0 ) ( I n P K , n ) R n ζ n ] s + ζ n R n R n ζ n ζ n R n P K , n R n ζ n
Let
S ˜ n , ξ ( K ) = S ^ n , ξ ( K ) σ ^ ϵ 2 n ξ H ^ n 1 [ ζ n R n R n ζ n + ( ρ 0 ρ ^ n ) ( ζ n M n R n ζ n ) s + ( ρ 0 ρ ^ n ) 2 ζ n M n M n ζ n ] H ^ n 1 ξ
As [ S ^ n , ξ ( K ) S ˜ n , ξ ( K ) ] does not depend on K, arg min K S ^ n , ξ ( K ) = arg min K S ˜ n , ξ ( K ) . By lemma:sk, we only need to show that sup K | S ˜ n , ξ ( K ) S n , ξ ( K ) | S n , ξ ( K ) p 0 .
Let e i be the ith column of the ( m + 1 ) × ( m + 1 ) identity matrix. Since σ ^ ϵ 2 = σ ϵ 2 + o P ( 1 ) and H ^ n = H n + o P ( 1 ) = O P ( 1 ) , for the GS2SLS, by the triangular inequality, it is sufficient to show the following:
(i)
sup K | e i Ω n 2 ( K ) e j | / [ n S n , ξ ( K ) ] < c for some constant c > 0 and sup K | e i [ Ω ^ n 2 ( K ) Ω n 2 ( K ) ] e j | / [ n S n , ξ ( K ) ] 0 ;
(ii)
sup K | e i Z ¯ n ( ρ 0 ) ( I n P K , n ) Z ¯ n ( ρ 0 ) e j | / [ n S n , ξ ( K ) ] 0 ;
(iii)
sup K | e i Ω n 1 ( K ) ] e j | / [ n S n , ξ ( K ) ] < c for some constant c > 0 , sup K | e i [ Ω ^ n 1 ( K ) Ω n 1 ( K ) ] e j | / [ n S n , ξ ( K ) ] p 0 and sup K | e i [ ζ n R n P K , n R n ζ n Ω n 1 ( K ) ] e j | / [ n S n , ξ ( K ) ] p 0 ;
(iv)
sup K | e i Z ¯ n ( ρ 0 ) ( I n P K , n ) R n ζ n e j | / [ n S n , ξ ( K ) ] p 0 , and
(v)
sup K | e i { ( ρ 0 ρ ^ n ) [ Z n M n ( I n P K , n ) Z n ( ρ 0 ) ζ n M n R n ζ n ] s + ( ρ 0 ρ ^ n ) 2 [ Z n M n ( I n P K , n ) M n Z n ζ n M n M n ζ n ] } e j | / [ n S n , ξ ( K ) ] p 0 .
For the CGS2SLS, we need to show (ii)–(v) and
(i’)
sup K | e i [ Π n 1 ( K ) + Π n 2 ( K ) + Π n 3 ( K ) ] e j | / [ n S n , ξ ( K ) ] < c for some constant c > 0 and sup K | e i [ Π ^ n 1 ( K ) + Π ^ n 2 ( K ) + Π ^ n 3 ( K ) Π n 1 ( K ) Π n 2 ( K ) Π n 3 ( K ) ] e j | / [ n S n , ξ ( K ) ] 0 .
We first show (i) and (i’). By Lemma 4,
sup K | e i Ω n 2 ( K ) e j | / [ n S n , ξ ( K ) ] < c 1 sup K K 2 / [ n S n , ξ ( K ) ]
for some constant c 1 > 0 . By Assumption 10 (ii), for the GS2SLS, S n , ξ ( K ) > K 2 c 2 / n for some c 2 > 0 . Then sup K | e i Ω n 2 ( K ) e j | / [ n S n , ξ ( K ) ] < c for some constant c > 0 . For tr [ P K , n R n ( ρ ^ n ) G n ( λ ^ n ) ] , by the mean value theorem,
| tr [ P K , n R n ( ρ ^ n ) G n ( λ ^ n ) ] tr ( Γ n K , 2 ) | = | ( λ ^ n λ 0 ) tr [ P K , n R n ( ρ ¨ n ) G n 2 ( λ ¨ n ) ] ( ρ ^ n ρ 0 ) tr [ P K , n M n G n ( λ ¨ n ) ] | K c ( λ ^ n λ 0 ) 2 + ( ρ ^ n ρ 0 ) 2
in probability for some constant c > 0 , by Lemmas 10 and 4. As all parameter estimates used in Ω ^ n 2 ( K ) are consistent, applying similarly the mean value theorem to other terms in Ω ^ n 2 ( K ) , we can see that | | Ω ^ n 2 ( K ) Ω n 2 ( K ) | | K 2 c n in probability, where c n = o P ( 1 ) does not depend on K. Thus (i) holds. For the CGS2SLS, (i’) holds similarly.
The (ii) holds because | e i Z ¯ n ( ρ 0 ) ( I n P K , n ) Z ¯ n ( ρ 0 ) e j | c 1 Δ n K , 1 for some c 1 > 0 , and n S n , ξ ( K ) > c Δ n K , 1 for some constant c > 0 by Assumption 10 (ii).
For (iii), the first two results are similar to those in (i), thus we only show that
sup K | e i [ ζ n R n P K , n R n ζ n Ω n 1 ( K ) ] e j | / [ n S n , ξ ( K ) ] p 0
By Chebyshev’s inequality, for any η > 0 ,
P ( sup K | e i [ ζ n R n P K , n R n ζ n Ω n 1 ( K ) ] e j | / [ n S n , ξ ( K ) ] η ) K E { e i [ ζ n R n P K , n R n ζ n Ω n 1 ( K ) ] e j e j [ ζ n R n P K , n R n ζ n Ω n 1 ( K ) ] e i } / [ η 2 n 2 S n , ξ 2 ( K ) ] K E { e i [ ζ n R n P K , n R n ζ n Ω n 1 ( K ) ] [ ζ n R n P K , n R n ζ n Ω n 1 ( K ) ] e i } / [ η 2 n 2 S n , ξ 2 ( K ) ] K K c / [ η 2 n 2 S n , ξ 2 ( K ) ] K c 1 / [ η 2 n S n , ξ ( K ) ]
for some constants c > 0 and c 1 > 0 , where the third inequality follows by Lemmas 3 and 4, and the last inequality holds since n S n , ξ ( K ) K c 2 for some constant c 2 > 0 by Assumption 10 (ii). The result then follows by Assumption 11.
For (iv), by Chebyshev’s inequality, for any η > 0 ,
P ( sup K | e i Z ¯ n ( ρ 0 ) ( I n P K , n ) R n ζ n e j | / [ n S n , ξ ( K ) ] > η ) K E [ e i Z ¯ n ( ρ 0 ) ( I n P K , n ) R n ζ n e j e j ζ n R n ( I n P K , n ) Z ¯ n ( ρ 0 ) ] e i / [ η 2 n 2 S n , ξ 2 ( K ) ] K e i Z ¯ n ( ρ 0 ) ( I n P K , n ) R n E ( ζ n ζ n ) R n ( I n P K , n ) Z ¯ n ( ρ 0 ) e i / [ η 2 n 2 S n , ξ 2 ( K ) ] K τ ζ , max e i Z ¯ n ( ρ 0 ) ( I n P K , n ) Z ¯ n ( ρ 0 ) e i / [ η 2 n 2 S n , ξ 2 ( K ) ] K c / [ η 2 n S n , ξ ( K ) ]
for some constant c, where τ ζ , max denotes the largest eigenvalue of R n E ( ζ n ζ n ) R n , and the last inequality holds because R n E ( ζ n ζ n ) R n is UB and S n , ξ ( K ) > c 1 Δ n K , 1 for some c 1 > 0 . Thus the result holds.
For (v), as Z n = Z ¯ n + ζ n and n ( ρ ^ n ρ 0 ) = O P ( 1 ) , we show the following:
(1)
sup K | e i Z ¯ n M n ( I n P K , n ) Z ¯ n ( ρ 0 ) e j | / [ n n S n , ξ ( K ) ] 0 and sup K | e i Z ¯ n M n ( I n P K , n ) M n Z ¯ n e j | / [ n 2 S n , ξ ( K ) ] 0 ;
(2)
sup K | e i A n ζ n e j | / [ n n S n , ξ ( K ) ] p 0 , where A n = Z ¯ n M n ( I n P K , n ) R n , Z ¯ n ( ρ 0 ) ( I n P K , n ) M n or Z ¯ n M n ( I n P K , n ) M n ;
(3)
sup K | e i E [ ζ n M n P K , n R n ζ n ] e j | / [ n n S n , ξ ( K ) ] 0 , sup K | e i E [ ζ n M n P K , n M n ζ n ] e j | / [ n n S n , ξ ( K ) ] 0 , sup K | e i { ζ n M n P K , n R n ζ n E [ ζ n M n P K , n R n ζ n ] } e j | / [ n n S n , ξ ( K ) ] p 0 , and sup K | e i { ζ n M n P K , n M n ζ n E [ ζ n M n P K , n M n ζ n ] } e j | / [ n n S n , ξ ( K ) ] p 0 .
By Lemma 7 (iii), we have
sup K | e i Z ¯ n M n ( I n P K , n ) Z ¯ n ( ρ 0 ) e j | / [ n n S n , ξ ( K ) ] sup K c Δ n K , 1 / S n , ξ ( K ) Δ n K , 2 / [ n S n , ξ ( K ) ]
for some c > 0 . Since sup K Δ n K , 2 / [ n S n , ξ ( K ) ] K Δ n K , 2 / [ n S n , ξ ( K ) ] and Δ n K , 2 = o ( 1 ) , the first result in (1) holds by Assumption 10 (ii). The second result in (1) holds since
sup K | e i Z ¯ n M n ( I n P K , n ) M n Z ¯ n e j | / [ n 2 S n , ξ ( K ) ] sup K c Δ n K , 2 / [ n S n , ξ ( K ) ]
for some c > 0 . For (2), similar to (iv), for any η > 0 , we have
P ( sup K | e i Z ¯ n M n ( I n P K , n ) R n ζ n e j | / [ n n S n , ξ ( K ) ] > η ) c η 2 K ( Δ n K , 2 / [ n S n , ξ ( K ) ] ) [ n S n , ξ ( K ) ] 1 , P ( sup K | e i Z ¯ n ( ρ 0 ) ( I n P K , n ) M n ζ n e j | / [ n n S n , ξ ( K ) ] > η ) c η 2 K ( Δ n K , 1 / [ n S n , ξ ( K ) ] ) [ n S n , ξ ( K ) ] 1 ,
and
P ( sup K | e i Z ¯ n M n ( I n P K , n ) M n ζ n e j | / [ n n S n , ξ ( K ) ] > η ) c η 2 K ( Δ n K , 2 / [ n S n , ξ ( K ) ] ) [ n S n , ξ ( K ) ] 1
for some c > 0 . (3) is similar to (i).      ☐

References

  1. S.G. Donald, and W.K. Newey. “Choosing the number of instruments.” Econometrica 69 (2001): 1161–1191. [Google Scholar] [CrossRef]
  2. A.L. Nagar. “The bias and moment matrix of the general k-class estimators of the parameters in simultaneous equations.” Econometrica 27 (1959): 575–595. [Google Scholar] [CrossRef]
  3. X. Liu, and L.F. Lee. “Two stage least squares estimation of spatial autoregressive models with endogenous regressors and many instruments.” Econ. Rev. 32 (2013): 734–753. [Google Scholar] [CrossRef]
  4. M. Benirschka, and J.K. Binkley. “Land price volatility in a geographically dispersed market.” Am. J. Agric. Econ. 76 (1994): 185–195. [Google Scholar] [CrossRef]
  5. L. Anselin, and A. Bera. “Spatial Dependence in Linear Regression Models with an Introduction to Spatial Econometrics.” In Handbook of Applied Economic Statistics. Edited by A. Ullah and D.E. Giles. New York, NY, USA: Marcel Dekker, 1998, pp. 237–289. [Google Scholar]
  6. A. Case. “On the use of spatial autoregressive models in demand analysis.” Discussion Paper 135, Research Program in Development Studies, Woodrow Wilson School, Princeton University, Princeton, NJ, USA, 1987. [Google Scholar]
  7. A. Case. “Spatial patterns in household demand.” Econometrica 59 (1991): 953–965. [Google Scholar] [CrossRef]
  8. A. Case. “Neighborhood influence and technological change.” Reg. Sci. Urban Econ. 22 (1992): 491–508. [Google Scholar] [CrossRef]
  9. A. Case, J. Hines Jr., and H. Rosen. “Budget spillovers and fiscal policy independence: Evidence from the states.” J. Public Econ. 52 (1993): 285–307. [Google Scholar] [CrossRef]
  10. T. Besley, and A. Case. “Incumbent behavior: Vote-seeking, tax-setting, and yardstick competition.” Am. Econ. Rev. 85 (1995): 951–963. [Google Scholar]
  11. H.H. Kelejian, and I.R. Prucha. “A generalized spatial two-stage least squares procedure for estimating a spatial autoregressive model with autoregressive disturbances.” J. Real Estate Financ. Econ. 17 (1998): 99–121. [Google Scholar] [CrossRef]
  12. D.M. Drukker, P. Egger, and I.R. Prucha. “On two-step estimation of a spatial autoregressive model with autoregressive disturbances and endogenous regressors.” Econ. Rev. 32 (2013): 686–733. [Google Scholar] [CrossRef]
  13. H.H. Kelejian, and I.R. Prucha. “Estimation of simultaneous systems of spatially interrelated cross sectional equations.” J. Econom. 118 (2004): 27–50. [Google Scholar] [CrossRef]
  14. H.H. Kelejian, and I.R. Prucha. “HAC estimation in a spatial framework.” J. Econom. 140 (2007): 131–154. [Google Scholar] [CrossRef]
  15. L.F. Lee, and J. Yu. “Efficient GMM estimation of spatial dynamic panel data models with fixed effects.” Working paper, 2012. [Google Scholar]
  16. G. Chamberlain. “Asymptotic efficiency in estimation with conditional moment restrictions.” J. Econom. 34 (1987): 305–334. [Google Scholar] [CrossRef]
  17. H.H. Kelejian, and I.R. Prucha. “On the asymptotic distribution of the Moran I test statistic with applications.” J. Econom. 104 (2001): 219–257. [Google Scholar] [CrossRef]
  18. H.H. Kelejian, and I.R. Prucha. “A generalized moments estimator for the autoregressive parameter in a spatial model.” Int. Econ. Rev. 40 (1999): 509–533. [Google Scholar] [CrossRef]
  19. J. Hahn, and J. Hausman. “A new specification test for the validity of instrumental variables.” Econometrica 70 (2002): 163–189. [Google Scholar] [CrossRef]
  20. L. Anselin. Spatial Econometrics: Methods and Models. Boston, MA, USA: Kluwer Academic Publishers, 1988. [Google Scholar]
  21. X. Liu, and L.F. Lee. “GMM estimation of social interaction models with centrality.” J. Econom. 159 (2010): 99–115. [Google Scholar] [CrossRef]
  22. L.F. Lee. “Asymptotic distributions of quasi-maximum likelihood estimators for spatial autoregressive models.” Econometrica 72 (2004): 1899–1925. [Google Scholar] [CrossRef]
  23. H. White. Estimation, Inference and Specification Analysis. New York, NY, USA: Cambridge University Press, 1994. [Google Scholar]

Share and Cite

MDPI and ACS Style

Jin, F.; Lee, L.-f. Generalized Spatial Two Stage Least Squares Estimation of Spatial Autoregressive Models with Autoregressive Disturbances in the Presence of Endogenous Regressors and Many Instruments. Econometrics 2013, 1, 71-114. https://0-doi-org.brum.beds.ac.uk/10.3390/econometrics1010071

AMA Style

Jin F, Lee L-f. Generalized Spatial Two Stage Least Squares Estimation of Spatial Autoregressive Models with Autoregressive Disturbances in the Presence of Endogenous Regressors and Many Instruments. Econometrics. 2013; 1(1):71-114. https://0-doi-org.brum.beds.ac.uk/10.3390/econometrics1010071

Chicago/Turabian Style

Jin, Fei, and Lung-fei Lee. 2013. "Generalized Spatial Two Stage Least Squares Estimation of Spatial Autoregressive Models with Autoregressive Disturbances in the Presence of Endogenous Regressors and Many Instruments" Econometrics 1, no. 1: 71-114. https://0-doi-org.brum.beds.ac.uk/10.3390/econometrics1010071

Article Metrics

Back to TopTop