Next Article in Journal
Inverses for Fourth-Degree Permutation Polynomials Modulo 32Ψ or 96Ψ, with Ψ as a Product of Different Prime Numbers Greater than Three
Previous Article in Journal
Pricing Contingent Claims in a Two-Interest-Rate Multi-Dimensional Jump-Diffusion Model via Market Completion
 
 
Font Type:
Arial Georgia Verdana
Font Size:
Aa Aa Aa
Line Spacing:
Column Width:
Background:
Review

Four Measures of Association and Their Representations in Terms of Copulas

by
Michel Adès
1,
Serge B. Provost
2,* and
Yishan Zang
2
1
Département de Mathématiques, Université du Québec à Montréal, Montréal, QC H2X 3Y7, Canada
2
Department of Statistical and Actuarial Sciences, The University of Western Ontario, London, ON N6A 5B7, Canada
*
Author to whom correspondence should be addressed.
Submission received: 16 January 2024 / Revised: 9 February 2024 / Accepted: 22 February 2024 / Published: 2 March 2024

Abstract

:
Four measures of association, namely, Spearman’s ρ , Kendall’s τ , Blomqvist’s β and Hoeffding’s Φ 2 , are expressed in terms of copulas. Conveniently, this article also includes explicit expressions for their empirical counterparts. Moreover, copula representations of the four coefficients are provided for the multivariate case, and several specific applications are pointed out. Additionally, a numerical study is presented with a view to illustrating the types of relationships that each of the measures of association can detect.

1. Introduction

Copula representations and sample estimates of the correlation measures attributed to Spearman, Kendall, Blomqvist and Hoeffding are provided in this paper. All these measures of association depend on the ranks of the observations on each variable. They can reveal the strength of the dependence between two variables that are not necessarily linearly related, as is required in the case of Pearson’s correlation. They can as well be applied to ordinal data. While the Spearman, Kendall and Blomqvist measures of association are suitable for observations exhibiting monotonic relationships, Hoeffding’s index can also ascertain the extent of the dependence between the variables, regardless of the patterns that they may follow. Thus, these four measures of association prove quite versatile when it comes to assessing the strength of various types of relationships between variables. Moreover, since they are rank-based, they are all robust with respect to outliers. What is more, they can be readily evaluated.
Copulas are principally utilized for modeling dependency features in multivariate distributions. They enable one to represent the joint distribution of two or more random variables in terms of their marginal distributions and a specific correlation structure. Thus, the effect of the dependence between the variables can be separated from the contribution of each marginal. As measures of dependence, copulas have found applications in numerous fields of scientific investigations, including reliability theory, signal processing, geodesy, hydrology, finance and medicine. We now review certain basic definitions and results on the subject.
In the bivariate framework, a copula function is a distribution whose support is the unit square 1 2 = [ 0 , 1 ] 2 and whose marginals are uniformly distributed. A more formal definition is now provided.
A function C : 1 2 1 is a bivariate copula if it satisfies the two following properties:
  • For every u , v 1 ,
    C ( u , 1 ) = u , C ( 1 , v ) = v , and C ( u , 0 ) = C ( 0 , v ) = 0 .
  • For every u 1 , u 2 , v 1 , v 2 1 such that u 1 u 2 and v 1 v 2 ,
    C ( u 2 , v 2 ) C ( u 2 , v 1 ) C ( u 1 , v 2 ) + C ( u 1 , v 1 ) 0 .
This last inequality implies that C ( u , v ) is increasing in both variables.
We now state a result due to Sklar (Theorem 1) [1].
Theorem 1. 
Let H ( x , y ) be the joint cumulative distribution function of the random variables X and Y whose continuous marginal distribution functions are denoted by F ( x ) and G ( y ) . Then, there exists a unique bivariate copula C : 1 2 1 such that
H ( x , y ) = C F ( x ) , G ( y )
where C ( · , · ) is a joint cumulative distribution function having uniform marginals. Conversely, for any continuous cumulative distribution functions F ( x ) and G ( y ) and any copula C ( · , · ) , the function H ( · , · ) , as defined in (1), is a joint distribution function with marginal distribution functions F ( · ) and G ( · ) .
Sklar’s theorem provides a technique for constructing copulas. Indeed, the function
C ( u , v ) = H F 1 ( u ) , G 1 ( v )
is a bivariate copula, where the quasi-inverses F 1 ( · ) and G 1 ( · ) are defined by
F 1 ( u ) = inf { x | F ( x ) u } , u ( 0 , 1 ) ,
and
G 1 ( v ) = inf { y | G ( y ) v } , v ( 0 , 1 ) .
Copulas are invariant with respect to strictly increasing transformations. More specifically, assuming that X and Y are two continuous random variables whose associated copula is C ( · , · ) , and letting α ( · ) and β ( · ) be two strictly increasing functions and C α , β ( · , · ) be the copula obtained from α ( X ) and β ( Y ) , then for all ( u , v ) 1 2 , one has
C α , β ( u , v ) = C ( u , v ) .
We shall denote the probability density function corresponding to the copula C ( u 1 , u 2 ) by
c ( u , v ) = 2 u v C ( u , v ) .
The following relationship between h ( · , · ) , the joint density function of the random variables X and Y as defined in Sklar’s theorem, and the associated copula density function c ( · , · ) can then be readily obtained from Equation (1) as
h ( x , y ) = f ( x ) g ( y ) c ( F ( x ) , G ( y ) )
where f ( x ) and g ( y ) denote the marginal density functions of X and Y, respectively. Accordingly, a copula density function can be expressed as follows:
c ( u , v ) = h ( F 1 ( u ) , G 1 ( v ) ) f ( F 1 ( u ) ) g ( G 1 ( v ) ) .
Now, given a random sample ( x 1 , y 1 ) , , ( x n , y n ) generated from the continuous random vector ( X , Y ) , let
( u i , v i ) = ( F ( x i ) , G ( y i ) ) , i = 1 , , n ,
where F ( · ) and G ( · ) are the usually unknown marginal cumulative distribution functions (cdfs) of X and Y. The empirical marginal cdfs F ^ ( · ) and G ^ ( · ) are then utilized to determine the pseudo-observations:
( u ^ i , v ^ i ) = ( F ^ ( x i ) , G ^ ( y i ) ) , i = 1 , , n ,
where the empirical cdfs (ecdfs) are given by F ^ ( x ) = 1 n i = 1 n I ( x i x ) and G ^ ( y ) = 1 n i = 1 n I ( y i y ) , with I ( ) denoting the indicator function which is equal to one if the condition is verified and zero, otherwise. Equivalently, one has
( u ^ i , v ^ i ) = ( r i / n , s i / n ) ,
where r i is the rank of x i among { x 1 , , x n } , and s i is the rank of y i among { y 1 , , y n } .
The frequencies or probability mass function of an empirical copula can be expressed as
c ^ ( u , v ) = 1 n i = 1 n I ( F ^ ( x i ) = u ) I ( G ^ ( y i ) = v ) = 1 n i = 1 n I ( r i / n = u ) I ( s i / n = v ) ,
and the corresponding empirical copula (distribution function) is then given by
C ^ ( u , v ) = 1 n i = 1 n I ( F ^ ( x i ) u ) I ( G ^ ( y i ) v ) = 1 n i = 1 n I ( r i / n u ) I ( s i / n v ) ,
which is a consistent estimate of C ( u , v ) . We note that, in practice, the ranks are often divided by n + 1 instead of n in order to mitigate certain boundary effects, and that other adjustments that are specified in Section 2 may also be applied. As pointed out by [2], who refers to [3], “Empirical copulas were introduced and first studied by Deheuvels who called them empirical dependence functions”.
Additional properties of copulas that are not directly relevant to the results presented in this article are discussed for instance in [4,5,6].
This article contains certain derivations that do not seem to be available in the literature and also provides missing steps that complete the published proofs. It is structured as follows: Section 2, Section 3, Section 4 and Section 5, which, respectively, focus on Spearman’s, Kendall’s, Blomqvist’s and Hoeffding’s correlation coefficients, include representations of these measures of dependence in terms of copulas, in addition to providing sample estimates thereof and pointing out related distributional results of interest. The effectiveness of these correlation coefficients in assessing the trends present in five data sets exhibiting distinctive patterns is assessed in a numerical study that is presented in Section 6. Section 7 is dedicated to multivariate extensions of the four measures of association and their copula representations.
To the best of our knowledge, the four major dependence measures discussed here, along with their representations in terms of copulas, have not been previously covered in a single source.

2. Spearman’s Rank Correlation

Spearman’s rank correlation statistic, also referred to as Spearman’s ρ , measures the extent to which the relationship between two variables is monotonic—either increasing or decreasing.
First, Spearman’s ρ is expressed in terms of a copula denoted by C ( U , V ) . Then, some equivalent representations of Spearman’s rank correlation statistic are provided; one of them is obtained by replacing C ( U , V ) by its empirical counterpart.
Let ( X , Y ) be a bivariate continuous random vector having h ( x , y ) as its joint density function, and F ( X ) and G ( Y ) denote the respective marginal distribution functions of X and Y.
Theoretically, Spearman’s correlation is given by
ρ S = C o v [ F ( X ) , G ( Y ) ] V a r [ F ( X ) ] V a r [ G ( Y ) ]
= R 2 F ( x ) G ( y ) h ( x , y ) d x d y ( R F ( x ) d F ( x ) ) ( R G ( y ) d G ( y ) ) [ R F ( x ) 2 d F ( x ) ( R ( F ( x ) d F ( x ) ) 2 ] [ R G ( y ) 2 d G ( y ) ( R ( G ( y ) d G ( y ) ) 2 ]
= 0 1 0 1 u v c ( u , v ) d u d v ( 1 / 2 ) ( 1 / 2 ) ( 1 / 12 ) ( 1 / 12 ) in light of ( 8 ) ,
with the transformation { x = F 1 ( u ) and y = G 1 ( v ) } whose Jacobian is the inverse of the Jacobian associated with the following transformation :
{ u = F ( x ) and v = G ( y ) } , that is , 1 / [ f ( F 1 ( u ) ) g ( G 1 ( v ) ) ] ,
= 12 0 1 0 1 C ( u , v ) d u d v 3 ,
= 12 E [ U V ] 3 ,
where C ( · , · ) and c ( · , · ) , respectively, denote the copula and copula density function associated with ( X , Y ) , and R represents the set of real numbers. In [7,8], it is taken as a given that the double integral appearing in (16) can be expressed as that appearing in (17). We now prove that this is indeed the case. First, recall that 2 C ( u , v ) u v = c ( u , v ) , the copula density function. On integrating by parts twice, one has
0 1 0 1 u v d C ( u , v ) = 0 1 0 1 u v 2 C ( u , v ) u v d v d u = 0 1 u 0 1 v v C ( u , v ) u d v d u = 0 1 u v C ( u , v ) u 0 1 0 1 C ( u , v ) u d v d u = 0 1 u 1 0 1 C ( u , v ) u d v d u , as C ( u , 1 ) = u = 0 1 u d u 0 1 0 1 u C ( u , v ) u d v d u = 1 2 0 1 u C ( u , v ) | 0 1 0 1 C ( u , v ) d u d v = 1 2 1 2 + 0 1 0 1 C ( u , v ) d u d v , as C ( 1 , v ) = v = 0 1 0 1 C ( u , v ) d u d v .
Now, let ( X 1 , Y 1 ) , , ( X n , Y n ) be a random sample generated from the random vector ( X , Y ) , and denote by F ^ ( X ) and G ^ ( Y ) the respective empirical distribution functions of X and Y. Throughout this article, the sample size is assumed to be n. On denoting by R i and S j , the rank of X i among { X 1 , , X n } and the rank of Y j among { Y 1 , , Y n } , respectively, one has F ^ ( X i ) = R i / n U i and G ^ ( Y j ) = S j / n V j , where U i and V j denote the canonical pseudo-observations on each component. Note that the rank averages R ¯ and S ¯ are both equal to ( n + 1 ) / 2 . Then, Spearman’s rank correlation estimator admits the following equivalent representations:
ρ ^ S = i = 1 n ( R i R ¯ ) ( S i S ¯ ) i = 1 n ( R i R ¯ ) 2 i = 1 n ( S i S ¯ ) 2
= ( i = 1 n R i S i ) n R ¯ S ¯ [ ( i = 1 n R i 2 ) n R ¯ 2 ] [ ( i = 1 n S i 2 ) n S ¯ 2 ) ]
= ( i = 1 n F ^ ( x i ) G ^ ( y i ) ) n ( n + 1 ) 2 / 4 [ ( i = 1 n F ^ ( x i ) 2 ) n ( n + 1 ) 2 / 4 ] [ ( i = 1 n G ^ ( y i ) 2 ) n ( n + 1 ) 2 / 4 ]
= ( i = 1 n U i V i ) n ( n + 1 ) 2 / 4 [ ( i = 1 n U i 2 ) n ( n + 1 ) 2 / 4 ] [ ( i = 1 n V i 2 ) n ( n + 1 ) 2 / 4 ]
= i = 1 n ( U i U ¯ ) ( V i V ¯ ) i = 1 n ( U i U ¯ ) 2 i = 1 n ( V i V ¯ ) 2 ,
where U ¯ = i = 1 n U i / n and V ¯ = i = 1 n V i / n .
Of course, (24) readily follows from (20), and it is seen from either one of these expressions that Spearman’s rank correlation is not be affected by any monotonic affine transformation, whether applied to the ranks or the canonical pseudo-observations. As pointed out for instance in [9], the pseudo-observations are frequently taken to be
U ^ i = R i n + 1 = n n + 1 F ^ ( x i ) = 1 n + 1 k = 1 n I ( x k x i )
and
V ^ j = S j n + 1 = n n + 1 G ^ ( y j ) = 1 n + 1 k = 1 n I ( y k y j ) .
Alternatively, one can define the pseudo-observations so that they be uniformly—and less haphazardly—distributed over the unit interval as follows:
U i ˜ = R i n 1 2 n = F ^ ( x i ) 1 2 n
and
V j ˜ = S j n 1 2 n = G ^ ( y j ) 1 2 n .
In a simulation study, Dias (2022) [10] observed that such pseudo-observations have a lower bias than those obtained by dividing the ranks by n + 1 . What is more, it should be observed that if we extend the pseudo-observations U ˜ i , i = 1 , , n , and V ˜ j , j = 1 , , n , by 1 2 n on each side and assign their respective probability, namely, 1 n , to each of the n resulting subintervals, the marginal distributions is then uniformly distributed within the interval [ 0 , 1 ] , which happens to be a requirement for a copula density function. However, this is not the case for any other affine transformation of the ranks. The alternative transformations rank 1 / 3 n + 1 / 3 and rank 1 n 1 were also considered by [10,11], respectively. As established in [10], the pseudo-observation estimators resulting from any of the above-mentioned transformations as well as the canonical pseudo-observations are consistent estimators of the underlying distribution functions.
Kojanovik and Yan (2010) [7] pointed out that ρ ^ S , as specified in (21), can also be expressed as
ρ ^ S = 12 n ( n + 1 ) ( n 1 ) i = 1 n R i S i 3 ( n + 1 ) ( n 1 ) ,
where ρ ^ S is a consistent estimator of ρ S .
Moreover, it can be algebraically shown that, alternatively,
ρ ^ S = 1 6 i = 1 n ( R i S i ) 2 n ( n 2 1 )
when the ranks are distinct integers.
On writing (17) as
ρ S = 12 0 1 0 1 u v d C ( u , v ) 3 ,
and replacing C ( u , v ) by C ^ ( u , v ) as defined in (13), the double integral becomes
1 n i = 1 n 0 1 u d ( I ( r i / n u ) ) 0 1 v d ( I ( s i / n v ) ) .
For instance, on integrating the first integral by parts, one has
u I ( r i / n u ) | 0 1 0 1 I ( r i / n u ) d u = 1 ( 1 r i / n ) = r i / n .
Thus, the resulting estimator of Spearman’s rank correlation is given by
ρ ^ S = 12 n 3 i = 1 n R i S i 3 ,
which is approximately equal to that given in (29).
Now, letting C θ ( u , v ) be a copula whose functional representation is known, and assuming that it is a one-to-one function of the dependence parameter θ , it follows from (17) that
ρ S ( θ ) = 12 1 2 C θ ( u , v ) d u d v 3 ,
which provides an indication of the extent to which the variables are monotonically related. Moreover, since ρ ^ S , as defined in (21), (29) or (30), tends to ρ S ( θ ) , θ ^ = ρ S 1 ( ρ ^ S ) can serve as an estimate of θ .
It follows from (17) that Spearman’s ρ can be expressed as
ρ S = 12 1 2 [ C ( u , v ) u v ] d u d v .
On replacing [ C ( u , v ) u v ] in (33) by | C ( u , v ) u v | , one obtains a measure based on the L 1 distance between the copula C and the product copula Π = u v ( [5]). This is the so-called Schweizer–Wolff’s sigma as defined in [12], which is given by
σ X , Y = σ C = 12 1 2 | C ( u , v ) u v | d u d v .
The expression (34) is a measure of dependence which satisfies the properties of Rényi’s axioms [13] for measures of dependence [12], [14] (p. 145).
Note that Pearson’s correlation coefficient,
r ^ = i = 1 n ( x i x ¯ ) ( y i y ¯ ) i = 1 n ( x i x ¯ ) 2 i = 1 n ( y i y ¯ ) 2 ,
only measures the strength of a linear relationship between X and Y, whereas Spearman’s rank correlation ρ S assesses the strength of any monotonic relationship between X and Y. The latter is always well-defined, which is not the case for the former. Both vary between 1 and 1 and ρ S = ± 1 indicates that Y is either an increasing or a decreasing function of X. Moreover, it should be noted that Pearson’s correlation coefficient cannot be expressed in terms of copulas since its estimator is a function of the observations themselves rather than their ranks.
The next three sections include results that were gleaned from the following books among others: [4,5,15,16].

3. Kendall’s Rank Correlation Coefficient

Kendall’s τ , also referred to as Kendall’s rank correlation coefficient, was introduced by [17]. Maurice Kendall also proposed an estimate thereof and published several papers as well as a monograph in connection with certain ordinal measures of correlation. Further historical details are available from [18].
Kendall’s τ is a nonparametric measure of association between two variables, which is based on the number of concordant pairs minus the number of discordant pairs. Consider two observations ( x i , y i ) and ( x j , y j ) , with ( i , j ) { 1 , , n } such that i j , that are generated from a vector ( X , Y ) of continuous random variables. Then, for any such assignment of pairs, define each pair as being concordant, discordant or equal, as follows:
( x i , y i ) and ( x j , y j ) are concordant if
{ x i < x j and y i < y j or if x i > x j and y i > y j } , or equivalently
( x i x j ) ( y i y j ) > 0 , i.e., the slope of the line connecting the two points is positive.
( x i , y i ) and ( x j , y j ) are discordant if
{ x i < x j and y i > y j or if x i > x j and y i < y j } , or equivalently
( x i x j ) ( y i y j ) < 0 , i.e., the slope of the line connecting the two points is negative.
( x i , y i ) and ( x j , y j ) are equal if x i = x j   or   y i = y j . Actually, pair equality can be disregarded as the random variables X and Y are assumed to be continuous.

3.1. The Empirical Kendall’s τ

Let { ( x 1 , y 1 ) , ( x 2 , y 2 ) , , ( x n , y n ) } be a random sample of n pairs arising from the vector ( X , Y ) of continuous random variables. There are C 2 n = n 2 possible ways of selecting distinct pairs ( x i , y i ) and ( x j , y j ) of observations in the sample, with each pair being either concordant or discordant.
Let S i j be defined as follows:
S i j = sign ( X i X j ) sign ( Y i Y j ) ,
where
sign ( u ) = 1 if u < 0 0 if u = 0 1 if u > 0 .
Then, the values that S i j can take on are
s i j = 1 when the pairs are discordant 0 when the pairs are neither concordant nor discordant 1 when the pairs are concordant .
Kendall’s sample τ ^ is defined as follows:
τ ^ = 1 i < j n s i j C 2 n = 2 n ( n 1 ) 1 i < j n s i j .
Alternatively, on letting c denote the number of concordant pairs and d the number of discordant pairs in a given sample of size n, one can express the estimate of Kendall’s τ as
τ ^ = c d c + d = c d C 2 n = 2 ( c d ) n ( n 1 ) .
As it is assumed that there can be no equality between pairs, C 2 n = c + d , so that
τ ^ = 4 c n ( n 1 ) 1 or , equivalently , τ ^ = 1 4 d n ( n 1 ) .
In fact, τ ^ is an unbiased estimator of τ . As well, Kendall and Gibbons (1990) [19] (Chapter 5) established that Var ( τ ^ ) = 2 ( 2 n + 5 ) 9 n ( n 1 ) . A coefficient related to that specified in (39) was discussed in [20,21,22] in the context of double time series.

3.2. The Population Kendall’s τ

Letting ( X 1 , Y 1 ) and ( X 2 , Y 2 ) be independent and identically distributed random vectors, with the joint distribution function of ( X i , Y i ) being H ( x , y ) , F ( x ) and G ( y ) denote the respective distribution functions of X i and Y j , i , j = 1 , 2 , and the associated copula be C ( u , v ) = H ( F 1 ( u ) , G 1 ( v ) ) , the population Kendall’s τ is defined as follows:
τ = τ X , Y = Pr [ concordant pairs ] Pr [ discordant pairs ] p c p d
= Pr [ ( X 1 X 2 ) ( Y 1 Y 2 ) > 0 ] Pr [ ( X 1 X 2 ) ( Y 1 Y 2 ) < 0 ]
= 2 Pr [ ( X 1 X 2 ) ( Y 1 Y 2 ) > 0 ] 1
= 4 Pr [ ( X 1 < X 2 , Y 1 < Y 2 ) ] 1
= 4 R 2 Pr ( X 2 x , Y 2 y ) d H ( x , y ) 1 , with H ( x , y ) = C ( F ( x ) , G ( y ) )
= 4 R 2 H ( x , y ) c ( F ( x ) , G ( y ) ) f ( x ) g ( y ) d x d y 1 = 4 1 2 H ( F 1 ( u ) , G 1 ( v ) ) c ( u , v ) f ( F 1 ( u ) ) g ( G 1 ( v ) ) f ( F 1 ( u ) ) g ( G 1 ( v ) ) d u d v 1
= 4 0 1 0 1 C ( u , v ) d C ( u , v ) 1
= 4 E [ C ( U , V ) ] 1 ,
where
U and V have a Uniform ( 0 , 1 ) distribution, their joint cdf being C ( u , v ) ;
u = F X ( x ) and v = F Y ( y ) ;
R 2 { ( x , y ) | x and y are real numbers } ;
d C ( u , v ) = 2 C ( u , v ) u v d u d v = c ( u , v ) d u d v .
Clearly, (41) follows from (40) since
Pr [ ( X 1 X 2 ) ( Y 1 Y 2 ) < 0 ] = 1 Pr [ ( X 1 X 2 ) ( Y 1 Y 2 ) > 0 ] .
We now state Theorem 5.1.1 from [5]:
Theorem 2. 
Let ( X 1 , Y 1 ) and ( X 2 , Y 2 ) be independent vectors of continuous random variables with joint distributions functions H 1 and H 2 , respectively, with common marginals F ( · ) and G ( · ) . Let C 1 and C 2 be the copulas of ( X 1 , Y 1 ) and ( X 2 , Y 2 ) , respectively, so that H 1 ( x , y ) = C 1 ( F ( x ) , G ( y ) ) and H 2 ( x , y ) = C 2 ( F ( x ) , G ( y ) ) . Let
Q = P [ ( X 1 X 2 ) ( Y 1 Y 2 ) > 0 ] P [ ( X 1 X 2 ) ( Y 1 Y 2 ) < 0 ] .
Then,
Q ( C 1 , C 2 ) = 4 1 2 C 2 ( u 1 , u 2 ) d C 1 ( u 1 , u 2 ) 1 .
If X and Y are continuous random variables whose copula is C, then Equation (44) follows from (40), (46) and (47).

3.3. Marginal Probability of Sij

The marginal probability of S i j is
p S i j ( s i j ) = p c , s i j = 1 p d , s i j = 1 1 p c p d , s i j = 0 .
Gibbons and Chakraborti (2003) [15] proved that
E ( S i j ) = 1 p c + ( 1 ) p d = τ .

3.4. Certain Properties of τ

The correlation coefficient τ is invariant with respect to strictly increasing transformations.
If X 1 and Y 1 are independent, then the value of τ is zero:
τ ( X 1 , Y 1 ) = 2 Pr [ ( X 1 X 2 ) ( Y 1 Y 2 ) > 0 ] 1 = 2 { Pr [ X 1 X 2 > 0 , Y 1 Y 2 > 0 ] + Pr [ X 1 X 2 < 0 , Y 1 Y 2 < 0 ] } 1 = 2 1 4 + 1 4 1 = 0 .
Kendall’s τ takes on values in the interval [ 1 , 1 ] .
As stated in [4], when the number of discordant pairs is 0, the value of τ is maximum and equals 1, which means a perfect relationship; the variables are then comonotonic, i.e., one variable is an increasing transform of the other; if the variables are countermonotonic, i.e., one variable is a decreasing transform of the other, the correlation coefficient τ equals 1 . Note that these two properties do not hold for Pearson’s correlation coefficient. Moreover, it proves more appropriate to make use of Kendall’s τ when the joint distribution is not Gaussian.

4. Blomqvist’s Correlation Coefficient

Blomqvist (1950) [23] proposed a measure of dependence that was similar in its structure to Kendall’s correlation coefficient, except that in this instance, medians were utilized. Blomqvist’s correlation coefficient can be defined as follows:
β = β X , Y = P [ ( X F X 1 ( 1 / 2 ) ) ( Y G Y 1 ( 1 / 2 ) ) > 0 ] P [ ( X F X 1 ( 1 / 2 ) ) ( Y G Y 1 ( 1 / 2 ) ) < 0 ] ,
where F X 1 ( 1 / 2 ) x ˜ and G Y 1 ( 1 / 2 ) y ˜ are the respective medians of X and Y, which explains why this coefficient is also known as the median correlation coefficient.
Now, letting X and Y be continuous random variables whose joint cdf is H ( · , · ) , F ( · ) and G ( · ) denote the respective marginal cdfs, and C ( · , · ) be the associated copula, then,
F ( x ˜ ) = F ( F X 1 ( 1 / 2 ) ) = 1 / 2 , G ( y ˜ ) = G ( G Y 1 ( 1 / 2 ) ) = 1 / 2 ,
and
β = β X , Y = 2 Pr [ ( X F X 1 ( 1 / 2 ) ) ( Y G Y 1 ( 1 / 2 ) ) > 0 ] 1
= 2 { Pr [ X < F X 1 ( 1 / 2 ) , Y < G Y 1 ( 1 / 2 ) ] + Pr [ X > F X 1 ( 1 / 2 ) , Y > G Y 1 ( 1 / 2 ) ] } 1
= 4 H ( F X 1 ( 1 / 2 ) , G Y 1 ( 1 / 2 ) ) 1
= 4 C ( 1 / 2 , 1 / 2 ) 1 .
In the development of these equations, the following relationships were utilized in addition to H ( x , y ) = C ( F ( x ) , G ( y ) ) :
P [ ( X F X 1 ( 1 / 2 ) ) ( Y G Y 1 ( 1 / 2 ) ) > 0 ] = P [ X F X 1 ( 1 / 2 ) > 0 , Y G Y 1 ( 1 / 2 ) > 0 ] + P [ X F X 1 ( 1 / 2 ) < 0 , Y G Y 1 ( 1 / 2 ) < 0 ] ;
P [ X > F X 1 ( 1 / 2 ) , Y > G Y 1 ( 1 / 2 ) ] = P [ X < F X 1 ( 1 / 2 ) , Y < G Y 1 ( 1 / 2 ) ] .

4.1. Estimation of β

Let x ˜ n and y ˜ n be the respective medians of the samples x 1 , , x n and y 1 , , y n . The computation of Blomqvist’s correlation coefficient is based on a 2 × 2 contingency table that is constructed from these two samples.
According to Blomqvist’s suggestion, the x y -plane is divided into four regions by drawing the lines x = x ˜ n and y = y ˜ n . Let n 1 and n 2 be the number of points belonging to the first or third quadrant and to the second or fourth quadrant, respectively.
Blomqvist’s sample β n or the median correlation coefficient is defined by
β n = n 1 n 2 n 1 + n 2 = 2 n 1 n 1 + n 2 1 .
If the sample size n is even, then clearly, no sample points fall on the lines x = x ˜ n and y = y ˜ n . Moreover, n 1 and n 2 are then both even. However, if n is odd, then one or two sample points must fall on the lines x = x ˜ n and y = y ˜ n . In the first case (a single point lying on a median), Blomqvist proposed that this point shall not be counted. For the second case, one point has to fall on each line; then, one of the points is assigned to the quadrant touched by both points, while the other is not counted.
Genest et al. (2013) [24] provided an accurate interpretation of β n as “the difference between the proportion of sample points having both components either smaller or greater than their respective medians, and the proportion of the other sample points”. Finally, as pointed out by [23], the definition of β n as given in (55) was not new [25]; however, its statistical properties had not been previously fully investigated.

4.2. Some Properties of Blomqvist’s Correlation Coefficient

The coefficient β is invariant under strictly increasing transformations of X and Y.
The correlation coefficient β takes on values in the interval [ 1 , 1 ] .
If X and Y are independent, then C ( 1 / 2 , 1 / 2 ) = F ( 1 / 2 ) G ( 1 / 2 ) = 1 / 4 , and β = 0 .

5. Hoeffding’s Dependence Index

To measure the strength of relationships that are not necessarily monotonic, one may make use of Hoeffding’s dependence coefficient. Letting H ( X , Y ) denote the joint distribution function of X and Y, and F ( X ) and G ( Y ) stand for the marginal distribution functions of X and Y , Hoeffding’s nonparametric rank statistic for testing bivariate independence is based on
D ( x , y ) = H ( x , y ) F ( x ) G ( y ) ,
which is equal to zero if and only if X and Y are independently distributed.
The nonparametric estimator of the quantity D 2 = 30 D 2 ( x , y ) d H ( x , y ) results in the statistic
D ^ 2 = 30 Q 2 ( n 2 ) R + ( n 2 ) ( n 3 ) S n ( n 1 ) ( n 2 ) ( n 3 ) ( n 4 ) ,
where
Q = i = 1 n ( R i 1 ) ( R i 2 ) ( S i 1 ) ( S i 2 ) ,
R = i = 1 n ( R i 2 ) ( S i 2 ) C i ,
and
S = i = 1 n ( C i 1 ) C i ,
with R i and S j representing the rank of X i among { X 1 , , X n } and the rank of Y j among { Y 1 , , Y n } , respectively, and C i denoting the number of bivariate observations ( X j , Y j ) for which X j X i and Y j Y i .
We now state Hoeffding’s Lemma [26]: Let X and Y be random variables with joint distribution function H ( x , y ) and marginal distributions F ( x ) and G ( y ) . If E ( X Y ) and E ( X ) E ( Y ) are finite, then
Cov ( X , Y ) = [ H ( x , y ) F ( x ) G ( y ) ] d x d y .
This result became known when it was cited by [27]. Refs. [28,29] discussed multivariate versions of this lemma.
The correlation coefficient is thus given by
Cor ( X , Y ) = [ H ( x , y ) F ( x ) G ( y ) ] d x d y V a r ( X ) V a r ( Y )
or
Cor ( X , Y ) = [ C ( F ( x ) , G ( y ) ) F ( x ) G ( y ) ] d x d y V a r ( X ) V a r ( Y ) ,
with (63) resulting from Sklar’s theorem.
Invoking Hoeffding’s lemma, Hofert et al. (2019) [16] (p. 47) pointed out two fallacies about the uniqueness and independence of random variables. Hoeffding appealed to his lemma to identify the bivariate distributions with given marginal distribution functions F ( x ) and G ( y ) , which minimize or maximize the correlation between X and Y.

Hoeffding’s Φ2

Hoeffding (1940) [26] defined the stochastic dependence index of the random variables X and Y as
Φ X , Y 2 = 90 0 1 0 1 ( C ( u , v ) u v ) 2 d u d v ,
where
Φ X , Y 2 = 0 in the case of independence sin ce then C ( u , v ) = u v , 1 in the case of monotone dependence , Φ 2 ( 0 , 1 ) otherwise .
Hoeffding (1940) [26] showed that Φ X , Y 2 takes the value one in the cases of monotonically increasing and monotonically decreasing continuous functional dependence; it is otherwise less than one and greater than zero.
Let X 1 , , X n be a simple random sample generated from the two-dimensional random vector X whose distribution function and copula are denoted by H ( · ) and C ( · ) , respectively, and assumed to be unknown. The copula C is then estimated by the empirical copula C ^ n , which is defined as
C ^ n ( u ) = 1 n j = 1 n i = 1 2 I ( U ^ i j u i ) for u = ( u 1 , u 2 ) 1 2 ,
with the pseudo-observations U ^ i j = F ^ i ( X i j ) for i = 1 , 2 , and j = 1 , , n and F ^ i ( x ) = 1 n j = 1 n I ( X i j x ) , x R . Since U ^ i j = 1 n (rank of X i j in X i 1 , , X i n ), statistical inference is based on the ranks of the observations.
A nonparametric estimator of Φ 2 is then obtained by replacing the copula C ( · ) in (64) by the empirical copula C ^ n ( · ) , i.e.,
Φ ^ n 2 : = Φ 2 ( C ^ n ) = 90 1 2 { C ^ n ( u ) Π ( u ) } 2 d u ,
where Π ( u ) = u 1 u 2 denotes the independence copula.
As explained in [30], this estimator can be evaluated as follows:
Φ ^ n 2 = 90 1 n 2 j = 1 n k = 1 n i = 1 2 ( 1 max { U ^ i j , U ^ i k } ) 1 2 n j = 1 n i = 1 2 ( 1 U ^ i j 2 ) + 1 3 2 .
The asymptotic distribution of Φ ^ n 2 can be deduced from the asymptotic behavior of the empirical copula process which, for instance, has been discussed by [31,32,33].
The quantity Φ X , Y 2 was introduced by [34] without the normalizing factor 90, as a distribution-free statistic for testing the independence of X and Y.
Referring to [12], Nelsen (2006) [5] (p. 210) states that “… any L p distance should yield a symmetric nonparametric measure of dependence”. For any p , 1 < p < , the L p distance between the copula C ( · ) and the product copula Π ( · ) is given by the following expression:
k p 1 2 | C ( u , v ) u v | p d u d v 1 p ,
where k p is the normalizing factor. On letting p = 2 , one obtains Φ X , Y .

6. Illustrative Examples

In order to compare the measures of association discussed in the previous sections, five two-dimensional data sets exhibiting different patterns that will be identified by the letters A, B, C, D and E, are considered. The first one was linearly decreasing, in which case Pearson’s correlation ought to be the most appropriate coefficient. The strictly monotonic pattern of the second set ought to be readily detected by Spearman’s, Kendall’s and Blomqvist’s coefficients, whose performance is assessed when applied to the fourth pattern, which happens to be piecewise monotonic. In the case of patterns C and E, whose points exhibit distinctive patterns, Hoeffding’s measure of dependence is expected to be more suitable than any of the other measures of association.
First, 500 random values of x, denoted by S , were generated within the interval ( 3 , 3 ) . Now, let
  • f A ( x ) = x / 5 + 1 + ϵ ,
  • f B ( x ) = x 5 + ϵ ,
  • f C ( x ) = sin ( x ) + ϵ ,
  • f D ( x ) = | x 3 / 2 | + ϵ and
  • f E = tan ( x ) 3 + ϵ ,
where ϵ represents a slight perturbation consisting of a multiple of random values generated from a uniform distribution on the interval [ 1 , 1 ] . The five resulting data sets, A = { ( x , f A ( x ) ) | x S } , B = { ( x , f B ( x ) ) | x S } , C = { ( cos ( x ) , f C ( x ) ) | x S } , D = { ( x , f D ( x ) ) | x S } and E = { ( x , f E ( x ) ) | x S } are plotted in Figure 1, Figure 2, Figure 3, Figure 4 and Figure 5.
We then evaluated Spearman’s, Kendall’s, Blomqvist’s and Hoeffding’s statistics, as well as Pearson’s sample correlation coefficient for each data set. Their numerical values and associated p-values are reported in Table 1.
Hoeffding’s statistic strongly rejects the null hypothesis of independence since the p-values are all virtually equal to zero. This correctly indicates that, in all five cases, the variables are functionally related.
As anticipated, Pearson’s correlation coefficient is larger in absolute value in the case of a linear relationship (data set A) with a value of 0.9964 , than in the case of a monotonic relationship (data set B) with a value of 0.8207 .
Spearman’s, Kendall’s and Blomqvist’s statistics readily detect the monotonic relationships that data sets A and B exhibit. Interestingly, in the case of data set D, which happens to be monotonically increasing and then decreasing, at the 5% significance level, both Spearman’s and Kendall’s statistics manage to reject the independence assumption.

7. Multivariate Measures of Association

7.1. Blomqvist’s β

Consider the random vector ( X 1 , X 2 , , X n ) whose joint distribution function is F ( x ) = P ( X 1 x 1 , X 2 x 2 , , X n x n ) and marginal continuous distribution functions are F i ( x i ) = P ( X i x i ) for x i R , i { 1 , 2 , , n } . We now state Sklar’s Theorem for the multivariate case:
Let F be an n-dimensional continuous distribution function with continuous marginal distribution functions ( F 1 , F 2 , , F n ) . Then, there exists a unique n-copula C : 1 n 1 such that
F ( x 1 , x 2 , , x n ) = C ( F 1 ( x 1 ) , F 2 ( x 2 ) , , F n ( x n ) ) .
Conversely, if C is an n-copula and F 1 , F 2 , , F n are continuous distribution functions, then the function F is an n-dimensional distribution function with marginal distribution functions ( F 1 , F 2 , , F n ) [5] (Theorem 2.10.9, p. 46).
Clearly, the copula C ( · ) in Equation (69) is the joint distribution function of the random variables U i = F i ( x i ) , i { 1 , 2 , , n } . Observe that C ( u ) = P ( U u ) = F ( F 1 1 ( u 1 ) , F 2 1 ( u 2 ) , , F n 1 ( u n ) ) for all u = ( u 1 , , u n ) 1 n .
Letting W n ( u ) = max ( u 1 + u 2 + + u n n + 1 , 0 ) and M n ( u ) = min ( u 1 , u 2 , , u n ) , the Fréchet–Hoeffding inequality,
W n ( u ) C ( u ) M n ( u ) ,
provides lower and upper bounds for any copula. This inequality is attributed to [26,35]. We note that a related result appeared in [36].
The Fréchet–Hoeffding upper bound is a copula when the random variables are perfectly positively dependent, i.e., they are comonotonic. However, the lower bound is a copula only in the bivariate case [8].
Blomqvist’s β , as given in Equation (52), can also be expressed as
β = C ( 1 / 2 , 1 / 2 ) Π ( 1 / 2 , 1 / 2 ) + C ¯ ( 1 / 2 , 1 / 2 ) Π ¯ ( 1 / 2 , 1 / 2 ) M ( 1 / 2 , 1 / 2 ) Π ( 1 / 2 , 1 / 2 ) + M ¯ ( 1 / 2 , 1 / 2 ) Π ¯ ( 1 / 2 , 1 / 2 ) .
where Π n ( u ) = u 1 u 2 u n , and the survival function C ¯ ( u ) = P ( U > u ) . When C is a copula involving n random variables, Equation (71) can be generalized as follows:
β = C ( 1 / 2 ) Π ( 1 / 2 ) + C ¯ ( 1 / 2 ) Π ¯ ( 1 / 2 ) M ( 1 / 2 ) Π ( 1 / 2 ) + M ¯ ( 1 / 2 ) Π ¯ ( 1 / 2 ) = k n ( C ( 1 / 2 ) + C ¯ ( 1 / 2 ) 2 1 n ) ,
where 1 / 2 = ( 1 / 2 , 1 / 2 , , 1 / 2 ) , k n = 2 n 1 2 n 1 1 , Π ( 1 / 2 ) = 2 n , and M ( 1 / 2 ) = 1 / 2 . When n = 2 , one has C ( 1 / 2 , 1 / 2 ) = C ¯ ( 1 / 2 , 1 / 2 ) for any copula; however, this is not the case for n 3 . The coefficient β can be interpreted as the normalized average distance between the copula C and the independence copula Π .
Ref. [37] utilized the multivariate Blomqvist measure of dependence to analyze main GDP (gross domestic product) aggregates per capita in the European Union, Germany and Portugal for the period 2008–2019.

7.2. Spearman’s ρ

In the bivariate case, Spearman’s rank correlation can be expressed as
ρ S = E ( U V ) E ( U ) E ( V ) Var ( U ) Var ( V ) ,
where U and V are uniformly distributed, so that E ( U ) = E ( V ) = 1 / 2 and Var ( U ) = Var ( V ) = 1 / 12 . As previously established,
ρ S = 0 1 0 1 u v d C ( u , v ) ( 1 / 2 ) 2 1 / 12 = 12 0 1 0 1 C ( u , v ) d u d v 3 ,
where C is the joint distribution of ( U , V ) .
Employing the same notation as in the previous section, we now present different versions of Spearman’s ρ for the multivariate case:
(i)
Kendall (1970) [38]:
ρ S 4 = h 2 ( 4 i < j ( C 2 n ) 1 [ 0 , 1 ] n C i j ( u , v ) d u d v 1 ) ;
(ii)
Ruymgaart and van Zuijlen (1978) [39]:
ρ S 1 = h n ( 2 n [ 0 , 1 ] n C ( u ) d u 1 ) ;
(iii)
Joe (1990) [40]:
ρ S 2 = h n ( 2 n [ 0 , 1 ] n Π ( u ) d C ( u ) 1 ) ;
(iv)
Nelsen (2002) [2]:
ρ S 3 = h n { 2 n 1 ( [ 0 , 1 ] n C ( u ) d Π ( u ) + [ 0 , 1 ] n Π ( u ) d C ( u ) 1 ) } ;
where h n = 1 + n 2 n ( 1 + n ) , C 2 n = n ! 2 ! ( n 2 ) ! , and u = ( u 1 , u 2 , , u n ) .
We observe that ρ S 3 appears in [41] (p. 227) as a measure of average upper and lower orthant dependence, and that ρ S 4 constitutes the population version of the weighted average pairwise Spearman’s rho given in Chapter 6 of [38], where C i j ( u , v ) is the bivariate marginal copula [6] (p. 22).
As obtained by [41] (p. 228), a lower bound for ρ S i , i { 1 , 2 , 3 } , is given by
2 n ( n + 1 ) ! n ! { 2 n ( n + 1 ) } for n 2 .
For n = 3 , this lower bound is at least equal to 4 / 3 , and for n = 2 , we have ρ S 1 = ρ S 2 = ρ S 4 . As noted by [42] (p. 787), the aforementioned lower bound may fail to be the best possible.
Spearman’s rank correlation can also be expressed as follows for the bivariate case:
ρ S = [ 0 , 1 ] 2 C ( u , v ) d u d v [ 0 , 1 ] 2 Π ( u , v ) d u d v [ 0 , 1 ] 2 M ( u , v ) d u d v [ 0 , 1 ] 2 Π ( u , v ) d u d v
= [ 0 , 1 ] 2 u v d C ( u , v ) [ 0 , 1 ] 2 u v d Π ( u , v ) [ 0 , 1 ] 2 u v d M ( u , v ) [ 0 , 1 ] 2 u v d Π ( u , v ) ,
where [ 0 , 1 ] 2 M ( u , v ) d u d v = 1 / 3 and [ 0 , 1 ] 2 Π ( u , v ) d u d v = 1 / 4 . It is readily seen that representation (79) coincides with that given in (74). The coefficient ρ S can be interpreted as the normalized average distance between the copula C and the independence copula Π .
Equation (79) suggests the following natural generalization for the multivariate case:
ρ S = [ 0 , 1 ] n C ( u ) d u [ 0 , 1 ] n Π ( u ) d u [ 0 , 1 ] n M ( u ) d u [ 0 , 1 ] n Π ( u ) d u ,
which, incidentally, agrees with the representation specified in Equation (76).
For instance, Liebscher (2021) [43] made use of the multivariate Spearman measure of correlation to determine the dependence of a response variable on a set of regressor variables in a nonparametric regression model.

7.3. Kendall’s τ

Joe (1990) [40] provides the following representation of Kendall’s τ for the multivariate case:
τ n , c = ( 2 n 1 1 ) 1 { 2 n [ 0 , 1 ] n C ( u ) d C ( u ) 1 } ,
which also appears in [5] (p. 231) as measure of average multivariate total positivity. In fact, Equation (82) generalizes Equation (44), i.e.,
τ 2 , c = 4 0 1 0 1 C ( u 1 , u 2 ) d C ( u 1 , u 2 ) 1 .
Nelsen (1996) [41] also notes that a lower bound for τ n , c is given by
( 2 n 1 1 ) 1 sin ce [ 0 , 1 ] n C ( u ) d C ( u ) 0 .
As shown by [42] (Theorem 5.1), and reported by [44] (p. 218), this lower bound is attained if at least one of the bivariate margins of the copula C equals W (the Fréchet–Hoeffding lower bound).
Kendall and Babington Smith (1940) [45] introduced an extension of Kendall’s τ as a coefficient of agreement among n 2 rankings. Another generalization is proposed in [46].
A test based on the nonparametric estimator of the multivariate extension of Kendall’s τ was utilized in [47] to establish links between innovation and higher education in certain regions.

7.4. Hoeffding’s Φ2

Using the same notation as in the previous sections, one can express Hoeffding’s dependence index as follows for the bivariate case:
Φ X , Y 2 = 90 0 1 0 1 { C ( u , v ) u v } 2 d u d v = 90 [ 0 , 1 ] 2 { C ( u , v ) Π ( u , v ) } 2 d u d v ,
where
Φ X , Y 2 = 0 in the case of stochastic independence , 1 in the case of monotone dependence , ( 0 , 1 ) , otherwise .
Observe that Φ X , Y 2 = 0 if and only if C = Π .
For the multivariate case, Φ 2 is defined as
Φ 2 = h n [ 0 , 1 ] n { C ( u ) Π ( u ) } 2 d u ,
where h n = [ [ 0 , 1 ] n { M ( u ) d u Π ( u ) d u } 2 ] 1 is the normalizing constant.
Gaißer et al. (2010) [30] determined that the inverses of the normalizing constants for the upper and lower bounds are, respectively, given by
( h n ) 1 = [ 0 , 1 ] n { M ( u ) d u Π ( u ) } 2 d u = 2 ( n + 1 ) ( n + 2 ) 2 n n ! i = 0 n ( i + 1 / 2 ) + ( 1 / 3 ) n
and
( g n ) 1 = [ 0 , 1 ] n { W ( u ) d u Π ( u ) } 2 d u = 2 ( n + 2 ) ! 2 i = 0 n C i n ( 1 ) i 1 ( n + 1 + i ) ! + ( 1 / 3 ) n ,
where C i n = n ! i ! ( n i ) ! .
For instance, Medovikov and Prokhorov (2017) [48] made use of Hoeffding’s multivariate index to determine the dependence structure of financial assets and evaluate the risk of contagion.

7.5. Note

As pointed out by [49], Spearman’s ρ and Blomqvist’s β can be expressed as follows:
k n ( C ) = α n [ 0 , 1 ] n ( C + σ * ) d μ n 1 2 n 1 ,
where μ n is a probability measure on [ 0 , 1 ] n , whereas Kendall’s τ has the following representation:
τ n ( C ) = α n [ 0 , 1 ] n C d C 1 2 n ,
with
α n = ( 1 + n ) 2 n 1 2 n ( 1 + n ) , for Spearman s rho , α n = 2 n 1 2 n 1 1 , for Blomqvist s beta ,
and
α n = 2 n 2 n 1 1 , for Kendall s tau .

8. Conclusions

Bivariate and multivariate measures of dependence originally due to Spearman, Kendall, Blomqvist and Hoeffding, as well as related results of interest such as their sample estimators and representations in terms of copulas, were discussed in this paper. Various recent applications were also pointed out. Additionally, a numerical study corroborated the effectiveness of these coefficients of association in assessing dependence with respect to five sets of generated data exhibiting various patterns.
A potential avenue for future research would consist in studying matrix-variate rank correlation measures as was achieved very recently by [50] for the case of Kendall’s τ .

Author Contributions

Conceptualization, M.A. and S.B.P.; methodology, M.A. and S.B.P.; software, S.B.P. and Y.Z.; validation, M.A. and S.B.P.; formal analysis, M.A., S.B.P. and Y.Z.; investigation, M.A., S.B.P. and Y.Z.; resources, M.A. and S.B.P.; writing—original draft preparation, M.A. and S.B.P.; writing—review and editing, M.A. and S.B.P.; visualization, S.B.P. and Y.Z.; supervision, M.A. and S.B.P.; project administration, M.A. and S.B.P. All authors have read and agreed to the published version of the manuscript.

Funding

This research was funded by the Natural Sciences and Engineering Research Council of Canada, grant number R0610A.

Data Availability Statement

No new data were created or analyzed in this study.

Acknowledgments

We would like to express our sincere thanks to two reviewers for their insightful comments and suggestions. The financial support of the Natural Sciences and Engineering Research Council of Canada is gratefully acknowledged by the second author.

Conflicts of Interest

The authors declare no conflicts of interest.

References

  1. Sklar, A. Fonctions de répartition à n dimensions et leurs marges. Publ. l’Institut Stat. l’Université Paris 1959, 8, 229–231. [Google Scholar]
  2. Nelsen, R.B. Concordance and copulas: A survey. In Distributions with Given Marginals and Statistical Modelling; Cuadras, C.M., Fortiana, J., Rodriguez-Lallena, J.A., Eds.; Kluwer Academic Publishers: London, UK, 2002; pp. 169–177. [Google Scholar]
  3. Deheuvels, P. La fonction de dépendance empirique et ses propriétés, un test non-paramétrique d’indépendance. Bullettin l’Académie R. Belg. 1979, 65, 274–292. [Google Scholar] [CrossRef]
  4. Joe, H. Multivariate Models and Dependence Concepts; Chapman & Hall: Boca Raton, FL, USA, 1997. [Google Scholar]
  5. Nelsen, R.B. An Introduction to Copulas, 2nd ed.; Springer: New York, NY, USA, 2006. [Google Scholar]
  6. Cherubini, U.; Fabio, G.; Mulinacci, S.; Romagno, S. Dynamic Copula Methods in Finance; John Wiley & Sons: New York, NY, USA, 2012. [Google Scholar]
  7. Kojadinovic, I.; Yan, J. Comparison of three semiparametric methods for estimating dependence parameters in copula models. Math. Econ. 2010, 47, 52–63. [Google Scholar] [CrossRef]
  8. Schmid, F.; Schmidt, R. Nonparametric inference on multivariate versions of Blomqvist’s beta and related measures of tail dependence. Metrika 2007, 66, 323–354. [Google Scholar] [CrossRef]
  9. Genest, C.; Ghoudi, K.; Rivest, L.-P. A semiparametric estimation procedure of dependence parameters in multivariate families of distribution. Biometrika 1995, 82, 543–552. [Google Scholar] [CrossRef]
  10. Dias, A. Maximum pseudo-likelihood estimation in copula models for small weakly dependent samples. arXiv 2022, arXiv:2208.01322. [Google Scholar]
  11. Kerman, J. A closed-form approximation for the median of the beta distribution. arXiv 2011, arXiv:1111.0433. [Google Scholar]
  12. Schweizer, B.; Wolff, E.F. On nonparametric measures of dependence for random variables. Ann. Stat. 1981, 9, 879–885. [Google Scholar] [CrossRef]
  13. Rényi, A. On measures of dependence. Acta Math. Hung. 1959, 10, 441–451. [Google Scholar] [CrossRef]
  14. Lai, C.D.; Balakrishnan, N. Continuous Bivariate Distributions, 2nd ed.; Springer: New York, NY, USA, 2009. [Google Scholar]
  15. Gibbons, J.D.; Chakraborti, S. Nonparametric Statistical Inference, 4th ed.; Revised and Expanded; Statistics Textbooks and Monographs; Marcel Dekker, Inc.: New York, NY, USA, 2003; Volume 168. [Google Scholar]
  16. Hofert, M.; Kojadinovic, I.; Mächler, M.; Yan, J. Elements of Copula Modeling with R; Springer: New York, NY, USA, 2019. [Google Scholar]
  17. Kendall, M.G. A new measure of rank correlation. Biometrika 1938, 30, 81–93. [Google Scholar] [CrossRef]
  18. Kruskal, W.H. Ordinal measures of association. J. Am. Stat. Assoc. 1958, 53, 814–861. [Google Scholar] [CrossRef]
  19. Kendall, M.G.; Gibbons, J.D. Rank Correlation Methods, 5th ed.; Griffin: London, UK, 1990. [Google Scholar]
  20. Lipps, G.F. Die Betimmung der Abhängigkeit zwischen den Merkmalen eines Gegeständes. Berichte Königlich Säsischen Ges. Wiss. 1905, 57, 1–32. [Google Scholar]
  21. Lipps, G.F. Die Psychischen Massmethoden; (No. 10); F. Vieweg und Sohn: Braunschweig, Germany, 1906. [Google Scholar]
  22. Fechner, G.T. Kollektivmasslehre; Engelmann: Kogarah, Australia, 1897. [Google Scholar]
  23. Blomqvist, N. On a measure of dependence between two random variables. Ann. Math. Stat. 1950, 21, 593–600. [Google Scholar] [CrossRef]
  24. Genest, C.; Carabarín-Aguirre, A.; Harvey, F. Copula parameter estimation using Blomqvist’s beta. J. Société Française Stat. 2013, 154, 5–24. [Google Scholar]
  25. Mosteller, F. On some useful “inefficient” statistics. Ann. Math. Stat. 1946, 17, 377–408. [Google Scholar] [CrossRef]
  26. Hoeffding, W. Masstabinvariante Korrelationstheorie. Schriften Math. Instituts Instituts Angew. Math. Universität Berlin 2012, 5, 181–233. [Google Scholar]
  27. Lehmann, E.L. Some concepts of dependence. T Ann. Math. Stat. 1966, 37, 1137–1153. [Google Scholar] [CrossRef]
  28. Jogdeo, K. Characterizations of independence in certain families of bivariate and multivariate distributions. Ann. Math. Stat. 1968, 39, 433–441. [Google Scholar] [CrossRef]
  29. Block, H.W.; Fang, Z. A multivariate extension of Hoeffding’s lemma. Ann. Probab. 1988, 16, 1803–1820. [Google Scholar] [CrossRef]
  30. Gaißer, S.; Ruppert, M.; Schmid, F. A multivariate version of Hoeffding’s phi-square. J. Multivar. Anal. 2010, 101, 2571–2586. [Google Scholar] [CrossRef]
  31. Gänssler, P.; Stute, W. Seminar on Empirical Processes; DMV Seminar 9; Birkhäuser: Basel, Switzerland, 1987. [Google Scholar]
  32. Van der Vaart, A.W.; Wellner, J.A. Weak Convergence and Empirical Processes; Springer: New York, NY, USA, 1996. [Google Scholar]
  33. Tsukahara, H. Semiparametric estimation in copula models. Can. J. Stat. 2005, 33, 357–375. [Google Scholar] [CrossRef]
  34. Blum, J.R.; Kiefer, J.; Rosenblatt, M. Distribution free tests of independence based on the sample distribution function. Ann. Math. Stat. 1961, 32, 485–498. [Google Scholar] [CrossRef]
  35. Fréchet, M. Sur les tableaux de corrélations dont les marges sont données. Ann. L’Université Lyon 1951, 9, 53–77. [Google Scholar]
  36. Fréchet, M. Généralisations du théorème des probabilités totales. Fundam. Math. 1935, 25, 379–387. [Google Scholar] [CrossRef]
  37. Ferreira, H.; Ferreira, M. Multivariate medial correlation with applications. Depend. Model. 2020, 8, 361–372. [Google Scholar] [CrossRef]
  38. Kendall, M.G. Rank Correlation Methods; Griffin: London, UK, 1970. [Google Scholar]
  39. Ruymgaart, F.H.; van Zuijlen, M.C.A. Asymptotic Normality of Multivariate Linear Rank Statistics in the Non-I.I.D. Case. Ann. Stat. 1978, 6, 588–602. [Google Scholar] [CrossRef]
  40. Joe, H. Multivariate Concordance. J. Multivar. Anal. 1990, 35, 12–30. [Google Scholar] [CrossRef]
  41. Ne1sen, R.B. Nonparametric measures of multivariate association. In Distributions with Fixed Marginals and Related Topics; Rüschendorf, L., Schweizer, B., Taylor, M.D., Eds.; Institute of Mathematical Statistics: Hayward, CA, USA, 1996; pp. 223–232. [Google Scholar]
  42. Úbeda-Flores, M. Multivariate versions of Blomqvist’s Beta and Spearman’s footrule. Ann. Inst. Stat. Math. 2005, 57, 781–788. [Google Scholar] [CrossRef]
  43. Liebscher, E. On a multivariate version of Spearman’s correlation coefficient for regression: Properties and Applications. Asian J. Stat. Sci. 2021, 1, 123–150. [Google Scholar]
  44. Schmid, F.; Schmidt, R.; Blumentritt, T.; Gaißer, S.; Ruppert, M. Copula-Based Measures of Multivariate Association in Copula Theory and Its Applications. In Lecture Notes in Statistics; Jaworski, P., Durante, F., Härdle, W., Rychlik, T., Eds.; Springer: Berlin, Germany, 2010; Volume 198. [Google Scholar]
  45. Kendall, M.G.; Babington Smith, B. On the method of paired comparisons. Biometrika 1940, 31, 324–345. [Google Scholar] [CrossRef]
  46. Genest, C.; Nešlehová, J.; Ghorbal, N.B. Estimators based on Kendall’s tau in multivariate copula models. Aust. N. Z. J. Stat. 2011, 53, 157–177. [Google Scholar] [CrossRef]
  47. Ascorbebeitia, J.; Ferreira, E.; Orbe, S. Testing conditional multivariate rank correlations: The effect of institutional quality on factors influencing competitiveness. TEST 2022, 31, 931–949. [Google Scholar] [CrossRef]
  48. Medovikov, I.; Prokhorov, A. A new measure of vector dependence, with applications to financial risk and contagion. J. Financ. Econom. 2017, 15, 474–503. [Google Scholar] [CrossRef]
  49. Taylor, M.D. Multivariate measures of concordance for copulas and their marginals. Depend. Model. 2016, 4, 224–236. [Google Scholar] [CrossRef]
  50. McNeil, A.J.; Nešlehová, J.G.; Smith, A.D. On Attainability of Kendall’s Tau Matrices and Concordance Signatures. J. Multivar. Anal. 2022, 191, 105033. [Google Scholar] [CrossRef]
Figure 1. Plot of data set A.
Figure 1. Plot of data set A.
Appliedmath 04 00019 g001
Figure 2. Plot of data set B.
Figure 2. Plot of data set B.
Appliedmath 04 00019 g002
Figure 3. Plot of data set C.
Figure 3. Plot of data set C.
Appliedmath 04 00019 g003
Figure 4. Plot of data set D.
Figure 4. Plot of data set D.
Appliedmath 04 00019 g004
Figure 5. Plot of data set E.
Figure 5. Plot of data set E.
Appliedmath 04 00019 g005
Table 1. Five statistics and their associated p-values.
Table 1. Five statistics and their associated p-values.
Statistics and p-ValuesABCDE
Spearman{ 0.9963 , 0 }{ 0.9218 , 0 }{0.0022, 0.9602}{0.1028, 0.0215}{ 0.0745 , 0.0961 }
Kendall{ 0.9456 , 0 }{ 0.8072 , 0 }{0.0071, 0.8136}{0.0919, 0.0021}{ 0.0350 , 0.2419 }
Blomqvist{ 0.9600 , 0 }{ 0.6160 , 0 }{0.0320, 0.4209}{0.0720, 0.0891}{0.0640, 0.1283}
Hoeffding{0.8679, 0}{0.5302, 0}{0.0472, 0}{0.1902, 0}{0.0104, 0}
Pearson{ 0.9964 , 0 }{ 0.8207 , 0 }{0.0202, 0.6529}{0.0555, 0.2152}{ 0.0092 , 0.8377 }
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

Share and Cite

MDPI and ACS Style

Adès, M.; Provost, S.B.; Zang, Y. Four Measures of Association and Their Representations in Terms of Copulas. AppliedMath 2024, 4, 363-382. https://0-doi-org.brum.beds.ac.uk/10.3390/appliedmath4010019

AMA Style

Adès M, Provost SB, Zang Y. Four Measures of Association and Their Representations in Terms of Copulas. AppliedMath. 2024; 4(1):363-382. https://0-doi-org.brum.beds.ac.uk/10.3390/appliedmath4010019

Chicago/Turabian Style

Adès, Michel, Serge B. Provost, and Yishan Zang. 2024. "Four Measures of Association and Their Representations in Terms of Copulas" AppliedMath 4, no. 1: 363-382. https://0-doi-org.brum.beds.ac.uk/10.3390/appliedmath4010019

Article Metrics

Back to TopTop