Type of matrix in probability theory and statistics
In probability theory and statistics , a cross-covariance matrix is a matrix whose element in the i , j position is the covariance between the i -th element of a random vector and j -th element of another random vector. When the two random vectors are the same, the cross-covariance matrix is referred to as covariance matrix . A random vector is a random variable with multiple dimensions. Each element of the vector is a scalar random variable. Each element has either a finite number of observed empirical values or a finite or infinite number of potential values. The potential values are specified by a theoretical joint probability distribution . Intuitively, the cross-covariance matrix generalizes the notion of covariance to multiple dimensions.
The cross-covariance matrix of two random vectors X {\displaystyle \mathbf {X} } and Y {\displaystyle \mathbf {Y} } is typically denoted by K X Y {\displaystyle \operatorname {K} _{\mathbf {X} \mathbf {Y} }} or Σ X Y {\displaystyle \Sigma _{\mathbf {X} \mathbf {Y} }} .
For random vectors X {\displaystyle \mathbf {X} } and Y {\displaystyle \mathbf {Y} } , each containing random elements whose expected value and variance exist, the cross-covariance matrix of X {\displaystyle \mathbf {X} } and Y {\displaystyle \mathbf {Y} } is defined by[ 1] : 336
K X Y = cov ( X , Y ) = d e f E [ ( X − μ X ) ( Y − μ Y ) T ] {\displaystyle \operatorname {K} _{\mathbf {X} \mathbf {Y} }=\operatorname {cov} (\mathbf {X} ,\mathbf {Y} ){\stackrel {\mathrm {def} }{=}}\ \operatorname {E} [(\mathbf {X} -\mathbf {\mu _{X}} )(\mathbf {Y} -\mathbf {\mu _{Y}} )^{\rm {T}}]} Eq.1
where μ X = E [ X ] {\displaystyle \mathbf {\mu _{X}} =\operatorname {E} [\mathbf {X} ]} and μ Y = E [ Y ] {\displaystyle \mathbf {\mu _{Y}} =\operatorname {E} [\mathbf {Y} ]} are vectors containing the expected values of X {\displaystyle \mathbf {X} } and Y {\displaystyle \mathbf {Y} } . The vectors X {\displaystyle \mathbf {X} } and Y {\displaystyle \mathbf {Y} } need not have the same dimension, and either might be a scalar value.
The cross-covariance matrix is the matrix whose ( i , j ) {\displaystyle (i,j)} entry is the covariance
K X i Y j = cov [ X i , Y j ] = E [ ( X i − E [ X i ] ) ( Y j − E [ Y j ] ) ] {\displaystyle \operatorname {K} _{X_{i}Y_{j}}=\operatorname {cov} [X_{i},Y_{j}]=\operatorname {E} [(X_{i}-\operatorname {E} [X_{i}])(Y_{j}-\operatorname {E} [Y_{j}])]} between the i -th element of X {\displaystyle \mathbf {X} } and the j -th element of Y {\displaystyle \mathbf {Y} } . This gives the following component-wise definition of the cross-covariance matrix.
K X Y = [ E [ ( X 1 − E [ X 1 ] ) ( Y 1 − E [ Y 1 ] ) ] E [ ( X 1 − E [ X 1 ] ) ( Y 2 − E [ Y 2 ] ) ] ⋯ E [ ( X 1 − E [ X 1 ] ) ( Y n − E [ Y n ] ) ] E [ ( X 2 − E [ X 2 ] ) ( Y 1 − E [ Y 1 ] ) ] E [ ( X 2 − E [ X 2 ] ) ( Y 2 − E [ Y 2 ] ) ] ⋯ E [ ( X 2 − E [ X 2 ] ) ( Y n − E [ Y n ] ) ] ⋮ ⋮ ⋱ ⋮ E [ ( X m − E [ X m ] ) ( Y 1 − E [ Y 1 ] ) ] E [ ( X m − E [ X m ] ) ( Y 2 − E [ Y 2 ] ) ] ⋯ E [ ( X m − E [ X m ] ) ( Y n − E [ Y n ] ) ] ] {\displaystyle \operatorname {K} _{\mathbf {X} \mathbf {Y} }={\begin{bmatrix}\mathrm {E} [(X_{1}-\operatorname {E} [X_{1}])(Y_{1}-\operatorname {E} [Y_{1}])]&\mathrm {E} [(X_{1}-\operatorname {E} [X_{1}])(Y_{2}-\operatorname {E} [Y_{2}])]&\cdots &\mathrm {E} [(X_{1}-\operatorname {E} [X_{1}])(Y_{n}-\operatorname {E} [Y_{n}])]\\\\\mathrm {E} [(X_{2}-\operatorname {E} [X_{2}])(Y_{1}-\operatorname {E} [Y_{1}])]&\mathrm {E} [(X_{2}-\operatorname {E} [X_{2}])(Y_{2}-\operatorname {E} [Y_{2}])]&\cdots &\mathrm {E} [(X_{2}-\operatorname {E} [X_{2}])(Y_{n}-\operatorname {E} [Y_{n}])]\\\\\vdots &\vdots &\ddots &\vdots \\\\\mathrm {E} [(X_{m}-\operatorname {E} [X_{m}])(Y_{1}-\operatorname {E} [Y_{1}])]&\mathrm {E} [(X_{m}-\operatorname {E} [X_{m}])(Y_{2}-\operatorname {E} [Y_{2}])]&\cdots &\mathrm {E} [(X_{m}-\operatorname {E} [X_{m}])(Y_{n}-\operatorname {E} [Y_{n}])]\end{bmatrix}}} For example, if X = ( X 1 , X 2 , X 3 ) T {\displaystyle \mathbf {X} =\left(X_{1},X_{2},X_{3}\right)^{\rm {T}}} and Y = ( Y 1 , Y 2 ) T {\displaystyle \mathbf {Y} =\left(Y_{1},Y_{2}\right)^{\rm {T}}} are random vectors, then cov ( X , Y ) {\displaystyle \operatorname {cov} (\mathbf {X} ,\mathbf {Y} )} is a 3 × 2 {\displaystyle 3\times 2} matrix whose ( i , j ) {\displaystyle (i,j)} -th entry is cov ( X i , Y j ) {\displaystyle \operatorname {cov} (X_{i},Y_{j})} .
For the cross-covariance matrix, the following basic properties apply:[ 2]
cov ( X , Y ) = E [ X Y T ] − μ X μ Y T {\displaystyle \operatorname {cov} (\mathbf {X} ,\mathbf {Y} )=\operatorname {E} [\mathbf {X} \mathbf {Y} ^{\rm {T}}]-\mathbf {\mu _{X}} \mathbf {\mu _{Y}} ^{\rm {T}}} cov ( X , Y ) = cov ( Y , X ) T {\displaystyle \operatorname {cov} (\mathbf {X} ,\mathbf {Y} )=\operatorname {cov} (\mathbf {Y} ,\mathbf {X} )^{\rm {T}}} cov ( X 1 + X 2 , Y ) = cov ( X 1 , Y ) + cov ( X 2 , Y ) {\displaystyle \operatorname {cov} (\mathbf {X_{1}} +\mathbf {X_{2}} ,\mathbf {Y} )=\operatorname {cov} (\mathbf {X_{1}} ,\mathbf {Y} )+\operatorname {cov} (\mathbf {X_{2}} ,\mathbf {Y} )} cov ( A X + a , B T Y + b ) = A cov ( X , Y ) B {\displaystyle \operatorname {cov} (A\mathbf {X} +\mathbf {a} ,B^{\rm {T}}\mathbf {Y} +\mathbf {b} )=A\,\operatorname {cov} (\mathbf {X} ,\mathbf {Y} )\,B} If X {\displaystyle \mathbf {X} } and Y {\displaystyle \mathbf {Y} } are independent (or somewhat less restrictedly, if every random variable in X {\displaystyle \mathbf {X} } is uncorrelated with every random variable in Y {\displaystyle \mathbf {Y} } ), then cov ( X , Y ) = 0 p × q {\displaystyle \operatorname {cov} (\mathbf {X} ,\mathbf {Y} )=0_{p\times q}} where X {\displaystyle \mathbf {X} } , X 1 {\displaystyle \mathbf {X_{1}} } and X 2 {\displaystyle \mathbf {X_{2}} } are random p × 1 {\displaystyle p\times 1} vectors, Y {\displaystyle \mathbf {Y} } is a random q × 1 {\displaystyle q\times 1} vector, a {\displaystyle \mathbf {a} } is a q × 1 {\displaystyle q\times 1} vector, b {\displaystyle \mathbf {b} } is a p × 1 {\displaystyle p\times 1} vector, A {\displaystyle A} and B {\displaystyle B} are q × p {\displaystyle q\times p} matrices of constants, and 0 p × q {\displaystyle 0_{p\times q}} is a p × q {\displaystyle p\times q} matrix of zeroes.
Definition for complex random vectors [ edit ] If Z {\displaystyle \mathbf {Z} } and W {\displaystyle \mathbf {W} } are complex random vectors, the definition of the cross-covariance matrix is slightly changed. Transposition is replaced by Hermitian transposition :
K Z W = cov ( Z , W ) = d e f E [ ( Z − μ Z ) ( W − μ W ) H ] {\displaystyle \operatorname {K} _{\mathbf {Z} \mathbf {W} }=\operatorname {cov} (\mathbf {Z} ,\mathbf {W} ){\stackrel {\mathrm {def} }{=}}\ \operatorname {E} [(\mathbf {Z} -\mathbf {\mu _{Z}} )(\mathbf {W} -\mathbf {\mu _{W}} )^{\rm {H}}]} For complex random vectors, another matrix called the pseudo-cross-covariance matrix is defined as follows:
J Z W = cov ( Z , W ¯ ) = d e f E [ ( Z − μ Z ) ( W − μ W ) T ] {\displaystyle \operatorname {J} _{\mathbf {Z} \mathbf {W} }=\operatorname {cov} (\mathbf {Z} ,{\overline {\mathbf {W} }}){\stackrel {\mathrm {def} }{=}}\ \operatorname {E} [(\mathbf {Z} -\mathbf {\mu _{Z}} )(\mathbf {W} -\mathbf {\mu _{W}} )^{\rm {T}}]} Two random vectors X {\displaystyle \mathbf {X} } and Y {\displaystyle \mathbf {Y} } are called uncorrelated if their cross-covariance matrix K X Y {\displaystyle \operatorname {K} _{\mathbf {X} \mathbf {Y} }} matrix is a zero matrix.[ 1] : 337
Complex random vectors Z {\displaystyle \mathbf {Z} } and W {\displaystyle \mathbf {W} } are called uncorrelated if their covariance matrix and pseudo-covariance matrix is zero, i.e. if K Z W = J Z W = 0 {\displaystyle \operatorname {K} _{\mathbf {Z} \mathbf {W} }=\operatorname {J} _{\mathbf {Z} \mathbf {W} }=0} .