[FDA Book] Chapter 2.1 ~ 2.3 & 8.1 ~ 8.2
2. Tools for exploring functional data
2.1 Introduction
- FDA의 notation과 concept 정의
- FDA에서 사용하는 statistics 정의
- Matrix decompositions, projections, and the constrained maximization of quadratic forms에 대한 자세한 내용은 Appendix 참고
2.2 Some notation
2.2.1 Scalars, vectors, functions and matrices
- $x$ : a vector
$\Rightarrow x_i$ : scalar (the elements of vector $x$) - $x$ : a function
$\Rightarrow x(t)$ : scalar (the values of function $x$)
$\Rightarrow x(\mathbf{t})$ : vector $\mathbf{t}$에 대한 function value ($p$-dim function) - If $x_i$ or $x(t)$ is a vector, we use $\mathbf{x}_i$ or $\mathbf{x}(t)$
- Standard notation을 요약하여 사용
- $\texttt{Temp}$ : a temperature record
- $\texttt{Knee}$ : a knee angle
- $\texttt{LMSSE}$ : a squared error fitting criterion for a linear model
- $\texttt{RSQ}$ : a squared correlation measure.
2.2.2 Derivatives and integrals
- $D$ : operator (함수 $x$를 함수 $Dx$로 변환하는 operator)
- $D^m x$ : the derivative of order m of a function $x$ ($\frac{d^mx}{dt^m}$와 동일)
- $D^0 x$ : $x$
$s.t. \ D^{1}D^{-1}x=D^0x=x$,
when $D^{-1}x$가 $x$의 부정적분(indefinite integral) - $\int x$ : $\int_{a}^{b} x(t)dt$ ($t$의 적분 범위가 clear할 때)
2.2.3 Inner products
-
Inner product for functions
$$ \langle x,y \rangle = \int x(t)y(t)dt $$
-
$L_2$ norm
$$ \lVert x \rVert^2=\langle x,x \rangle = \int x^2(t)dt $$
2.2.4 Functions of functions
-
functional composition (합성함수)
$ x^*=x \circ h $
-
function value
$ x^*(t)=(x \circ h)(t)=x[h(t)] $
-
inverse function $h^{-1}$
$$ (h \circ h^{-1})(t)=(h^{-1} \circ h)(t)=t $$
-
functional transformations operations (or operators) ex) $D$ : $x \rightarrow Dx$
2.3 Summary statistics for functional data
2.3.1 Functional means and variances
-
Mean function
$$ \bar{x}(t)=\frac{1}{N}\sum_{i=1}^{N}x_i(t) $$
-
Variance function
$$ Var_X(t)=\frac{1}{N-1}\sum_{i=1}^{N}[x_i(t)-\bar{x}(t)]^2 $$
2.3.2 Covariance and correlation functions
-
Covariance function
$$ Cov_X(t_1, t_2)=\frac{1}{N-1}\sum_{i=1}^{N}{x_i(t_1)-\bar{x}(t_1)}{x_i(t_2)-\bar{x}(t_2)} $$
-
Correlation function
$$ Corr_X(t_1, t_2)=\frac{Cov_X(t_1, t_2)}{\sqrt{Var_X(t_1)Var_X(t_2)}} $$
2.3.3 Cross-covariance and cross-correlation functions
-
Cross-covariance function
$$ Cov_{X,Y}(t_1, t_2)=\frac{1}{N-1}\sum_{i=1}^{N}{x_i(t_1)-\bar{x}(t_1)}{y_i(t_2)-\bar{y}(t_2)} $$
-
Cross-correlation function
$$ Corr_{X,Y}(t_1, t_2)=\frac{Cov_{X,Y}(t_1, t_2)}{\sqrt{Var_X(t_1)Var_Y(t_2)}} $$
Cantour plots of cross-correlation functions
Cantour plots of correlation functions
8. Principal components analysis for functional data
8.1 Introduction
- 전처리와 시각화 후, data의 특성을 파악하기 위해 PCA를 사용
- Classical multivariate anlaysis에서는 variance-covariance와 correlation을 설명하기 힘든 경우가 많음
- PCA는 유용한 정보를 담고 있는 covariance structure를 파악하는데 도움을 준다.
- PCA를 통해 이후의 분석에서 발생할 수 있는 문제를 사전에 고려할 수 있다. (ex - multicollinearity)
- FPCA(functional PCA)는 smoothing 되어있는 경우에 특성이 더 잘 나타난다. (smoothing 과정에서 regularization issue 발생)
8.2 Defining functional PCA
8.2.1 PCA for multivariate data
Concept of multivariate PCA
-
Linear combination of X
$$ f_i = \sum_{j=1}^{p}\beta_j x_{ij}, \ i=1,…,N $$
where $\beta_j$ : weighting coefficient, $x_{ij}$ : $i$th obs of $j$th variable
-
Vectorized form
$$ f_i = \boldsymbol{\beta}^{'} \mathbf{x}_i, \ i=1,…,N $$
where $ \boldsymbol{\beta}=(\beta_1,…,\beta_p)^{'}, ~ \mathbf x_i=(x_{i1},…,x_{ip})^{'} $
How to find PC
-
Find the weight vector $\xi_1 = (\xi_{11},…,\xi_{p1})^{'}$ for
$$ f_{i1}=\sum_j \xi_{j1}x_{ij}=\boldsymbol{\xi}_1^{'} \mathbf{x}_i $$
s.t. maximize $ \frac{1}{N}\sum_if_{i1}^2 $ subject to $\sum_j \xi_{j1}^2 = \lVert \xi_1 \rVert^2 = 1 $
-
1번 과정을 반복하며 동시에
$$ \sum_j \xi_{jk}\xi_{jm} = \boldsymbol{\xi}_k^{'} \boldsymbol{\xi}_m=0, \ k<m $$
을 만족하는 $ \xi_2,…,\xi_m $을 찾는다.
Summary
- Mean square(변수들 간 variation)를 maximize하는 방향의 unit vector $\boldsymbol{\xi}_1 $을 찾는다.
- 2nd PC부터는 mean square를 maximize함과 동시에 이전 PC loading($ \boldsymbol{\xi}_i $)과 orthogonal한 $ \boldsymbol{\xi}_2,…\boldsymbol{\xi}_k \ (k<p) $을 찾는다.
- Data의 mean을 뺀 후에 PCA를 하는 것이 일반적이다. (Centering $\Rightarrow \max MS(f_{ij}) = \max Var(f_{ij}) $)
- Weight vector $\boldsymbol{\xi}_i$는 unique하지 않다. (Sign change)
- PC score $f_{im}$은 특정 사례 또는 반복실험의 특징 측면에서 변동(variation)의 의미를 설명하는데 도움을 준다.
8.2.2 Defining PCA for functional data
Concept of functional PCA
-
Inner product of integration version is defined by
$$ \int \beta x = \int \beta(s) x(s) ds $$
-
PC score
$$ f_i = \int \beta x_i = \int \beta(s) x_i(s) ds $$
where $\beta$ : weight function
How to find functional PCA
-
Find the weight function $\xi_1(s)$ for
$$ f_i = \int \xi_1(s) x_i(s) ds $$
s.t. maximize $ \frac{1}{N}\sum_i f_{i1}^2 = \frac{1}{N}\sum_i (\int \xi_1 x_i)^2 $
subject to $ \lVert \xi_1 \rVert^2 =\int \xi_1(s)^2ds = \int \xi_1^2 = 1 $ -
1번 과정을 반복하며 동시에
$$ \int \xi_k \xi_m=0, \ k<m $$
을 만족하는 $ \xi_2,…,\xi_m $을 찾는다.
Summary
- Mean square를 maximize하는 방향이고 $ \lVert \xi_1 \rVert^2=1 $인 function $\xi_1(s) $를 찾는다.
- 2nd PC부터는 mean square를 maximize함과 동시에 이전 PC loading($ \xi_1(s) $)와 orthogonal한 $ \xi_2(s),…\xi_k(s) \ (k<p) $을 찾는다.
- Data의 mean을 뺀 후에 PCA를 하는 것이 일반적이다. (Centering $\Rightarrow \max MS = \max Var $)
- Weight function $\xi_i(s)$는 unique하지 않다. (Sign change)
8.2.3 Defining an optimal empirical orthonormal basis
-
We want to find $K$ orthonormal functions $\xi_m$.
-
즉, expansion했을 때 각 curve에 가장 잘 근사하는 $K$개의 orthonormal basis functions를 찾고 싶다!
-
Expansion by the orthonormal basis functions
$$ \hat x_i(t) = \sum_{k=1}^K f_{ik}\xi_k(t), $$
where $ f_{ik} $ is the principal component value $\int x_i\xi_k$
-
Measure of approximation ($\texttt{PCASSE}$)
$$ \texttt{PCASSE} = \sum_{i=1}^N \lVert x_i-\hat{x}_i \rVert^2 $$
where $ \lVert x_i-\hat{x}_i \rVert^2=\int [x(s) - \hat{x}(s)]^2 ds $ (integrated squared error)
-
Optimal orthonormal basis function = weight function $\xi_m$
$$ \xi_m = \arg\min_\xi \texttt{PCASSE} $$
where $\xi_m$ : empirical orthonormal functions
8.2.4 PCA and eigenanalysis
Multivariate PCA
-
Assumtion : $x_{ij}$ is centerized. ($x_{ij} - \frac{1}{N}\sum_i x_{ij}$)
-
Mean square criterian for finding the 1st PC
$$ \max_{\boldsymbol{\xi^{'}\xi}=1}\frac{1}{N} \boldsymbol{\xi^{'}X^{'}X\xi} $$
-
Substitute variance-covariance matrix
$$ \max_{\boldsymbol{\xi^{'}\xi}=1} \boldsymbol{\xi^{'}V\xi} $$
where $\mathbf{V} = N^{-1}\mathbf{X^{'}X} $ is a $p \times p$ sample var-cov matrix
$\Rightarrow$ We can solve maximization problem using eigen decomposition! -
Eigen equation
$$ \boldsymbol{V\xi} = \rho\boldsymbol{\xi} $$
where $ \rho $ is largest eigen value
-
위 식을 풀면 $ (\rho_j,\boldsymbol{\xi}_j) $ pairs가 생기고, 각 $\boldsymbol{\xi}_j$는 orthogonal하다.
-
$\mathbf{V}$ has $\min{p,N-1}$ nonzero eigen values $\rho_j$
$(\because \max(rank(\mathbf{X}))=N-1)$ -
$ \boldsymbol \xi_j $ satisfied maximization problem and the orthogonal constraints ($ \xi_j \perp (\xi_1, \cdots, \xi_{j-1}) $) for $\forall j$
$\Rightarrow$ $\boldsymbol{\xi}$ is a solution of PCA
Functional PCA
-
Assumtion : $x_i(t)$ is centerized. ($x_i(t) - \frac{1}{N}\sum_i x_i(t)$)
-
Covariance function
$$ v(s,t) = \frac{1}{N}\sum_{i=1}^N x_i(s)x_i(t) $$
-
Each of PC weight functions $\xi_j(s)$ satisfies
$$ \int v(s,t)\xi(t) dt = \rho \xi(s) $$
where $LHS$ is an integral transform $V$ of the weight function $\xi$ defined by $ V\xi = \int v(\cdot,t)\xi(t) dt $ (covariance operator $V$)
-
Eigen equation using covariance operator $V$
$$ V\xi = \rho\xi $$
where $\xi$ is an eigen function
Difference between multivariate and functional eigen analysis problems
- $\max${# of different eigen pairs}가 다르다
- multivariate : # of variables = $p$
- functional : # of functions = $\infty$ ($\because$ smoothed)
but if $x_i$ are linearly independent, $rank(V)=N-1$ and only $N-1$ nonzero eigen values exist.
Summary - Comparison between MPCA and FPCA
Multivariate PCA
-
PC score
$$ f_i = \sum_{j=1}^p \xi_j x_{ij} $$
-
Objective function
$$ Var(\xi_j) = \frac{1}{N}\sum_i f_{ij}^2 $$
-
Constraints
$$ \lVert\xi_i\rVert^2= \sum_j \xi_{ji}^2 = 1 $$
$$ \sum_j \xi_{jk}\xi_{jm}=\boldsymbol{\xi}_k^{'}\boldsymbol{\xi}_m=0 $$
Functional PCA
-
PC score
$$ f_i = \int \xi(s) x_i(s) ds $$
-
Objective function
$$ \frac{1}{N}\sum_i f_{ij}^2 = \frac{1}{N}\sum_i (\int \xi_j x_i)^2 $$
-
Constarints
$$ \lVert\xi_i\rVert^2=\int \xi_i(s)^2 ds = 1 $$
$$ \int \xi_k\xi_m=0 $$
Reference
- Ramsay. & Silverman. (2005), Functional Data Analysis 2nd edition. Springer