Are there any relationships between PLS and Krylov subspace methods?

 

----A question posted by Jinbo Bi

 

 

A drug mining problem --- docking problem

 

Leads                                              Receptors

 

 

 

 

 


                                                             

 

 

 

 

 

DATA:

 

 

y1    y2    ...     yl

x1    x2     x3     x4     x5     …        xn

Lead1

 

Lead2

 

Lead3

 

*

*

*

 

LeadN

1.2   1.3   …   1.4

 

2.1   2.2   …   2.4

 

3.1   3.2   …   3.4

 

*

*

*

 

3.1   3.2         3.4

0.23  0.14  0.4    0.31  0.2    …    0.176

 

0.33  0.12  0.4    0.32  0.5    …    0.177

 

0.23  0.14  0.4    0.31  0.2    …    0.176

 

*

*

*

 

0.33  0.12  0.4    0.32  0.5    …    0.177

 

Y

X

 

 

 

 

*

*

 

 

 

 

Ordinary Least Squares

 

 

 

 

 

 

Principal Component Analysis (PCA)

 

 

 

This is a singular value decomposition

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 


Each p (loading) vector is actually a projection operator, and the t (score) vector is the coordinate of each point along these projection directions p's.

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 


What are those p vectors?  What are those q vectors?  And what are in the diagonal of the diagonal matrix S?

 

 

 

So p’s are the eigenvectors of matrix , and q’s are eigenvectors of matrix , diagonal elements of S are sqrt of eigenvalues.

 

 

 

Principal Component Regression

 

Cut off some eigenvectors corresponding to smaller eigenvalues, i.e., use fewer rank-one matrices to approximate original data X.

 

 

 

 

 

 

 

 


 

 

 

 

Partial Least Squares

 

Algorithm:

1.   do PCA for matrix X, find score Tx,

2.   do PCA for matrix Y, find score Ty,

3.   fit regression model between Tx and Ty, like

 

A real-life algorithm: (iteratively finding t’s and p’s for X and Y)

For each h=1,…,c, where A0=X'Y, M0=X'X, C0=I, and given,

1.   compute qh, the dominant eigenvector of AhTAh

2.   wh=ChAhqh, wh=wh/||wh||, and store wh into W as a column

3.   ph=Mhwh, ch=whTMhwh, ph=ph/ch, and store ph into P as a column

4.   qh=AhTwh/ch, and store qh into Q as a column

5.   Ah+1=Ah - chphqhT and Bh+1=Mh - chphphT

6.   Ch+1=Ch - whphT

A theorem is:

 

Consider one dependent y,

where

 

 

Krylov Subspace Methods

 

It is an equation solver, i.e., an iterative algorithm to solve an equation system like

 

 

A kth Krylov subspace generated by a vector c is like

 

 

The method is to find solutions of the following problem within the kth Krylov subspace generated by y at the kth iteration

 

 

Gram-Schmidt scheme to construct orthonormal basis for

 

 

 

Krylov method is shown can converge in super-linear rate under certain assumptions.

 

For a real-life algorithm, please check the paper

“The idea behind Krylov methods” by Ilse C.F. Ipsen and Carl D. Meyer

 

The real-life algorithms for PLS and Krylov method are pretty much similar to each other, so the questions are

 

1)   if PLS is really a kind of Krylov method if you consider fitting regression is somehow kind of solving equation system?

 

2)   if PLS is not Krylov method, can it be tailored to be able to apply any of Krylov methods, which be able to make solving PLS faster?