S675 class notes
From enfascination
10/26/09
remarks on previous
- Fisher's best linear discriminator constructs a linear combination of the variables that is usually good for discrimination. This linear comination is not necessarily the linear combination found by PCA
- The rule htat assigns an unlabelled u to the class label i for which (u-xbarsubi)^tS^-1(u - xbarsubi) is minimal bas an obvious extension from the case of Z classes (i in {1,2}) to the case of g classes (i in {1,2,...,g}). The general rule is called linear discirminant analysis (LDA).
- LDA relies on a pooled sample covariance matrix S.
- Let Sigma = ES. where E is the expectation.
- It may or maynot be the case that eash class has population covariance matrix Sigma.
- What if the classes have different covariance matrices, Sigmasub1, ... , Sigmasubg?
- One possibility: estimate reach Sigmasubi by Ssubi, then assign to u the label i for which (u-xbarsubi)^tSsubi^-1(u-xbarsubi) is minimal.
- This rule is called quadratic discriminant analysis (QDA)
Discriminant Coordinates
Again (for motivation) assume that Psubi=Normal(musubi,Sigma). (common covariance, different means)
Let xbarsubi and Ssubi denote the sample mean vector and sample covariance matrix for sample i in {1,2,...,g}..
Let xbar = Sum of i=1 to g (nsubi/n)xbarsubi, <--sample grand mean vector
W = (n-g)^-1 Sum from i=1 to g (nsubi-1)Ssubi <---pooled withingroups sample covariance matrix
B = (g-1)^-1 Sum from i=1 to g nsubi(xbarsubi-xbar)(xbarsubi-xbar)^t <---between-groups sample covariance matrix
Given a in Real^p, we perform a univariate analysis of variance to test the nullhypothesis Hsub0:a^tmusub1=a^tmusub2=...=a^tmusubg.
In ANOVA, generalization of T statistic is F statistic. In T, statistic comparise difference between observed and predicted mean with variance withint he sample
F(a) = (ration between the variation between gorupand variation within groups) (a^tBa)/(a^tWa) (between group variance of the linear combination)/(within group variance of the linear combination)