S675 class notes

From enfascination

Jump to: navigation, search
(remarks on 10/26/09)
(Discriminant Coordinates)
Line 11: Line 11:
 
===Discriminant Coordinates===
 
===Discriminant Coordinates===
 
Again (for motivation) assume that Psubi=Normal(musubi,Sigma). (common covariance, different means)
 
Again (for motivation) assume that Psubi=Normal(musubi,Sigma). (common covariance, different means)
 +
 
Let xbarsubi and Ssubi denote the sample mean vector and sample covariance matrix for sample i in {1,2,...,g}..
 
Let xbarsubi and Ssubi denote the sample mean vector and sample covariance matrix for sample i in {1,2,...,g}..
 +
 
Let xbar = Sum of i=1 to g (nsubi/n)xbarsubi,  <--sample grand mean vector
 
Let xbar = Sum of i=1 to g (nsubi/n)xbarsubi,  <--sample grand mean vector
 +
 
W = (n-g)^-1 Sum from i=1 to g (nsubi-1)Ssubi <---pooled withingroups sample covariance matrix
 
W = (n-g)^-1 Sum from i=1 to g (nsubi-1)Ssubi <---pooled withingroups sample covariance matrix
 +
 
B = (g-1)^-1 Sum from i=1 to g nsubi(xbarsubi-xbar)(xbarsubi-xbar)^t <---between-groups sample covariance matrix
 
B = (g-1)^-1 Sum from i=1 to g nsubi(xbarsubi-xbar)(xbarsubi-xbar)^t <---between-groups sample covariance matrix
 +
 +
 +
 +
Given a in Real^p, we perform a univariate analysis of variance to test the nullhypothesis
 +
Hsub0:a^tmusub1=a^tmusub2=...=a^tmusubg.
 +
 +
In ANOVA, generalization of T statistic is F statistic.  In T, statistic comparise difference between observed and predicted mean with variance withint he sample
 +
 +
F(a) = (ration between the variation between gorupand variation within groups) (a^tBa)/(a^tWa)  (between group variance of the linear combination)/(within group variance of the linear combination)

Revision as of 14:49, 26 October 2009

10/26/09

remarks on previous

  1. Fisher's best linear discriminator constructs a linear combination of the variables that is usually good for discrimination. This linear comination is not necessarily the linear combination found by PCA
  2. The rule htat assigns an unlabelled u to the class label i for which (u-xbarsubi)^tS^-1(u - xbarsubi) is minimal bas an obvious extension from the case of Z classes (i in {1,2}) to the case of g classes (i in {1,2,...,g}). The general rule is called linear discirminant analysis (LDA).
  3. LDA relies on a pooled sample covariance matrix S.
    • Let Sigma = ES. where E is the expectation.
    • It may or maynot be the case that eash class has population covariance matrix Sigma.
    • What if the classes have different covariance matrices, Sigmasub1, ... , Sigmasubg?
    • One possibility: estimate reach Sigmasubi by Ssubi, then assign to u the label i for which (u-xbarsubi)^tSsubi^-1(u-xbarsubi) is minimal.
    • This rule is called quadratic discriminant analysis (QDA)

Discriminant Coordinates

Again (for motivation) assume that Psubi=Normal(musubi,Sigma). (common covariance, different means)

Let xbarsubi and Ssubi denote the sample mean vector and sample covariance matrix for sample i in {1,2,...,g}..

Let xbar = Sum of i=1 to g (nsubi/n)xbarsubi, <--sample grand mean vector

W = (n-g)^-1 Sum from i=1 to g (nsubi-1)Ssubi <---pooled withingroups sample covariance matrix

B = (g-1)^-1 Sum from i=1 to g nsubi(xbarsubi-xbar)(xbarsubi-xbar)^t <---between-groups sample covariance matrix


Given a in Real^p, we perform a univariate analysis of variance to test the nullhypothesis Hsub0:a^tmusub1=a^tmusub2=...=a^tmusubg.

In ANOVA, generalization of T statistic is F statistic. In T, statistic comparise difference between observed and predicted mean with variance withint he sample

F(a) = (ration between the variation between gorupand variation within groups) (a^tBa)/(a^tWa) (between group variance of the linear combination)/(within group variance of the linear combination)