Saturday, September 12, 2009

Activity 15 - Probabilistic Classification (Linear Discriminant Analysis)

LDA 2-feature
This is another type of classification in which the probability of class belonging is calculated and is determined through early observations of trained data sets. The assumption in this technique is that the classes are linearly separable through linear combinations of features that best distinguish the classes. In this activity, two features are used to separate the P1 and 25c coins in the previous activity. The features used are the rg values, similar to Activity 14.

For more detailed explanation of LDA visit this site.
Shown above is the original plot of the features in rg space. We can see that the data is linearly separable but we need to find the line to separate them and rotate them to see clearly the separability of the samples.

Applying the LDA algorithm, we obtained the following result:

As seen, the LDA separates the P1 and 25c coins very closely. The algorithm obtained 100% or 10/10 classification based on the two features.

This makes me conclude that LDA is better than Minimum distance classification since it has higher classification accuracy.

I give myself 10 pts for understanding and doing the activity correctly.

Reference:
A15 - Probabilistic Classification handout. M. Soriano. 2008.

Acknowledgement
I acknowledge Jaya for the help in understanding LDA algorithm. ;)


Code:
LDA
chdir('E:\Documents and Settings\vergara\My Documents\acads\1st sem 09-10\186\act14\' + 'data');
fname = 'ben';
x1 = fscanfMat('piso_t.txt');
x2 = fscanfMat('ben_t.txt');
tst = fscanfMat(fname + '_s.txt');

x = [cat(1,x1(:,1),x2(:,1)) cat(1,x1(:,2),x2(:,2)) cat(1,x1(:,5),x2(:,5))];

u1 = mean([x1(:,1) x1(:,2) x1(:,5)], 'r');
u2 = mean([x2(:,1) x2(:,2) x2(:,5)], 'r');

u = [mean(x(:,1)) mean(x(:,2)) mean(x(:,3))];

x1c = [x1(:,1) - u(1) x1(:,2) - u(2) x1(:,5) - u(3)];
x2c = [x2(:,1) - u(1) x2(:,2) - u(2) x2(:,5) - u(3)];

n1 = size(x1c,1);
n2 = size(x2c,1);
n = size(x,1);

c1 = (x1c'*x1c) / n1;
c2 = (x2c'*x2c) / n2;

C = [(n1/n)*c1(1)+(n2/n)*c2(1) (n1/n)*c1(2)+(n2/n)*c2(2) (n1/n)*c1(3)+(n2/n)*c2(3);...
(n1/n)*c1(4)+(n2/n)*c2(4) (n1/n)*c1(5)+(n2/n)*c2(5) (n1/n)*c1(6)+(n2/n)*c2(6);...
(n1/n)*c1(7)+(n2/n)*c2(7) (n1/n)*c1(8)+(n2/n)*c2(8) (n1/n)*c1(9)+(n2/n)*c2(9)];

h = [];
hp = [];
for tst_num = 1:size(tst, 1)
xk = cat(2, tst(tst_num,1:2), tst(tst_num,5));
f1 = u1*inv(C)*xk' - 0.5*u1*inv(C)*u1' + log(n1/n);
f2 = u2*inv(C)*xk' - 0.5*u2*inv(C)*u2' + log(n2/n);
hp = cat(1, hp, [f1 f2]);
h = cat(1, h, [1*(f1>f2) 1*(f1
end

chdir('E:\Documents and Settings\vergara\My Documents\acads\1st sem 09-10\186\act15');
fprintfMat(fname + '_res3' + '.txt', hp, '%0.5f');


PLOTTING
chdir('E:\Documents and Settings\vergara\My Documents\acads\1st sem 09-10\186\act14\data');
ben_t = fscanfMat('ben_t.txt');
piso_t = fscanfMat('piso_t.txt');
scf(), plot(ben_t(:,1), ben_t(:,2), 'sb', 'markerface', 'b')
plot(piso_t(:,1), piso_t(:,2), 'sg', 'markerface', 'g')
xlabel(' r')
ylabel(' g')
legend('25c', 'P1',2)

chdir('E:\Documents and Settings\vergara\My Documents\acads\1st sem 09-10\186\act15');
ben_res2 = fscanfMat('ben_res2.txt');
ben_res3 = fscanfMat('ben_res3.txt');
ben_tres2 = fscanfMat('ben_tres2.txt');

piso_res2 = fscanfMat('piso_res2.txt');
piso_res3 = fscanfMat('piso_res3.txt');
piso_tres2 = fscanfMat('piso_tres2.txt');

x = 200:0.001:300;
scf(), plot(ben_tres2(:,1), ben_tres2(:,2), 'ob', 'markerface', 'b')
plot(ben_res2(:,1), ben_res2(:,2), 'sy', 'markerface', 'y')
plot(piso_tres2(:,1), piso_tres2(:,2), 'og', 'markerface', 'g')
plot(piso_res2(:,1), piso_res2(:,2), 'sm', 'markerface', 'm')
plot(x,x,'-r')

legend('train 25c', 'sample 25c','train P1', 'sample P1', 2)
xlabel('f1')
ylabel('f2')
//mtlb_axis([100 400 100 400])

0 comments:

Post a Comment

Followers