Probabilistic
Neural Network
Consider the problem of multi-class classification. We
are given a set of data points from each class. The objective is to classify
any new data sample into one of the classes. Consider the problem of multi-class
classification. We are given a set of data points from each class. The
objective is to classify any new data sample into one of the classes.
Probabilistic Neural Network or, PNN can be useful for
multi-class classifier.
Architecture
A PNN is an implementation of a statistical algorithm
called kernel discriminant analysis in which the operations are organized into a
multilayered feedforward network with four layers.
1) Input
layer
The input layer contains the
nodes with set of measurements. Each neuron in the input layer represents a
predictor variable. In categorical variables, N-1 neurons are used when there
are N number of categories. It standardizes the range of the values by
subtracting the median and dividing by the interquartile range. Then the input neurons
feed the values to each of the neurons in the hidden layer.
2) Pattern
layer
The pattern layer consists
of the Gaussian functions formed using the given set of data points as centers.
This layer contains one neuron for each case in the training data set. It
stores the values of the predictor variables for the case along with the target
value. A hidden neuron computes the Euclidean distance of the test case from
the neuron’s center point and then applies the RBF kernel function using the sigma
values.
3) Summation
layer
The summation layer performs
a sum operation of the outputs from the second layer for each class.
4) Output
layer
The output layer performs a
vote, selecting the largest value. The associated class label is then
determined.
Advantages
There are several advantages and disadvantages using PNN.
·
PNNs are much faster than multilayer
perceptron networks.
·
PNNs approach Bayes optimal classification.
·
Guaranteed to converge to an optimal classifier
as the size of the representative training set increases
·
An inherently parallel structure
·
PNN networks are relatively insensitive to
outliers.
·
PNNs can be more accurate than multilayer
perceptron networks.
·
PNN networks generate accurate predicted
target probability scores.
·
Disadvantages
·
PNN are slower than multilayer perceptron
networks at classifying new cases.
·
PNN require more memory space to store the
model.
·
Requires a representative training set
PNN
Pseudo Code
// C is the number of
classes, N is the number of examples, Nk are from class k
// d is the dimensionality
of the training examples, sigma is the smoothing factor
// test_example[d] is the
example to be classified
// Examples[N][d] are the
training examples
int PNN(int C, int N, int
d, float sigma, float test_example[d], float Examples[N][d])
{
int classify = -1;
float largest = 0;
float sum[ C ];
// The OUTPUT layer which computes the pdf for each class C
for ( int k=1; k<=C; k++ )
{
sum[ k ] = 0;
// The SUMMATION layer which accumulates the pdf
// for each example from the particular class k
for ( int i=0; i<Nk; i++ )
{
float product = 0;
// The PATTERN layer that multiplies the test example by the weights
for ( int j=0; j<d; j++ )
product += test_example[j] * Examples[i][j];
product = ( product – 1 ) / ( sigma * sigma );
product = exp( product );
sum[ k ] += product;
}
sum[ k ] /= Nk;
}
for ( int k=1; k<=C; k++ )
if ( sum[ k ] > largest )
{
largest = sum[ k ];
classify = k;
}
return classify;
}
Example
Input Data Set:
X
|
Y
|
CLASS
|
1
|
0
|
1
|
0
|
1
|
1
|
1
|
1
|
1
|
-1
|
0
|
2
|
0
|
-1
|
2
|
Test data: [0.5,
0.5]
PNN:
X
|
Y
|
CLASS
|
Count-1
|
Count-2
|
1
|
0
|
1
|
3
|
2
|
0
|
1
|
1
|
||
1
|
1
|
1
|
||
-1
|
0
|
2
|
||
0
|
-1
|
2
|
X1
|
X2
|
0.5
|
0.5
|
X
|
Y
|
X-X1
|
X-X2
|
(X-X1)^2
|
(X-X2)^2
|
exp(-((X-X1)^2)/2)
|
exp(-((X-X2)^2)/2)
|
exp(-(X-X1)^2/2)+exp(-((X-X2)^2)/2)
|
1
|
0
|
0.5
|
-0.5
|
0.25
|
0.25
|
0.778800783
|
0.778800783
|
1.557601566
|
0
|
1
|
-0.5
|
0.5
|
0.25
|
0.25
|
0.778800783
|
0.778800783
|
1.557601566
|
1
|
1
|
0.5
|
0.5
|
0.25
|
0.25
|
0.778800783
|
0.778800783
|
1.557601566
|
-1
|
0
|
-1.5
|
-0.5
|
2.25
|
0.25
|
0.105399225
|
0.778800783
|
0.884200008
|
0
|
-1
|
-0.5
|
-1.5
|
0.25
|
2.25
|
0.778800783
|
0.105399225
|
0.884200008
|
SUM(CLASS1)
|
SUM(CLASS2)
|
Y1=SUM(CLASS1)/Count-1
|
Y2= SUM(CLASS2)/Count-2
|
4.672804698
|
1.768400015
|
1.557601566
|
0.884200008
|
As Y1>Y2 the data point [0.5,
0.5] lies in class 1.