Sunday, May 12, 2019

Radial Basis Function Network(RBFN)


Radial basis function network can be used for approximating function and recognizing patterns. It uses Gaussian Potential functions.

Architecture

The architecture radial basis function network consists of three layers, the input,hidden and the output layer as shown in below figure. The architecture of RBFN is a multi layer feed forward network. There exist ‘n’ number of input neuron and ‘m’ number of output neurons with the hidden layer existing between the input and output layer. The interconnection between the input layer and hidden layer forms hypothetical connection and between the hidden and output layer forms weighted connections. The training algorithm is used for updation of weights in all the interconnections.


Training algorithm for RBFN

The training algorithm for the radial basis function network is given below.  Radial basis function uses Gaussian activation function. The response of such function is non-negative for all value of x. The function is defined as, f(x)=exp(-x2)

Step 1: Initialize the weights (set to small random values)
Step 2: While stopping is false do step 3-10
Step 3: For each input do steps 4 – 9
Step 4: Each input unit (xi ,i=1,2.....n) receives input signal to all unit in the layer above(hidden unit)
Step 5: Calculate the radial basis function
Step 6: Choose the centres for the radial basis functions. The centres are chosen from the  set of input vectors. A sufficient number of centers have to be selected in order to ensure adequate sampling of the input vector space.
Step 7: The output of im unit vi(xi) in the hidden layer.      
$$ v_i(x_i)=exp(-\sum_{j=1}^r[{{x_{ji}-\bar{x}_{ji}]^2}\over{\sigma^2_i}}) $$
Where $\bar x_{ji}$ = Centre of the RBF unit for input variables
$\sigma _i$ = Width of the RBF unit
$x_{ji}$ = $j_{th}$ variable of unit pattern
Step 8: Initialize the weights in the output layer of the network to some small random value.
Step 9: Calculate the output of the neural network
$$ y_{net}=e\sum_{i=1}^Hw_{im}v_i(x_i)+w_0$$
Where, H = Number of hidden layer nodes(RBF function)
$y_{net}$ = Output value of the $m_{th}$ mode in output layer for the $n_{th}$ incoming pattern.
$w_{im}$ = Weight between  $i_{th}$ RBF unit and  $m_{th}$ output node
$w_0$ = Biasing term at $n_{th}$ output node
Step 10: calculate error and test stopping condition.
The stopping condition may be the weight change, number of epochs, etc.             

Friday, May 10, 2019

Learning Vector Quantization(LVQ)


Vector quantization is a standard statistical clustering technique, which seeks to divide input space into areas that are assigned as "code book" vectors. In LVQ network, target values are for the input training pattern and the learning is supervised. In training process, the output units are positioned to approximate the decision surfaces. After training, an LVQ net classifies an input vector by assigning in to same class as the output unit that has its weight vector(reference or code book vector) closest to input vector.
In order t understand the LVQ techniques , it should be borne in mind that the closest weight vector Wj to pattern X may be associated with a node j, that has the wrong class label for X. This follows because the initial node labels are based only on their most frequent class use and are therefore not always reliable. The procedure for updating wj is given by,

                       1. if x is classified correctly, ∆Wj=α[x-wj]
                       2. if x is classified incorrectly, ∆Wj=-α[x-wj]

The negative sign in the misclassification makes the weight vector move away from the cluster containing x, which, on average, trends to make weight vectors move away from the boundaries.

Architecture

The architecture is shown in below figure. The architecture is similar to the architecture of SOM without a topological structure assumed for the output units. in LVQ net, each output unit has a known class since it is supervised learning, thus differs from SOM which uses unsupervised learning. 

Training Algorithm

The  algorithm for the LVQ net is to find the output unit that has a matching pattern with the input vector. At the end of the process, if x and w belong to the same class, weights are moved toward the new input vector and if they belong to a different class, the weights are moved away from the input vector. In this case also similar to SOM, the winner unit is identified. The winner unit index is compared with the target, and based upon its comparison result, the weight updation is performed as shown in the algorithm given below. The iterations are further continued by reducing the learning rate. 
The algorithm is as follows:
Step 1: Initialize weights (reference) vectors. Initialize learning rate.
Step 2: while stopping is false, do Steps 3-7
Step 3: For each training input vector x, do Steps 4-5
Step 4: compute J using squared Euclidean distance.
                       D(j)=∑(wij-xi)2
                       Find j when D(j) is minimum.
Step 5: Update wj as follows:
                       If t=cj, then
                       wj(new) =wj(old) +α[x-wj(old)]
                       If t ≠cj ; then
                       wj(new) =wj(old) –α[x-wj(old)]
Step 6: Reduce the learning rate.
Step 7: Test for the stopping condition.

The condition may be fixed number of iterations or the learning rate reaching a sufficiently small value.

Example : Step by Step

Construct and test LVQ with four vectors assigned to two classes. Assume α =0.1. perform interaction up to  α = 0.05.

   Vector                         Class
(1 0 1 0)                          1
(0 0 1 1)                          2
(1 1 0 0)                          1
(1 0 0 1)                          2

Solution:

The first output unit represents class 1
The second output unit represents class 2.
c1= 1 and c2= 2
The first two vectors out of the four vectors are used to initialize the two reference vectors. The (1 1 0 0) and (10 0 1) are used as the training vectors.

EPOCH-1

Step 1: Initialize weights w1= (1 0 1 0) and w2=(0 0 1 1)
             Initialize the learning rate: α = 0. 1
Step 2: Begin the training process.
Step 3: For input vector x= (1 1 0 0) with T = I do steps 4-5
Step 4: calculate J
                       D(j)= ∑i(wij - xj)2
                       D(1)=(1- 1)2 + (0 -1)2 +(1-0)2 + (0- 0)2 = 2
                       D(2)=(0 -1)2 + (0-1)2 + (1-0)2=4
                       D(1) is minimum J= 1 → cj = 1.
Step 5: since T = 1 and cj= 1, then. T = cj
                       w1(new) =w 1(old) + α [x – w1 [old]]
                                     =(1 0 1 0)+ 0 .1[(1 1 0 0) –(1 0 1 0)
                       w1(new) =(1  0.1    0.9  0)
Step 3: For the input vector x = (1 0 0 1) with T = 2 perform steps 4-5.
Step 4:Calculate J
                       D (j) = ∑(wij –x1)2
                       D(1) =(1 – 1)2+ (0.1 -0)2 + (0.9-0)2 + (0 – 1)2 = 1.82
                       D(2) = (0 -1)2 +(0 – 0)2 + (1 – 0)2 (1- 1)2 =2
                       D (1) is minimum J = 1 → CJ= 1.
Step 5:  Sine T =2 and  cJ=1, T≠ cJ the weight updation is,
                       w1(new) = w1(old) - α[x- w1[old]]
                                    =(1 0.1  0.9  0) – 0.1 [(1  0  0  1) –(1  0.1  0.9  0) 
                       w1(new)=(1  0.11  0.99 – 0.1)
Step 6: One epoch of training is completed. Reduce the learning rate
                       α(t + 1) = 0.5 α(t)
                       α(1) = 0.5 x 0.1
                       α(1) =0.05.
Step 7: Test the stopping condition. The required learning rate is not obtained. Perform the second epoch.

EPOCH- 2   

Step 1: Initialize weights w1= (1  0.11  0.99 – 0.1) and w2 = (0  0  1  1)
            Initializing the learning rate : α = 0.05
Step 2: Begin the training process.
Step 3: For input vector x = (1 1 0 0) with T = 1 perform steps 4-5.
Step 4: calculate J.
                       D (j) =∑j (wij - xi)2
                       D (1) = (1 -1)2 + (0.11 – 1)2 + (0.99 – 0)2 +(- 0.1 -0)2 = 1.7822
                       D (2)= (0 – 1)2 + (0 – 1)2 + (1 – 0)2 (1 – 0)2= 4
                       D (1) is minimum, J = 1, hence CJ = 1.
Step 5:  Since T = 1 and CJ = 1, then T = CJ
                       w1(new) = w1(old) + α[x –w1 (old)]
                                    =(1  0.11  0.9  - 0.1) + 0.05[( 1  1  0  0) – (1  0.11  0.99   -0.1)]
                       w1(new) =(1  0.15   0.94  - 0.095)
Step 3: For input vector x = (1  0  0   1) with T = 2 perform step 4-5
Step 4: Calculate j.
                       D (j) = ∑i (w ij –xi)2
                       D (1) = (1 – 1)2 + (0.15 – 0)2 + (0.94 – 0)2 + (- 0.095 + 1)2 = 1.725
                       D (2) = (0 – 1)2 = (0 -0 )2 + ( 1 – 0)2 + (1 – 1)2 =2
                       D (1) is minimum, J = 1, hence CJ = 1
Step 5: Since T =2 and CJ = 1, then T ≠ Cj, the weight updation is,
                       w 1 (new) = (1 0.15  0.94 – 0.095) – 0.05[(1  0 0 1) – (1 0.15 0.94 – 0.095)]
                       w1(new)= (1  0.1575    0.987    -0.15)
Step 6: Second Epoch of training is completed.
Step 7: Test the stopping condition.
            Thus for learning state 0.05, the process is done.

Thus the LVQ  net is constructed and tested.

________________
Thanks for Learning.