Thursday, July 11, 2013

Basic of Neural Network

Introduction to Artificial neural network (ANN)




We will come to know about the details of ANN later but first of all we should know why we should go for ANN. How will it be beneficial to our engineering works?

Prompt answer of the question is there are some real life problems that does not have any kind of direct linear relationship between the inputs and outputs. But we some time need to solve that kind of problems of estimation or classification like sales forecasting, industrial process control, customer research, data validation, risk management, target marketing etc.

ANN is mostly used for fuzzy, difficult problems that don't yield to traditional algorithmic approaches. IOWs, there are more "suitable" solutions for computers, but sometimes those solutions don't work, and in those cases one approach is a neural network

Basic Analogy 


What is the Human brain made of?

The bulk of the brain is made up of structural cells termed glial cells and astrocytes. Lying in amongst these cells are neurons, specialized cells that conduct electrical impulses along their processes. It has been estimated that the average human brain contains about 100 billion neurons and, on average, each neuron is connected to 1000 other neurons. This results in the generation of vast and complex neural networks that are the mainstay of the brain's processing capabilities.

What is a neuron?
Neurons are the basic data processing units, the 'chips', of the brain. Each neuron receives electrical inputs from about 1000 other neurons. Impulses arriving simultaneously are added together and, if sufficiently strong, lead to the generation of an electrical discharge, known as an action potential (a 'nerve impulse'). The action potential then forms the input to the next neuron in the network.

Recognition & Memory
Our brain is capable of recognizing different objects and keeps the information in memory. For everything, neurons are responsible.

Example of how neuron works
If someone ask you what comes after 5 what will be your answer. It must be 6. How do you come to know that? That is where the neuron / brain work. You are trained about the number systems that’s why you are able to think that after 5 it will be 6 not 7. But if you were being trained in a way that after 5 it will be 9 then your answer must be 9. 


Similarly if we can build a model that can be trained as our brain then it can be used in many aspect of computational machine learning.

Artificial Neural Network
In machine learning and computational neuroscience, an artificial neural network, often just named a Neural Network, is a mathematical model inspired by biological neural networks. A neural network consists of an interconnected group of artificial neurons, and it processes information using a connectionist approach to computation. In most cases a neural network is an adaptive system changing its structure during a learning phase. Neural networks are used for modelling complex relationships between inputs and outputs or to find patterns in data. 



Neural networks do not perform miracles. But if used sensibly they can produce some amazing results.


A Simple Neuron
An artificial neuron is a device with many inputs and one output. The neuron has two modes of operation; the training mode and the using mode. In the training mode, the neuron can be trained to fire (or not), for particular input patterns. In the using mode, when a taught input pattern is detected at the input, its associated output becomes the current output. If the input pattern does not belong in the taught list of input patterns, the firing rule is used to determine whether to fire or not.



Network Architecture

The commonest type of artificial neural network consists of three groups, or layers, of units: a layer of "input" units is connected to a layer of "hidden" units, which is connected to a layer of "output" units. 

1. The activity of the input units represents the raw information that is fed into the network. 
2. The activity of each hidden unit is determined by the activities of the input units and the weights on the connections between the input and the hidden units. 
3. The behavior of the output units depends on the activity of the hidden units and the weights between the hidden and output units. 

This simple type of network is interesting because the hidden units are free to construct their own representations of the input. The weights between the input and hidden units determine when each hidden unit is active, and so by modifying these weights, a hidden unit can choose what it represents.

Network Training

After construction of network, network have to be trained accordingly using training data. So that any test data can produce expected output result. Training can be of two types,

1. Supervised Training
In supervised training, both the inputs and the outputs are provided. The network then processes the inputs and compares its resulting outputs against the desired outputs. Errors are then propagated back through the system, causing the system to adjust the weights which control the network. This process occurs over and over as the weights are continually tweaked. The set of data which enables the training is called the "training set." During the training of a network the same set of data is processed many times as the connection weights are ever refined.

2. Unsupervised or Adaptive Training 
The other type of training is called unsupervised training. In unsupervised training, the network is provided with inputs but not with desired outputs. The system itself must then decide what features it will use to group the input data. This is often referred to as self-organization or adaption.

Basically there are two types of problem to solved:

1. Classification Problems : Here netwrok is used to classify a set of data into several groups or classes. 
  • Example: PNN ( Probabilistic Neural Network) ,SOM (Self organizing map), Perceptron, MLP ( Multilayer Perceptron) etc. 
2.  Approximation function or Regression Problems : Here network predicts/estimates the output of the input data.
  • Example: : GRNN (General regression neural network), RBF (Radial basis function) etc.
Note: Some of the neural networks can be used for both classification as well as approximation.


Wednesday, July 10, 2013

GRNN : Generalized Regression Neural Networks

GRNN stands for Generalized Regression neural network.

1. What does GRNN do ??

This is basically a neural network based function approximation or function estimation algorithm. It predicts the output of a given input data.

2. How does it work ??


As per the basic principle of neural network it needs a training data to train itself. Training data should contain input-output mapping. Now if we train the network with the training data set and  we feed a new testing data set, it will accordingly gives the output or predict the result.

In case of GRNN, output is estimated  using weighted average of the outputs of training dataset, where the weight  is calculated using the euclidean distance between the training data  and   test data. If the weight or distance is large then the weight will be very less and if the distance is small it will put more weight to the output.

3. Network Architecture


Network architecture contains four basic layers. Input layer,Pattern layer,Summation layer, Output layer.
Input layer:
Input layer feeds the input to the next layer.
Pattern layer:
Pattern layer calculates the Euclidean distance and activation function.
Summation layer: 
Summation layer has two subparts one is Numerator part and another one is Denominator part. Numerator part contains summation of the multiplication of training output data and  activation function. Denominator is the summation of all activation function. This layer feeds both the Numerator & Denominator to the next output layer.
Output layer:
Output layer contains one neuron which calculate the output by dividing the numerator part of the Summation layer by the denominator part.

4. Main Principle

GRNN stands on the below equation:

Y(x)=∑Yie-(di2∕2σ2) ∕ ∑e-(di2∕2σ2)

Where, di2=(x-xi)T(x-xi)
Here x is the input sample and xi is the training sample.  Output of the input sample i
 is Yi. di2 is the Euclidean distance from the x and xi. e-(di2∕2σ2) is the activation function. Basically this activation function theoretically is the weight for that input .
Now if you look closely, The value of    di2   signifies how much the training sample can contribute to the output of that test particular test sample. 
If  di2  has small value that means it will contribute more value to the output but if it is a big value that means it will contribute very less to the output.
The term  e-(di2∕2σ2) is deciding that how much weight the training sample will contribute.
If di2 is small value, the term e-(di2∕2σ2)  returns a relatively large value.
If di2 is large value, the term e-(di2∕2σ2)  returns a relatively small value.
If di2 is zero the term e-(di2∕2σ2)  returns one that means test data = training sample and the output of test data will be the output of the training sample.


Here we have only one unknown parameter, spread constant σ. That can be tuned by training process to an optimum value where the error will be very small.

5. Training Procedure

Training procedure is to find out the optimum value of σ. Best practice is that find the position where the MSE (Mean Squared Error) is minimum.
First divide the whole training sample into two parts. Training sample and test sample. Apply GRNN on the test data based on training data and find out the MSE for different σ. Now find the minimum MSE and corresponding value of σ

6. Advantages of GRNN

1) The main advantage of GRNN is to speed up the training process which helps the network to be trained faster.
2) The network is able to learning from the training data by “1-pass” training in a fraction of the time it takes to train standard feed forward networks.
 3) The spread, Sigma (σ), is the only free parameter in the network, which often can be identified by the V-fold or Split-Sample cross validation.
4) Unlike standard feed forward networks, GRNN estimation is always able to converge to a global solution and won’t be trapped by a local minimum. 

7. Example

input    output
2             3
4             5
6             7
8             9   
What will be the output of 5??

Step 1
Calculate distances d1 = (5-2)^2 = 9 , d2= (5-4)^2 = 1, d3 =(5-6)^2=1, d4 = (5-8)^2 = 9.

Step 2
Calculate weights using the activation function:   e-(di2∕2σ2)
Lets say σ = 1.
so weights are, 
w1 =  0.01
w2 =  0.6
w3=   0.6
w4=   0.01

Step 3

Summation of  w's  W = w1+w2+w3+w4 =  1.22
So denominator is 1.22.
Now numerator is  YW = w1*y1 + w2*y2+w3*y3+ w4*y4
                                    =0 .01*3+0.6*5+0.6*7+0.01*9 
                                    = 7.32
Step 4

So the output is: (Neumerator/Denominator )

output = YW/W = 7.32/1.22 = 6.

So predicted output is 6.