Introduction
The Self-Organizing Map was developed by
professor Kohonen. The SOM has been proven useful in many applications. It is one
of the most popular neural network models. Self-Organizing Maps are aptly
named. “Self-Organizing” is because no supervision is required. “Maps” is
because they attempt to map their weights to conform to the given input data.
It belongs to the category of competitive and
unsupervised learning networks, which means that no human intervention is needed
during the learning. SOM can be used for clustering data without prior knowledge
of the class memberships of the input data.
Architecture
It is probably the most useful neural net
type, if the learning process of the human brain shall be simulated. The
"heart" of this type is the feature map, a neuron layer where
neurons are organizing themselves according to certain input values. The type
of this neural net is both feed forward (input layer to feature map) and
feedback (feature map). It is built of an input layer that’s neurons are
connected with each neuron of another layer, called "feature map".
The feature map can be one or two dimensional and each of its neurons is
connected to all other neurons on the map. Mainly used for classification.
Basic Principle:
The basic idea of the SOM is to make
available a certain amount of classificatory resources (we will call them
neurons, although the term units is also widely used) which are organized based
on the patterns available for classification (which will be called here input
patterns). A very relevant characteristic of the SOM, which in fact constitutes
a trademark, is the development of an ordered network in which nearby neurons
will share similarities, this way patterns which are similar will activate
similar areas in the SOM. Thus it can be said that different parts of the SOM
will be activated by specific types of input patterns, leading to a
representation based upon global organization and local specialization. The SOM
is trained iteratively through a large number of epochs. An epoch is defined as
the processing of all the input patterns once, thus each input pattern will be
processed as many times as the number of epochs.
Variables
1. Input data: X [MxN]. M
number of data with N features.
2. Maximum class: MC
3. Weight: W [MCxN]
4. Distance: DIST [1xN], a
matrix of calculated distances for a single data (row).
5. Learning rate: LR
6. Minimum distance : MIND
7. Minimum distance class:
MINDC
8. Best matching unit or
winning neuron: BMU
9. Epoch: E , number of
iteration
10.
Output class: OUTC
Algorithm
& Pseudo Code
A.
Training
1. Set learning rate (LR).
2. Set maximum no class
(MC).
3. Initialize random weights
(W) of dimension MCxN. N is the no of feature.
4. Set epoch (E).
5. Iterate through no of
epoch(E) starts
a. Iterate through no of
input data start.
b. Calculate distance from
an input data[X(i)] to all the weights.
c.
Find the minimum distance (out of MC no of distances).
d. Find the index for the
minimum distance class[MINDC].
e. BMU [W(MINDC,N)] is the
weight that makes the minimum distance.
f.
Update the weight of BMU neuron using below formula,
g. W(MINDC,N) = W(MINDC,N) +
LR * (X (i,:)-W(MINDC,N))
h. Iterate through no of
input data ends.
6. Iterate through no of
epoch ends.
7. Save the final weights.
B.
Testing
1. Iterate through test data
start [TEST(i)].
2. Calculate distance from TEST(i)
to all the final weights.
3. Find the minimum distance
(out of MC no of distances).
4. Find the index for the
minimum distance class[MINDC].
5. Probable class is the
MINDC for this data.
6. Iterate through test data
ends.
Example:
Training
Samples
i1: (1, 1, 0, 0)
i2: (0, 0, 0, 1)
i3: (1, 0, 0, 0)
i4: (0, 0, 1, 1)
Initial weight matrix: (random values between 0 and 1)
Class1- .8 .4 .7 .3
Class2- .2 .6 .5 .9
Iteration one:
For i1:
Distance1 =
(1-0.8)^2+(1-0.4)^2+(0-0.7)^2+(0-0.3)^2 =0.98
Distance2 = (1-0.2)^2+(1-0.6)^2+(0-0.5)^2+(0-0.9)^2 = 1.86
Minimum distance = 0.98 with Distance1. So we will update weight
for class1.
Class1 weights are
.8 > .8+(1-.8)*.6 =0.92
.4>.4+(1-.4)*.6=.76
.7>.7+(0-.7)*.6 = .28
.3>.3+(0-.3)*.6 = .12
New weights are:
0.92
|
0.76
|
0.28
|
0.12
|
0.2
|
0.6
|
0.5
|
0.9
|
For i2:
Distance1 =
(0-0.92)^2+(0-0.76)^2+(0-0.28)^2+(1-0.12)^2
=2.2768
Distance2 = (0-0.2)^2+(0-0.6)^2+(0-0.5)^2+(1-0.9)^2 = .62
Minimum distance = 0.62 with Distance2. So we will update weight
for class2.
Class1 weights are
.2 > .2+(0-.2)*.6 =.08
.6>.6+(0-.6)*.6=.24
.5>.5+(0-.5)*.6 = .2
.9>.9+(1-.9)*.6 = .96
New weights are:
0.92
|
0.76
|
0.28
|
0.12
|
0.08
|
0.24
|
0.2
|
0.96
|
For i3:
Distance1 =
(1-0.92)^2+(0-0.76)^2+(0-0.28)^2+(0-0.12)^2
=.6768
Distance2 = (1-0.08)^2+(0-0.24)^2+(0-0.2)^2+(0-0.96)^2 = 1.8656
Minimum distance = 0.6768 with Distance1. So we will update weight
for class1.
Class1 weights are
. 92> . 92+(1-.92)*.6 =.968
.76>.76+(0-.76)*.6=.304
.28>.28+(0-.28)*.6 = .112
.12>.12+(0-.12)*.6 = .048
New weights are:
0.968
|
0.304
|
0.112
|
0.048
|
0.08
|
0.24
|
0.2
|
0.96
|
For i4:
Distance1 =
(0-0.968)^2+(0-0.304)^2+(1-0.112)^2+(1-0.048)^2
=2.72
Distance2 = (0-0.08)^2+(0-0.24)^2+(1-0.2)^2+(1-0.96)^2 = .7056
Minimum distance = 0.6768 with Distance2. So we will update weight
for class2.
Class2 weights are
. 08> . 08+(0-.08)*.6 =.032
.24>.24+(0-.24)*.6=.096
.2>.2+(1-.2)*.6 = .68
.96>.96+(1-.96)*.6 = .9844
New weights are:
0.968
|
0.304
|
0.112
|
0.048
|
0.032
|
0.096
|
0.68
|
0.984
|
Final
weights:
After much iteration it reaches to final weight
as:
Class1:
|
1
|
0.5
|
0
|
0
|
Class2:
|
0
|
0
|
0.5
|
1
|
TESTING
With same data:
i1: (1, 1, 0, 0)
i2: (0, 0, 0, 1)
i3: (1, 0, 0, 0)
i4: (0, 0, 1, 1)
Test Data
|
Distance1
|
Distance2
|
Wining
Class
|
||||
For
i1 :
|
1
|
1
|
0
|
0
|
0.25
|
3.25
|
Class1
|
For
i2 :
|
0
|
0
|
0
|
1
|
2.25
|
0.25
|
Class2
|
For
i3 :
|
1
|
0
|
0
|
0
|
0.25
|
2.25
|
Class1
|
For
i4 :
|
0
|
0
|
1
|
1
|
3.25
|
0.25
|
Class2
|
With
new data:
Data : (1, 1, 1, 0)
New Data
|
Distance1
|
Distance2
|
Wining
Class
|
|||
1
|
1
|
1
|
0
|
1.25
|
3.25
|
Class1
|