Spherical Topology Self-Organizing Map Neuron Network for...

Spherical Topology Self-Organizing Map Neuron

Network for Visualization of Complex Data

Huajie Wu <[email protected]>

4 November 2011

A report submitted for the degree of Master of Computing of

Australian National University

Supervisor: Prof. Tom Gedeon

Acknowledgements

Firstly, thanks to my supervisor Tom Gedeon, and Dingyun Zhu for their recommendations and support on this project. Moreover, thanks to Uwe R. Zimmer for his suggestions about the technique of report writing. Finally, thanks to my family and my friends for their encouragements.

Abstract

The spherical SOM (SSOM) has been proposed in order to remove the “border effect” in conventional Self-Organizing Maps (SOM). However, SSOM still has limitations in representing a sequence of events. The concentric spherical Self-Organizing Maps (CSSOM) is proposed in this report, because it can use an arbitrary number of spheres and that topology could be applied in analysis of sequential and time series data. I present a new method to extend SSOM and to reconstruct the neighbors in order to implement concentric spherical Self-Organizing Maps. Moreover, for ease of evaluation, I present the display schemas and several measurements for the quality of SOMs. I present the experimental results. The results indicate that the quality of SOM is improved through using specified CSSOM depending on the characteristics of the dataset. However, the results for sequence training as currently proposed needs improvement. Finally, the quality of clustering becomes worse, as the number of spheres increases and the number of units in each sphere decreases. Key words: Neural Networks, concentric spherical Self-Organizing Maps, time series data, clustering, sequence training

List of Abbreviations

NN Neural Networks SOM Self-Organizing Map SSOM Spherical Self-Organizing Map CSSOM Concentric Spherical Self-Organizing Map

Contents Acknowledgements .................................................................................................................................................... 1 Abstract ....................................................................................................................................................................... 2 List of Abbreviations .................................................................................................................................................. 2 List of Figures ............................................................................................................................................................. 5 List of Tables ............................................................................................................................................................... 6 1. Introduction ........................................................................................................................................................ 7

1.1 Motivation ............................................................................................................................................... 7 1.2 Objectives ................................................................................................................................................ 7 1.3 Contribution ........................................................................................................................................... 8 1.4 Preview ................................................................................................................................................... 8

2. Background ......................................................................................................................................................... 8 2.1 Neural Networks and Unsupervised Learning ..................................................................................... 8 2.2 Kohonen’s SOM ..................................................................................................................................... 10 2.3 Spherical SOMs ..................................................................................................................................... 11

3. S-SOM ................................................................................................................................................................ 12 3.1 The Algorithm in the Training Process ............................................................................................... 12 3.2 Deformation of S-SOM ......................................................................................................................... 13

3.2.1 The Arrangement & Neighborhood Structure ........................................................................ 13 3.2.2 The Representations of Distortions and Colors ..................................................................... 13

4. Details of Concentric Spherical Self-Organizing Map Neuron Network ....................................................... 14 4.1 Description ........................................................................................................................................... 14 4.2 Architecture of CSSOM ......................................................................................................................... 16 4.3 Neighborhoods Structure .................................................................................................................... 18 4.4 Display Schema ..................................................................................................................................... 20 4.5 Sequence Training ................................................................................................................................ 22 4.6 Test Suite ............................................................................................................................................... 23

4.6.1 Purity of Clustering .................................................................................................................. 24 4.6.2 Quantization Error and Topological Error ............................................................................. 24

5. Experiments and Results ................................................................................................................................. 25 5.1 Experiment 1: the quality of SOMs ..................................................................................................... 25

5.1.1 Description of the experiment................................................................................................. 25 5.1.2 Experiment Process and Discussion of Results ...................................................................... 26

5.2 Experiment 2: Time sequence training .............................................................................................. 29 5.2.1 Description of the experiment................................................................................................. 29 5.2.2 Experiment Process and Discussion of Results ...................................................................... 30

5.3 Experiment 3: The purity of clustering using CSSOM with different number of spheres ............... 33 5.3.1 Description of the experiment................................................................................................. 33 5.3.2 Experiment Process and Discussion of Results ...................................................................... 33

6. Conclusion and Future Works ......................................................................................................................... 34 6.1 Conclusion ............................................................................................................................................ 34 6.2 Future Work .......................................................................................................................................... 34

Reference .................................................................................................................................................................. 35

Appendix A ............................................................................................................................................................... 37

List of Figures

FIGURE 1: EXAMPLE OF A BASIC NN .................................................................................................................... 9 FIGURE 2: CONVENTIONAL 1D AND 2D ARRANGEMENTS ................................................................................. 10 FIGURE 3: THE PROCESS OF UPDATING WEIGHTS FOR CLUSTER UNITS ............................................................ 12 FIGURE 4: 3 D GRAPHICAL OBJECT REPRESENTING THE WISCONSIN CANCER DATA ........................................ 14 FIGURE 5: THE MAIN INTERFACE OF CSSOM ..................................................................................................... 15 FIGURE 6: DROP-DOWN MENU OF “FILE” AND POP-UP WINDOW ................................................................... 16 FIGURE 7: THE CHANGES AFTER SELECTING LOADED FILES .............................................................................. 16 FIGURE 8: THE GENERAL FLOW OF CSSOM ....................................................................................................... 17 FIGURE 9: THE SUB-STEPS IN INITIALIZATION .................................................................................................... 17 FIGURE 10: OPTIONS IN TRAINING .................................................................................................................... 17 FIGURE 11: OPTIONS IN DISPLAY SCHEMA ........................................................................................................ 18 FIGURE 12: OPTIONS IN EVALUATION ................................................................................................................ 18 FIGURE 13: 2 D OUTPUT GRID UNITS MAP OF 3 LAYERS OF CSSOM ................................................................. 18 FIGURE 14: SHOWING SPHERE3 AS CENTER, 5 SPHERES OF “CHAIN GLYPH” DISPLAY SCHEMA ....................... 21 FIGURE 15: START FROM SPHERE 1, END WITH SPHERE 4, 5 SPHERES OF “EQUAL GLYPH” DISPLAY SCHEMA . 21 FIGURE 16: THE DISTRIBUTION OF ITEMS IN CLUSTERS .................................................................................... 24 FIGURE 17: THE AVERAGE NUMBER OF NEIGHBORHOODS PER UNITS IN SSOM AND CSSOM......................... 28 FIGURE 18: THE NUMBER OF UNITS WITH DIFFERENT NEIGHBORHOODS. THE DETAILS ARE IN APPENDIX A . 28 FIGURE 19: VISUAL VIEW OF SSOM FROM DIFFERENT ANGLES ........................................................................ 29 FIGURE 20: VISUAL VIEW OF CSSOM(15S) FROM SPHERE 1 TO SPHERE 15 ...................................................... 29 FIGURE 21: VISUAL VIEW OF CSSOM IN SEQUENCE TRAINING FROM SPHERE 1 TO SPHERE 15....................... 31 FIGURE 22: VISUAL VIEW OF CSSOM IN SEQUENCE TRAINING FROM SPHERE 5 TO SPHERE 10....................... 31 FIGURE 23: VISUAL VIEW OF CSSOM IN PARALLEL TRAINING IN TIME ORDER FROM SPHERE 1 TO SPHERE 15

................................................................................................................................................................... 32

List of Tables

TABLE 1: VISUAL TABLE FOR “PARALLEL TRAINING” .......................................................................................... 22 TABLE 2: VISUAL TABLE FOR “SEQUENCE TRAINING” ........................................................................................ 22 TABLE 3: SUMMARY OF DATASETS .................................................................................................................. 26 TABLE 4: QE AND TE OF SOMS USING DATASET “IRIS” ...................................................................................... 26 TABLE 5: QE AND TE OF SSOM AND CSSOM USING DATASET “ECSH” ............................................................... 27 TABLE 6: QE AND TE OF PARALLEL TRAINING AND SEQUENCE TRAINING USING DATASET “ECSH” .................. 30 TABLE 7: QE AND TE OF PARALLEL TRAINING IN RANDOM ORDER AND IN TIME ORDER USING DATASET “ECSH”

................................................................................................................................................................... 32 TABLE 8: THE PURITY OF SSOMS WITH DIFFERENT SIZE OF SPHERE ................................................................. 33 TABLE 9: THE PURITY OF CSSOMS WITH DIFFERENT NUMBER OF SPHERES ..................................................... 33

1. Introduction

In 1982, the Self-Organizing Map (SOM) was proposed by Kohonen, and it was primarily used for clustering, classification, sampling and visualizing high dimensional data [3]. Since then, that technique has been widely applied in many ways such as clustering high-frequency financial data. The conventional neighborhood arrangements are planar SOM made of two-dimensional rectangular or hexagonal lattices. However, the planar SOM has a disadvantage which is the “border effect” [15]. During training, the neurons compete with others. The weight of the winning neuron and its neighbor are updated. Ideally, all the units have the same chance to be updated. However, in the planar map, the units at the border of the map have fewer neighbors than the inside units. At the end of training, the map may not form expected similar regions of the data space, since there are many units with unequal chances of being modified during training [5]. Therefore, many spherical SOMs were proposed in order to solve that problem. In this report, one of these written by Sangole & Leontitsis in 2005 is described and is extended as a base. The disadvantage of most of methods is that the number of neurons in the map is not arbitrary [2]. The reason is that “ICOSAN can arrange 2+10*4N” [2] where ICOSAN is the arrangement of that method, and N is the number of recursive subdivisions. Concentric Spherical Self-Organizing Maps based on SOM implement multiple layers of SSOM. Compared with that single layer of SSOM, the number of neurons is more able to be varied. More importantly, the standard SOM also has no way to represent sequential data which could be done in multiple layers in CSSOM.

1.1 Motivation

The motivation of this project is to provide a method that the user can use an arbitrary number of spheres, and to observe the results of clustering data as well as the quality of SOMs on multiple spheres (CSSOM). Moreover, the other motivation is to evaluate the effects between time sequence training and parallel training in concentric spherical Self-Organizing Map.

1.2 Objectives

The aim of this project is to implement concentric spherical Self-Organizing Map based on Sangole & Leontitsis’s SSOM code, and allow a user to visualize data using an arbitrary number of spheres, and investigate a possible way to train CSSOM for time series data.

1.3 Contribution

The contribution of this project involves the three following areas. Firstly, the SSOM code written by Sangole & Leontitsis is simplified and modified in order to implement concentric SSOM and display them in different ways. Secondly, evaluation code was written to evaluate the accuracy of clustering and the quality of SOMs. Finally, the sequence training code is designed for CSSOM, and is used to evaluate its difference from parallel training.

1.4 Preview

Chapter 2 gives an overview of the relevant techniques and the basic concepts, including Neural Networks, unsupervised training, Self-Organizing Maps and spherical Self-Organizing Maps, which are preparation for understanding CSSOM and its evaluation. Chapter 3 takes a further interpretation and explanation of Sangole & Leontitsis’s SSOM. Chapter 4 has a detailed introduction, including the basic description, the architecture of CSSOM, the structure of neighborhood and display schema. Chapter 5 evaluates the results of SSOM and CSSOM for clustering the data and the quality of SOMs, and has an observation about the effects and results different between sequence training and parallel training for specific datasets. Finally, Chapter 6 concludes the report and indicates the weaknesses as well as suggestions for future work.

2. Background

2.1 Neural Networks and Unsupervised Learning

A Neural Network (NN) is a “mathematical model” or “computational model” which consists of an interconnected group of artificial neurons, and simulates some small amount of human brain structure and function. Moreover, it is an “adaptive system” which changes its structure based on external information and internal information [17]. In general, it consists of three types of layers which have input neurons, hidden neurons and output neurons respectively. The following picture shows the basic principle of NNs.

Figure 1: Example of a basic NN

In Figure 1, X represents the inputs which might be from other neurons; Y represents the outputs which might be inputs to other neurons or the final outputs, and W which represents the strength of the connections between the inputs and the neuron. Finally, the neuron collects all the adaptive inputs, and uses the activation function to generate the output. The activation function could be non-liner function or Gaussian function. The activation function formula is shown below.

𝑦 = 𝑓(�𝑤𝑖𝑥𝑖

n

i=0

)

Where y is the output of the neuron n is the number of the inputs i is the i th of the neuron w is the weight of the input x is the input of the neuron

f is the activation function, e.g. for sigmoid, f(x)= 11+e−x

There are three main learning methods: they are supervised learning, unsupervised and reinforcement learning. In this report, unsupervised learning is used in SOM and it will be mainly described. With unsupervised learning, the SOM neural networks can be used to find patterns in data without known categories or labels. In other words, SOM only uses the training dataset’s input set to group them, in which case the training data is not organized as input-output patterns. If we do have output patterns, then we can compare them to the clusters of the SOM and can estimate the similarity between the SOM clusters and the output labels, which is similar to the accuracy for a supervised classification method.

2.2 Kohonen’s SOM

The Self-Organizing Maps algorithm was introduced by Teuvo Kohonen in 1982[9]. Generally, Kohonen’s later publications in 1995 [10] and 2001 [11] are regarded as the major references on SOM. Kohonen’s description is “it is a tool of visualization and analysis of high-dimensional data”. Additionally, it is useful for clustering, classification and data mining in different areas. SOM is an unsupervised learning method, the key feature of which is that “there are no explicit target outputs or environmental evaluations associated with each input” [6]. During the training process, there is no evaluation of correctness of output or ‘supervision’. First, it is different from other neural networks, and it only has two layers which are input layer and output layer (or called competition layer) respectively. Every input in input space connects to all the output neurons in the map. The output arrangements are mostly of two dimensions. The Figure 2 shows below conventional 1D and 2D arrangements.

yn

yn

xn … …

xn … … (a)1D line layout (b)2D plan layout (rectangle) Figure 2: Conventional 1D and 2D arrangements In Figure 2, xn represent the input neurons in input space, yn represents the outputs in the output space. Figure 2.a shows a one dimensional arrangement in form of a line layout. Figure 2.b shows a two-dimensional arrangement in form of rectangular layout. The Figure 2 shows that compared to general NN, SOM has no hidden neurons and the discrete layout of the inputs map to output space in a regular arrangement. Besides the rectangular layout, 2D SOM also has the form of hexagonal arrangement. Next, the main process of Self-Organizing Maps (SOM) is introduced generally. The process is made up three main phases which are competition, cooperation and adaptation [17]. Competition: The output of the neuron in self-organizing map neural network computes the distance (Euclidean distance) between the weight vector and input vector. Then, the competition among the neurons is based on the outputs that they produce, where i(x) indicates the optimal matching input vector x, the formula can be represented:

i(x)= arg min𝑗 ||x −𝑤𝑗||, j=1,2,…,l (2.1)

In formula 2.1, x is the input vector, wj is the jth neuron’s weight vector. It uses “Nearest neighbor search”, which is interpreted as proximity search, similarity search or closest point search, consists in finding closest points in metric spaces [7]. The neuron j which satisfies the above condition is called the “winning neuron”.

http://en.wikipedia.org/wiki/Metric_space

Cooperation: the winning neuron is located at the center of the neighborhood of topologically cooperating neurons. The winning neuron tends to activate a set of neurons at lateral distances computed by a special function. The distance function must satisfy two requirements: 1) it is symmetric; 2) it decreases monotonically, as the distance increases [8]. A distance function h(n,i) which satisfies the above requirements is Gaussian: h( j, i) = exp( - dj,i2 / 2σ2 ) (2.2) In formula 2.2, h(j,i) is the topological area centered around the wining neuron i. The dj,i is the lateral distance between winning neuron i and cooperating neuron j, and σ is the radius influence. Adaption: it is in this phase that the synaptic weights adaptively change. Since these neural networks are self-adaptive, it requires neuron j’s synaptic weight wj to be updated toward the input vector x. All neurons in the neighborhood of the winner are updated as well in order to make sure that adjacent neurons have similar weight vectors. The following formula state the weights of each neurons in the neighborhood of the winner are updated: wj=wj+ηh(j,i)*(x-wj) (2.3) In formula 2.3, η is a learning rate, i is the index of winning neuron, wj is the weight of the neuron j. The h(j,i) function has been shown in equation 2.2. These three phases are repeated during the training, until the changes become less than a predefined threshold.

2.3 Spherical SOMs

Compared to 2D normal SOM, spherical SOMs eliminate the “border effect”. Furthermore, the spherical SOMs have more effective visualization. That is because all neurons have equal geometrical treatment and people may prefer to read the maps from the spheres. There are a number of spherical SOMs which have been implemented and applied to various datasets. The main spherical SOMs topologies are the followings: GeoSOM, S-SOM, 3D-SOM and H-SOM. GeoSOM was proposed by Wu & Takatsuka [5], and uses a 2D rectangular grid data structure to store the icosahedron-based geodesic dome. In Sangole & Leontitsis’s S-SOM work [4], every grid unit stores the list of its immediate neighbors. The next chapter has detailed descriptions about S-SOM. Boudjemai [16] applied it in 3D modeling as 3D-SOM, whereas Hirokazu [2] developed H-SOM to arrange the neurons along a helix, which is divided into equal parts. Hirokazu’s method allows arbitrary numbers of neurons, but calculating neighbors is quite difficult. There are disadvantages and advantages among these spherical SOMs, discussed in detail by Wu & Takatsuka [5]. Our project is mainly based on Sangole & Leontitsis’s S-SOM work, so the following chapter has much more details on S-SOM.

3. S-SOM

The spherical self-organizing feature map proposed by Sangole & Leontisis is a tool mapping randomly organized N-dimensional data into a lower almost 2D surface of a sphere, which is used for visual pattern analysis.

3.1 The Algorithm in the Training Process

Before training, the program requires a user to load the SSOM data structure and the data. Furthermore, the parameters like epoch (how many cycles), leaning rate and neighbor parameter should be set first. Then the weights of every cluster units are updated during training. The following flow chart shows the process of the training.

Figure 3: The process of updating weights for cluster units

In Figure 3, in step 2, 𝐷𝑖,𝑗,𝑘𝑝 is the difference between the current input vector and the weight

vectors for all cluster units, the formula used is: 𝐷𝑖,𝑗,𝑘

𝑝 = (Φ(ui,j,k)+1)*∑ (𝑥𝑛

𝑝 − 𝑤𝑛,𝑖,𝑗𝑘)𝑁𝑛=1 2 (3.1)

Where, 𝑥𝑛𝑝 is the nth input vector, and 𝑤𝑛,𝑖,𝑗𝑘 is the weight vectors of (i,j,k)th node. The Φ(ui,j,k) is a count-dependent non-decreasing function used to prevent cluster under-utilization[13]. In step 3, the winning neuron is the unit at node (i,j,k) with minimum distance.

In step 4, the weights of the winning neuron and its neighborhood units will be updated. The formula is shown below:

𝑤𝑖,𝑗,𝑘 (new) = 𝑤𝑖,𝑗,𝑘 (old) + a*[𝑥𝑛𝑝-𝑤𝑖,𝑗,𝑘 (old)] (3.2a)

Φ(new) =Φ(old) + h(r) (3.2b) h(r)= exp(-r2/2R) (3.2c)

In 3.2a, a is the learning rate. In 3.2b, Φ is the count-dependent parameter explained in 3.1. In 3.2b, 3.2c, h(r) is the neighborhood function. In 3.2c, R is a radius “that span over a half of the spherical SOFM space”, and r is the current radius [4]. Finally, if it satisfies the stop condition which is the “epoch” parameter (the number of cycles) set by users, the training process will terminate.

3.2 Deformation of S-SOM

3.2.1 The Arrangement & Neighborhood Structure

The output space is made up of the predefined grids units. Compared to other platonic polyhedral, the icosahedron is the most similar to a sphere, and the variance in edge length is the smallest. Most of the vertices have 6 immediate neighbors (adjacent points), except the 12 original vertices of the icosahedron which only have 5 immediate neighbors. After tessellation, the number of vertices can be calculated as: NN=2+10*4n (3.1) In equation 3.1, NN is the number of vertices in the output space, n represents the number of recursive subdivisions. The program also has a data structure to store all vertices’ neighbors. In other words, every vertex has a neighborhood list to record its neighbors.

3.2.2 The Representations of Distortions and Colors

As Sangole’s[12] description, distortions and colors reflect the magnitude of the similarity measure. The following diagram shows the representations of distortions and colors in visual view. Dataset: Wisconsin breast cancer data, 683 fine needle aspirate tissue samples, pre-classified into two categories: malignant and benign tissue Parameters: 642 grid units on a tessellated sphere are selected. The network is trained for approximately 200 epochs.

Figure 4: 3 D graphical object representing the Wisconsin cancer data

In Figure 3, it is not hard to find that there are two significantly different colors which are red and yellow. Red indicates malignant tissue, whereas yellow represents benign tissue. Furthermore, there are two clumps between the boundaries of the colors. “Either shape or color, in the graphical form thereby indicates the presence of the distinct attributes as compared to the rest of the input vectors in the data set”[12].

4. Details of Concentric Spherical Self-Organizing Map

Neuron Network

4.1 Description

This section mainly describes the user interface, and generally introduces the function of every component in the GUI.

Figure 5: The main interface of CSSOM

In Figure 5, “Loading Data Structure” section displays what “input data file” and “SSOM structure data file” the user has selected. The instructions how to load these two files will be shown latter. Before training, the “Training Parameter” should be set, or the parameter will keep the default value (shown in Figure 5). “Epochs” represents the cycles of parallel training, and it is necessary to set when it has parallel training. “Size” is the neighborhood parameter which is used in the h function (please refer to equation 3.2c). “Spheres” represents the number of layers of sphere. “Times” is the repetitions of sequence training (details in chapter 4.5) and not used in parallel training. Before displaying the glyphs, the “Display Parameter” also should be set, or this parameter will remain at the default value (shown in Figure 5). “CenterSph” should be set before plotting the “Chain” glyph (details in chapter 4.4). “StartSph” and “EndSph” should be set before plotting the “Equal” glyph (details in chapter 4.4). All of these have default values, so the system will work but may not produce the best results. After loading the input data file and SSOM structure data file, the buttons in the “Training Buttons” section will be visible. The “Train” button is used for parallel training after setting the “epochs” parameter. The “Sequence Train” button is used for sequence training. After the training, the buttons in the “Display Glyph Buttons” section will be visible. The “Plot ‘Chain’ Glyph” button is used to generate the glyph in form of a chain (details in chapter 4.4). The “Plot ‘Equal’ Glyph” button is used to generate the glyph consisting of several equal size spheres (details in chapter 4.4). Next, the following Figure shows the sub-menu of “File”.

Figure 6: Drop-down menu of “File” and pop-up window

In Figure 6, in the drop-down menu of “File”, select “Load data…” or “Load S-SOFM…”, then pop-up a window which is the parent directory of main “.m” file. After users select the files to load, the “Loading Data Structure” section in the main interface will show the name of the file users selected. The following Figure shows the changes:

Figure 7: The changes after selecting loaded files

In Figure 7, The “Training Buttons” are selectable after loading two files. In the drop-down menu of “Help”, there is an option “About”, which pops up a window with information about this software.

4.2 Architecture of CSSOM

The Concentric Spherical Self-Organizing Maps could be interpreted as multiple layers of Sangole & Leontisis’ S-SOM, and a tool of more effective visualized representation for sequence data. The CSSOM is composed of four modules which are the initialization module, training module, display schema module and test suite module. The following diagrams show the flow

and sequence of the modules and provide an overview of CSSOM’s operation principles.

Figure 8: the general flow of CSSOM

Initialization:

Figure 9: The sub-steps in Initialization

In Figure 9, step 1.1, the files include the input data and SSOM structure, and the parameters include those shown in Figure 5. In step 1.2, all data are saved as variables in the workspace such as X for input vectors, C for the neighbor lists. In step 1.3, based on “X”, “C”, “P”(the Cartesian coordinates) and “spheres”, resize the P and reorganize the neighborhood lists (details in chapter 4.3). Training:

Figure 10: Options in Training Figure 10 shows that there are two types training which are “Parallel Training” and “Sequence Training”. “Parallel Training” has been described in detail in the previous chapter (please refer to chapter 3.1), whereas “Sequence Training” will be introduced in chapter 4.5.

Display Schema:

Figure 11: Options in Display Schema

Figure 11: In order to analyze the data in various views, there are two different types of Display Schema. Chapter 4.4 has more details.

Figure 12: Options in Evaluation

In Figure 12’s option 2, QE represents Quantization Error, while TE is for Topological Error. For more details refer to chapter 4.6.

4.3 Neighborhoods Structure

Before the training, based on “P”, “C” and “spheres” (representing the cartesian coordinates, neighborhood lists, and number of layers respectively), “P” and “C” need to be resized and modified. This section focuses on the modifications of the neighborhood structure. Let’s demonstrate it according to the following diagram.

Figure 13: 2 D output grid units map of 3 layers of CSSOM

In Figure 13, in order to distinguish the cluster units in different spheres, we use different colors, which are also involved in following explanations. Blue represents Sphere 1, Yellow is for Sphere 2, and Red is for Sphere 3. In Sangole & Leontisis’ SSOM code, there is only one sphere which we assume is Sphere1. If a in Sphere1 is the winning neuron, in a’s neighborhood list, when r=1 (r for radius), the immediate neighbors are b…g (from b to g), when r=2, the neighbors exactly 2 away are h…s, when r=3, the neighbors exactly 3 away are t…k2. Normally, the initial value of r depends on the number of units, which spreads over half of the sphere [4]. If it is implemented in CSSOM and the number of layers is 3, the neighbors of a should be added, which contain the units in Sphere2 and Sphere3. The units in different spheres have connections to each other. Therefore, for r=1, besides b…g (from b to g in Sphere1), the neighbors of a include a in Sphere2 and a in Sphere3 (Sphere3 is also adjacent to Sphere1, because all the spheres are considered to be in a loop). When r=2, besides h…s (from h to s in Sphere1), the neighbors also include b…g (from b to g in Sphere2) and b…g (from b to g in Sphere3), and so on. Next, the pseudo code for updating neurons’ neighbors on the data structure is below: Algorithm 1 updating neurons’ neighbors on the data structure 1 initialize the neighborhood's data structure, 2 assign n spheres of original neighborhood data structure C to a new one Cnew; 3 //rsize is the radius size, spheres is the number of spheres, 4 //nsize is the number of units, rsDfl is defualt radius size of C. 5 for i = 1 to rsize do 6 for each sphere j do 7 for each neuron k do 8 update the right index of neighbors in the same sphere; 9 //because initialization just expands the size of Cnew, but the index of neighbors is not correct. 10 end for 11 end for 12 end for 13 if rsize bigger or equal spheres then 14 assign spheres - 1 to rsize; 15 end if 16 if no reminder of shperes devided by 2 then 17 assign(spheres / 2 - 1) to rsize; 18 else 19 assign((spheres + 1) / 2 - 1) to rsize; 20 end if 21 if rsize is larger than rsDfl then 22 for i = 1 to(rsize - rsDfl) do 23 for each sphere j do

24 for k = 1 to nsize do 25 assign Cnew{k + nsize * (j - 1), i + size(C, 2)} to empty value; 26 //to prevent the subscript exceeds, when assigning values next loop 27 end for 28 end for 29 end for 30 end if 31 assign Cnew to temporary variable CC; 32 for each sphere l do 33 for each neuron i do 34 for j = 1 to rsize do 35 for k = 1 to j do 36 if j - k equal to 0 then 37 add the most adjacent spheres corresponding index to current Cnew; 38 else then 39 add k adjacent spheres' corresponding neuron CC 's r = j - k neighbor to current Cnew; 40 end if 41 end for 42 end for 43 end for 44 end for 45

4.4 Display Schema

This program provides users with two types of display schema for analyzing data from different perspectives. The “Plot ‘Chain’ Glyph” schema focuses on displaying the center sphere which users set as. At the same time, as the connection (the adjacency) with the center sphere decreases, the size of other spheres decrease. This schema is suitable for emphatically analyzing a specified sphere and its adjacent spheres. Moreover, it is good for analyzing the spheres as a whole. Figure 14 shows the display schema.

Figure 14: Showing sphere3 as center, 5 spheres of “Chain Glyph” display schema

In Figure 14, there are 5 spheres, and sphere 3 is set as the center. Sphere 3 is the largest one; sphere 2 and sphere 4 become smaller; sphere 1 and sphere 5 are the smallest because they are the farthest from sphere 3. However, as the number of spheres increase, the size of all the spheres will become smaller. That problem can be solved by using the “Plot ‘Equal’ Glyph” display schema. “Equal Glyph” can show an arbitrary continuous number of equal-size spheres. Furthermore, it provides users optimal visual effects. The following Figure shows the effects.

Figure 15: Start from sphere 1, end with sphere 4, 5 spheres of “Equal Glyph” display schema

In Figure 15, the glyph shows the spheres from 1 to 4 with the same size, out of 5 spheres. Once users observe appropriate spheres, the “Equal Glyph” display schema will provide a good visual effect to users to focus in on that sphere and its adjacent spheres. Whatever display schema the users use, the spheres can be rotated simultaneously for user’s preference in analyzing complex data.

4.5 Sequence Training

There are two training algorithms which are “Parallel Training” and “Sequence Training” respectively mentioned above. “Parallel Training” is the traditional algorithm, the basic procedure of which is described in detail at chapter 3.1. It is called parallel training here as all the spheres “see” the data at the same time. That is, a winning unit is chosen from any of these spheres for P1 in the first epoch shown in the first line of Table 1. Generally the patterns are actually in a random order. This section will focus on describing “Sequence Training” which is proposed here for the data in sequential or chronological order. The following tables will visually demonstrate how the parallel training works followed by the new sequence training. Assume that there are 5 patterns, which are involved in 3 layers of CSSOM.

Spheres Epochs

S1 S2 S3

1 P1 P1 P1

1 P2 P2 P2

1 P3 P3 P3

1 P4 P4 P4

1 P5 P5 P5

… … … …

Table 1: Visual table for “Parallel Training”

Spheres Times

S1 S2 S3

1 P1 P2 P3

2 P2 P3 P4

3 P3 P4 P5

4 P4 P5 P1

5 P5 P1 P2

… … … …

Table 2: Visual table for “Sequence Training” In Table 1 and 2, Pn represents the nth pattern (the input vector). A pattern in Table 1 is trained on all S1…n (1th to nth Sphere) once, whereas a pattern in table 2 is trained on a Sn (nth Sphere) in order. The “Times” parameter which is shown in main interface (figure 5 in chapter 4.1) is set by users before the “Sequence Training”. It is not difficult to find that after 5 “Times”, every pattern was involved in all the spheres. Therefore, if the number of “Times” is equal to the number of patterns, every pattern will be involved in all the spheres once, which is equivalent to parallel training with 1 epoch (“Epochs”). So, the five “Times” in Table 2 has the same

amount of training as one epoch in Table 1. Next, the pseudo code will shows the “Sequence Training” algorithm in detail. Algorithm 2 for sequence training 1 initialize the weights; 2 initialize freqnew; 3 get the number of single sphere's neuron as nsize; 4 //nsize will used latter 5 for each repetition (Times) idx do 6 get the remainder after idx is divided by Psize(Patterns' size), save as start_idx 7 // idx is index of times 8 // start_idx can be considered as the index of current training pattern 9 if start_idx is 0 then 10 assign Psize to start_idx; 11 end if 12 for j=1 to the number of sphere 13 the start_idx th pattern minus all weights; 14 calculate the normalization of the distance matrix; 15 obtain the minimum distance in current sphere jth; 16 calculate the new weight for the wining neuron; 17 if neighborhood matrix is given then 18 for k = 1 to r 19 // r for each neighborhood radius 20 calculate the new weight of the neighbors; 21 update the count-dependent parameter for the neighbors 22 //the neighbors in other spheres as well 23 end for 24 end if 25 update the count-dependent parameter; 26 update the training pattern's index (plus 1); 27 end for 28 end for

4.6 Test Suite

There are three evaluation criteria for CSSOM, which are (1) distortion and color evaluation, (2) purity of clustering, (3) quantization error and topological error. Distortions and colors are used to analyze the data by visualization. They are adopted in most implementations of spherical SOMs, about which there are descriptions in detail earlier (refer to chapter 3.2.2). This section will concentrate on illustrating the latter two evaluations.

4.6.1 Purity of Clustering

Purity is an measure which evaluates how good the quality of clustering is. Assume there are k clusters C (the k in k-means), the total number of items in cluster Cj is denoted as |Cj|, and |Cj|class=i denotes the number of items of class i assigned to cluster j, then the purity can be expressed as: purity(Cj)= maxi(|Cj|class=i)/ | Cj | (4.1a) The overall purity of a clustering solution can be expressed as:

purity=1𝑁∑ 𝑘𝑗=1 maxi(|Cj|class=i) (4.1b)

In 4.1b, N is the total number of items. There is an example below which illustrates how to calculate the purity in detail.

Figure 16: the distribution of items in clusters

In Figure 16, the majority class and the number of members of the majority class for 3 clusters are: , 5 in cluster 1; , 3 in cluster 2; , 4 in cluster 3; therefore, the purity is 1/21*(5+3+4)≈0.57.

4.6.2 Quantization Error and Topological Error

Quantization error and topological error are two widely used measurements which evaluate the quality of a Self-Organizing Map. First, quantization error evaluates how well the neural network map fits the input patterns. This error is the average distance between all data vectors and their best matching units (winning neurons). The formula can be expressed as:

QE= 1𝑛

∑ || 𝑥𝚥��⃗ −𝑚𝑥𝚥��⃗ ||𝑛𝑗=1 (4.2)

In formula 4.2, n is the number of input vectors, 𝑥𝚥��⃗ is the jth input vector, and 𝑚𝑥𝚥��⃗ is the best

matching unit of the jth input vector. However, the quantization error cannot describe the topological order of the map, or in other words, it is difficult to measure the topology preservation. Topology preservation is a measurement of how continuous the mapping from input space to the map grid is. Topological error is used to evaluate the complexity of the output space. “This error measures the proportion of all data vectors for which first and second best-matching units (BMU) are not adjacent vectors”[14]. The formula is expressed as:

TE= 1𝑛

∑ 𝑢(𝑥𝚥��⃗ )𝑛𝑗=1 (4.3a)

𝑢(𝑥𝚥��⃗ )=�0 𝑖𝑓 𝑥𝚥��⃗ 𝑑𝑑𝑑𝑑 𝑣𝑣𝑣𝑑𝑣𝑟′𝑠 𝑓𝑖𝑟𝑠𝑑 𝑑𝑎𝑑 𝑠𝑣𝑣𝑣𝑎𝑑 𝑏𝑣𝑠𝑑 𝑚𝑑𝑑𝑣𝑑ℎ𝑖𝑎𝑖 𝑢𝑎𝑖𝑑𝑠 𝑑𝑟𝑣 𝑑𝑑𝑎𝑑𝑣𝑣𝑎𝑑 1 𝑖𝑓 𝑥𝚥��⃗ 𝑑𝑑𝑑𝑑 𝑣𝑣𝑣𝑑𝑣𝑟′𝑠 𝑓𝑖𝑟𝑠𝑑 𝑑𝑎𝑑 𝑠𝑣𝑣𝑣𝑎𝑑 𝑏𝑣𝑠𝑑 𝑚𝑑𝑑𝑣𝑑ℎ𝑖𝑎𝑖 𝑢𝑎𝑖𝑑𝑠 𝑑𝑟𝑣 𝑎𝑣𝑑 𝑑𝑑𝑎𝑑𝑣𝑣𝑎𝑑 (4.3b)

In 4.3a and 4.3b, 𝑥𝚥��⃗ is jth input vector and n is the number of input vectors. Therefore, lower topological error has better topology preservation. In contrast, a high topology error indicates that the output space is complex and it is hard to preserve the topology, so it recommends reducing the size of the network [3]. That is because as the size of the network reduces, the possibility that first-best and second-best matching units are adjacent in output space increase, so the TE becomes lower.

5. Experiments and Results

5.1 Experiment 1: the quality of SOMs

5.1.1 Description of the experiment

In this experiment, the main task is to analyze and evaluate the quality of different topologies. There are two figures below used to observe SOMs’ quality in each emphasis. The first one compares the quantization errors and topological errors among Kohonen’s SOM, Sangole’s S-SOM and concentric SSOM using dataset “IRIS”. The second table will focus on comparing S-SOM and CSSOM using the more complex dataset “ECSH”. Dataset Description: IRIS The dataset consist of 150 patterns which 50 are Iris Setosa, 50 are Iris Versicolour and 50 are Iris Virginica. There are four attributes for the input data which are sepal length, sepal width, petal length and petal width respectively. More detailed descriptions and the data files can be

found on the UCI University of California Irvine machine learning data set repository website1. ECSH The data set is for a participant in a larger study who read paragraphs in Easy, Calm, Stressful, Hard order. There are two sheets where one sheet has data for a Calm paragraph and the other for Stressful, and in this experiment, the data is for a Calm paragraph. It reflects 60Hz recordings and spans one minute. There are 3,641 input vectors each of which has 7 attributes as listed below: xGaze - x gaze point on a 1680 (width) x 1050 (height) monitor yGaze - y gaze point on a 1680 x 1050 monitor pupilLDiam - pupil diameter of left eye pupilRDiam - pupil diameter of right eye ecg - ECG gsr - GSR bp - blood pressure The table below shows the summary of those datasets.

Dataset Number of input dimensions

Number of observations

Missing Value Data Characteristics

IRIS 4 150 no Multivariate ECSH 7 3641 yes Time sequence

Table 3: Summary of datasets

5.1.2 Experiment Process and Discussion of Results

First, we use the “IRIS” dataset to compare the quantization errors and topological errors of SOM, SSOM and CSSOM. The parameter “epoch” is set to 44. The parameter “neighborhood size(s)” is set to 0.5. Those parameters are set according to the “nature of dataset”. “Unfortunately, inferences in this regard cannot be made based on existing methods for estimating the SOFM parameters”[1], therefore, the values selected are the optimal ones based on repeated experiments. For ease of the comparisons between those topology methods, similar number of units in each is selected. The appropriate number is 162, especially in CSSOM, 14 spheres of 12 units’ grid (=168 the closet to 162). Every value in Table 4 is the average of all values in 20-times repeated runs. The results are shown below:

ERRORS SOM SSOM CSSOM(14S) QE 0.257 0.213 0.253 TE 0.027 0.018 0.018

Table 4: QE and TE of SOMs using dataset “IRIS”

1 http://archive.ics.uci.edu/ml/index.html

http://archive.ics.uci.edu/ml/index.html

In Table 4, we can see that both QE and TE of SSOM, CSSOM (14 spheres of 12 units’ grid) is lower than the conventional SOM’s. That is because SSOM and CSSOM have solved the “border effects” problem, and every unit has much the same chance to update the weights. As a result, the errors are decreased. However, QE of CSSOM is slightly higher than SSOM. It is probably because the complexity of the neighborhood structure has been increased when the CSSOM is used. There are not any differences in TE. For ease of comparisons between SSOM and CSSOM, we use more complex data to obtain some more results. The “ECSH” dataset is used which has 3,641 patterns. Therefore, the appropriate number of units is 2,562 in SSOM, because it is closest to the number of input patters. For CSSOM, we select a number of units which is closest to 2,562. The parameter “epoch” is set to 20. The parameter “neighborhood size(s)” is set to 0.5. The results are shown below:

ERROR SSOM CSSOM(4S) CSSOM(15S) CSSOM(61S) CSSOM(214S) QE 193.77 381.71 188.25 587.73 426.14 TE 0.101 0.328 0.179 0.092 0.105

Table 5: QE and TE of SSOM and CSSOM using dataset “ECSH” In the table 5, SSOM is a single SOM with a 2,562-units grid map; CSSOM(4S) represents 4 spheres of 642-units grid map(4*642=2,568); CSSOM(15S) represents 15 spheres of 162-units grid map(15*162=2,582); CSSOM(61S) represents 61 spheres of 42-units grid map(61*42=2,562); CSSOM(214S) represents 214 spheres of 12-units grid map(214*12=2,568); It is interesting to find that TE in table 5 is directly related to the average number of neighbors per units shown in Figure 17. Obviously, CSSOM(61S) has the smallest topological error (TE), since the average number of neighbors per unit in CSSOM(61S) is largest. In other word, as the average number of neighborhoods per unit increases, TE will decrease. That is because TE measures the proportion of all data vectors for which first and second best-matching units (BMU) are not adjacent vectors, and the probability of adjacent vectors increase, as the number of neighbors per unit increase (under that condition they has the same or similar size of network). At the same time, table 5 shows QE of CSSOM(15S) is lower than SSOM, whereas the rest of CSSOM is higher. Figure 18 shows the trend of the number of units with different number of neighbors decreases, as the number of units in a singer layer decreases. Obviously, all 1,267 units in CSSOM(214S) have the same number of neighbors, whereas the units in SSOM has 30 variants with different neighborhoods (see Appendix A). The number of units with different numbers of neighbors is larger, so the uniformity is worse. Therefore, CSSOM(214S) is most uniform, and the others are less. The uniformity of the arrangement is one of the factors impacting the QE. As the number of units with different number of neighbors becomes larger, more and more units have unequal chances to be updated. Therefore, it will lead to influences on forming expected similar regions of data space. At the same time, other factors like the complexity of the units-grid map and the connections between the units have an impact on QE.

Therefore, it might be the reason CSSOM(15S) has the smallest QE.

Figure 17: The average number of neighborhoods per units in SSOM and CSSOM

Figure 18: The number of units with different neighborhoods. The details are in Appendix A

Next, we analyze the quality of SSOM and CSSOM(15S) in the form of visual representations. As the representations of distortions and colors are mentioned in Chapter 3.2.2, it reflects on the magnitude of the similarity measure. The representation in Figure 19 is from “SSOM” in Table 5 above, one of the experiments with 75.96 in QE and 0.116 in TE.

1498

376

684

1595

1267

0200400600800

10001200140016001800

Average number of neighborhoods per unit

SSOM

CSSOM(4S)

CSSOM(15S)

CSSOM(61S)

CSSOM(214S)

30

7 4 2 1 0

5101520253035

The number of units with different neighborhoods

the number of unitswith differentneighbors

Figure 19: visual view of SSOM from different angles

Compared to SSOM, besides reflecting the connections and similarity in the same sphere, the visual view of CSSOM has a more visible representation on the connections between the spheres. Figure 20 shows the view of “CSSOM(15S)” in Table 5 above, one of the experiments with 30.69 in QE and 0.176 in TE.

Figure 20: visual view of CSSOM(15S) from sphere 1 to sphere 15

Figure 20 clearly shows that there are more similarities between spheres 4 to 11.

5.2 Experiment 2: Time sequence training


Because the standard SOM has no way to represent sequential data, but which could be done by

multiple layers, we propose sequence training for CSSOM in order to observe the effects and results. These two points are our expectations: (1) better TE or QE; (2) Normal distributions of colors and distortions in the spheres. Normal distribution means there are obvious areas with distinct colors and distortions’ in one sphere, or the glyph shows continuous spheres which seem to be similar in the distribution of the colors and distortions. Therefore, the experiment is divided into two sections to evaluate the results and the effects. The dataset selected is “ECSH” which is the same as the previous experiment, and chapter 5.2.1 has the detailed description.


Using the results from the traditional training (called parallel training here), we use one of the experiments above which is CSSOM(15S) with best results (Table 5) and effects (Figure 21). For ease of comparisons with that parallel training, the parameter “Epochs” is set to be 20, “Times” (repetitions) is set to be 3,641 , “Size” (neighborhood size parameter) and “Spheres” are set to be 0.5 and 15 respectively. The units-grid map selected is ICOSA2 (162 units). The table below shows the average TE and QE after equivalent parallel training and sequence training.

ERRORS Parallel Training Sequence Training QE 188.25 1511.522 TE 0.179 0.496

Table 6: QE and TE of Parallel Training and Sequence Training using dataset “ECSH” In table 6, it is disappointing to find that neither QE nor TE has found any improvements with the sequence training. Furthermore, QE in sequence training is about 8 times higher than QE in parallel training, whereas TE in sequence training is approximate 3 times higher than QE in parallel training. Figure 20 above shows one of CSSOM’s experiments in parallel training, with 30.69 in QE and 0.176 in TE, whereas, under the same conditions, Figure 21 shows one of sequence training, with 716.32 in QE and 0.409 in TE. In Figure 20, we can find obvious different areas in the same sphere and corresponding similarities among these spheres. However, in Figure 21, there are no obvious similarities among these spheres. Figure 22 clearly show both colors and distortions are irregular and inconsistent from sphere 5 to sphere 10.

Figure 21: Visual view of CSSOM in sequence training from sphere 1 to sphere 15

Figure 22: Visual view of CSSOM in sequence training from sphere 5 to sphere 10

We cannot obtain the desired results and effects in sequence training, possibly because of the number of epochs, the size of units-grid map, and even the sequence training algorithm. Next, an experiment is described below to validate the assumption that it is not sequential training that is the problem. The patterns are randomly selected in default parallel training above in every epoch. For observing the differences, we modify the sequence of patterns to be in order. All the parameters remain the same values with those in parallel training in table 7. It is the TE/QE table with 20-times repeated runs below:

ERRORS Parallel Training(random order) Parallel Training(time order) QE 188.25 443.47 TE 0.179 0.186

Table 7: QE and TE of Parallel Training in random order and in time order using dataset “ECSH” The QE of the parallel training in time order (data order) is much higher than that of the parallel training in random order, and there are only slight differences in TE between them. Those results imply that when the patterns are randomly selected in training, this is good for the quality of SOM. Figure 23 shows an example of parallel training in time order, with 352.20 QE and 0.206 TE.

Figure 23: visual view of CSSOM in parallel training in time order from sphere 1 to sphere 15

Compared to the training in random order, the training in time order (the dataset’s order) has fewer spheres with similarities. Despite this, there are still obviously distinguishable color areas in each sphere. In a word, those results suggest that the sequence training as proposed is not suitable for CSSOM. Since the QE for the parallel training in time order is still lower than sequence training, this suggests that the sequence training can be improved.

5.3 Experiment 3: The purity of clustering using CSSOM with

different number of spheres


Purity is an evaluation measure of the quality of clustering. Chapter 4.6.1 has the description in detail. In this section, we conduct an experiment to observe the variation trend of the quality of clustering as the number of spheres increase. The “IRIS” dataset is used in this experiment, and 105 of 150 patterns are used as training set (35 are Iris Setosa, 35 are Iris Versicolour and 35 are Iris Virginica), whereas 45 patterns are used for testing. The patterns in the testing set have labels for the output (0 for Setosa, 1 for Versicolour and 2 for Virginica). All the following experiments’ k-value is set as 3.


First, we use default SSOM structures with different size of spheres (ICOSA0, ICOSA1, ICOSA2, ICOSA3 and ICOSA4) to test the purity of SOMs for the “IRIS” dataset.

SSOM ICOSA0 ICOSA1 ICOSA2 ICOSA3 ICOSA4 Purity 89.67% 94.11% 94.35% 91.89% 92.00% Table 8: The purity of SSOMs with different size of sphere

From Table 8, we can find that ICOSA2 (162 units) is the most appropriate for clustering the “IRIS” dataset, rather than ICOSA4. In other word, as the units increase, the purity might not continue to increase. Therefore, the following experiment selects CSSOMs with same or similar number of units which is closed to 162 in total, but with different number of spheres. The following experiment focuses on comparing the purity of 1-sphere-ICOSA2 (162 units), 4-spheres-ICOSA1 (42 units, 168 in total), and 14-spheres-ICOSA0 (12 units, 168 in total). The parameter “epoch” is set to 20. The parameter “neighborhood size(s)” is set to 0.5. Table 9 is the corresponding purity of clustering of 20 times (the average value).

CSSOM ICOSA2(1S) ICOSA1(4S) ICOSA0(14S) Purity 94.35% 92.52% 92.12%

Table 9: The purity of CSSOMs with different number of spheres In table 9, it is easy to find that the quality of clustering is worse as the number of spheres in CSSOM increases. The reason might be that as the number of spheres in CSSOM increases, the distribution of the clusters becomes more and more complicated.

6. Conclusion and Future Works

6.1 Conclusion

In this report, we propose a new approach for the arrangement of neurons in spherical SOMs, which is CSSOM with arbitrary number of spheres. User can select arbitrary number of spheres according to the objectives of various experiments. First, CSSOM has better quality than conventional SOM, since it has lower QE and TE. Second, the QE and TE among SSOM and CSSOMs (different number of spheres, but with the approximate number of units in total) are different. If a better topology preservation and greater uniformity of units’ neighborhood are desired, the CSSOM with fewer units in one sphere is recommended. However, the input patterns might not fit the neural map well (high QE). Therefore, users should have the balance between them according to different demands. Finally, the results of the experiments show that the sequence training we proposed at the present stage for CSSOM needs improvement.

6.2 Future Work

In the project, we focus on implementation instead of the experiment, so the datasets used are not diverse enough. More experiments with diverse datasets should be conducted in the future. Moreover, the corresponding points’ X (the cartesian coordinates of the points on the sphere) in every sphere are the same in the initialization phase. This is not reasonable, because it might have some influences on QE. About X’s initialization, it requires a small scale of adjustment according to the length of R (neighborhood radius). Of course, it is also the future work that r (neighborhood radius parameter) and epochs are adjustable automatically depending on the dataset’s characteristics, in order to achieve optimum performance. Finally, although the experiment shows the sequence training we proposed for the time series data in CSSOM does not work well, improvements are possible. For example, for each epoch, the first pattern could be selected randomly for training, but patterns’ order (time series) should be still kept sequential. Future work should also involve experiments with many datasets to provide a rule of thumb for how to choose sphere size and number of units. For example it may be better to find an appropriate good size, and then try multiple spheres.

Reference

[1] Leontitsis, A & Sangole, AP 2006, ‘Estimating an Optimal Neighborhood Size in the Spherical Self-Organizing Feature Map’, International Journal of Computational Intelligence, vol 18, no. 35, pp. 192-196.

[2] Nishio, H, Altaf-Ui-Amin, MD, Kurokawa, K & Kanaya, S 2006, ‘Spherical SOM and Arrangement of Neurons Using Helix on Sphere’, IPSJ Digital Courier, vol.2, pp. 133-137

[3] Bação, F & Lobo, V 2010, Introduction to Kohonen’s Self-Organizing Maps, Universidade Nova De Lisboa, Portugal, viewed 22 Sep 2011, <http://edugi.uji.es/Bacao/SOM%20Tutorial.pdf>.

[4] Sangole, AP & Leontitsis, A 2006, ‘Spherical Self-Organizing Feature Map: an Introductory Review’, International Journal of Bifurcation and Chaos, vol.16, no. 11, pp. 3195-3206

[5] Wu, Y & Takatsuka, M 2006, ‘Spherical self-organizing map using efficient indexed geodesic data structure’, Neural Networks, vol. 19, no. 6, pp. 900-910

[6] Dayan, P 1999, ‘Unsupervised learning’, in Wilson, RA & Keil, K (ed.), The MIT Encyclopedia of the Cognitive Sciences.

[7] Ramasubramanian, V & Paliwal, KK 2000, ‘Fast nearest-neighbor search algorithms based on approximation-elimination search’, Pattern Recognition, vol. 33, no. 9, pp. 1497-1510.

[8] Haykin, S 1999, Neural Networks: A Comprehensive Foundation, 2nd edn, Prentice-Hall, Inc., New Jersey, pp. 443-465.

[9] Kohonen, T 1982, ‘Self-Organized Formation of Topologicallly Correct Feature Maps’, Biological Cybernetics, vol.43.

[10] Kohonen, T 1995, ‘Self-Organizing Maps’, Springer-Verlag.

[11] Kohonen, T 2001, ‘Self-Organizing Maps’, Springer series in information sciences 3rd ed.

[12] Sangole, A & Knopf, GK 2003, ‘Visualization of randomly ordered numeric data sets using spherical sel-organizing feature maps’, Computer & Graphics, vol.27, no.6, pp. 963-976.

[13] Krishnamurthy, AK, Ahalt, SC, Melton, DE & Chen, P 1990, ‘Neural networks for vector quantization of speech and images’, IEEE Journal on Selected Areas in Communication, vol.8, no.8, pp. 1449-1457.

[14] Uriarte, EA & Martin, FD 2005, ‘Topology Preservation in SOM’, Interational Journal of Mathematical and Computer Science, vol.1, no.1, pp. 1-4.

[15] Sarle, WS 2002, SOM FAQ.

http://edugi.uji.es/Bacao/SOM%20Tutorial.pdf

[16] Boudjemai, F, Enberg, PB & Postaire, JG 2003, ‘Self organizing spherical map architecture for 3D object modeling’, in Proceedings of workshop on self-organizing maps.

[17] Lawrence, J 1994, Introduction to neural network: design, theory and applications, 6th edn, California Scientific Software Press.

Appendix A

The following tables show the units have different neighbors in SSOM, CSSOM(4S), CSSOM(15S), CSSOM(61S), CSSOM(214S). SSOM: 2562-units grid map2

neighbors 1320 1351 1374 1378 1400 1401 1420 1421 1422 1435 number 12 60 60 60 120 60 60 60 120 60 neighbors 1440 1442 1446 1453 1454 1456 1459 1461 1464 1470 number 120 120 60 60 120 30 120 60 120 120 neighbors 1472 1476 1480 1482 1486 1488 1490 1494 1496 1498 number 120 120 120 60 120 60 120 120 60 60

CSSOM(4S): 4 spheres of 642-units grid map(4*642=2568)

neighbors 357 368 375 378 384 386 388 number 240 480 240 600 480 240 240


neighbors 624 673 690 704 number 180 900 450 900


neighbors 1457 1644 number 732 1830


neighbors 1267 number 2568

2 “neighbors” represent the number of neighbors. “number” is the number of units with n neighbors. For example, in CSSOM(4S), there are 240 units which have 375 neighbors, 480 units which have 368 neighbors and so on.

Spherical Topology Self-Organizing Map Neuron Network for...

Documents

Transcript of Spherical Topology Self-Organizing Map Neuron Network for...