Dr. P. Nagabhushan
Transcript of Dr. P. Nagabhushan
Dr. P. NagabhushanDirector
Indian Institute of Information Technology, Allahabad-211 015Email: [email protected]
Incremental Learning Learning is the Cognitive Process of acquiring
knowledge.
If Learning is performed using the entire data in onestretch, then it is one-shot learning or Learning atone-go: provided the entire data is available.
3
However, Volume of data could be so high that the entire
data mass cannot be processed in one stretch.
Availability of data could itself be dependent onthe arrival of data at delayed time instants.
Data required might have to be pooled in fromdifferent sources.
Hence,
one should move from traditional one-shot learningto Incremental Learning.
4
Concept of Incremental Cognition
The process of deriving knowledge
in a phased manner
without re-indenting the past data
is conceived as
Incremental Learning.
5
Proposed Incremental Learning Models
6
ZM-Large Volume Incremental Learning (LVIL) through Sequence Compulsive Incremental Learning for temporal arrival of data
7
PM- LVIL through Sequence Compulsive Incremental Learning (SCIL) Model for Temporal Arrival of Data
8
Sequence Optional Incremental Learning (SOIL) Model Using Kruskal’s way of merging for Distributed Data
9
NILE DELTA
Four Quadrants / 4 Blocks
f2 f3f1 f4 f5
good
excellent
(28-42)
(36-49) 8.42
28.21
2
:
:
:
:
:
:
:
:
:
:
:
:
:
:
:
:
:
:
A typical symbolic table
Symbolic Data Objects - Construction
A conventional feature table
Sample f1 f2
12345678910
5.14.94.74.65.05.44.65.04.44.9
3.53.03.23.13.63.93.43.42.93.1
Symbolic Data Objects - Construction
Sample f1 f2
12345678910
5.14.94.74.65.05.44.65.04.44.9
3.53.03.23.13.63.93.43.42.93.1
( ) ( ) 9.39.224.54.41 −=−= ffSimple interval type
Symbolic Data Objects - Construction
Sample f1 f2
12345678910
5.14.94.74.65.05.44.65.04.44.9
3.53.03.23.13.63.93.43.42.93.1
Interval type with the nominal feature values
Note : Nominal feature value = The most expected feature value
]3.3,9.4[0
2
0
1 == ff
( ) ( ) 9.33.39.224.59.44.41 == ff
Symbolic Data Objects - Construction
( ) ( ) ( ) ( ) ( ) 08.04.52.5,25.02.50.5,33.00.58.4,25.08.46.4,08.06.44.41 −−−−−=f
( ) ( ) ( ) ( ) 07.00.47.3,28.07.34.3,35.04.31.3,28.01.38.22 −−−−=f
21 ffS =
More complex Symbolic object (in terms of probability distribution)
Sample f1 f2
12345678910
5.14.94.74.65.05.44.65.04.44.9
3.53.03.23.13.63.93.43.42.93.1
Symbolic Data Objects - Construction
Symbolic features through histograms
Sample f1 f2
12345678910
5.14.94.74.65.05.44.65.04.44.9
3.53.03.23.13.63.93.43.42.93.1
Symbolic Data Objects - Construction
A Histo Distance Measure for Distribution type of Features
A Histo Distance Measure for Distribution type of Features
Let us consider the case where a n - dimensional SYMBOLIC OBJECT has all ‘n’ features of histogram type / probability distribution type.
A = [ ai = ]i= 1,n
p
aa
Cumulative Probability Distribution
P
aa
Normalized Cumulative Probability Distribution
P
a a
What are the special advantages ?
P
a a
Cumulative Probability Function always has positive slope.
Starts with (a,0) and ends with (ā,1)
{(a a ā), (0 P 1)}
We can linearize the cumulative probability function through Regression
P
āa
P = ma + c
This linear equation
P = ma + c
represents one feature of object
For the corresponding feature of object the linear equation is
P = m a + c
Both the straight lines may be inserted in a single P Vs feature value graph
P
T
U V
W
P
T
U V
W
The distance between the two objects in ith feature is
The area enclosed by the quadrilateral T U V W
Therefore
Distance(A,B) =[ (Areai)p ]1/p
i= 1,n
If p = 1 it is City Block Distance, If p= 2 it is Euclidean Distance
Different Possibilities
Exactly Overlapping
D(A,B) = 0
D(A,B) = Area Enclosed
a1
a2
D(A,B) = a1 + a2
Special Cases
If ai = (ai, āi) is an interval type of feature then
p
ai āi
Equiprobable Distribution
P
ai āi
If ai is a single valued numeric type then
p
ai
P
ai
In such a case the distance between the two objects and
P
ai bi
D(,) = Area of the rectangle
= (bi – ai) * 1
= |bi – ai|
= Conventional City Block
1
0
Dis1
Dis2
Line L1
Line L2
Fig 10(a)
Distance (Line L1, Line L2)= (Dis1 + Dis2) / 2
1
0
Dis1
Dis2
Line L1Line L2
xi,yi
Fig 10(b)
Distance (Line L1, Line L2)= (Dis1 * yi + Dis2 * (1-yi)) / 2
A simple distance measure which is quitelucid to comprehend andcomputationally easy to implement isdevised for distribution/histogram typeof symbolic objects
An Image Processing Application
NILE DELTA
Four Quadrants / 4 Blocks
WALL PAPER
THAR DESERT
SIMULATED
folder
Map of Australia
70Source: Google
Analysis of Image
Size is (4320 X 5250 X 3) pixels Implies 4320 rows and 5250 columns of pixels in each of the
Red, Green, Blue (RGB) layers of the Image. If each pixel is represented by an Unsigned integer of say, 1
byte, we need 6,80,40,000 bytes !!! Learning from such a huge image in one-go is really a over
burden and needs the computational resources beyondimagination.
An easier alternative could be to re-size the image andlearn from it. But by re-sizing, the quality of image iscompromised.
In applications such as Weather forecasting, medical imageanalysis one cannot afford to loose the quality of thoseimages that have been obtained with great difficulty thattoo at an exorbitant cost.
71
Selecting the region of Interest for learning
72
Size: 4000X4800X3
Size of Each Frame
73
Resized Image: 4000 X 4800 X 3
For vertical Fragments/Frames
Divided by No. of Frames
If we want say 8 Frames, 4800/8 = 600
Size of Each Frame will be 4000 X 600 X 3
Vertical Fragmentation of Australia Map into 8 frames each of size (4000X600X3)
74
Dividing each frame into blocks of fixed size
75
Size of each block
As the topic of size of block is not the original issue inthis research, it is considered as the topic beyond thescope of present research.
For the sake of simplicity, it has been decided to divideeach frame into blocks of fixed size, say (50 X 50 X 3)or (100 X 100 X 3).
76
The Process of SCIL from spatial data
77
No. of Clusters = 4;
No. of Outliers = 8;
Slope and Intercept for ‘R’ layer128.9632 5.6244224.7624 -60.8968151.2128 58.4553163.5785 -2.0205
Slope and Intercept for ‘G’ layer129.2686 4.6854228.2566 -66.0935144.7173 42.2374164.7278 4.5665
Slope and Intercept for ‘B’ layer128.3291 25.2985145.0664 13.1920146.8216 20.8907192.8587 -47.5536
Extracted Knowledge is
Frame 1
The Process of SCIL from spatial data (Cont’d)
78
Clusters of Frame 1 Clusters of Frame 2 Distance between clusters of Frame 1 and Frame 2
Clusters 1 and 5 can be merged
Clusters 3 and 6 can be merged
As Minimum distance is >0.15clusters cannot be merged
Clusters 2 and 5 are nearest
Final Knowledge of Frame1 and Frame 2
Result after processing the final frame
79
Second Image
80
Test Image No.2
Size of the Image = (3292 X4939X3) pixels
Selected portion for learning is row pixels 1 to 3000and columns pixels 1 to 4800 of all the RGB planes
Image is divided into 8 frames of size (3000 X 600 X 3)pixels
Each Frame is Divided into Blocks of (100 X 100 X 3)pixels. Hence each frame gets divided into 30X6 = 180blocks.
81
Result of the test image
82
Quiz wise incremental update of Knowledge of Subject-1-Using SCIL Model
83
Quiz-1
Quiz-2
Updated knowledge
Quiz-3 Quiz-4
Quiz-5 Quiz-6
Quiz-7
Quiz-8
Quiz-9
Quiz-10