Dr. P. Nagabhushan

Dr. P. NagabhushanDirector

Indian Institute of Information Technology, Allahabad-211 015Email: [email protected]

mailto:[email protected]

Incremental Learning Learning is the Cognitive Process of acquiring

knowledge.

If Learning is performed using the entire data in onestretch, then it is one-shot learning or Learning atone-go: provided the entire data is available.

3

However, Volume of data could be so high that the entire

data mass cannot be processed in one stretch.

Availability of data could itself be dependent onthe arrival of data at delayed time instants.

Data required might have to be pooled in fromdifferent sources.

Hence,

one should move from traditional one-shot learningto Incremental Learning.

4

Concept of Incremental Cognition

The process of deriving knowledge

in a phased manner

without re-indenting the past data

is conceived as

Incremental Learning.

5

Proposed Incremental Learning Models

6

ZM-Large Volume Incremental Learning (LVIL) through Sequence Compulsive Incremental Learning for temporal arrival of data

7

PM- LVIL through Sequence Compulsive Incremental Learning (SCIL) Model for Temporal Arrival of Data

8

Sequence Optional Incremental Learning (SOIL) Model Using Kruskal’s way of merging for Distributed Data

9

NILE DELTA

Four Quadrants / 4 Blocks

f2 f3f1 f4 f5

good

excellent

(28-42)

(36-49) 8.42

28.21

2

:

:

:

:

:

:

:

:

:

:

:

:

:

:

:

:

:

:

A typical symbolic table

Symbolic Data Objects - Construction

A conventional feature table

Sample f1 f2

12345678910

5.14.94.74.65.05.44.65.04.44.9

3.53.03.23.13.63.93.43.42.93.1


Sample f1 f2

12345678910

5.14.94.74.65.05.44.65.04.44.9

3.53.03.23.13.63.93.43.42.93.1

( ) ( ) 9.39.224.54.41 −=−= ffSimple interval type


Sample f1 f2

12345678910

5.14.94.74.65.05.44.65.04.44.9

3.53.03.23.13.63.93.43.42.93.1

Interval type with the nominal feature values

Note : Nominal feature value = The most expected feature value

]3.3,9.4[0

2

0

1 == ff

( ) ( ) 9.33.39.224.59.44.41 == ff


( ) ( ) ( ) ( ) ( ) 08.04.52.5,25.02.50.5,33.00.58.4,25.08.46.4,08.06.44.41 −−−−−=f

( ) ( ) ( ) ( ) 07.00.47.3,28.07.34.3,35.04.31.3,28.01.38.22 −−−−=f

21 ffS =

More complex Symbolic object (in terms of probability distribution)

Sample f1 f2

12345678910

5.14.94.74.65.05.44.65.04.44.9

3.53.03.23.13.63.93.43.42.93.1


Symbolic features through histograms

Sample f1 f2

12345678910

5.14.94.74.65.05.44.65.04.44.9

3.53.03.23.13.63.93.43.42.93.1


A Histo Distance Measure for Distribution type of Features

A Histo Distance Measure for Distribution type of Features

Let us consider the case where a n - dimensional SYMBOLIC OBJECT has all ‘n’ features of histogram type / probability distribution type.

A = [ ai = ]i= 1,n

Cumulative Probability Distribution

P

aa

Normalized Cumulative Probability Distribution

P

a a

What are the special advantages ?

P

a a

Cumulative Probability Function always has positive slope.

Starts with (a,0) and ends with (ā,1)

{(a a ā), (0 P 1)}

We can linearize the cumulative probability function through Regression

P

āa

P = ma + c

This linear equation

P = ma + c

represents one feature of object

For the corresponding feature of object the linear equation is

P = m a + c

Both the straight lines may be inserted in a single P Vs feature value graph

P

T

U V

W

P

T

U V

W

The distance between the two objects in ith feature is

The area enclosed by the quadrilateral T U V W

Therefore

Distance(A,B) =[ (Areai)p ]1/p

i= 1,n

If p = 1 it is City Block Distance, If p= 2 it is Euclidean Distance

Different Possibilities

Exactly Overlapping

D(A,B) = 0

D(A,B) = Area Enclosed

a1

a2

D(A,B) = a1 + a2

Special Cases

If ai = (ai, āi) is an interval type of feature then

p

ai āi

Equiprobable Distribution

P

ai āi

If ai is a single valued numeric type then

p

ai

P

ai

In such a case the distance between the two objects and

P

ai bi

D(,) = Area of the rectangle

= (bi – ai) * 1

= |bi – ai|

= Conventional City Block

1

0

Dis1

Dis2

Line L1

Line L2

Fig 10(a)

Distance (Line L1, Line L2)= (Dis1 + Dis2) / 2

1

0

Dis1

Dis2

Line L1Line L2

xi,yi

Fig 10(b)

Distance (Line L1, Line L2)= (Dis1 * yi + Dis2 * (1-yi)) / 2

A simple distance measure which is quitelucid to comprehend andcomputationally easy to implement isdevised for distribution/histogram typeof symbolic objects

An Image Processing Application

NILE DELTA

Four Quadrants / 4 Blocks

WALL PAPER

THAR DESERT

SIMULATED

folder

Map of Australia

70Source: Google

Analysis of Image

Size is (4320 X 5250 X 3) pixels Implies 4320 rows and 5250 columns of pixels in each of the

Red, Green, Blue (RGB) layers of the Image. If each pixel is represented by an Unsigned integer of say, 1

byte, we need 6,80,40,000 bytes !!! Learning from such a huge image in one-go is really a over

burden and needs the computational resources beyondimagination.

An easier alternative could be to re-size the image andlearn from it. But by re-sizing, the quality of image iscompromised.

In applications such as Weather forecasting, medical imageanalysis one cannot afford to loose the quality of thoseimages that have been obtained with great difficulty thattoo at an exorbitant cost.

71

Selecting the region of Interest for learning

72

Size: 4000X4800X3

Size of Each Frame

73

Resized Image: 4000 X 4800 X 3

For vertical Fragments/Frames

Divided by No. of Frames

If we want say 8 Frames, 4800/8 = 600

Size of Each Frame will be 4000 X 600 X 3

Vertical Fragmentation of Australia Map into 8 frames each of size (4000X600X3)

74

Dividing each frame into blocks of fixed size

75

Size of each block

As the topic of size of block is not the original issue inthis research, it is considered as the topic beyond thescope of present research.

For the sake of simplicity, it has been decided to divideeach frame into blocks of fixed size, say (50 X 50 X 3)or (100 X 100 X 3).

76

The Process of SCIL from spatial data

77

No. of Clusters = 4;

No. of Outliers = 8;

Slope and Intercept for ‘R’ layer128.9632 5.6244224.7624 -60.8968151.2128 58.4553163.5785 -2.0205

Slope and Intercept for ‘G’ layer129.2686 4.6854228.2566 -66.0935144.7173 42.2374164.7278 4.5665

Slope and Intercept for ‘B’ layer128.3291 25.2985145.0664 13.1920146.8216 20.8907192.8587 -47.5536

Extracted Knowledge is

Frame 1

The Process of SCIL from spatial data (Cont’d)

78

Clusters of Frame 1 Clusters of Frame 2 Distance between clusters of Frame 1 and Frame 2

Clusters 1 and 5 can be merged

Clusters 3 and 6 can be merged

As Minimum distance is >0.15clusters cannot be merged

Clusters 2 and 5 are nearest

Final Knowledge of Frame1 and Frame 2

Result after processing the final frame

79

Second Image

80

Test Image No.2

Size of the Image = (3292 X4939X3) pixels

Selected portion for learning is row pixels 1 to 3000and columns pixels 1 to 4800 of all the RGB planes

Image is divided into 8 frames of size (3000 X 600 X 3)pixels

Each Frame is Divided into Blocks of (100 X 100 X 3)pixels. Hence each frame gets divided into 30X6 = 180blocks.

81

Result of the test image

82

Quiz wise incremental update of Knowledge of Subject-1-Using SCIL Model

83

Quiz-1

Quiz-2

Updated knowledge

Quiz-3 Quiz-4

Quiz-5 Quiz-6

Quiz-7

Quiz-8

Quiz-9

Quiz-10

e-mail: [email protected]

mailto:[email protected]

Dr. P. Nagabhushan

Documents

Transcript of Dr. P. Nagabhushan