1 Some Guidelines for Good Research Dr Leow Wee Kheng Dept. of Computer Science.

31
1 Some Guidelines for Good Research Dr Leow Wee Kheng Dept. of Computer Science

description

3 Research involves...  selecting a topic  studying existing work  solving problems  publishing results

Transcript of 1 Some Guidelines for Good Research Dr Leow Wee Kheng Dept. of Computer Science.

Page 1: 1 Some Guidelines for Good Research Dr Leow Wee Kheng Dept. of Computer Science.

1

Some Guidelines for

Good Research

Dr Leow Wee KhengDept. of Computer Science

Page 2: 1 Some Guidelines for Good Research Dr Leow Wee Kheng Dept. of Computer Science.

2

Research is... extending human understanding? publishing papers? solving a challenging puzzle? making things work? having fun?

Page 3: 1 Some Guidelines for Good Research Dr Leow Wee Kheng Dept. of Computer Science.

3

Research involves... selecting a topic studying existing work solving problems publishing results

Page 4: 1 Some Guidelines for Good Research Dr Leow Wee Kheng Dept. of Computer Science.

4

Problem Solving define problem specify requirements/quality of solution formulate solution method evaluate performance of method

Page 5: 1 Some Guidelines for Good Research Dr Leow Wee Kheng Dept. of Computer Science.

5

Problem Definition Purpose

clarify problem focus attention

Define problem precisely It's usually difficult to define problem.

So, just skip it? The more difficult it is,

the stronger is the need,the harder one should try.

Page 6: 1 Some Guidelines for Good Research Dr Leow Wee Kheng Dept. of Computer Science.

6

Problem DefinitionExample: image retrieval

1st trial:Retrieve images that a user wants.

Problem: too vague What does user want?

Page 7: 1 Some Guidelines for Good Research Dr Leow Wee Kheng Dept. of Computer Science.

7

Problem Definition2nd trial:

Retrieve images that are similar to a specific example called the query.

Better, but still vague: What does it mean by “similar”?

Page 8: 1 Some Guidelines for Good Research Dr Leow Wee Kheng Dept. of Computer Science.

8

Problem Definition3nd trial:

Given a query Q, retrieve images Ii, i = 1,...,n, such that a similarity measure s(Q, Ii) is large.

Better, but What is this s(Q, Ii)?

Notice There’s an input: Q There’re expected outputs: Ii There’s requirement specification: large s(Q, Ii)

Page 9: 1 Some Guidelines for Good Research Dr Leow Wee Kheng Dept. of Computer Science.

9

Problem Definition4th trial:

Given a query Q, retrieve images Ii, i = 1,...,n, such that a similarity measure s(Q, Ii) is large, and s(Q, Ii) is consistent with human’s perception.

Good try but difficult to measure human’s perception;

still a difficult research topic.

Notice We are talking more about quality of solution.

Page 10: 1 Some Guidelines for Good Research Dr Leow Wee Kheng Dept. of Computer Science.

10

Problem Definition5th trial:

Given a query Q, retrieve images Ii, i = 1,...,n, such that precision p and recall rate r are maximized. each Q and Ii contains one or more regions p and r are performance indices p and r are defined in terms of similarity

s(Q, Ii)

Now, the problem definition is more specific: requirement of solution also given still haven’t said what are s(Q, Ii), p, r

Page 11: 1 Some Guidelines for Good Research Dr Leow Wee Kheng Dept. of Computer Science.

11

Problem DefinitionGood problem definition should include inputs expected outputs relationships between inputs and outputs requirements about outputs performance measures

Page 12: 1 Some Guidelines for Good Research Dr Leow Wee Kheng Dept. of Computer Science.

12

Formulate Solution MethodWhat it is not... not writing a program not thinking of an algorithm not thinking of what tools to use

So what is it? give more details to problem definition

use mathematics divide the problem into sub-problems map sub-problem into known problem with

known solution methods

Page 13: 1 Some Guidelines for Good Research Dr Leow Wee Kheng Dept. of Computer Science.

13

Formulate Solution MethodExample: image retrieval

First sub-problem: define similarity s(Q, Ii) Q has regions Rk

Ii has regions Rij

some Rk are identical/similar to some Rij

some Rk are not identical/similar to some Rij

need to find best matching pairs

How to say all these in math?

Page 14: 1 Some Guidelines for Good Research Dr Leow Wee Kheng Dept. of Computer Science.

14

Formulate Solution Method One possibility: use a mapping function:

f: Q Ii Rij = f(Rk) means Rk corresponds to Rij

define region similarity s(Rk, Rij) s(Rk, Rij) is large if Rk is similar to Rij

this is another sub-problem

s(Rk, f(Rk)) = match between Rk Q and corresponding f(Rk) Ii

Page 15: 1 Some Guidelines for Good Research Dr Leow Wee Kheng Dept. of Computer Science.

15

Formulate Solution MethodFirst, define sf (Q, Ii):

Given a query Q, an image Ii , and a mapping function f, define sf (Q, Ii) as

Then, define s(Q, Ii):Given a query Q and an image Ii ,

QR

kkifk

RfRsQIQs ))(,(||1),(

),(max),( iff

i IQsIQs

Page 16: 1 Some Guidelines for Good Research Dr Leow Wee Kheng Dept. of Computer Science.

16

Formulate Solution MethodNow, perform a “magic” which I call

problem transformation.

Given a query Q and an image Ii, find the mapping f such that s(Q, Ii) is maximized. this is an optimization problem

Given a query Q and a set of images Ii, compute s(Q, Ii) by solving optimization problem return images Ii with large s(Q, Ii).

Now, we have an algorithm!

Page 17: 1 Some Guidelines for Good Research Dr Leow Wee Kheng Dept. of Computer Science.

17

Formulate Solution MethodSounds complicated. Do we really do all these? Yes.

The more complicated, the stronger the need.

Further reading:my IJCAI 2001 paper on conceptual graph map query and images to conceptual graphs;

define problem as subgraph matching problem(problem transformation)

map subgraph matching to search problem(transform once more)

implement search algorithm

Page 18: 1 Some Guidelines for Good Research Dr Leow Wee Kheng Dept. of Computer Science.

18

Formulate Solution MethodSome challenges: Define the problem that you’re working on using

math. Try to find an interesting CS research problem

that cannot be defined using math.

Page 19: 1 Some Guidelines for Good Research Dr Leow Wee Kheng Dept. of Computer Science.

19

Performance EvaluationPurpose evaluate solution method understand its strengths and weaknesses

(with the aim of improving it) determine values of system parameters

Page 20: 1 Some Guidelines for Good Research Dr Leow Wee Kheng Dept. of Computer Science.

20

Performance Evaluation What’s your impression if you see this:

The accuracy of the method is 90%.

The statement is not very meaningful. Is the method really good?

may be the problem is simple may be the data is not representative

easy to differentiate elephant from apple;difficult to differentiate between African and Asian elephant

Page 21: 1 Some Guidelines for Good Research Dr Leow Wee Kheng Dept. of Computer Science.

21

Performance EvaluationSo what makes a good evaluation? good test data

test various aspects of method good test cases

expose strengths and weaknesses of method good comparison

set baselines for assessing performance compare with state-of-the-art

Page 22: 1 Some Guidelines for Good Research Dr Leow Wee Kheng Dept. of Computer Science.

22

Performance EvaluationExample 1: Color histograms: finding good

operating condition (i.e., parameter values) 100 colorful images 2 parameters:

radius R separation ratio

2 performance measures: mean number of color bins mean error of color

Page 23: 1 Some Guidelines for Good Research Dr Leow Wee Kheng Dept. of Computer Science.

23

Performance Evaluation

Page 24: 1 Some Guidelines for Good Research Dr Leow Wee Kheng Dept. of Computer Science.

24

Performance Evaluation

Page 25: 1 Some Guidelines for Good Research Dr Leow Wee Kheng Dept. of Computer Science.

25

Performance Evaluation

good operating condition

Page 26: 1 Some Guidelines for Good Research Dr Leow Wee Kheng Dept. of Computer Science.

26

Performance EvaluationExample 2: Estimation of surface normals of 3D

points. 3 sets of data points 2 methods of getting good neighboring points:

point-based (P), mesh-based (M) 2 methods of computing surface normals:

PCA, linear extrapolation (LE) baseline: PCA on raw 3D points

Page 27: 1 Some Guidelines for Good Research Dr Leow Wee Kheng Dept. of Computer Science.

27

Performance Evaluation

What can you conclude from the result? M is better than P LE is better than PCA LE/M is the best

Page 28: 1 Some Guidelines for Good Research Dr Leow Wee Kheng Dept. of Computer Science.

28

Performance EvaluationExample 3: Classification of images by color

distributions. 100 classes, 20 samples each 2 methods of computing color histograms:

clustered (c), adaptive (a) 4 dissimilarity measures:

L2, JD: only for c EMD: too slow for c, so only for a WC: ok for both c and a

baseline: c + L2

Page 29: 1 Some Guidelines for Good Research Dr Leow Wee Kheng Dept. of Computer Science.

29

Performance EvaluationWhat can you conclude from the result?

Page 30: 1 Some Guidelines for Good Research Dr Leow Wee Kheng Dept. of Computer Science.

30

final points... Always relate to the big picture.

good research is never performed in isolation Know the strengths & weaknesses of your tools.

nothing is perfect for everything Understand the problem first.

choose the solution method last Learn various methods and tools.

you never know when you need one

Page 31: 1 Some Guidelines for Good Research Dr Leow Wee Kheng Dept. of Computer Science.

31

good readings... H. S. Fogler & S. E. LeBlanc, Strategies for

Creative Problem Solving, Prentice-Hall, 1995. general ideas, no math

D. Huff, How To Lie With Statistics, W. W. Norton, 1993. know how not to get cheated