1 Some Guidelines for Good Research Dr Leow Wee Kheng Dept. of Computer Science.
-
Upload
noah-richards -
Category
Documents
-
view
216 -
download
0
description
Transcript of 1 Some Guidelines for Good Research Dr Leow Wee Kheng Dept. of Computer Science.
1
Some Guidelines for
Good Research
Dr Leow Wee KhengDept. of Computer Science
2
Research is... extending human understanding? publishing papers? solving a challenging puzzle? making things work? having fun?
3
Research involves... selecting a topic studying existing work solving problems publishing results
4
Problem Solving define problem specify requirements/quality of solution formulate solution method evaluate performance of method
5
Problem Definition Purpose
clarify problem focus attention
Define problem precisely It's usually difficult to define problem.
So, just skip it? The more difficult it is,
the stronger is the need,the harder one should try.
6
Problem DefinitionExample: image retrieval
1st trial:Retrieve images that a user wants.
Problem: too vague What does user want?
7
Problem Definition2nd trial:
Retrieve images that are similar to a specific example called the query.
Better, but still vague: What does it mean by “similar”?
8
Problem Definition3nd trial:
Given a query Q, retrieve images Ii, i = 1,...,n, such that a similarity measure s(Q, Ii) is large.
Better, but What is this s(Q, Ii)?
Notice There’s an input: Q There’re expected outputs: Ii There’s requirement specification: large s(Q, Ii)
9
Problem Definition4th trial:
Given a query Q, retrieve images Ii, i = 1,...,n, such that a similarity measure s(Q, Ii) is large, and s(Q, Ii) is consistent with human’s perception.
Good try but difficult to measure human’s perception;
still a difficult research topic.
Notice We are talking more about quality of solution.
10
Problem Definition5th trial:
Given a query Q, retrieve images Ii, i = 1,...,n, such that precision p and recall rate r are maximized. each Q and Ii contains one or more regions p and r are performance indices p and r are defined in terms of similarity
s(Q, Ii)
Now, the problem definition is more specific: requirement of solution also given still haven’t said what are s(Q, Ii), p, r
11
Problem DefinitionGood problem definition should include inputs expected outputs relationships between inputs and outputs requirements about outputs performance measures
12
Formulate Solution MethodWhat it is not... not writing a program not thinking of an algorithm not thinking of what tools to use
So what is it? give more details to problem definition
use mathematics divide the problem into sub-problems map sub-problem into known problem with
known solution methods
13
Formulate Solution MethodExample: image retrieval
First sub-problem: define similarity s(Q, Ii) Q has regions Rk
Ii has regions Rij
some Rk are identical/similar to some Rij
some Rk are not identical/similar to some Rij
need to find best matching pairs
How to say all these in math?
14
Formulate Solution Method One possibility: use a mapping function:
f: Q Ii Rij = f(Rk) means Rk corresponds to Rij
define region similarity s(Rk, Rij) s(Rk, Rij) is large if Rk is similar to Rij
this is another sub-problem
s(Rk, f(Rk)) = match between Rk Q and corresponding f(Rk) Ii
15
Formulate Solution MethodFirst, define sf (Q, Ii):
Given a query Q, an image Ii , and a mapping function f, define sf (Q, Ii) as
Then, define s(Q, Ii):Given a query Q and an image Ii ,
QR
kkifk
RfRsQIQs ))(,(||1),(
),(max),( iff
i IQsIQs
16
Formulate Solution MethodNow, perform a “magic” which I call
problem transformation.
Given a query Q and an image Ii, find the mapping f such that s(Q, Ii) is maximized. this is an optimization problem
Given a query Q and a set of images Ii, compute s(Q, Ii) by solving optimization problem return images Ii with large s(Q, Ii).
Now, we have an algorithm!
17
Formulate Solution MethodSounds complicated. Do we really do all these? Yes.
The more complicated, the stronger the need.
Further reading:my IJCAI 2001 paper on conceptual graph map query and images to conceptual graphs;
define problem as subgraph matching problem(problem transformation)
map subgraph matching to search problem(transform once more)
implement search algorithm
18
Formulate Solution MethodSome challenges: Define the problem that you’re working on using
math. Try to find an interesting CS research problem
that cannot be defined using math.
19
Performance EvaluationPurpose evaluate solution method understand its strengths and weaknesses
(with the aim of improving it) determine values of system parameters
20
Performance Evaluation What’s your impression if you see this:
The accuracy of the method is 90%.
The statement is not very meaningful. Is the method really good?
may be the problem is simple may be the data is not representative
easy to differentiate elephant from apple;difficult to differentiate between African and Asian elephant
21
Performance EvaluationSo what makes a good evaluation? good test data
test various aspects of method good test cases
expose strengths and weaknesses of method good comparison
set baselines for assessing performance compare with state-of-the-art
22
Performance EvaluationExample 1: Color histograms: finding good
operating condition (i.e., parameter values) 100 colorful images 2 parameters:
radius R separation ratio
2 performance measures: mean number of color bins mean error of color
23
Performance Evaluation
24
Performance Evaluation
25
Performance Evaluation
good operating condition
26
Performance EvaluationExample 2: Estimation of surface normals of 3D
points. 3 sets of data points 2 methods of getting good neighboring points:
point-based (P), mesh-based (M) 2 methods of computing surface normals:
PCA, linear extrapolation (LE) baseline: PCA on raw 3D points
27
Performance Evaluation
What can you conclude from the result? M is better than P LE is better than PCA LE/M is the best
28
Performance EvaluationExample 3: Classification of images by color
distributions. 100 classes, 20 samples each 2 methods of computing color histograms:
clustered (c), adaptive (a) 4 dissimilarity measures:
L2, JD: only for c EMD: too slow for c, so only for a WC: ok for both c and a
baseline: c + L2
29
Performance EvaluationWhat can you conclude from the result?
30
final points... Always relate to the big picture.
good research is never performed in isolation Know the strengths & weaknesses of your tools.
nothing is perfect for everything Understand the problem first.
choose the solution method last Learn various methods and tools.
you never know when you need one
31
good readings... H. S. Fogler & S. E. LeBlanc, Strategies for
Creative Problem Solving, Prentice-Hall, 1995. general ideas, no math
D. Huff, How To Lie With Statistics, W. W. Norton, 1993. know how not to get cheated