IT 499: Seminar Course Week 3 Faculty: Dr. Afshan Jafri 28 February 2012

45
IT 499: Seminar Course Week 3 Faculty: Dr. Afshan Jafri 28 February 2012

description

IT 499: Seminar Course Week 3 Faculty: Dr. Afshan Jafri 28 February 2012. Outline. What is research? How to prepare yourself for IT research? How to identify and define a good IT research problem? Research Area Research Question / Topic How to solve it? Research methods Research phases - PowerPoint PPT Presentation

Transcript of IT 499: Seminar Course Week 3 Faculty: Dr. Afshan Jafri 28 February 2012

Page 1: IT 499: Seminar Course Week 3 Faculty: Dr. Afshan Jafri 28 February 2012

IT 499: Seminar CourseWeek 3

Faculty: Dr. Afshan Jafri

28 February 2012

Page 2: IT 499: Seminar Course Week 3 Faculty: Dr. Afshan Jafri 28 February 2012

2

Outline

1. What is research?

2. How to prepare yourself for IT research?

3. How to identify and define a good IT research problem?

- Research Area

- Research Question / Topic

4. How to solve it?

- Research methods

- Research phases

5. How to write and publish an IT paper?

6. Research Ethics

Page 3: IT 499: Seminar Course Week 3 Faculty: Dr. Afshan Jafri 28 February 2012

How to solve it?

• Understanding the problem

– Distinguishing the unknown, the data and the condition

• Devising a plan

– Connecting the data to the unknown, finding related problems, relying on previous findings

• Carrying out the plan

– Validating each step, (if possible) proving correctness

• Looking back

– Checking the results, contemplating alternative solutions, exploring further potential of result/method

Page 4: IT 499: Seminar Course Week 3 Faculty: Dr. Afshan Jafri 28 February 2012

Research Methods

Page 5: IT 499: Seminar Course Week 3 Faculty: Dr. Afshan Jafri 28 February 2012

A Research Method Classification

• Scientific: understanding nature

• Engineering: providing solutions

• Empirical: data centric models

• Analytical: theoretical formalism

• Computing: hybrid of methods

From W.R.Adrion, Research Methodology in Software Engineering, ACM SE Notes, Jan. 1993

Page 6: IT 499: Seminar Course Week 3 Faculty: Dr. Afshan Jafri 28 February 2012

Scientist vs. Engineer

• A scientist sees a phenomenon and asks “why?” and proceeds to research the answer to the question.

• An engineer sees a practical problem and wants to know “how” to solve it and “how” to implement that solution, or “how” to do it better if a solution exists.

• A scientist builds in order to learn, but an engineer learns in order to build

Page 7: IT 499: Seminar Course Week 3 Faculty: Dr. Afshan Jafri 28 February 2012

The Scientific Method

• Observe

• Propose a model

• Measure and analyze

• Validate model/theory

• Repeat

Observe real world

Propose a model or theoryof some real world phenomena

Measure and analyzeabove

Validate hypotheses of the model or theory

If possible, repeat

Page 8: IT 499: Seminar Course Week 3 Faculty: Dr. Afshan Jafri 28 February 2012

The Engineering Method

• Observe (solutions)

• Propose improvement (build/develop)

• Measure and analyze

• Repeat, until no further improvement

Observe existing solutions

Propose better solutions

Build or develop bettersolution

Measure, analyze, andevaluate

Repeat until no further improvements are possible

Page 9: IT 499: Seminar Course Week 3 Faculty: Dr. Afshan Jafri 28 February 2012

The Empirical Method

• Propose a model

– Develop statistical or other methods

• Apply to case studies

• Measure and analyze

• Validate

• Repeat

Propose a model

Develop statistical or otherbasis for the model

Apply to case studies

Measure and analyze

Validate and then repeat

Page 10: IT 499: Seminar Course Week 3 Faculty: Dr. Afshan Jafri 28 February 2012

The Analytical Method

• Propose a formal theory or set of axioms

• Develop a theory

• Derive results

• Compare with empirical observations (if possible)

• Refine theory

Propose a formal theory or set of axioms

Develop a theory

Derive results

If possible, compare withempirical observations

Refine theory if necessary

Page 11: IT 499: Seminar Course Week 3 Faculty: Dr. Afshan Jafri 28 February 2012

Computing

Page 12: IT 499: Seminar Course Week 3 Faculty: Dr. Afshan Jafri 28 February 2012

Research Phases

Page 13: IT 499: Seminar Course Week 3 Faculty: Dr. Afshan Jafri 28 February 2012

Research Phases

• Informational: gathering information through reflection, literature, people survey

• Propositional: Proposing/formulating a hypothesis, method, algorithm, theory or solution

• Analytical: analyzing and exploring proposition, leading to formulation, principle or theory

• Evaluative: evaluating the proposal

R.L. Glass, A structure-based critique of contemporary computing research, Journal of Systems and Software Jan (1995)

Page 14: IT 499: Seminar Course Week 3 Faculty: Dr. Afshan Jafri 28 February 2012

Method-Phase Matrix

Methods/Phases Informational Propositional Analytical Evaluative

Scientific Observe the world

Propose a model or theory or behavior

Measure and analyze

Validate hypothesis of the model or theory; if possible repeat

Engineering Observe existing solutions

Propose better solutions; build or develop

Measure and analyze

Measure and analyze; repeat until no further improvements possible

Empirical Propose a model; develop statistical or other methods

Apply to case studies; measure and analyze

Measure and analyze; validate model; repeat

Analytical Propose a formal theory or set of axioms

Develop a theory; derive results

Derive results; compare with empirical observations if possible

Page 15: IT 499: Seminar Course Week 3 Faculty: Dr. Afshan Jafri 28 February 2012

Example - Software Engineering• Informational phase - Gather or aggregate information via

– reflection

– literature survey

– people/organization survey

– case studies

• Propositional phase - Propose and build hypothesis, method or algorithm, model, theory or solution

• Analytical phase - Analyze and explore proposal leading to demonstration and/or formulation of principle or theory

• Evaluation phase - Evaluate proposal or analytic findings by means of experimentation (controlled) or observation (uncontrolled, such as case study or protocol analysis) leading to a substantiated model, principle, or theory.

Page 16: IT 499: Seminar Course Week 3 Faculty: Dr. Afshan Jafri 28 February 2012

Computing Schism …

• “ … computing research … is characterized largely by research that uses the analytical method and few of its alternatives … the evaluative phase is seldom included.” [Glass, 95]

• “Relative to other sciences, the data shows that computer scientists validate a smaller percentage of their claims [Tichys 98]

Page 17: IT 499: Seminar Course Week 3 Faculty: Dr. Afshan Jafri 28 February 2012

SCIENTIFIC METHOD

Page 18: IT 499: Seminar Course Week 3 Faculty: Dr. Afshan Jafri 28 February 2012

Formulate Research Hypotheses• Typical hypotheses:

– Hypothesis about user characteristics (tested with user studies or user-log analysis, e.g., clickthrough bias)

– Hypothesis about data characteristics (tested with fitting actual data)

– Hypothesis about methods (tested with experiments):

• Method A works (or doesn’t work) for task B under condition C by measure D (feasibility)

• Method A performs better than method A’ for task B under condition C by measure D (comparative)

• Introduce baselines naturally lead to hypotheses

• Carefully study existing literature to figure our where exactly you can make a new contribution (what do you want others to cite your work as?)

• The more specialized a hypothesis is, the more likely it’s new, but a narrow hypothesis has lower impact than a general one, so try to generalize as much as you can to increase impact

• But avoid over-generalize (must be supported by your experiments)

• Tuning hypotheses

Page 19: IT 499: Seminar Course Week 3 Faculty: Dr. Afshan Jafri 28 February 2012

Hypothesis Procedure

• Clearly define the hypothesis to be tested (include any necessary conditions)

• Design the right experiments to test it (experiments must match the hypothesis in all aspects)

• Carefully analyze results (seek for understanding and explanation rather than just description)

• Unless you’ve got a complete understanding of everything, always attempts to formulate a further hypothesis to achieve better understanding

Page 20: IT 499: Seminar Course Week 3 Faculty: Dr. Afshan Jafri 28 February 2012

Clearly Define a Hypothesis

• A clearly defined hypothesis helps you choose the right data and right measures

• Make sure to include any necessary conditions so that you don’t over claim

• Be clear about any justification for your hypothesis (testing a random hypothesis requires more data than testing a well-justified hypothesis)

Page 21: IT 499: Seminar Course Week 3 Faculty: Dr. Afshan Jafri 28 February 2012

Design the Right Experiments

• Flawed experiment design is a common cause of rejection of a paper (e.g., a poorly chosen baseline)

• The data should match the hypothesis

– A general claim like “method A is better than B” would need a variety of representative data sets to prove

• The measure should match the hypothesis

– Multiple measures are often needed (e.g., both precision and recall)

• The experiment procedure shouldn’t be biased

– Comparing A with B requires using identical procedure for both

– Common mistake: baseline method not tuned or not tuned seriously

• Test multiple hypotheses simultaneously if possible (for the sake of efficiency)

Page 22: IT 499: Seminar Course Week 3 Faculty: Dr. Afshan Jafri 28 February 2012

Carefully Analyze the Results

• Do the significance test if possible/meaningful

• Go beyond just getting a yes/no answer

– If positive: seek for evidence to support your original justification of the hypothesis.

– If negative: look into reasons to understand how your hypothesis should be modified

– In general, seek for explanations of everything!

• Get as much as possible out of the results of one experiment before jumping to run another

– Don’t throw away negative data

– Try to think of alternative ways of looking at data

Page 23: IT 499: Seminar Course Week 3 Faculty: Dr. Afshan Jafri 28 February 2012

Modify a Hypothesis

• Don’t stop at the current hypothesis; try to generate a modified hypothesis to further discover new knowledge

• If your hypothesis is supported, think about the possibility of further generalizing the hypothesis and test the new hypothesis

• If your hypothesis isn’t supported, think about how to narrow it down to some special cases to see if it can be supported in a weaker form

Page 24: IT 499: Seminar Course Week 3 Faculty: Dr. Afshan Jafri 28 February 2012

Derive New Hypotheses

• After you finish testing some hypotheses and reaching conclusions, try to see if you can derive interesting new hypotheses

– Your data may suggest an additional (sometimes unrelated) hypothesis; you get a by-product

– A new hypothesis can also logically follow a current hypothesis or help further support a current hypothesis

• New hypotheses may help find causes:

– If the cause is X, then H1 must be true, so we test H1

Page 25: IT 499: Seminar Course Week 3 Faculty: Dr. Afshan Jafri 28 February 2012

ENGINEERING METHOD

Page 26: IT 499: Seminar Course Week 3 Faculty: Dr. Afshan Jafri 28 February 2012

Research Cycle

Page 27: IT 499: Seminar Course Week 3 Faculty: Dr. Afshan Jafri 28 February 2012

Validation

Page 28: IT 499: Seminar Course Week 3 Faculty: Dr. Afshan Jafri 28 February 2012

EMPIRICAL METHODS

Page 29: IT 499: Seminar Course Week 3 Faculty: Dr. Afshan Jafri 28 February 2012

Components of Empirical Research

• Problem statement, research questions, purposes, benefits

• Theory, assumptions, background literature

• Variables and hypotheses

• Operational definitions and measurement

• Research design and methodology

• Instrumentation, Experiment

• Data analysis

• Conclusions, interpretations, recommendations

Page 30: IT 499: Seminar Course Week 3 Faculty: Dr. Afshan Jafri 28 February 2012

What is an Experiment?

• Research method in which

– conditions are controlled

– so that 1 or more independent variables

– can be manipulated to test a hypothesis

– about a dependent variable

• Allows

– evaluation of causal relationships among variables

– while all other variables are eliminated or controlled.

Page 31: IT 499: Seminar Course Week 3 Faculty: Dr. Afshan Jafri 28 February 2012

Variables

• Dependent Variable

– Criterion by which the results of the experiment are judged.

– Variable that is expected to be dependent on the manipulation of the independent variable

• Independent Variable

– Any variable that can be manipulated, or altered, independently of any other variable

– Hypothesized to be the causal influence

Page 32: IT 499: Seminar Course Week 3 Faculty: Dr. Afshan Jafri 28 February 2012

Setup

• Experimental Treatments

– Alternative manipulations of the independent variable being investigated

• Experimental Group

– Group of subjects exposed to the experimental treatment

• Control Group

– Group of subjects exposed to the control condition

– Not exposed to the experimental treatment

– Serve as the standard for comparison

Page 33: IT 499: Seminar Course Week 3 Faculty: Dr. Afshan Jafri 28 February 2012

Testing

• Test Unit

– Entity whose responses to experimental treatments are being observed or measured

• Randomization

– Assignment of subjects and treatments to groups is based on chance

– Provides “control by chance”

– Random assignment allows the assumption that the groups are identical with respect to all variables except the experimental treatment

Page 34: IT 499: Seminar Course Week 3 Faculty: Dr. Afshan Jafri 28 February 2012

Steps in Empirical Research

Build Apparatus(integrate prototype and

test conditions into experimental apparatus

& software)

Experiment Design(tweak software, establish

experimental variables, procedure, design, run

pilot subjects)

User Study (collect data,

conduct interviews)

Analyse Data(build models, check for significant differences,

etc.)

Publish Results

Next iteration

Page 35: IT 499: Seminar Course Week 3 Faculty: Dr. Afshan Jafri 28 February 2012

RECAP

Page 36: IT 499: Seminar Course Week 3 Faculty: Dr. Afshan Jafri 28 February 2012

Recap

• What are the claims?

• What are the factors and considerations?

• What is the evaluation approach?

• What are metrics?

• How’s the data collected?

• What are the results compared to?

• How did you validate your results?

• What are your findings?

Page 37: IT 499: Seminar Course Week 3 Faculty: Dr. Afshan Jafri 28 February 2012

The Claims

• At the outset, there’s an indicated purpose to the work

• Consequently, the purpose needs to be clearly defined

• Also, the evaluation needs to be aligned to fit the

purpose

• So, what are your claims?

Page 38: IT 499: Seminar Course Week 3 Faculty: Dr. Afshan Jafri 28 February 2012

Types of Claims…

• Reduced requirements

• Better performance

• Ease of use

• Higher utility

• Cost reduction

• Best practice

• Insights

• Etc …

Page 39: IT 499: Seminar Course Week 3 Faculty: Dr. Afshan Jafri 28 February 2012

The Considerations

• What are the inputs?

• What factors affect your results?

• Are they all variable?

• Are they all controllable?

• Is it possible to isolate their effect

• Individually vs. collectively

• Completely vs. partially?

Page 40: IT 499: Seminar Course Week 3 Faculty: Dr. Afshan Jafri 28 February 2012

Approach

• Carry the thrust from clear purpose and clear understanding factors

• Requires care as it will be subject to scrutiny

• Must be structured and logical

• Must exhibit an appropriate level of sophistication

• Must be clear

Page 41: IT 499: Seminar Course Week 3 Faculty: Dr. Afshan Jafri 28 February 2012

Metrics

• Define them

• How were they measured before (if applicable)?

• Are previous measures good enough?

Page 42: IT 499: Seminar Course Week 3 Faculty: Dr. Afshan Jafri 28 February 2012

Data Collection

• Could be made through …

– Measurements

– Simulation

– Survey

• Who/what collected the data?

• To what extent was the data collection process trustworthy?

• How is this “extent” verified?

Page 43: IT 499: Seminar Course Week 3 Faculty: Dr. Afshan Jafri 28 February 2012

Baselines

• Is there a need for baseline?

• Can you establish a reasonable baseline?

– Previous proposal(s)

– Random behavior

– Optimal behavior

– Current behavior

• Are you creating the baseline? (i.e., benchmarking?)

Page 44: IT 499: Seminar Course Week 3 Faculty: Dr. Afshan Jafri 28 February 2012

Validating Your Results

• In part, stems from the validity of what’s been leading up to the results section

• There’s also the breadth and depth of evaluation

• Breadth: The span of cases and considerations under which the evaluation was performed

• Depth: How far you go with each case or factor?

• Scope of evaluation also raises credibility

Page 45: IT 499: Seminar Course Week 3 Faculty: Dr. Afshan Jafri 28 February 2012

Credits

• http://www.cs.usyd.edu.au/~info5993/

• http://www.cs.uiuc.edu/homes/czhai

• Abd Elhammid Taha, Research Methods, slides

• From W.R.Adrion, Research Methodology in Software Engineering, ACM SE Notes, Jan. 1993

• Matti Tedre; “Teaching the Philosophy of Computer Science”, Journal of Information Technology Education; Know Your Discipline, 2007.