Outline Of Confusion

47
Intelligent design primer Collecting CSI from people for search algorithms Existing users of technique Search algorithm primer Why use search algorithms in the first place? How search algorithms work The search algorithm dilemma Intelligent Agents to the rescue How we help solve the dilemma through superior pattern recognition Empirical evidence of human capability The Experiment Hypothesis Experiment design Results Questions Guide for the Perplexed OUTLINE OF CONFUSION

description

Intelligent design primer Collecting CSI from people for search algorithms Existing users of technique Search algorithm primer Why use search algorithms in the first place? How search algorithms work The search algorithm dilemma Intelligent Agents to the rescue - PowerPoint PPT Presentation

Transcript of Outline Of Confusion

Page 1: Outline Of Confusion

• Intelligent design primer

• Collecting CSI from people for search algorithms• Existing users of technique

• Search algorithm primer• Why use search algorithms in the first place?• How search algorithms work• The search algorithm dilemma

• Intelligent Agents to the rescue• How we help solve the dilemma through

superior pattern recognition• Empirical evidence of human capability

• The Experiment• Hypothesis• Experiment design• Results

• Questions

Guide for the Perplexed

OUTLINE OF CONFUSION

Page 2: Outline Of Confusion

W H AT I S I D ? I S I T J U S T G O D O F T H E G A P S ?

INTELLIGENT DESIGN PRIMER

Page 3: Outline Of Confusion

• The fundament question of Intelligent Design Theory:

• How do we know intelligent design when we see it?

• The fundament claim of Intelligent Design Theory:• Only intelligent agents create information.

FUNDAMENTALS

Page 4: Outline Of Confusion

IRREDUCIBLE COMPLEXITY

Page 5: Outline Of Confusion

EXPLANATORY FILTER

Page 6: Outline Of Confusion

COMPLEX SPECIFIED INFORMATION

Is this fractal complex specified information?

Page 7: Outline Of Confusion

• Creating irreducible complexity• creates complex specified information

WHAT DOES IT MEAN TO CREATE INFORMATION?

Page 8: Outline Of Confusion

HOW IS ID DIFFERENT THAN ALL OF MODERN SCIENCE?

ID

Page 9: Outline Of Confusion

WHY IS ID FITTER THAN DARWINISM?

ID focuses on the information creation instead of on the information.

ORTHIS

THIS?

Would you rather own…

Page 10: Outline Of Confusion

• Intelligent design primer

• Collecting CSI from people for search algorithms• Existing users of technique

• Search algorithm primer• Why use search algorithms in the first place?• How search algorithms work• The search algorithm dilemma

• Intelligent Agents to the rescue• How we help solve the dilemma through

superior pattern recognition• Empirical evidence of human capability

• The Experiment• Hypothesis• Experiment design• Results

• Questions

Guide for the Perplexed

OUTLINE OF CONFUSION

Page 11: Outline Of Confusion

COLLECTING ACTIVE INFORMATION FOR SEARCH & OPTIMIZATION

?

Page 12: Outline Of Confusion

COLLECTING CSI FROM PEOPLE

• According to Intelligent Design theory, particularly in “Search for a Search”, intelligent agents such as people are capable of improving search algorithm performance beyond mathematical bounds.

• Goal: Create a generalized interface for people to contribute to an algorithmic search and optimization process, thus demonstrating human supra-computational capability.

Page 13: Outline Of Confusion

COLLECTING CSI FROM PEOPLE

Page 14: Outline Of Confusion

COLLECTING CSI FROM PEOPLE

? !

Page 15: Outline Of Confusion

COMMERCIAL CSI COLLECTION

• Mechanical Turkhttp://www.mturk.com/• Marketplace of simple web based jobs for low skill work

• reCaptchahttp://www.google.com/recaptcha• Uses captchas to correct OCR text translation

• Foldithttp://fold.it/portal/• Players fold genes along with algorithm, achieving results superior to gene folding

algorithm alone

• Google Image Labelinghttp://images.google.com/imagelabeler/• Players compete to label images

Page 16: Outline Of Confusion

COMMERCIAL CSI COLLECTION

• Mechanical Turkhttp://www.mturk.com/• Marketplace of simple web based jobs for low skill work

• reCaptchahttp://www.google.com/recaptcha• Uses captchas to correct OCR text translation

• Fold Ithttp://fold.it/portal/• Players fold genes along with algorithm, achieving results superior to gene folding

algorithm alone

• Google Image Labelinghttp://images.google.com/imagelabeler/• Players compete to label imagesFOLDI

T

Page 17: Outline Of Confusion

• Intelligent design primer

• Collecting CSI from people for search algorithms• Existing users of technique

• Search algorithm primer• Why use search algorithms in the first place?• How search algorithms work• The search algorithm dilemma

• Intelligent Agents to the rescue• How we help solve the dilemma through

superior pattern recognition• Empirical evidence of human capability

• The Experiment• Hypothesis• Experiment design• Results

• Questions

Guide for the Perplexed

OUTLINE OF CONFUSION

Page 18: Outline Of Confusion

Drag picture to placeholder or click icon to add

W H AT R O BO T S C A N D O

SEARCH ALGORITHM PRIMER

Page 19: Outline Of Confusion

WHEN ARE SEARCH ALGORITHMS USED?

• Many problems can be solved by straightforward algorithms in an amount of time polynomial proportional to the problem size. These problems are generally tractable for solving exactly with a computer, though a significant amount of computing power and space may be necessary.

• However, there is a much larger group of problems which, as far as we know, cannot be solved in polynomial time (NPC+). For these problems the best we can do is a best effort attempt to get as close to the optimal as possible within our computation time and space limits.

• There are numerous different heuristic and approximation algorithms that are used for NPC+ problems, and this is where search algorithms are used. Since we don’t know how to find the optimum solution, we have to search around in a problem space.

Page 20: Outline Of Confusion

HOW COMPLEXITY CLASSES SCALEB LU E = L I N E A R , G R E E N = P O LY N O M I A L , R E D = E X PO N E N T I A L

Page 21: Outline Of Confusion

SOME EXAMPLES OF NPC+ PROBLEMS

• Finding binding sites on proteins• Delivery route planning• Calculating cheap airline trips• Stock market portfolio selection• Packing your belongings for a move• Making the Internet fast

Page 22: Outline Of Confusion

HOW SEARCH ALGORITHMS WORK

• Search is a process of hill climbing, focusing on using information in previously found solutions to find even better solutions. One well known example is the Newton-Raphson method of finding square roots.

• The problem is an uneven search landscape will cause a search to become stuck on low lying peaks and crags.

• To get out of these traps, the search algorithm has to have an element of exploration. Exploration consists of sampling areas of the landscape, and hill climbing in promising sections.

Page 23: Outline Of Confusion

HOW SEARCH ALGORITHMS WORK

• Search is a process of hill climbing, focusing on using information in previously found solutions to find even better solutions. One well known example is Newton’s method of finding square roots.

• The problem is in an uneven search landscape will cause a search to get stuck on low peaks and crags.

• To get out of these traps, the search algorithm has to have an element of exploration.Exploration consists of sampling areas of the landscape, and exploring promising sections.

Page 24: Outline Of Confusion

FINDING GOOD PLACES TO EXPLORE

• How do we know where to go?

x

x xxx x

x x x

Page 25: Outline Of Confusion

THE DILEMMA

• Unfortunately, selecting good areas to hillclimb is itself a very difficult problem to solve, and depending on how good of a guess is desired the selection algorithm will be NPC+.

• Consequently, using search effectively to solve an NPC+ problem ends up introducing a new problem of equal or greater complexity (as predicted by Dembski’s “Search for a Search” paper).

• Consider the following solution set from which a search algorithm needs to select a new space to explore.

Page 26: Outline Of Confusion

EXAMPLE

• Which solution signifies a new area to investigate?

1 v v v v v v a f j k [ / B D d e f b ] / = \ Z g h a b ] /2 v v v v v v a f j k g / B D d e f b ] / = \ Z g h = b ] /3 v v v v v v a f c d e f B D d e f b ] / = \ Z g h a b ] /4 v v v v v v j k j k g / B D d e / = \ H = \ Z g h = b ] Z5 v v v v v v b / = \ Z ? d e f i a H c B D Z k [ ] b > d e6 v v v v v v a f c j k f B D / = \ b ] / = \ D B D a b ] /7 v v v v v v a f j k [ / ] / = \ f b ] / = \ Z g h a b ] /8 v v v v v v a d e f g / B D d e f b ] / = \ Z g h = b ] /9 v v v v v v i > > > > h D Z ? j k f d g h a b ] d Z Z Z i

Page 27: Outline Of Confusion

• Intelligent design primer

• Collecting CSI from people for search algorithms• Existing users of technique

• Search algorithm primer• Why use search algorithms in the first place?• How search algorithms work• The search algorithm dilemma

• Intelligent Agents to the rescue• How we help solve the dilemma through

superior pattern recognition• Empirical evidence of human capability

• The Experiment• Hypothesis• Experiment design• Results

• Questions

Guide for the Perplexed

OUTLINE OF CONFUSION

Page 28: Outline Of Confusion

O U R SU P E R I O R PATT E R N R E C O G N I T I O N

INTELLIGENT AGENTS TO THE RESCUE

Page 29: Outline Of Confusion

WHY CAN WE IMPROVE ALGORITHMS?

• In Dr. Dembski’s “Search for a Search” he shows that search algorithms are incapable of finding a search target any better than a random search, without the insertion of external information.

• Furthermore, he shows that such information cannot come from another search algorithm. It can only come from a non-algorithmic source.

• Intelligent design theory posits that intelligent agents are capable of creating this information, and consequently capable of improving the capabilities of search and optimization algorithms.

Page 30: Outline Of Confusion

CAN WE IMPROVE ALGORITHMS?

• In Dr. Dembski’s “Search for a Search” he shows that search algorithms are incapable of finding a search target any better than a random search, without the insertion of external information.

• Furthermore, he shows that such information cannot come from another search algorithm. It can only come from a non-algorithmic source.

• Intelligent design theory posits that intelligent agents are capable of creating this information, and consequently capable of improving the capabilities of search and optimization algorithms.

????????

??????

Page 31: Outline Of Confusion

HUMAN VS ALGORITHM

Shows human and algorithmic performance on an NP-Complete (Travelling Salesman Problem). Points and O(n)/O(n ln n) plots show human capability, O(n2) and greater show algorithmic capability.

Page 32: Outline Of Confusion

HUMAN VS ALGORITHM

Page 33: Outline Of Confusion

HOW HUMANS HELP SOLVE THE DILEMMA

• If humans are capable of adding information to the search process, then we can assist the search algorithm in exploring the problem space more effectively than algorithmically possible.

• The reason why algorithms have trouble searching is because they don’t have a good, generic pattern detection ability. They can’t effectively detect patterns in the solutions that lead them to better solutions. However, we humans are known for our pattern detection, and can use our superior ability to help out the algorithm.

• Let’s take another look at the search process.

Page 34: Outline Of Confusion

EXAMPLE

• Which solution signifies a new area to investigate?

• Must be both very unlike other good solutions, while being highly ranked.

1 v v v v v v a f j k [ / B D d e f b ] / = \ Z g h a b ] /2 v v v v v v a f j k g / B D d e f b ] / = \ Z g h = b ] /3 v v v v v v a f c d e f B D d e f b ] / = \ Z g h a b ] /4 v v v v v v j k j k g / B D d e / = \ H = \ Z g h = b ] Z5 v v v v v v b / = \ Z ? d e f i a H c B D Z k [ ] b > d e6 v v v v v v a f c j k f B D / = \ b ] / = \ D B D a b ] /7 v v v v v v a f j k [ / ] / = \ f b ] / = \ Z g h a b ] /8 v v v v v v a d e f g / B D d e f b ] / = \ Z g h = b ] /9 v v v v v v i > > > > h D Z ? j k f d g h a b ] d Z Z Z i

Page 35: Outline Of Confusion

EXAMPLE

• Which solution signifies a new area to investigate?

• Must be both very unlike other good solutions, while being highly ranked.

1 v v v v v v a f j k [ / B D d e f b ] / = \ Z g h a b ] /2 v v v v v v a f j k g / B D d e f b ] / = \ Z g h = b ] /3 v v v v v v a f c d e f B D d e f b ] / = \ Z g h a b ] /4 v v v v v v j k j k g / B D d e / = \ H = \ Z g h = b ] Z5 v v v v v v b / = \ Z ? d e f i a H c B D Z k [ ] b > d e6 v v v v v v a f c j k f B D / = \ b ] / = \ D B D a b ] /7 v v v v v v a f j k [ / ] / = \ f b ] / = \ Z g h a b ] /8 v v v v v v a d e f g / B D d e f b ] / = \ Z g h = b ] /9 v v v v v v i > > > > h D Z ? j k f d g h a b ] d Z Z Z i

This solution is most unlike the rest, while also

being highly ranked.

Page 36: Outline Of Confusion

• Intelligent design primer

• Collecting CSI from people for search algorithms• Existing users of technique

• Search algorithm primer• Why use search algorithms in the first place?• How search algorithms work• The search algorithm dilemma

• Intelligent Agents to the rescue• How we help solve the dilemma through

superior pattern recognition• Empirical evidence of human capability

• The Experiment• Hypothesis• Experiment design• Results

• Questions

Guide for the Perplexed

OUTLINE OF CONFUSION

Page 37: Outline Of Confusion

I N W H I C H T H I N G S K I N D O F W O R K

THE EXPERIMENT

Page 38: Outline Of Confusion

HYPOTHESES

• Grand hypothesis: humans can improve any improvable search algorithm beyond mathematical limits

• Actual hypothesis: humans can improve a particular search algorithm in a particular domain

• Criteria for verification: human generated solution displaces best solutions found by computer in fewer samples of solutions

Page 39: Outline Of Confusion

EXPERIMENT• Problem: find primes that generated RSA key pair

• The fitness function has access to an original plain text and its cypher text.

• Metric: two objectives to be maximized• 1) similarity between original plain text and cypher text generated by

a given set of primes• 2) similarity between original cypher text and its decryption

generated by a given set of primes• Algorithm: multi-objective genetic algorithm• Human involvement: users of Amazon’s Mechanical Turk

service will select a set of solutions for one iteration of GA optimization

• Method of comparison: best solution found in proportion to number of solutions checked by humans/algorithm.

Page 40: Outline Of Confusion

SCREENSHOT

Page 41: Outline Of Confusion

SCREENSHOT EXPLANATION

Stars represent relative valuation of

solution. 5 stars means one of best

solutions found so far.

Solution is really just a bit string (universal problem

representation). However, to make patterns more discernable and more appealing to the eye,

substrings are mapped to images.

Checkbox selected by user to signify solution

set for algorithm exploration.

Page 42: Outline Of Confusion

AMAZON TURK RESULTS

Optimal solution found by both algorithm and Amazon Turk user with values of 45 and 121

Objective #1

Objective #2

There exists an optimum solution

with objective values of 64 and

236Optimum

Page 43: Outline Of Confusion

CONCLUSION• Actual hypothesis not verified. Humans (may have)

contributed to, but did not improve, the search process.• Solution found did not displace solutions found by algorithm, since

exact same solution was found by algorithm. Therefore, no human generated improvement observed.

• However, human finding same solution shows definite contribution.

• Experiment shows slight promise. However, Amazon Turk users are known to script their responses. So, results may be output of a script, not a human.

• Many things can be improved in algorithm, GUI, data collection and mathematical analysis.

Page 44: Outline Of Confusion

?

Page 45: Outline Of Confusion

IMPROVEMENTS TO EXPERIMENT

• Add Captcha to submission form so Turkers cannot script form submission.

• More descriptive user interface. Describe experiment? Turn into a game? Other suggestions?

• Better comparison between human and algorithm?

Page 46: Outline Of Confusion

WHY IS THERE NO PROBLEM INFORMATION?

• This representation is all the search algorithm sees. It knows nothing about the nature of the problem.

• Consequently, to perform a fair comparison, the human user cannot be given any additional problem domain information.

1 v v v v v v a f j k [ / B D d e f b ] / = \ Z g h a b ] /2 v v v v v v a f j k g / B D d e f b ] / = \ Z g h = b ] /3 v v v v v v a f c d e f B D d e f b ] / = \ Z g h a b ] /4 v v v v v v j k j k g / B D d e / = \ H = \ Z g h = b ] Z5 v v v v v v b / = \ Z ? d e f i a H c B D Z k [ ] b > d e6 v v v v v v a f c j k f B D / = \ b ] / = \ D B D a b ] /7 v v v v v v a f j k [ / ] / = \ f b ] / = \ Z g h a b ] /8 v v v v v v a d e f g / B D d e f b ] / = \ Z g h = b ] /9 v v v v v v i > > > > h D Z ? j k f d g h a b ] d Z Z Z i

Page 47: Outline Of Confusion

WHY COMPARE ON THE NUMBER OF SOLUTIONS EVALUATED?

• Both human users and algorithms are allowed to do whatever they want with the solutions that have been found so far. Consequently, the number of solutions evaluated is the upper bound on information used by both parties to discover new search areas.