Outline Of Confusion

• Intelligent design primer

• Collecting CSI from people for search algorithms• Existing users of technique

• Search algorithm primer• Why use search algorithms in the first place?• How search algorithms work• The search algorithm dilemma

• Intelligent Agents to the rescue• How we help solve the dilemma through

superior pattern recognition• Empirical evidence of human capability

• The Experiment• Hypothesis• Experiment design• Results

• Questions

Guide for the Perplexed

OUTLINE OF CONFUSION

W H AT I S I D ? I S I T J U S T G O D O F T H E G A P S ?

INTELLIGENT DESIGN PRIMER

• The fundament question of Intelligent Design Theory:

• How do we know intelligent design when we see it?

• The fundament claim of Intelligent Design Theory:• Only intelligent agents create information.

FUNDAMENTALS

IRREDUCIBLE COMPLEXITY

EXPLANATORY FILTER

COMPLEX SPECIFIED INFORMATION

Is this fractal complex specified information?

• Creating irreducible complexity• creates complex specified information

WHAT DOES IT MEAN TO CREATE INFORMATION?

HOW IS ID DIFFERENT THAN ALL OF MODERN SCIENCE?

ID

WHY IS ID FITTER THAN DARWINISM?

ID focuses on the information creation instead of on the information.

ORTHIS

THIS?

Would you rather own…







• Questions



COLLECTING ACTIVE INFORMATION FOR SEARCH & OPTIMIZATION

?

COLLECTING CSI FROM PEOPLE

• According to Intelligent Design theory, particularly in “Search for a Search”, intelligent agents such as people are capable of improving search algorithm performance beyond mathematical bounds.

• Goal: Create a generalized interface for people to contribute to an algorithmic search and optimization process, thus demonstrating human supra-computational capability.


? !

COMMERCIAL CSI COLLECTION

• Mechanical Turkhttp://www.mturk.com/• Marketplace of simple web based jobs for low skill work

• reCaptchahttp://www.google.com/recaptcha• Uses captchas to correct OCR text translation

• Foldithttp://fold.it/portal/• Players fold genes along with algorithm, achieving results superior to gene folding

algorithm alone

• Google Image Labelinghttp://images.google.com/imagelabeler/• Players compete to label images

http://www.mturk.com/

http://www.google.com/recaptcha

http://fold.it/portal/

http://images.google.com/imagelabeler/

COMMERCIAL CSI COLLECTION

• Mechanical Turkhttp://www.mturk.com/• Marketplace of simple web based jobs for low skill work

• reCaptchahttp://www.google.com/recaptcha• Uses captchas to correct OCR text translation

• Fold Ithttp://fold.it/portal/• Players fold genes along with algorithm, achieving results superior to gene folding

algorithm alone

• Google Image Labelinghttp://images.google.com/imagelabeler/• Players compete to label imagesFOLDI

T

http://www.mturk.com/

http://www.google.com/recaptcha

http://fold.it/portal/

http://images.google.com/imagelabeler/







• Questions



Drag picture to placeholder or click icon to add

W H AT R O BO T S C A N D O

SEARCH ALGORITHM PRIMER

WHEN ARE SEARCH ALGORITHMS USED?

• Many problems can be solved by straightforward algorithms in an amount of time polynomial proportional to the problem size. These problems are generally tractable for solving exactly with a computer, though a significant amount of computing power and space may be necessary.

• However, there is a much larger group of problems which, as far as we know, cannot be solved in polynomial time (NPC+). For these problems the best we can do is a best effort attempt to get as close to the optimal as possible within our computation time and space limits.

• There are numerous different heuristic and approximation algorithms that are used for NPC+ problems, and this is where search algorithms are used. Since we don’t know how to find the optimum solution, we have to search around in a problem space.

HOW COMPLEXITY CLASSES SCALEB LU E = L I N E A R , G R E E N = P O LY N O M I A L , R E D = E X PO N E N T I A L

SOME EXAMPLES OF NPC+ PROBLEMS

• Finding binding sites on proteins• Delivery route planning• Calculating cheap airline trips• Stock market portfolio selection• Packing your belongings for a move• Making the Internet fast

HOW SEARCH ALGORITHMS WORK

• Search is a process of hill climbing, focusing on using information in previously found solutions to find even better solutions. One well known example is the Newton-Raphson method of finding square roots.

• The problem is an uneven search landscape will cause a search to become stuck on low lying peaks and crags.

• To get out of these traps, the search algorithm has to have an element of exploration. Exploration consists of sampling areas of the landscape, and hill climbing in promising sections.

HOW SEARCH ALGORITHMS WORK

• Search is a process of hill climbing, focusing on using information in previously found solutions to find even better solutions. One well known example is Newton’s method of finding square roots.

• The problem is in an uneven search landscape will cause a search to get stuck on low peaks and crags.

• To get out of these traps, the search algorithm has to have an element of exploration.Exploration consists of sampling areas of the landscape, and exploring promising sections.

FINDING GOOD PLACES TO EXPLORE

• How do we know where to go?

x

x xxx x

x x x

THE DILEMMA

• Unfortunately, selecting good areas to hillclimb is itself a very difficult problem to solve, and depending on how good of a guess is desired the selection algorithm will be NPC+.

• Consequently, using search effectively to solve an NPC+ problem ends up introducing a new problem of equal or greater complexity (as predicted by Dembski’s “Search for a Search” paper).

• Consider the following solution set from which a search algorithm needs to select a new space to explore.

EXAMPLE

• Which solution signifies a new area to investigate?

1 v v v v v v a f j k [ / B D d e f b ] / = \ Z g h a b ] /2 v v v v v v a f j k g / B D d e f b ] / = \ Z g h = b ] /3 v v v v v v a f c d e f B D d e f b ] / = \ Z g h a b ] /4 v v v v v v j k j k g / B D d e / = \ H = \ Z g h = b ] Z5 v v v v v v b / = \ Z ? d e f i a H c B D Z k [ ] b > d e6 v v v v v v a f c j k f B D / = \ b ] / = \ D B D a b ] /7 v v v v v v a f j k [ / ] / = \ f b ] / = \ Z g h a b ] /8 v v v v v v a d e f g / B D d e f b ] / = \ Z g h = b ] /9 v v v v v v i > > > > h D Z ? j k f d g h a b ] d Z Z Z i







• Questions



O U R SU P E R I O R PATT E R N R E C O G N I T I O N

INTELLIGENT AGENTS TO THE RESCUE

WHY CAN WE IMPROVE ALGORITHMS?

• In Dr. Dembski’s “Search for a Search” he shows that search algorithms are incapable of finding a search target any better than a random search, without the insertion of external information.

• Furthermore, he shows that such information cannot come from another search algorithm. It can only come from a non-algorithmic source.

• Intelligent design theory posits that intelligent agents are capable of creating this information, and consequently capable of improving the capabilities of search and optimization algorithms.

CAN WE IMPROVE ALGORITHMS?

• In Dr. Dembski’s “Search for a Search” he shows that search algorithms are incapable of finding a search target any better than a random search, without the insertion of external information.

• Furthermore, he shows that such information cannot come from another search algorithm. It can only come from a non-algorithmic source.

• Intelligent design theory posits that intelligent agents are capable of creating this information, and consequently capable of improving the capabilities of search and optimization algorithms.

????????

??????

HUMAN VS ALGORITHM

Shows human and algorithmic performance on an NP-Complete (Travelling Salesman Problem). Points and O(n)/O(n ln n) plots show human capability, O(n2) and greater show algorithmic capability.

HUMAN VS ALGORITHM

HOW HUMANS HELP SOLVE THE DILEMMA

• If humans are capable of adding information to the search process, then we can assist the search algorithm in exploring the problem space more effectively than algorithmically possible.

• The reason why algorithms have trouble searching is because they don’t have a good, generic pattern detection ability. They can’t effectively detect patterns in the solutions that lead them to better solutions. However, we humans are known for our pattern detection, and can use our superior ability to help out the algorithm.

• Let’s take another look at the search process.

EXAMPLE


• Must be both very unlike other good solutions, while being highly ranked.


EXAMPLE


• Must be both very unlike other good solutions, while being highly ranked.


This solution is most unlike the rest, while also

being highly ranked.







• Questions



I N W H I C H T H I N G S K I N D O F W O R K

THE EXPERIMENT

HYPOTHESES

• Grand hypothesis: humans can improve any improvable search algorithm beyond mathematical limits

• Actual hypothesis: humans can improve a particular search algorithm in a particular domain

• Criteria for verification: human generated solution displaces best solutions found by computer in fewer samples of solutions

EXPERIMENT• Problem: find primes that generated RSA key pair

• The fitness function has access to an original plain text and its cypher text.

• Metric: two objectives to be maximized• 1) similarity between original plain text and cypher text generated by

a given set of primes• 2) similarity between original cypher text and its decryption

generated by a given set of primes• Algorithm: multi-objective genetic algorithm• Human involvement: users of Amazon’s Mechanical Turk

service will select a set of solutions for one iteration of GA optimization

• Method of comparison: best solution found in proportion to number of solutions checked by humans/algorithm.

SCREENSHOT

SCREENSHOT EXPLANATION

Stars represent relative valuation of

solution. 5 stars means one of best

solutions found so far.

Solution is really just a bit string (universal problem

representation). However, to make patterns more discernable and more appealing to the eye,

substrings are mapped to images.

Checkbox selected by user to signify solution

set for algorithm exploration.

AMAZON TURK RESULTS

Optimal solution found by both algorithm and Amazon Turk user with values of 45 and 121

Objective #1

Objective #2

There exists an optimum solution

with objective values of 64 and

236Optimum

CONCLUSION• Actual hypothesis not verified. Humans (may have)

contributed to, but did not improve, the search process.• Solution found did not displace solutions found by algorithm, since

exact same solution was found by algorithm. Therefore, no human generated improvement observed.

• However, human finding same solution shows definite contribution.

• Experiment shows slight promise. However, Amazon Turk users are known to script their responses. So, results may be output of a script, not a human.

• Many things can be improved in algorithm, GUI, data collection and mathematical analysis.

IMPROVEMENTS TO EXPERIMENT

• Add Captcha to submission form so Turkers cannot script form submission.

• More descriptive user interface. Describe experiment? Turn into a game? Other suggestions?

• Better comparison between human and algorithm?

WHY IS THERE NO PROBLEM INFORMATION?

• This representation is all the search algorithm sees. It knows nothing about the nature of the problem.

• Consequently, to perform a fair comparison, the human user cannot be given any additional problem domain information.


WHY COMPARE ON THE NUMBER OF SOLUTIONS EVALUATED?

• Both human users and algorithms are allowed to do whatever they want with the solutions that have been found so far. Consequently, the number of solutions evaluated is the upper bound on information used by both parties to discover new search areas.

Outline Of Confusion

Documents

Transcript of Outline Of Confusion