Lecture 9: AI and the Brain - Drexel CCIgreenie/cs510/CS510-09-09.pdf · From the New York Times:...

23
Lecture 9: AI and the Brain Rachel Greenstadt November 24, 2008 From the New York Times: Robots have fared best, Dr. Teller said, in highly structured environments made just for them, and in the company of other robots. “If you took a welding bot from an assembly line and put it in your local body shop, it would end up killing somebody in about 30 seconds,” he said. “It would weld a person to a wall.” The reporter seems to imply this is a function of insufficient technological sophistication. But as anyone who's ever read sci-fi knows, this is a function of robots wanting to slaughter human beings. - Ezra Klein

Transcript of Lecture 9: AI and the Brain - Drexel CCIgreenie/cs510/CS510-09-09.pdf · From the New York Times:...

  • Lecture 9: AI and the Brain

    Rachel GreenstadtNovember 24, 2008

    From the New York Times:

    Robots have fared best, Dr. Teller said, in highly structured environments made just for them, and in the company of other robots. “If you took a welding bot from an assembly line and put it in your local

    body shop, it would end up killing somebody in about 30 seconds,” he said. “It would weld a person to a wall.”

    The reporter seems to imply this is a function of insufficient technological sophistication. But as anyone who's ever read sci-fi knows, this is a function of robots wanting to slaughter human beings.

    - Ezra Klein

  • Reminders

    • Bayesian learning exercise due NOW• Presentations and papers due next week• Final here 12/8 6 pm

  • Grades/FinalsFinal practice problems (I suggest you also use class exercises for practice) The final will cover material from the whole course, but focus on material covered after the midterm. There will again be a few questions related to the readings. To rehash, since the midterm we've covered logic, planning, and learning (decision trees, bayesian, reinforcement learning, and neural networks). [7.2, 7.10, 7.12, 8.2, 8.6, 11.9, 11.11, 11.13, 17.3, 17.8, 18.1, 18.4, 18.8, 18.10, 20.3, 20.4, 20.11, 20.19, 21.5]

    The best thing you can do for yourself is to work on the project. If you do an awesome project, you will get a B or better in the course.

    If you are scared of exams, and you study hard by doing all the problems, you can turn them in as a mitigating circumstance/extra credit for effort.

  • Topics

    • Reverse engineering the brain• Neural networks• Strong AI Debate

  • What is AI?

    Thinking like a human

    Thinking rationally

    Acting like a human

    Acting rationally

    Most ofCS510

    Today

  • Brains

    1011 neurons of > 20 types, 1014 synapses, 1ms–10ms cycle timeSignals are noisy “spike trains” of electrical potential

    Axon

    Cell body or Soma

    Nucleus

    Dendrite

    Synapses

    Axonal arborization

    Axon from another cell

    Synapse

    Chapter 20, Section 5 3

  • How to translate into computer terms?

    • Kurzweil, Moravec, Merkle• Number of synapses and firing rate =~1016

    • Extrapolation from retina =~ 1012 - 1014

    • Energy to propagate nerve impulses =~ 1015

    • Thagard (molecules and hormones, etc, make neurons more complex) =~ 1023

    • Quantum states argument (Penrose, etc)

  • Intel says it can maintain Moore’s law until 2029...

  • IBM simulates cat cortex (100x slower)

    cat_flickr-dougwoods.jpg

    The simulation involves 1 billion spiking neurons and 10 trillion individual learning synapses, and was performed on an IBM Blue Gene/P supercomputer with 147,456 processors and 144TB of main

    memory. (Nov 18, 2009)

  • [F]or several decades the computing power found in advanced Artificial Intelligence and Roboticssystems has been stuck at insect brain power of 1 MIPS. While computer power per dollar fell [should

    be: rose] rapidly during this period, the money available fell just as fast. The earliest days of AI, in the mid 1960s, were fueled by lavish post-Sputnik defense funding, which gave access to $10,000,000

    supercomputers of the time. In the post Vietnam war days of the 1970s, funding declined and only $1,000,000 machines were available. By the early 1980s, AI research had to settle for $100,000

    minicomputers. In the late 1980s, the available machines were $10,000 workstations. By the 1990s, much work was done on personal computers costing only a few thousand dollars. Since then AI and robot

    brain power has risen with improvements in computer efficiency. By 1993 personal computers provided 10 MIPS, by 1995 it was 30 MIPS, and in 1997 it is over 100 MIPS. Suddenly machines are reading text,

    recognizing speech, and robots are driving themselves cross country. (Moravec 1997)

  • Artificial Neural Networks: History

    • Belief that it was necessary to model underlying brain architecture for AI

    • In contrast to encoded symbolic knowledge (best represented by expert systems)

    • Hebb - learning is altering strength of synaptic connections

  • Neural Networks

    • Attempt to build a computation system based on the parallel architecture of brains

    • Characteristics:• Many simple processing elements• Many connecctions• Simple messages• Adaptive interaction

  • Benefits of NN

    • User friendly (well, reasonably)• Non-linear• Noise tolerant• Many applications• Credit fraud/assignment • Robotic Control

  • “Neurons”

    • Inputs (either from outside or other “neurons”)

    • Weighted connections that correspond to synaptic efficiency

    • Threshold values to weight the inputs• Passed through activation function to

    determine output

  • Example Unit

    • Binary input/output

    • Rule

    • 1 if w0*I0 + w1*I1 +wb > 0

    • 0 if w0*I0 + w1*I1 +wb

  • Activation functions

    (a) (b)

    +1 +1

    iniini

    g(ini)g(ini)

    (a) is a step function or threshold function

    (b) is a sigmoid function 1/(1 + e−x)

    Changing the bias weight W0,i moves the threshold location

    Chapter 20, Section 5 5

  • How to Adapt?

    • Perceptron Learning Rule• change the weight by an amount proportional to the difference between the

    desired output and the actual output.

    • As an equation: ΔWi = η * (D - Y)Ii, where D is desired output and Y is actual output

    • Stop when converges

  • Limits of Perceptrons

    • Minsky and Papert 1969• Fails on “linearly inseparable” instances

    • XOR• linearly separable - pattern space can be separated

    by single hyperplane

    • SVMs also have this limitation but they can often transform pattern space by looking at higher dimensions

  • Perceptrons vs Decision Trees

  • Multilayer Perceptrons (MLP)

  • Back Propagation

    • Start with a set of known examples (supervised approach)

    • Assign random initial weights• Run examples through and calculate the mean-

    squared error

    • Propagate the error by making small changes to the weights at each level

    • Lather, rinse, repeat