Chapter 9: Computers

54
50 1 CHAPTER 9 LANGUAGE PROCESSING: HUMANS AND COMPUTERS PowerPoint by Don L. F. Nilsen to accompany An Introduction to Language (8 th or 9 th edition, 2007/2011) by Victoria Fromkin, Robert Rodman and Nina Hyams

description

 

Transcript of Chapter 9: Computers

Page 1: Chapter 9: Computers

50 1

CHAPTER 9LANGUAGE PROCESSING:HUMANS AND COMPUTERS

PowerPoint by Don L. F. Nilsento accompany

An Introduction to Language (8th or 9th edition, 2007/2011)by Victoria Fromkin, Robert Rodman

and Nina Hyams

Page 2: Chapter 9: Computers

50 2

BOTTOM-UP AND TOP-DOWN PROCESSING

Bottom-up processing relates to decoding. You start with the actual sounds, letters, morphemes, etc. and figure out the words, phrases, clauses, sentences, paragraphs, etc.

Top-down processing is based on reasoning. You make a generalization and see how well the sounds, letters, morphemes, etc. support your generalization.

(Fromkin Rodman Hyams [2007] 369)

Page 3: Chapter 9: Computers

50 3

Top-down reasoning is powerful, but it can be dangerous if it is not accompanied by bottom-up reasoning.

For example, Otto Jesperson assumed that men were better thinkers than women.

He conducted an experiment in which men and women read a story and were given a quiz.

Page 4: Chapter 9: Computers

50 4

The women responded more quickly and more accurately than the men, which was not what Jacobson had expected.

So he concluded that women’s minds have “vacant chambers” that men’s minds don’t have.

This allowed Jacobson to account for his evidence while at the same time not disproving his original hypothesis that men were better thinkers than women.

Page 5: Chapter 9: Computers

50 5

COMPUTER WORDS AND METAPHORS

COMPUTER WORDS: bits, bytes, code police, cyberspace, future shock, hackers, hard copy, menu, third wave, user-friendly

COMPUTER METAPHORS: 42, bug, cookies, GIGO, great runes, heavy wizardry, Lotus Software, Melvyl (California Library System), a sagan, snail mail, Sorcerer’s Apprentice mode, Trojan Horse, USLT, vulcan nerve pinch, web, YABA compatible

Page 6: Chapter 9: Computers

50 6

CONCORDANCE

A Concordance checks for word frequency and word associations.

Word associations are determined by providing a window of three words (more or less) on each side of the targeted word.

(Fromkin Rodman Hyams [2007] 379)

Page 7: Chapter 9: Computers

50 7

ELIZA

ELIZA is a program designed to pass the turing test. If a person communicating with ELIZA cannot tell whether or not ELIZA is a human, then ELIZA passes the turing test.

ELIZA claims to be a psychiatrist talking to a patient. ELIZA, like a psychiatrist, is attempting to get rather than give information:

Page 8: Chapter 9: Computers

50 8

PATIENT: Men are all alike.ELIZA: In what way?

PATIENT: They’re always bugging me about something or other.

ELIZA: Can you think of a specific example?

PATIENT: Well, my boyfriend made me come here.ELIZA: Your boyfriend made you come here?

PATIENT: He says I’m depressed much of the time.ELIZA: I am sorry to hear that you are depressed.

(Fromkin Rodman Hyams [2007] 389)

Page 9: Chapter 9: Computers

50 9

Victor Raskin fooled ELIZA by saying, “I don’t like my mother and I don’t want to talk about her.”

ELIZA responded, “Tell me more about your mother.

(Nilsen Nilsen & Combs 199)

Page 10: Chapter 9: Computers

50 10

CIRCUIT FIX-IT SHOP

• Twenty-five years after ELIZA, a program named “Circuit Fix-It Shop” was developed:

• COMPUTER: This is the circuit fix-it shop. How may I help you?

• TECHNICIAN: I need to fix the RS111 circuit.• COMPUTER: I am familiar with that circuit. The LED

is supposed to be displaying alternatively blinking one and seven.

• TECHNICIAN: Where is the LED?

Page 11: Chapter 9: Computers

50 11

COMPUTER: In the middle right of the blue region is the LED.

TECHNICIAN: No light.

COMPUTER: What is the switch at when the LED is off?

TECHNICIAN: The switch is down.

COMPUTER: You put up the switch.(Fromkin Rodman Hyams [2007] 390)

Page 12: Chapter 9: Computers

50 12

MORE SOPHISTICATED PROGRAMS

Today more sophisticated programs are needed. One such program is the little Paperclip guy that answers questions in Microsoft Word.

Another sophisticated program is “Script Model Grammar” designed by Roger Schank and Robert Abelson and modified by linguist Victor Raskin and others at Purdue University and elsewhere.

Page 13: Chapter 9: Computers

50 13

SAM: SCRIPT APPLIER MECHANISM

Of course sentences need to be parsed in Artificial Intelligence. But constituents larger than a sentence must be parsed as well.

One of the devices for doing this discourse parsing is the “Script Applier Mechanism” devised by Roger Schank and Robert Abelson and modified for humor and language play by Victor Raskin and others.

Page 14: Chapter 9: Computers

50 14

Note that a play or a movie has a script for the actors to follow.

The script in Artificial Intelligence is the same, but it is much simpler. It is a “mundane script.”

The “Restaurant Script,” for example involves a customer, a server, a cashier, etc.

Page 15: Chapter 9: Computers

50 15

Props in the “Restaurant Script” include the restaurant, the table, the menu, the food, the check, the payment, the tip, etc.

The sequence of actions is as follows:1. Customer goes to restaurant.2. Customer goes to table.3. Server brings menu.4. Customer orders food.5. Server brings food.6. Customer eats food.7. Server brings check.8. Customer leaves tip for server.9. Customer gives payment to cashier.10. Customer leaves restaurant.

(Hendrix and Sacerdote 654)(Nilsen Nilsen & Combs 199)

Page 16: Chapter 9: Computers

50 16

There are two exciting things about the Script Applier Mechanism. First, it will be able to spot anything that is missing, added, or out of place in the sequence of events and ask, “What’s up.”

Second, it is able to handle two scripts at the same time, so that it is capable of dealing with jokes, language play, satire, irony, sarcasm, parody, paradox and double entendre in general.

Page 17: Chapter 9: Computers

50 17

PARSING PROBLEMS

GARDEN PATH: The horse raced past the barn fell. After the child visited the doctor prescribed a course of injections.The doctor said the patient will die yesterday.

(Fromkin Rodman Hyams 365, 373)

EMBEDDING: “Never imagine yourself not to be otherwise than what it might appear to others…to be otherwise.”

(Lewis Carroll’s Alice’s Adventures in Wonderland)(Fromkin Rodman Hyams [2007] 365)

Page 18: Chapter 9: Computers

50 18

RIGHT-BRANCHING VS. EMBEDDING

RIGHT BRANCHING: This is the dog that worried the cat that killed the rat that ate the malt that lay in the house that Jack built.

EMBEDDING: Jack built the house that the malt that the rat that the cat that the dog worried killed ate lay in.

NOTE Multiple embedding is OK for a computer, but not OK for the human brain.

(Fromkin Rodman Hyams [2007] 373-374)

Page 19: Chapter 9: Computers

50 19

ANOMALOUS WORDS: A sniggle blick is procking a slar.

(Fromkin Rodman Hyams [2007] 368)

METANALYSIS (incorrect phrase breaking): grade A vs. grey daynight rate vs. nitrate

(Fromkin Rodman Hyams [2007] 370)

NOTE: English “adder” and “apron” were borrowed incorrectly from the French expressions “un nadder” and “un naperon” respectively

Page 20: Chapter 9: Computers

50 20

AMBIGUOUS SYNTAX IN NEWSPAPER HEADLINES:

Teacher Strikes Idle Kids

Enraged Cow Injures Farmer with Ax

Killer Sentenced to Die for Second Time in 10 Years

Stolen Painting Found by Tree(Fromkin Rodman Hyams [2007] 372)

Page 21: Chapter 9: Computers

50 21

REAL-WORLD KNOWLEDGE

Explain why the following sentences are ambiguous to a computer but not to a human:

A cheesecake was on the table. It was delicious and was soon eaten.

SIGN IN A CHURCH: For those of you who have children and don’t know it, we have a nursery downstairs.

NEWSPAPER AD: Our bikinis are exciting; they are simply the tops.

(Fromkin Rodman Hyams [2007] 403)

Page 22: Chapter 9: Computers

50 22

ANTISMOKING CAMPAIGN SLOGAN:

It’s time we make smoking history.

Do you know the time?

Concerned with spreading violence, the president called a press conference.

The ladies of the church have cast off clothing of every kind and they may be seen in the church basement Friday.

(Fromkin Rodman Hyams [2007] 403)

Page 23: Chapter 9: Computers

50 23

AMBIGUOUS NEWSPAPER HEADLINES

Red Tape Holds Up New Bridge

Kids Make Nutritious Snacks

Sex Education Delayed, Teachers Request Training

(Fromkin Rodman Hyams [2007] 403)

Page 24: Chapter 9: Computers

50 24

SEMANTIC PRIMING

In the human brain, the word “doctor” is more easily and more completely processed if it is preceded by “nurse” than if it is preceded by “flower.”

This is because “doctor” and “nurse” “are located in the same part of the mental lexicon.”

(Fromkin Rodman Hyams [2007] 371)

This same feature could easily be built into Artificial Intelligence.

Page 25: Chapter 9: Computers

50 25

SPEECH RECOGNITION & SPEECH SYNTHESIS

“Computational phonetics and phonology has two concerns. The first is with programming computers to analyze the speech signal into its component phones and phonemes.

The second is to send the proper signals to an electronic speaker so that it enunciates the phones of the language and combines them into morphemes and words.

The first of these is speech recognition; the second is speech synthesis.”

(Fromkin Rodman Hyams [2007] 384)

Page 26: Chapter 9: Computers

50 26

“Machines which…imitate human speech, are the most difficult to construct, so many are the agencies engaged in uttering even a single word—so many are the inflections and variations of tone and articulation, that the mechanician finds his ingenuity taxed to the utmost to imitate them.”

(Fromkin Rodman Hyams [2007] 385)

Page 27: Chapter 9: Computers

50 27

TO SYNTHESIZE SPEECH:

1. Start with a tone at the same frequency as vibrating vocal cords (higher if a woman’s or child’s voice is being synthesized, lower for a man’s)

2. Emphasize the harmonics corresponding to the formants required for a particular vowel, liquid, or nasal quality.

3. Add hissing or buzzing for fricatives.

4. Add nasal resonances for nasal sounds.

5. Temporarily cut off sound to produce stops and affricates….(Fromkin Rodman Hyams [2007] 386)

A Sound Spectrogram will give an indication of some of the variables of analyzing or synthesizing speech:

Page 28: Chapter 9: Computers

50 28

SOUND SPECTROGRAM(Fromkin Rodman Hyams [2007] 366)

Page 29: Chapter 9: Computers

50 29

SPELL CHECKER

I have a spelling checker.It came with my PC.It plane lee marks four my revueMiss steaks aye can knot sea.

(Fromkin Rodman Hyams [2007] 381)

Explain why the spell checker is not working in the poem above.

Page 30: Chapter 9: Computers

50 30

THEORIES AND MODELS

In The Physicist’s Conception of Nature, Manfred Eigen said, “A theory has only the alternative of being right or wrong. A model has a third possibility: it may be right, but irrelevant.”

(Fromkin Rodman Hyams [2007] 397)

Explain why a theory for Artificial Intelligence must be rigorous and at the same time allow for language play. In AI, are rigor and language play compatible concepts or not?

Page 31: Chapter 9: Computers

50 31

TRANSLATION

“Translation is more than word-for-word replacement. Often there is no equivalent word in the target language, and the order of words may differ, as in translating from an SVO language like English to an SOV language like Japanese. There is also difficulty in translating idioms, metaphors, jargon, and so on.”

(Fromkin Rodman Hyams [2007] 382)

Page 32: Chapter 9: Computers

50 32

“Machine translation is often impeded by lexical and syntactic ambiguities, structural disparities between the two languages, morphological complexities, and other cross-linguistic differences.”

(Fromkin Rodman Hyams [2007] 382)

In the following examples consider what information must be taken into consideration for better machine translation:

Page 33: Chapter 9: Computers

50 33

BUCHAREST HOTEL: The lift is being fixed for the next day. During that time we regret that you will be unbearable.

SWISS NUNNERY HOSPITAL: The nuns harbor all diseases and have no respect for religion.

GERMAN HOTEL: All the water has been passed by the manager.

ZURICH HOTEL: Because of the impropriety of entertaining guest of the opposite sex in the bedroom, it is suggested that the lobby be used for this purpose.

TURKEY: The government bans the smoking of children.(Fromkin Rodman Hyams [2007] 382)

Page 34: Chapter 9: Computers

50 34

Having Fun with Computer Terminology

Page 35: Chapter 9: Computers

50 35

1024

When Alan Schoenfeld of the University of California at Berkeley attended a conference on Artificial Intelligence, he was given Hotel Room Number 1024.

Wow! he said.

1024 is 2 to the tenth power. It is a megabyte.(Nilsen & Nilsen 98)

Page 36: Chapter 9: Computers

50 36

ACRONYMSAcronyms are so common in computer terminology that

programmers make fun of them.

“TLA” stands for “Three Letter Acronym.”

“YABA” stands for “Yet Another Bloody Acronym.”

“YABA Compatible” means that the initials can be pronounced easily are are not obscene.

(Nilsen & Nilsen 99)

Page 37: Chapter 9: Computers

50 37

CHAT GROUPSLinguist Susan Herring at the University of Texas, Arlington

studied the humor in chat groups. Her results were as follows:imaginary situations: 20 percenta mock persona: 14 percentteasing: 13 percentirony: 6 percentname play: 5 percentsilliness: 4 percentreal situations: 3 percentriddles: 2 percentpretended misunderstandings: 2 percentpuns: 1 percent

(Nilsen & Nilsen 167)

Page 38: Chapter 9: Computers

50 38

EMOTICONSIn conversation we can show our emotions, but on the internet

this is difficult, so we use emoticons::-) Smilling:-)))))))))) Really Smiling ;-) Winking:-* KissingI-0 Yawning:-& Tongue-Tied:’-{ Crying:-/ Undecided:-II Angry

(Nilsen & Nilsen 100)

Page 39: Chapter 9: Computers

50 39

SCIENCE FICTION AND FANTASY

Many computer terms come from Science Fiction and Fantasy:

A huge network packet is a “Godzillagram” from Godzilla

Teenage hackers are “Munchkins” from The Wizard of Oz

A mischievlous program is called a “wabbit” from Elmer Fudd’s “You wascawwy wabbit.”

A program that repeats itself indefinitely is said to be in “Sorcerer’s Apprentice Mode” from Fantasia

The meaning of life, truth, and everything is “42” from a computer in Douglas Adams’ novel.

(Nilsen & Nilsen 99)

Page 40: Chapter 9: Computers

50 40

When someone goes onto the internet to get information that is easily available from a manual, etc. the Cyber Police might say, “USLT.” This means “Use the Source, Luke!” from Starwars.

Another word from Starwars is an “Obi-Wan Error.” This comes from the name “Obi-Wan Kenobi” and refers to an “off-by-one code,” as in 2001: A Space Odyssey where the computer is named “HAL.” This comes from “IBM” but is the three letters before I, B, and M.

(Nilsen & Nilsen 99)

Page 41: Chapter 9: Computers

50 41

In computer terminology a soft boot refers to the hitting of “Control,” “Alternate” and “Delete” at the same time.

This is refered to as the “Vulcan Nerve Pinch” from Star Trek.

“Droid” from “Android” has become a suffix in such words as “trendroids,” who follow trends, and “sales droids” which promise customers things that can be delivered or are useless.

The “code police” and “net police” are named after the “thought police” in George Orwell’s 1984.

Page 42: Chapter 9: Computers

50 42

SIGNATURES

People like to create enigmatic and puzzling signatures. One user named Eddie follows his signature with “Ceci n’est pas une signature.”

This is an allusion to a painting of a pipe by René Magritte with the disclaimer, “Ceci n’est pas une pipe.”

(Nilsen & Nilsen 166)

Page 43: Chapter 9: Computers

50 43

TEXT MESSAGINGSince numbers and letters require more than a single stroke on cell

phones, acronyms are often used:

AFAIK: As far as I know

BTW: By the way

CUL or CUL8R: See you later

GIGO: Garbage In Garbage Out

GFR: Grime File Reaper

LOL: Lots of Laughs

OIC: Oh, I see

Page 44: Chapter 9: Computers

50 44

POS: Parent Over Shoulder

ROTF: Rolling on the Floor

ROTFLMAO: Rolling on the Floor Laughing My Ass Off

RUOK: Are you OK?

TIA: Thanks in Advance

WTF: Not translatable

WYSIWYG: What you See Is What You Get

BCNU: Be Seein’ you(Nilsen & Nilsen 99)

Page 45: Chapter 9: Computers

50 45

TWENTE, NETHERLANDS

Every year there is an annual workshop on Language Technology at the University of Twente.

In 1996 this workshop was devoted to “Automatic Interpretation and Generation of Verbal Humor.”

The papers at this conference had such titles as:

Page 46: Chapter 9: Computers

50 46

“Why do People Use Irony?”

“Password Swordfish: Verbal Humour in the Interface.”

“Computer Implementation of the General Theory of Verbal Humor.”

“Humor Theory beyond Jokes.”

“Speculations on Story Puns.”

“Relevance Theory and Humorous Interpretations.”

“What Sort of a Speech Act is the Joke?”

“A Neural Resolution of the Incongruity-Resoulution Theory of Humor”

“Humorous Analogy: Modeling the Devil’s Dictionary.”

“Why Is a Riddle Not Like a Metaphor?” and

“An Attempt at Natural Humor from a Natural Language Robot.”(Nilsen and Nilsen 98)

Page 47: Chapter 9: Computers

50 47

VIRUS JOKES

AT&T Virus: Every three minutes it tells you what great service you are getting.

MCI Virus: Every three minutes it reminds you that you’re paying too much for the AT&T virus.

Page 48: Chapter 9: Computers

50 48

Paul Revere Virus: This revolutionary virus does not horse around. It warns you of impending hard disk attack—once if by LAN, twice if by C:>.

New World Order Virus: Probably harmless, but it makes a lot of people really mad just thinking about it.

(Nilsen & Nilsen 177)

Page 49: Chapter 9: Computers

50 49

!KURT VONNEGUT ON THE INTERNET

In August of 1997 a piece appeared on the Internet by Kurt Vonnegut.

When Vonnegut’s wife was given a copy of the article she was so pleased with her clever husband that she forwarded a copy to their children.

Vonnegut said that it was “funny and wise and charming,” but he said he never wrote it.

Page 50: Chapter 9: Computers

50 50

!!The article had actually been published by Mary Schmich in the Chicago Tribune and then picked up and redistributed by a computer hacker.

Ian Fisher of The New York Times said that as long as readers thought the piece was Vonnegut’s, they viewed the Internet as a wonderful tool that could keep people in touch with each other.

But when they learned it was a hoax, their perception of the internet changed. The internet was now an unreliable hotbed of hoaxes and wild-eyed conspiracies.

Probably both opinions are true.(Nilsen & Nilsen 168)

Page 51: Chapter 9: Computers

50 51

!!!Computer-Humor Websites

ANIMATOR VS. ANIMATION II:

http://www.metacafe.com/watch/689540/animator_vs_animation_2/

THE THE IMPOTENCE OF PROOFREADING (TAYLOR MALI):

http://www.youtube.com/watch?v=p_rwB5_3PQc

TOP 50 POPULAR TEXT & CHAT ACRONYMS (NETLINGO):

http://www.netlingo.com/top50/popular-text-terms.php

Page 52: Chapter 9: Computers

50 52

References.

Clark, Virginia, Paul Eschholz, and Alfred Rosa. Language: Readings in Language and Culture, 6th Edition. New York, NY: St. Martin’s Press, 1998.

English, Katharine, ed. Most Popular Web Sites: The Best of the Net from A2Z. Indianapolis, IN: Lycos Press, 1996.

Fromkin, Victoria, Robert Rodman, and Nina Hyams. “Language Processing: Humans and Computers.” An Introduction to Language, 8thEdition. Boston, MA: Thomson Wadsworth, 2007; 9th Edition, 2011, 375-429.

Gralla, Preston. How the Internet Works. Emoryville, CA: Ziff-Daivs Press, 1997.

Hempelmann, Christian F. “Computational Humor: Beyond the Pun?” in Raskin [2008]: 333-360.

Hendrix, Gary G., and Earl D. Sacerdoti. “Natural-Languag Processing: The Field in Perspective.” in Language: Introductory Readings, 4th edition. Eds. Virginia P. Clark, Paul A. Eslchholz and Alfred F. Rosa. New York, NY: St. Martin’s, 1985.

Page 53: Chapter 9: Computers

50 53

Hulstijn, J., and A. Nijholt eds. Twente Workshop on Language Technology 12: Automatic Interpretation and Generation of Verbal Humor. Twente, Netherlands: Univ of Twente Dept of Computer Science, 1996.

Nilsen, Alleen Pace, and Don L. F. Nilsen. “Computer Humor,” and “Internet Influences.” Encyclopedia of 20th Century American Humor. Westport, CT: Greenwood, 2000, 97-100 and 165-168.

Nilsen, Don L. F., Alleen Pace Nilsen, and Nathan H. Combs. “Teaching a Computer to Speculate.” Computers and the Humanities. 22 (1988): 193-201.

Nilsen, Kelvin, and Alleen Pace Nilsen. “Literary Metaphors and Other Linguistic Innovations in Computer Language” (Clark, 166-176).

Raskin, Victor. Semantic Mechanisms of Humor. Boston, MA: Reider/Kluwer, 1985.

Raskin, Victor. The Primer of Humor Research. New York, NY: Mouton de Gruyter, 2008.

Page 54: Chapter 9: Computers

50 54

Raymond, Eric S. The New Hacker’s Dictionary, 2nd Edition. Cambridge, MA: MIT Press, 1993.

Roberts, Steven K. “Artificial Intelligence.” in Writing and Reading Across the Curriculum, 2nd Edition. Laurence Behrens and Leonard J. Rosen. Boston, MA: Litle, Brown, 1985, 214-222.

Rosch, Eleanor. “On the Internal Structure of Perceptual and Semantic Categories.” in Cognitive Development and the Acquisition of Language. Ed. T. Moore. New york, NY: Academic Press, 1973.

Schank, Roger C., and Robert Abelson. Scripts, Plans, Goals, and Understanding: An Inquiry Into Human Knowledge Structures. Hillsdale, NJ: Lawrence Erlbaum, 1977.

Siegel, David. Creating Killer Web Sites. Indianapolis, IN: Hayden Books, 1996.