Replicating Semantic Connections Made by Visual Readers for a Scanning System for Nonvisual Readers...
-
Upload
christopher-cummings -
Category
Documents
-
view
217 -
download
0
Transcript of Replicating Semantic Connections Made by Visual Readers for a Scanning System for Nonvisual Readers...
Replicating Semantic Replicating Semantic Connections Made by Visual Connections Made by Visual
Readers for a Scanning Readers for a Scanning System for Nonvisual System for Nonvisual
ReadersReaders
Kathy McCoyKathy McCoy(Debbie Yarrington)(Debbie Yarrington)
Dept. of Computer and Information SciencesDept. of Computer and Information SciencesUniversity of DelawareUniversity of Delaware
& & Consultant for National Institute on Consultant for National Institute on
Disability and Rehabilitation Research Disability and Rehabilitation Research (NIDRR)(NIDRR)
US Department of EducationUS Department of Education
GoalGoalThe goal of this system is to give The goal of this system is to give
nonvisual readers information similar nonvisual readers information similar to what visual readers get when to what visual readers get when skimming through a document in skimming through a document in response to a question.response to a question.
Motivation Motivation Working with college students who were blind Working with college students who were blind
and visually impairedand visually impaired Students took significantly longer to find Students took significantly longer to find
homework question answers within documents homework question answers within documents than their visual-reading counterpartsthan their visual-reading counterparts
Current screenreaders have limited search Current screenreaders have limited search ability.ability.
ApproachApproach
Use eye-tracker to see what sighted Use eye-tracker to see what sighted people look at when they are people look at when they are skimming to answer a questionskimming to answer a question
Identify important paragraphsIdentify important paragraphs Develop Natural Language Processing Develop Natural Language Processing
Techniques to replicate the dataTechniques to replicate the data Work with people who are blind to Work with people who are blind to
develop appropriate interfaces using develop appropriate interfaces using the results from above.the results from above.
Part 1: Visual Skimming DataPart 1: Visual Skimming Data
Goal: Goal: To achieving an understanding To achieving an understanding of what information visual skimmers of what information visual skimmers pay attention to when skimming pay attention to when skimming through documents to answer through documents to answer questionsquestions
Procedure: Procedure: ◦ Have visual readers skim through a Have visual readers skim through a
document for a question answer document for a question answer while being tracked by an eye while being tracked by an eye tracking systemtracking system
Gathering DataGathering Data14 complex questions and 14 complex questions and
accompanying documentsaccompanying documents
◦ 10 were 2-pages10 were 2-pages, 2 were 5-pages, , 2 were 5-pages, and 2 were 8 pages or longer.and 2 were 8 pages or longer.
◦ Documents were text documents Documents were text documents No images, few subtitles and listsNo images, few subtitles and lists
Example Questions Example Questions ConsideredConsidered
““What effect does China’s rising oil What effect does China’s rising oil prices have on other sectors of its prices have on other sectors of its economy?”economy?”
““According to Piaget, what techniques According to Piaget, what techniques do children use to adjust to their do children use to adjust to their environment?”environment?”
““How do people catch the West Nile How do people catch the West Nile Virus?Virus?””
Gathering DataGathering Data
Individuals skimmed for question Individuals skimmed for question answer in a document while being answer in a document while being tracked by an eye tracking system.tracked by an eye tracking system.
◦ 43 subjects skimmed for answers 43 subjects skimmed for answers to between 6-13 question,to between 6-13 question, Total of 513 question-answer Total of 513 question-answer skimming resultsskimming results
Subjects then answered multiple Subjects then answered multiple choice questionchoice question
Eye Tracker Data:Eye Tracker Data:Tobii Eye Tracker:Tobii Eye Tracker: AOIs:AOIs:
We could define areas of interest (AOI) in the We could define areas of interest (AOI) in the text document ahead of timetext document ahead of time
We chose paragraphs, titles, subtitles, and the We chose paragraphs, titles, subtitles, and the question as separate AOIs.question as separate AOIs.
We then counted the number of gaze points We then counted the number of gaze points (gazes of over 100 ms duration) in each AOI(gazes of over 100 ms duration) in each AOI
HotSpot and Duration File: HotSpot and Duration File: The tracker gave us an image that showed “hot The tracker gave us an image that showed “hot
spots”, or locations and durations of where the spots”, or locations and durations of where the eyes gazed eyes gazed
A file with locations and durations of gaze pointsA file with locations and durations of gaze points
Results Analysis:Results Analysis:
We examined AOIs most frequently We examined AOIs most frequently focused on that did not have focused on that did not have physical attributes that would physical attributes that would explain the attraction of peopleexplain the attraction of people’’s s gazesgazes
Assumption is that these areas were Assumption is that these areas were focused on because of their focused on because of their connection to the question.connection to the question.
Subjects found question answerSubjects found question answer Example:Example:
““How do people catch the West Nile Virus?How do people catch the West Nile Virus?””
The paragraph with the most gaze points for the most subjects The paragraph with the most gaze points for the most subjects was:was:
““In the United States, wild birds, especially crows In the United States, wild birds, especially crows and jays, are the and jays, are the main reservoir of West Nile virusmain reservoir of West Nile virus, , but the virus is actually spread by certain species of but the virus is actually spread by certain species of mosquitoes. mosquitoes. TransmissionTransmission happens when a happens when a mosquito bites a bird infected with the West Nile mosquito bites a bird infected with the West Nile virus and the virus and the virus enters virus enters the mosquito's the mosquito's bloodstream. It circulates for a few days before bloodstream. It circulates for a few days before settling in the salivary glands. Then the infected settling in the salivary glands. Then the infected mosquito bites an animal or a human and mosquito bites an animal or a human and the virus the virus enters the host's bloodstreamenters the host's bloodstream, where it may , where it may cause cause serious illnessserious illness. The virus then probably multiplies . The virus then probably multiplies and moves on to the brain, crossing the blood-brain and moves on to the brain, crossing the blood-brain barrier. Once the virus crosses that barrier and barrier. Once the virus crosses that barrier and infects the brain or its linings, the brain tissue infects the brain or its linings, the brain tissue becomes inflamed and becomes inflamed and symptoms arisesymptoms arise..””
Subjects focused on areas that have a Subjects focused on areas that have a semantic relationship with the semantic relationship with the questionquestion
E.g., with the question,E.g., with the question,““Why was MonetWhy was Monet’’s work criticized by the public?s work criticized by the public?””
the second most frequently focused on paragraph was:the second most frequently focused on paragraph was:
In 1874, Manet, Degas, Cezanne, Renoir, Pissarro, In 1874, Manet, Degas, Cezanne, Renoir, Pissarro, Sisley and Monet put together an exhibition, which Sisley and Monet put together an exhibition, which resulted in resulted in a large financial loss a large financial loss for Monet and his for Monet and his friends and marked a return to friends and marked a return to financial insecurity financial insecurity for Monet. It was only through the help of Manet that for Monet. It was only through the help of Manet that Monet was able to remain in Argenteuil. In an Monet was able to remain in Argenteuil. In an attempt to recoup some of his attempt to recoup some of his losseslosses, Monet tried to , Monet tried to sell some of his paintings at the Hotel Drouot. This, sell some of his paintings at the Hotel Drouot. This, too, was atoo, was a failurefailure. . Despite the Despite the financial uncertaintyfinancial uncertainty, , MonetMonet’’s paintings never became s paintings never became morosemorose or even all or even all that that sombersomber. Instead, Monet immersed himself in . Instead, Monet immersed himself in the task of perfecting a style which still hadthe task of perfecting a style which still had not not been accepted been accepted by the world at large. Monetby the world at large. Monet’’s s compositions from this time were extremely loosely compositions from this time were extremely loosely structured, with color applied in strong, distinct structured, with color applied in strong, distinct strokes as if no reworking of the pigment had been strokes as if no reworking of the pigment had been attempted. This technique was calculated to suggest attempted. This technique was calculated to suggest that the artist had indeed captured a spontaneous that the artist had indeed captured a spontaneous impression of nature.impression of nature.
This Paragraph does not contain the answerThis Paragraph does not contain the answer
Part 2:Part 2:
►Next Step: Developing Natural Language Next Step: Developing Natural Language Processing (NLP) techniques to Processing (NLP) techniques to automatically identify areas of text visual automatically identify areas of text visual readers focus on as determined in 1.readers focus on as determined in 1.
Process:Process:
1.1. Generate keywords from questionGenerate keywords from question2.2. Weight keywords based on inverse of # Weight keywords based on inverse of #
of paragraphs in which they occur in the of paragraphs in which they occur in the documentdocument
3.3. Generate matching score for each Generate matching score for each paragraph paragraph
• # of occurrences of each keyword x # of occurrences of each keyword x keywordkeyword’’s weights weight
4.4. Rank paragraphRank paragraph’’s likelihood of being s likelihood of being related to the question based on related to the question based on matching scorematching score
What is it that we match?What is it that we match?Keyword Sets:Keyword Sets:
Directly using the words from the query Directly using the words from the query did not work well; using words similar to did not work well; using words similar to the words in the query also did not work.the words in the query also did not work.
We needed to find a way to match the We needed to find a way to match the “loose semantic connections” found in the “loose semantic connections” found in the eye-tracking data.eye-tracking data.
Topically-Related Topically-Related KeywordsKeywords
Our solution: Our solution: ◦ use the use the World Wide Web World Wide Web to form to form
clusters of topically-related wordsclusters of topically-related words Intuition – to find loosely related Intuition – to find loosely related
words, we want to find words that words, we want to find words that are discussed “with” the words in are discussed “with” the words in the questionthe question
Use a google search to identify Use a google search to identify places on the web that the question places on the web that the question words are discussed – take words words are discussed – take words from those areas.from those areas.
Procedure: Cluster Procedure: Cluster formationformation
1.1. Use content words from question as Use content words from question as search engine (Google) query terms search engine (Google) query terms
2.2. Search returns ordered list of relevant Search returns ordered list of relevant URLs with accompanying snippetsURLs with accompanying snippets
3.3. Retrieve web page from URLRetrieve web page from URL4.4. Locate snippet within web page Locate snippet within web page
(stripped of html)(stripped of html)5.5. Include 50 content words before Include 50 content words before
snippet and 50 content words after snippet and 50 content words after snippet and call that a snippet phrasesnippet and call that a snippet phrase
Procedure: Cluster Procedure: Cluster formation IIformation II
6.6. Take the top 50 snippet phrases Take the top 50 snippet phrases containing the most search termscontaining the most search terms
7.7. Generate a word cluster with those Generate a word cluster with those phrasesphrases
8.8. Add a global meaning weight so as to Add a global meaning weight so as to eliminate words that are very common eliminate words that are very common (Global Indirect Document Frequency (Global Indirect Document Frequency seeded from a large list of words)seeded from a large list of words)
9.9. Take top 25% of cluster – and use it to Take top 25% of cluster – and use it to rank rank sentencessentences
10.10. Rank paragraphs by the sentencesRank paragraphs by the sentences
Example Important Example Important Sentences: How do People Sentences: How do People catch the West Nile Virus?catch the West Nile Virus?
1.1. west nile virus west nile virus
2.2. it is spread by mosquitoes it is spread by mosquitoes
3.3. transmission happens when a mosquito transmission happens when a mosquito bites a bird infected with the west nile bites a bird infected with the west nile virus and the virus enters the mosquito virus and the virus enters the mosquito bloodstream bloodstream
4.4. most people infected with the west nile most people infected with the west nile virus have no signs or symptoms virus have no signs or symptoms
5.5. most people recover from west nile virus most people recover from west nile virus without treatment without treatment
6.6. to help control west nile virus eliminate to help control west nile virus eliminate standing water in your yard standing water in your yard
7.7. about 20 percent of people develop a mild about 20 percent of people develop a mild infection called west nile fever infection called west nile fever
8.8. some laboratory workers involved in west some laboratory workers involved in west nile research have contracted the disease nile research have contracted the disease from infected animals from infected animals
9.9. mosquitoes breed in pools of standing mosquitoes breed in pools of standing water water
10.10. in rare cases it is possible for west nile in rare cases it is possible for west nile virus to spread through other routes virus to spread through other routes includingincluding
11.11. watch for sick or dying birds and report watch for sick or dying birds and report them to your local health departmentthem to your local health department
12.12. west nile virus is common in areas such as africa west nile virus is common in areas such as africa west asia and the middle eastwest asia and the middle east
13.13. in the united states wild birds especially crows and in the united states wild birds especially crows and jays are the main reservoir of west nile virus but the jays are the main reservoir of west nile virus but the virus is actually spread by certain species of virus is actually spread by certain species of mosquitoes mosquitoes
14.14. your best bet for preventing the virus and other your best bet for preventing the virus and other mosquito borne illnesses is to avoid exposure to mosquito borne illnesses is to avoid exposure to mosquitoes and eliminate mosquito breeding sites mosquitoes and eliminate mosquito breeding sites
15.15. your overall risk of contracting west nile virus your overall risk of contracting west nile virus depends on these factors time of yeardepends on these factors time of year
16.16. then the infected mosquito bites an animal or a then the infected mosquito bites an animal or a human and the virus enters the host bloodstream human and the virus enters the host bloodstream where it may cause serious illness where it may cause serious illness
17.17. even if you are infected your risk of developing a even if you are infected your risk of developing a serious west nile virus related illness is extremely serious west nile virus related illness is extremely smallsmall
Results: Discounted Results: Discounted Cumulative Gain with Cumulative Gain with EyeTracking ResultsEyeTracking Results
Current Work on Current Work on SkimmingSkimming
Incorporating Physical Attributes in Incorporating Physical Attributes in assessment of paragraphsassessment of paragraphs
Developing a user interface in conjunction Developing a user interface in conjunction with potential userswith potential users Important that it provide access to information Important that it provide access to information
like what a visual reader getslike what a visual reader gets Read important sentences with indication of Read important sentences with indication of
paragraph?paragraph? Read word clusters with indication of paragraph?Read word clusters with indication of paragraph?
Allow user to stop and start skimming with a Allow user to stop and start skimming with a keypresskeypress