Look Around: Question Answering, Serendipity, and the Research Process of Scholars in the Humanities

Post on 08-May-2015

558 views 0 download

description

This talk, by Kim Martin, Victoria Rubin and Anabel Quan-Haase, was presented at Access 2012 in Montreal, on October 19th, 2012.

Transcript of Look Around: Question Answering, Serendipity, and the Research Process of Scholars in the Humanities

Look Around Question Answering, Serendipity, and the Research Process of Scholars in

the Humanities

Kim Martin, Victoria Rubin, & Anabel Quan-Haase Faculty of Information and Media Studies, Western University

How this all Began

Table of Contents

1.  A History of Serendipity . . . . . . . . . . . . . . . . . . . . . . . . .4

2.  Technology & the research process of Humanists . . . . . .9

3.  NLP and Question Answering . . . . . . . . . . . . . . . . . . . .18

4.  Look Around . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .24

5.  Next Steps . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .29

Serendipity

Investigation of information encountering in the controlled

research environment

Erdelez, 2004

Facets of serendipity in everyday chance encounters: a grounded

theory approach to blog analysis

Rubin, Burkell & Quan-Haase, 2011

Coming across information serendipitously:

Part 1 – A process model

Makri & Blandford - 2012

Why does Serendipity Matter?

•  Can lead to discovery = serendipitous discovery.

•  Creativity

•  Thinking outside the box •  Trait of creative search (Race, 2012)

•  Original thinking

•  Link to distractions.

Do some tools “encourage serendipity”?

Technology & the research process of Humanists

•  Ten historians in SW Ontario

•  Interviews (30-60 mins)

•  Grounded theory approach

•  Interviews transcribed and coded

“I don’t know how to describe this, but it…

removed the serendipity factor. You can browse online, but that’s always

much more targeted, sometimes, most of us are

very happy to have that, sort of, inadvertent discovery”

- P3

In the words of Historians

“And then one wonders, well, what are the other

ways we can leverage the digital realm to provide

different kinds of serendipity that you

wouldn’t have thought of ?” - P7

“Googlebooks, however, has sort of just come into

my life, because a Googlesearch is, you

know you’re looking for a subject and then books come up and you can

stumble across them that way” - P5

Findings

•  Historians recognize that chance, an important part of their historical research process, can take many forms and occur in many places

•  Facets A & B (Prepared Mind and Act of Noticing) were most prominent in historians understanding of serendipity

The Historical Research Process

Stage 1: Problem selection: generation of ideas; preliminary work (i.e.) reading, discussion, exploration of funding; determining unanswered questions and hypothesizing.

Stage 2: Detailed planning of data collection: literature searching; .refinement of hypothesis; detailed work on methodology.

Stage 3 : Data collection.

Stage 4: Analyzing and interpretation of data.

Stage 5: Present findings; writing, rewriting and evaluation.  

Uva, 1977

“Planned Chaos”

“that every piece of historical writing shares an analogous set of accidental factors shaping its genesis”

- McClellan III (1999)

“serendipity and its relations do not come uninvited to the scholar’s table. Rather, serendipity visits those

scholars and researchers who set out with open minds and the flexibility of plan that allows them both to

recognize the fortuitous discovery and to pursue it to its logical end”

- Hoeflich (2007)

Serendipity and the (Digital?) Library

Harvard Library Innovation Lab

GoodReader

EverGreen (Leddy Library)

So, What’s the Problem?

•  The physical library does not need to be mirrored, or even represented online.

•  The colours, smells, and touch of a book is no longer relevant in the digital world.

•  The web allows us to do so much more with text than the format of the book ever did.

Natural Language Processing & Question Answering

QA is “an interactive human computer process that encompasses understanding a user information need, typically expressed in a natural language query; retrieving relevant documents, data, or knowledge from selected sources; extracting, qualifying and prioritizing available answers from these sources; and presenting and explaining responses in an effective manner.”

From Maybury, M. T. (Ed.). (2004). New Directions in Question Answering. Menlo Park, CA: MIT Press.

What QA does Unlike the typical search engine, the desired purpose of a QA system is to come up with ONE correct answer.

A QA system has to determine two things: what type of information it is looking for, and where to look for the answers.

There are 3 main modules of QA: •  Question-Processing

•  Document-Processing

•  Answer Extraction and Formulation

The Stages of Question Answering

Question Processing

The Stages of Question Answering

Answer Formulation

Five Capabilities of QA systems

Those that are capable of:

1.  processing factual questions.

2.  enabling simple reasoning mechanisms.

3.  enabling fusion from different documents.

4.  enabling analogical reasoning.

5.  being interactive.

Library Search

Then … And now.

Ø  Looks up information online.

Ø  Writes down location information.

Ø  Goes to library to get book from shelf.

Ø  Uses shelf call numbers and headings to locate text.

Ø  Retrieves required text.

Ø  Looks around.

Ø  Sees book "of interest" and stumbles upon information that may or may not positively affect their work.

Ø Question is asked to a search engine

Ø  It may or may not retrieve the correct results

Ø  Steps 1 and 2 are repeated until desired results are achieved.

Look Around

•  Is an add-on for virtual library catalogs that integrates a users “find” with visualizations of Library of Congress Classifications.

•  Allows for another layer of access to library material.

•  Will encourage users to penetrate the Long Tail of information, instead of looking only at the most commonly used texts/journals.

•  Allows the user to set the preference for the visualization, creating a more personalized browsing experience.

Break Down the Book

Look around

Look Around

By Alisewski http://www-958.ibm.com/software/data/cognos/manyeyes/visualizations/bib-records-by-class-without-law-o

Explore within the Metadata

Created by Lianne http://www-958.ibm.com/software/data/cognos/manyeyes/

Next Steps

•  Make decisions about the proper types of visualizations. Important to allow for choice.

•  Design of program with a computer scientist or information visualization professional.

•  Work with a small digital collection to pre-test with a group of humanist scholars.

Explore beyond the shelves

By Jeffrey Heer http://prefuse.org/gallery/datamountain/

References

•  http://www.thegraphicrecorder.com/2012/04/09/visual-vocabulary-the-basics/

•  Erdelez, S. (1999). Information Encountering: It’s More Than Just Bumping into Information. Bulletin of the American Society for Information Science, February/M.

•  Hoeflich, M. H. (2007). Serendipity in the Stacks , Fortuity in the Archives *. Law Library Journal, 99(4), 813–827.

•  Makri, S., & Blandford, A. (2012). process model Article Title Page Coming across information serendipitously : Part 1 – A process model. Journal of Documentation, 68(5).

•  McClellan III, J. E. (2005). Accident, Luck, and Serendipity in Historical Research. Proceedings Of The American Philosophical Society, 149(1), 1 – 21.

•  Race, T. M. (2012). Resource Discovery Tools : Supporting Serendipity Planning and Implementing Resource Discovery Tools in Academic Libraries. DLTS Faculty Publications, Paper 22.

•  Rubin, V. L., Burkell, J., & Quan-haase, A. (2011). "Facets of serendipity in everyday chance encounters: a grounded theory approach to blog analysis. Information Research, 16(3).