CNI 2003/Herlocker, Jung, and Webster1 Collaborative Filtering: Possibilities for Digital Libraries...

22
CNI 2003/Herlocker, Jung, and Webster 1 Collaborative Filtering: Possibilities for Digital Libraries Jon Herlocker Janet Webster Seikyung Jung Oregon State University

Transcript of CNI 2003/Herlocker, Jung, and Webster1 Collaborative Filtering: Possibilities for Digital Libraries...

CNI 2003/Herlocker, Jung, and Webster 1

Collaborative Filtering:Possibilities for Digital

Libraries

Collaborative Filtering:Possibilities for Digital

Libraries

Jon HerlockerJanet WebsterSeikyung Jung

Oregon State University

CNI 2003/Herlocker, Jung, and Webster 2

Current search engines are insufficient.

CNI 2003/Herlocker, Jung, and Webster 3

Two important search engine problems

Two important search engine problems

•They don’t understand:– Quality– Context

CNI 2003/Herlocker, Jung, and Webster 4

But First: Our Context

But First: Our Context

•Why are we standing up here?

•We think we can improve the digital library experience.

CNI 2003/Herlocker, Jung, and Webster 5

Today’s ContextToday’s Context

1. Research questions & hypotheses

2. Collaborative filtering 3. Our approach to CF in the

Library4. Challenges of collaborative

filtering for library search5. Initial lessons learned

CNI 2003/Herlocker, Jung, and Webster 6

The Librarian’s Questions

The Librarian’s Questions

• As electronic information increases in amount and value, how to provide access to it?

• How to change digital libraries from disconnected collections to integrated systems?

• How to integrate the expertise of librarians into the development process?

• How to adapt traditional library values to new opportunities?

CNI 2003/Herlocker, Jung, and Webster 7

The Computer Scientist’s Questions

The Computer Scientist’s Questions

• What is the next big leap in document search technology?

• How to overcome the limitations of software’s ability to understand language?

• How can we build a search engine that learns by observing searchers?

CNI 2003/Herlocker, Jung, and Webster 8

Our Research Hypotheses

Our Research Hypotheses

• Enabling the entire community to participate in organizing and recommending information will add value to the digital library

•  In other words: Collaborative Filtering will increase the value of a digital library

CNI 2003/Herlocker, Jung, and Webster 9

What is Collaborative Filtering?

What is Collaborative Filtering?

• Communities of people sharing their evaluations of content

• Recommendations are transferred between people of like interest

• Examples:– MovieLens.org– Epinions.com– Launchcast (launch.yahoo.com)– Amazon.com

CNI 2003/Herlocker, Jung, and Webster 10

CF and LibrariesCF and Libraries

• Search is central to user experience of digital library

• Collaborative Filtering:– Could overcome the limitations of

current search technology– CF already exists in libraries.

• Not search, but cataloguing (OCLC)

• Adapting CF for document searching is not trivial.– Information needs are dynamic.

CNI 2003/Herlocker, Jung, and Webster 11

Our ApproachOur Approach• OSU Libraries Recommender

System– Perform at CF at query level

• Match similar queries in addition to similar users

– Generate results based on past user recommendations

– Infer recommendations from user behavior

– Integrate with existing library systems and traditions

CNI 2003/Herlocker, Jung, and Webster 12

CNI 2003/Herlocker, Jung, and Webster 13

CNI 2003/Herlocker, Jung, and Webster 14

CNI 2003/Herlocker, Jung, and Webster 15

The Benefits of CFThe Benefits of CF

• Quality is considered.– Recommendations are based

on human evaluations.• Context is considered.• The system gets better as

it’s used.• Doesn’t require significant,

centralized human resources

CNI 2003/Herlocker, Jung, and Webster 16

CS ChallengesCS Challenges

• How to collect evaluations?• How to identify the “useful”

element of recommendations?• How to represent the

information needs of searchers?• How to rank results?• How to design the interface?

CNI 2003/Herlocker, Jung, and Webster 17

Library ChallengesLibrary Challenges

• How to balance privacy with personalization & involvement?

• How to maintain authority of recommended information?

• How to deal with timeliness of information?

• How to integrate with existing library systems?

• How to fund research in the library setting?

CNI 2003/Herlocker, Jung, and Webster 18

What We’ve Learned

What We’ve Learned

• Weakness of “old” search technology affects perception of new

• Wrapper technology minimizes IT commitment

• Existing internal data can be used to jumpstart system

• Controlled experiments show– Increased performance– Increased perception of non-

tangibles

CNI 2003/Herlocker, Jung, and Webster 19

CF and Digital Libraries

CF and Digital Libraries

• Helps handle more electronic information

• Improve search results• Shapes direction of digital libraries• Supports collaboration on many levels

Nothing ventured, nothing gained.

CNI 2003/Herlocker, Jung, and Webster 20

FundingFunding

• OSU Libraries Gray Chair for Innovative Technologies

• National Partnership for Advanced Computing Infrastructure (NSF)

• Georgia Pacific HMSC internship

CNI 2003/Herlocker, Jung, and Webster 21

More informationMore information– Silence of the Sleeper

•Malcom Gladwell, The New Yorker, October 4th, 1999 (gladwell.com)

– System for Electronic Recommendation Filtering Prototype (SERF) for OSU Libraries• http://dl.nacse.org/osu

CNI 2003/Herlocker, Jung, and Webster 22

ContactsContacts

Janet Webster– Oregon State University Libraries,

Hatfield Marine Science Center– [email protected]

Jon Herlocker– Oregon State University, School

of Electrical Engineering & Computer Science

[email protected]