05/06/2007CRIS 2008 Personalizing Information Retrieval in CRISs with Fuzzy Sets and Rough Sets...

27
05/06/2007 CRIS 2008 Personalizing Information Retrieval in CRISs with Fuzzy Sets and Rough Sets Germán Hurtado Martín 1,2 Chris Cornelis 2 Helga Naessens 1 1. University College Ghent, 2. Ghent University (Belgium)

Transcript of 05/06/2007CRIS 2008 Personalizing Information Retrieval in CRISs with Fuzzy Sets and Rough Sets...

Page 1: 05/06/2007CRIS 2008 Personalizing Information Retrieval in CRISs with Fuzzy Sets and Rough Sets Germán Hurtado Martín 1,2 Chris Cornelis 2 Helga Naessens.

05/06/2007 CRIS 2008

Personalizing Information Retrieval in CRISs with Fuzzy Sets and Rough Sets

Germán Hurtado Martín1,2

Chris Cornelis2

Helga Naessens1

1. University College Ghent, 2. Ghent University (Belgium)

Page 2: 05/06/2007CRIS 2008 Personalizing Information Retrieval in CRISs with Fuzzy Sets and Rough Sets Germán Hurtado Martín 1,2 Chris Cornelis 2 Helga Naessens.

CRIS 2008 205/06/2007

Overview

Problems in CRISs Fuzzy sets and Rough sets PAS project

Page 3: 05/06/2007CRIS 2008 Personalizing Information Retrieval in CRISs with Fuzzy Sets and Rough Sets Germán Hurtado Martín 1,2 Chris Cornelis 2 Helga Naessens.

CRIS 2008 305/06/2007

Overview

Problems in CRISs Fuzzy sets and Rough sets PAS project

Page 4: 05/06/2007CRIS 2008 Personalizing Information Retrieval in CRISs with Fuzzy Sets and Rough Sets Germán Hurtado Martín 1,2 Chris Cornelis 2 Helga Naessens.

CRIS 2008 405/06/2007

Problems in CRISs

Fuzzy

RoughTerm = Term

Page 5: 05/06/2007CRIS 2008 Personalizing Information Retrieval in CRISs with Fuzzy Sets and Rough Sets Germán Hurtado Martín 1,2 Chris Cornelis 2 Helga Naessens.

CRIS 2008 505/06/2007

Overview

Problems in CRISs Fuzzy sets and Rough sets PAS project

Page 6: 05/06/2007CRIS 2008 Personalizing Information Retrieval in CRISs with Fuzzy Sets and Rough Sets Germán Hurtado Martín 1,2 Chris Cornelis 2 Helga Naessens.

CRIS 2008 605/06/2007

Fuzzy sets and rough sets

Traditional approach: crisp sets

Young people = {x People | 0<age(x)<27}

Page 7: 05/06/2007CRIS 2008 Personalizing Information Retrieval in CRISs with Fuzzy Sets and Rough Sets Germán Hurtado Martín 1,2 Chris Cornelis 2 Helga Naessens.

CRIS 2008 705/06/2007

Fuzzy sets and rough sets

Fuzzy approach: fuzzy sets

0 if age(x) ≥ 301 if age(x) ≤ 20(30 – age(x)) / 10 otherwise

Young(x) =

Page 8: 05/06/2007CRIS 2008 Personalizing Information Retrieval in CRISs with Fuzzy Sets and Rough Sets Germán Hurtado Martín 1,2 Chris Cornelis 2 Helga Naessens.

CRIS 2008 805/06/2007

Fuzzy sets and rough sets Rough approach: rough sets

Upper approximation (R↑A)

A = {Numerical Analysis} B = {Compilers}

R↑A = {Num. Analysis, Ex. Sciences, Statistics, ... , Coding Theory} R↑B = {Compilers, Programming, GCC, YACC}

Page 9: 05/06/2007CRIS 2008 Personalizing Information Retrieval in CRISs with Fuzzy Sets and Rough Sets Germán Hurtado Martín 1,2 Chris Cornelis 2 Helga Naessens.

CRIS 2008 905/06/2007

Fuzzy rough sets

Fuzzy approach on rough sets Fuzzy set A Fuzzy relation R

R (x,y) Upper approximation

(R↑A)(y) = min(R(x,y),A(y))Xx∈

sup

Page 10: 05/06/2007CRIS 2008 Personalizing Information Retrieval in CRISs with Fuzzy Sets and Rough Sets Germán Hurtado Martín 1,2 Chris Cornelis 2 Helga Naessens.

CRIS 2008 1005/06/2007

Fuzzy rough sets: application

Query expansionAllows more results by using R↑A

R Programming Hardware C++ Java Laptop Algorithm

Programming 1.0 0.8 0.8 0.6

Hardware 1.0 0.4

C++ 0.8 1.0 0.7 0.2

Java 0.8 0.7 1.0 0.2

Laptop 0.4 1.0

Algorithm 0.6 0.2 0.2 1.0

- Query: “Programming”- Expanded query: {(“Programming”,1.0), (“C++”,0.8), (“Java”,0.8), (“Algorithm”,0.6)}

Page 11: 05/06/2007CRIS 2008 Personalizing Information Retrieval in CRISs with Fuzzy Sets and Rough Sets Germán Hurtado Martín 1,2 Chris Cornelis 2 Helga Naessens.

CRIS 2008 1105/06/2007

Overview

Problems in CRISs Fuzzy sets and Rough sets PAS project

Page 12: 05/06/2007CRIS 2008 Personalizing Information Retrieval in CRISs with Fuzzy Sets and Rough Sets Germán Hurtado Martín 1,2 Chris Cornelis 2 Helga Naessens.

CRIS 2008 1205/06/2007

PAS-project

What is the PAS-project? Personal Alert System (HoGent) Goal: to get the researcher’s attention on funding

possibilities that match his/her profile Information: about researchers, projects, funding

possibilities (grants etc.) → matching/collaboration Automation and intelligence

Page 13: 05/06/2007CRIS 2008 Personalizing Information Retrieval in CRISs with Fuzzy Sets and Rough Sets Germán Hurtado Martín 1,2 Chris Cornelis 2 Helga Naessens.

CRIS 2008 1305/06/2007

PAS – How does it work?

-Name

-Staff number

-Department(s)

-Group

-Date of creation of the profile

-Last update of the profile

-Percentage research time

-Skills description

-Diplomas

-Publications

-IWETO-keywords

-Free keywords

Fill in

IWETO

Thesaurus

HoGent

Thesaurus

User

Page 14: 05/06/2007CRIS 2008 Personalizing Information Retrieval in CRISs with Fuzzy Sets and Rough Sets Germán Hurtado Martín 1,2 Chris Cornelis 2 Helga Naessens.

CRIS 2008 1405/06/2007

PAS – How does it work?

-Reference

-Title

-Content

-Attachment(s)

-Level

-Duration

-Institution

-Deadline

-Address

-Contact person

-IWETO-keywords

-Free keywords

IWETO

Thesaurus

Messages

HoGent

Thesaurus

Page 15: 05/06/2007CRIS 2008 Personalizing Information Retrieval in CRISs with Fuzzy Sets and Rough Sets Germán Hurtado Martín 1,2 Chris Cornelis 2 Helga Naessens.

CRIS 2008 1505/06/2007

PAS – How does it work?

The IWETO-classification has 641 research fields:

5 at the 1st level, 31 at the 2nd level, 605 at the 3rd level

1

2

3

Page 16: 05/06/2007CRIS 2008 Personalizing Information Retrieval in CRISs with Fuzzy Sets and Rough Sets Germán Hurtado Martín 1,2 Chris Cornelis 2 Helga Naessens.

CRIS 2008 1605/06/2007

PAS – How does it work?

By adding “free keywords” we can refine the classification

1

2

3

0.6

0.7

0.8

Page 17: 05/06/2007CRIS 2008 Personalizing Information Retrieval in CRISs with Fuzzy Sets and Rough Sets Germán Hurtado Martín 1,2 Chris Cornelis 2 Helga Naessens.

CRIS 2008 1705/06/2007

PAS – How does it work?

Query:A = {k3}

Expanded query:R↑A = {(k1,0.8), (k3,1.0), …}

M1 → R2

Page 18: 05/06/2007CRIS 2008 Personalizing Information Retrieval in CRISs with Fuzzy Sets and Rough Sets Germán Hurtado Martín 1,2 Chris Cornelis 2 Helga Naessens.

CRIS 2008 1805/06/2007

PAS – How does it work?

0.6

0.7

0.80.7

Page 19: 05/06/2007CRIS 2008 Personalizing Information Retrieval in CRISs with Fuzzy Sets and Rough Sets Germán Hurtado Martín 1,2 Chris Cornelis 2 Helga Naessens.

CRIS 2008 1905/06/2007

Page 20: 05/06/2007CRIS 2008 Personalizing Information Retrieval in CRISs with Fuzzy Sets and Rough Sets Germán Hurtado Martín 1,2 Chris Cornelis 2 Helga Naessens.

CRIS 2008 2005/06/2007

Page 21: 05/06/2007CRIS 2008 Personalizing Information Retrieval in CRISs with Fuzzy Sets and Rough Sets Germán Hurtado Martín 1,2 Chris Cornelis 2 Helga Naessens.

CRIS 2008 2105/06/2007

Page 22: 05/06/2007CRIS 2008 Personalizing Information Retrieval in CRISs with Fuzzy Sets and Rough Sets Germán Hurtado Martín 1,2 Chris Cornelis 2 Helga Naessens.

CRIS 2008 2205/06/2007

Page 23: 05/06/2007CRIS 2008 Personalizing Information Retrieval in CRISs with Fuzzy Sets and Rough Sets Germán Hurtado Martín 1,2 Chris Cornelis 2 Helga Naessens.

CRIS 2008 2305/06/2007

Page 24: 05/06/2007CRIS 2008 Personalizing Information Retrieval in CRISs with Fuzzy Sets and Rough Sets Germán Hurtado Martín 1,2 Chris Cornelis 2 Helga Naessens.

CRIS 2008 2405/06/2007

Page 25: 05/06/2007CRIS 2008 Personalizing Information Retrieval in CRISs with Fuzzy Sets and Rough Sets Germán Hurtado Martín 1,2 Chris Cornelis 2 Helga Naessens.

CRIS 2008 2505/06/2007

PAS – Current implementation

Prototype that will be used as skeleton for the final system

Basic algorithm using weights and their products and basic fuzzy rough query expansion1

Basic profiles and messages Manual processing of feedback and manual data

extraction from text files.

1 P. Srinivasan, M. E. Ruiz, D. H. Kraft, J. Chen: Vocabulary mining for information retrieval: rough sets and fuzzy sets, Information Processing and Management, 37(1) (2001) 15-38

Page 26: 05/06/2007CRIS 2008 Personalizing Information Retrieval in CRISs with Fuzzy Sets and Rough Sets Germán Hurtado Martín 1,2 Chris Cornelis 2 Helga Naessens.

CRIS 2008 2605/06/2007

PAS – Future work

Richer representation of profiles and messages Automation of the feedback mechanism Dealing with imprecision and words from different thesauri Dealing with ambiguity and incomplete profiles Tracking research activities for collaboration Automatic extraction of information from text files Search engine

Page 27: 05/06/2007CRIS 2008 Personalizing Information Retrieval in CRISs with Fuzzy Sets and Rough Sets Germán Hurtado Martín 1,2 Chris Cornelis 2 Helga Naessens.

CRIS 2008 2705/06/2007

Thank you