05/06/2007CRIS 2008 Personalizing Information Retrieval in CRISs with Fuzzy Sets and Rough Sets...
-
Upload
jasmine-sutton -
Category
Documents
-
view
216 -
download
0
Transcript of 05/06/2007CRIS 2008 Personalizing Information Retrieval in CRISs with Fuzzy Sets and Rough Sets...
05/06/2007 CRIS 2008
Personalizing Information Retrieval in CRISs with Fuzzy Sets and Rough Sets
Germán Hurtado Martín1,2
Chris Cornelis2
Helga Naessens1
1. University College Ghent, 2. Ghent University (Belgium)
CRIS 2008 205/06/2007
Overview
Problems in CRISs Fuzzy sets and Rough sets PAS project
CRIS 2008 305/06/2007
Overview
Problems in CRISs Fuzzy sets and Rough sets PAS project
CRIS 2008 405/06/2007
Problems in CRISs
Fuzzy
RoughTerm = Term
CRIS 2008 505/06/2007
Overview
Problems in CRISs Fuzzy sets and Rough sets PAS project
CRIS 2008 605/06/2007
Fuzzy sets and rough sets
Traditional approach: crisp sets
Young people = {x People | 0<age(x)<27}
CRIS 2008 705/06/2007
Fuzzy sets and rough sets
Fuzzy approach: fuzzy sets
0 if age(x) ≥ 301 if age(x) ≤ 20(30 – age(x)) / 10 otherwise
Young(x) =
CRIS 2008 805/06/2007
Fuzzy sets and rough sets Rough approach: rough sets
Upper approximation (R↑A)
A = {Numerical Analysis} B = {Compilers}
R↑A = {Num. Analysis, Ex. Sciences, Statistics, ... , Coding Theory} R↑B = {Compilers, Programming, GCC, YACC}
CRIS 2008 905/06/2007
Fuzzy rough sets
Fuzzy approach on rough sets Fuzzy set A Fuzzy relation R
R (x,y) Upper approximation
(R↑A)(y) = min(R(x,y),A(y))Xx∈
sup
CRIS 2008 1005/06/2007
Fuzzy rough sets: application
Query expansionAllows more results by using R↑A
R Programming Hardware C++ Java Laptop Algorithm
Programming 1.0 0.8 0.8 0.6
Hardware 1.0 0.4
C++ 0.8 1.0 0.7 0.2
Java 0.8 0.7 1.0 0.2
Laptop 0.4 1.0
Algorithm 0.6 0.2 0.2 1.0
- Query: “Programming”- Expanded query: {(“Programming”,1.0), (“C++”,0.8), (“Java”,0.8), (“Algorithm”,0.6)}
CRIS 2008 1105/06/2007
Overview
Problems in CRISs Fuzzy sets and Rough sets PAS project
CRIS 2008 1205/06/2007
PAS-project
What is the PAS-project? Personal Alert System (HoGent) Goal: to get the researcher’s attention on funding
possibilities that match his/her profile Information: about researchers, projects, funding
possibilities (grants etc.) → matching/collaboration Automation and intelligence
CRIS 2008 1305/06/2007
PAS – How does it work?
-Name
-Staff number
-Department(s)
-Group
-Date of creation of the profile
-Last update of the profile
-Percentage research time
-Skills description
-Diplomas
-Publications
-IWETO-keywords
-Free keywords
Fill in
IWETO
Thesaurus
HoGent
Thesaurus
User
CRIS 2008 1405/06/2007
PAS – How does it work?
-Reference
-Title
-Content
-Attachment(s)
-Level
-Duration
-Institution
-Deadline
-Address
-Contact person
-IWETO-keywords
-Free keywords
IWETO
Thesaurus
Messages
HoGent
Thesaurus
CRIS 2008 1505/06/2007
PAS – How does it work?
The IWETO-classification has 641 research fields:
5 at the 1st level, 31 at the 2nd level, 605 at the 3rd level
1
2
3
CRIS 2008 1605/06/2007
PAS – How does it work?
By adding “free keywords” we can refine the classification
1
2
3
0.6
0.7
0.8
CRIS 2008 1705/06/2007
PAS – How does it work?
Query:A = {k3}
Expanded query:R↑A = {(k1,0.8), (k3,1.0), …}
M1 → R2
CRIS 2008 1805/06/2007
PAS – How does it work?
0.6
0.7
0.80.7
CRIS 2008 1905/06/2007
CRIS 2008 2005/06/2007
CRIS 2008 2105/06/2007
CRIS 2008 2205/06/2007
CRIS 2008 2305/06/2007
CRIS 2008 2405/06/2007
CRIS 2008 2505/06/2007
PAS – Current implementation
Prototype that will be used as skeleton for the final system
Basic algorithm using weights and their products and basic fuzzy rough query expansion1
Basic profiles and messages Manual processing of feedback and manual data
extraction from text files.
1 P. Srinivasan, M. E. Ruiz, D. H. Kraft, J. Chen: Vocabulary mining for information retrieval: rough sets and fuzzy sets, Information Processing and Management, 37(1) (2001) 15-38
CRIS 2008 2605/06/2007
PAS – Future work
Richer representation of profiles and messages Automation of the feedback mechanism Dealing with imprecision and words from different thesauri Dealing with ambiguity and incomplete profiles Tracking research activities for collaboration Automatic extraction of information from text files Search engine
CRIS 2008 2705/06/2007
Thank you