CIDR 2007, Asilomar California1 Predicate-Based Indexing of Enterprise Web Applications Cristian...
-
Upload
ashlynn-doyle -
Category
Documents
-
view
214 -
download
0
Transcript of CIDR 2007, Asilomar California1 Predicate-Based Indexing of Enterprise Web Applications Cristian...
![Page 1: CIDR 2007, Asilomar California1 Predicate-Based Indexing of Enterprise Web Applications Cristian Duda, David Graf, Donald Kossmann ETH Zurich.](https://reader036.fdocuments.us/reader036/viewer/2022062805/5697c0261a28abf838cd58f7/html5/thumbnails/1.jpg)
CIDR 2007, Asilomar California 1
Predicate-Based Indexing of Enterprise Web ApplicationsCristian Duda, David Graf, Donald Kossmann
ETH Zurich
![Page 2: CIDR 2007, Asilomar California1 Predicate-Based Indexing of Enterprise Web Applications Cristian Duda, David Graf, Donald Kossmann ETH Zurich.](https://reader036.fdocuments.us/reader036/viewer/2022062805/5697c0261a28abf838cd58f7/html5/thumbnails/2.jpg)
2
Enterprise Search: Possible Approaches
“Do It Yourself” (e.g., SAP, Oracle)+ App vendors know the semantics of their application- Everybody impements their own search engine- Cross Application Search is difficult
“Google for Web Applications” (generic ESE)+ generic (for all applications)+ enables cross-application search- need to teach the semantics of the app to the search
engine- nobody knows how to do it
![Page 3: CIDR 2007, Asilomar California1 Predicate-Based Indexing of Enterprise Web Applications Cristian Duda, David Graf, Donald Kossmann ETH Zurich.](https://reader036.fdocuments.us/reader036/viewer/2022062805/5697c0261a28abf838cd58f7/html5/thumbnails/3.jpg)
3
Enterprise Search: Current StatusSearch up to 50,000 documents for just $1,995.
Search up to 30 million documentsNew! Improved search results relevance, security and access to more content.
The Google Mini delivers cost-effective, high-quality search for your public website, intranet, and file servers – and you can be up and running in less than an hour. Supports from 50,000 to 300,000 documents. Learn more.
The Google Search Appliance provides robust, scalable and secure search across virtually all the information in your company. Starts at $30,000 for search across 500,000 documents. Learn more.
![Page 4: CIDR 2007, Asilomar California1 Predicate-Based Indexing of Enterprise Web Applications Cristian Duda, David Graf, Donald Kossmann ETH Zurich.](https://reader036.fdocuments.us/reader036/viewer/2022062805/5697c0261a28abf838cd58f7/html5/thumbnails/4.jpg)
4
Enterprise Application SearchSearch up to 50,000 documents for just $1,995.
Search up to 30 million documentsNew! Improved search results relevance, security and access to more content.
The Google Mini delivers cost-effective, high-quality search for your public website, intranet, and file servers – and you can be up and running in less than an hour. Supports from 50,000 to 300,000 documents. Learn more.
The Google Search Appliance provides robust, scalable and secure search across virtually all the information in your company. Starts at $30,000 for search across 500,000 documents. Learn more.
![Page 5: CIDR 2007, Asilomar California1 Predicate-Based Indexing of Enterprise Web Applications Cristian Duda, David Graf, Donald Kossmann ETH Zurich.](https://reader036.fdocuments.us/reader036/viewer/2022062805/5697c0261a28abf838cd58f7/html5/thumbnails/5.jpg)
5
Enteprise Application Search
JSP file
id name type
1 parrot green
2
Database
Property file
title.english=PetStore
XML Message
<item part=“1”>
<name>Snake</name>
<quantity>1</quantity>
<USPrice>60.30</USPrice>
</item>
Data User View
SAP,...
![Page 6: CIDR 2007, Asilomar California1 Predicate-Based Indexing of Enterprise Web Applications Cristian Duda, David Graf, Donald Kossmann ETH Zurich.](https://reader036.fdocuments.us/reader036/viewer/2022062805/5697c0261a28abf838cd58f7/html5/thumbnails/6.jpg)
6
Enterprise Search Engine (ESE)
Challenges:1. Userview assembled in a non-trivial way (not WYSIWYG)
2. References to Web Pages are complex:• URL• function• parameters• context (workflow, security)
This is not Google! 1. Google is WYSIWYG2. Google references are simple URIs
This is not Hidden Web!1. The app developer collaborates and teaches the semantics of the app to the ESE2. The ESE has full access to all data sources
![Page 7: CIDR 2007, Asilomar California1 Predicate-Based Indexing of Enterprise Web Applications Cristian Duda, David Graf, Donald Kossmann ETH Zurich.](https://reader036.fdocuments.us/reader036/viewer/2022062805/5697c0261a28abf838cd58f7/html5/thumbnails/7.jpg)
7
Enterprise Search Engine:
• Rules and Patterns • a handful of patterns are enough to describe the mapping
from raw view to user view declaratively (semi-automatic)
• Crawl the data sources (automatic)
• Normalize the data (automatic)
• Predicate-based indexing (automatic)
• Predicate-based query processing (automatic)
![Page 8: CIDR 2007, Asilomar California1 Predicate-Based Indexing of Enterprise Web Applications Cristian Duda, David Graf, Donald Kossmann ETH Zurich.](https://reader036.fdocuments.us/reader036/viewer/2022062805/5697c0261a28abf838cd58f7/html5/thumbnails/8.jpg)
8
Predicate-based IndexGoogle... ESE
Doc Id Keyword Score Predicate
d1 java 7 true
d1 pet 1 true
d1 store 1 true
d1 parrot 1 $catid=1
d1 finch 1 $catid=1
d1 iguana 1 $catid=2
d1 rattlesnake 1 $catid=2
d2 male 1 $itemid=1
d2 female 1 $itemid=1
![Page 9: CIDR 2007, Asilomar California1 Predicate-Based Indexing of Enterprise Web Applications Cristian Duda, David Graf, Donald Kossmann ETH Zurich.](https://reader036.fdocuments.us/reader036/viewer/2022062805/5697c0261a28abf838cd58f7/html5/thumbnails/9.jpg)
9
Demo!
Indexing Query Processing Result Generation
Use Case: Sun’s Java Pet Store Application
![Page 10: CIDR 2007, Asilomar California1 Predicate-Based Indexing of Enterprise Web Applications Cristian Duda, David Graf, Donald Kossmann ETH Zurich.](https://reader036.fdocuments.us/reader036/viewer/2022062805/5697c0261a28abf838cd58f7/html5/thumbnails/10.jpg)
10
The Application
• JSP Application developed by Sun
• Uses Dynamic JSP Pages + Database
• Sun uses it to showcase the capabilities of their J2EE platform
![Page 11: CIDR 2007, Asilomar California1 Predicate-Based Indexing of Enterprise Web Applications Cristian Duda, David Graf, Donald Kossmann ETH Zurich.](https://reader036.fdocuments.us/reader036/viewer/2022062805/5697c0261a28abf838cd58f7/html5/thumbnails/11.jpg)
11
Indexing (using our GUI)
JSP FilesRules from app. developer
Index location
Indexed files
![Page 12: CIDR 2007, Asilomar California1 Predicate-Based Indexing of Enterprise Web Applications Cristian Duda, David Graf, Donald Kossmann ETH Zurich.](https://reader036.fdocuments.us/reader036/viewer/2022062805/5697c0261a28abf838cd58f7/html5/thumbnails/12.jpg)
12
Query Processing (using our GUI)
The queried IndexQuery
Results
(URL+additional info)
![Page 13: CIDR 2007, Asilomar California1 Predicate-Based Indexing of Enterprise Web Applications Cristian Duda, David Graf, Donald Kossmann ETH Zurich.](https://reader036.fdocuments.us/reader036/viewer/2022062805/5697c0261a28abf838cd58f7/html5/thumbnails/13.jpg)
13
Result presentation
Dbl click on query result
Web page (user view) is displayed in browser.
1
2
Query: java iguana
![Page 14: CIDR 2007, Asilomar California1 Predicate-Based Indexing of Enterprise Web Applications Cristian Duda, David Graf, Donald Kossmann ETH Zurich.](https://reader036.fdocuments.us/reader036/viewer/2022062805/5697c0261a28abf838cd58f7/html5/thumbnails/14.jpg)
14
Result presentation
java iguanaQuery:
Only appears in the JSP file
Only appears in the database
• Our ESE understood the combination between the two data sources !
• The ESE combined the two data sources just as the application would have done
![Page 15: CIDR 2007, Asilomar California1 Predicate-Based Indexing of Enterprise Web Applications Cristian Duda, David Graf, Donald Kossmann ETH Zurich.](https://reader036.fdocuments.us/reader036/viewer/2022062805/5697c0261a28abf838cd58f7/html5/thumbnails/15.jpg)
15
Something funnyThe application also has a search functionality, but…
![Page 16: CIDR 2007, Asilomar California1 Predicate-Based Indexing of Enterprise Web Applications Cristian Duda, David Graf, Donald Kossmann ETH Zurich.](https://reader036.fdocuments.us/reader036/viewer/2022062805/5697c0261a28abf838cd58f7/html5/thumbnails/16.jpg)
16
Something funny
No Results!
The application’s search box is broken
![Page 17: CIDR 2007, Asilomar California1 Predicate-Based Indexing of Enterprise Web Applications Cristian Duda, David Graf, Donald Kossmann ETH Zurich.](https://reader036.fdocuments.us/reader036/viewer/2022062805/5697c0261a28abf838cd58f7/html5/thumbnails/17.jpg)
17
Details:http://www.dbis.ethz.ch/research/current_projects/appdata
Contacts:Cristian Duda
ETH Zurich, Switzerland
cristian.duda at inf.ethz.ch
Donald KossmannETH Zurich, Switzerland
kossmann at inf.ethz.ch