Meta Search Engines Taly Sharon. T.Sharon Search Engine Seminar2 Contents Search Engines (SEs)...
-
Upload
amy-sophie-strickland -
Category
Documents
-
view
230 -
download
1
Transcript of Meta Search Engines Taly Sharon. T.Sharon Search Engine Seminar2 Contents Search Engines (SEs)...
Meta Search Engines
Taly Sharon
T.Sharon Search Engine Seminar
2
Contents Search Engines (SEs) generations Meta Search Engine (MSE) Why use several SEs (Motivation)? Highlighted MSEs (Mamma,
Dogpile, Vivisimo, Ixquick, KartOO) Hebrew MSEs MSE comparison When to use MSE – pros and cons How to choose MSE?
T.Sharon Search Engine Seminar
3
Search Engines Generations
1st Generation - Basic SEs:
2nd Generation - Meta SEs:
3rd Generation - Popularity SEs:
T.Sharon Search Engine Seminar
4
2nd Generation SEs - MetaSEs
Using several SEs in parallel. The results are filtered, ranked and
presented to the user as a uniformed list.
The ranking is a combination of the number of sources each page appeared in, and the ranking in each source.
T.Sharon Search Engine Seminar
5
Meta SE is a Meta-Service
It doesn’t use an Index/database of its own.
It uses other external search services that provide the information necessary to fulfill user queries.
T.Sharon Search Engine Seminar
6
Meta Search Engine
MetaCrawler
Yahoo Web Crawler Open Text Lycos InfoSeek Inktomi Galaxy Excite
Google · Yahoo · Jeeves Ask About · LookSmart · OvertureFindWhat
T.Sharon Search Engine Seminar
7
Premises of a Meta SE No single search is sufficient. Problem in expressing the query. Low quality references can be
detected.
T.Sharon Search Engine Seminar
8
Why use Several SEs?
Search Engines differ more than we think!
T.Sharon Search Engine Seminar
9
Overlap between Google and Yahoo
Source: Jux2 analysis of 500 top search terms, April 2004
http://www.jux2.com/stats.php
T.Sharon Search Engine Seminar
10
Who Overlaps Whom?
T.Sharon Search Engine Seminar
11
Try it yourself @ jux2
T.Sharon Search Engine Seminar
12
MSE - Motivation
1. The number and variety of SEs.2. Each SE provides an incomplete snapshot of
Web.3. Users are forced to try and retry their queries
across different SEs.4. Each SE has its own interface.5. Irrelevant, outdated or unavailable
responses.6. Each query is independent.7. No individual customization.8. The result is not homogenized.
T.Sharon Search Engine Seminar
13
Problems of MSEs
No advanced search options. Using the lowest common
denominator. Sponsored results from the SEs are
not highlighted.
T.Sharon Search Engine Seminar
14
Highlighted MSEs
T.Sharon Search Engine Seminar
15
Mamma
T.Sharon Search Engine Seminar
16
rSort: Mamma’s Ranking Algorithm Each duplicate search result is considered a 'vote' for that result. Pages with the highest number of votes go at the top of our result
set the method of voting we use is a simplified version of the
"Condorcet Method", named after the mathematician Marquis de Condorcet who invented this voting procedure in the 18th century.
One of the big advantages of this ranking method is the elimination of search engine spam.
Spammers often have difficulty spamming more than one engine at the same time, as different spamming methods must be used for each search engine.
Spam results will tend to receive fewer votes from multiple sources.
A spammer may have top ranking on one search engine, but they won't achieve it on Mamma unless they're able to spam ALL of our sources, an insurmountable task for even the best spammer.
T.Sharon Search Engine Seminar
17
Dogpile
T.Sharon Search Engine Seminar
18
Dogpile Advanced
T.Sharon Search Engine Seminar
19
Dogpile Advanced
T.Sharon Search Engine Seminar
20
Dogpile Advanced
T.Sharon Search Engine Seminar
21
Dogpile Advanced
T.Sharon Search Engine Seminar
22
Dogpile Preferences
T.Sharon Search Engine Seminar
23
Dogpile Preferences
T.Sharon Search Engine Seminar
24
Vivisimo/Clusty Vivísimo supports the most advanced
features of the major search engines using one Vivísimo syntax, which
follows the most standard conventions. Vivísimo translates your query into the
corresponding syntax of each underlying search engine.
Also, Vivísimo only queries the search engines that support your chosen syntax.
T.Sharon Search Engine Seminar
25
Clusty
T.Sharon Search Engine Seminar
26
Vivisimo Advanced
T.Sharon Search Engine Seminar
27
Vivisimo Advanced
T.Sharon Search Engine Seminar
28
Ixquick
T.Sharon Search Engine Seminar
29
Ixquick
T.Sharon Search Engine Seminar
30
Ixquick
T.Sharon Search Engine Seminar
31
KartOO – Visual MSE
T.Sharon Search Engine Seminar
32
MetaSEs in Hebrew: Clusty Start
T.Sharon Search Engine Seminar
33
Clusty
T.Sharon Search Engine Seminar
34
Clusty
T.Sharon Search Engine Seminar
35
When to use a MSE?
When single Basic-SE fails to provide good results.
One-stop shopping - prefer to search multiple SEs/sites at once to get blended ranked results (so as to save effort/time).
Searching for multi-faceted topics. Want to get clustered results to focus
search on the relevant keywords. Looking for current events/news.
T.Sharon Search Engine Seminar
36
For quick and dirty searches. If you want an answer fast, you may have better luck querying multiple engines simultaneously.
For broad and shallow searches. Meta searching is an excellent approach if the purpose of your search is to get an overview of a topic.
To assess potential keywords for an unfamiliar subject. What better way to discover search terms than to see how they appear in a cross section of documents across the web?
To see how different engines handle the same query. This is an excellent way to get to know the "personalities" of different search engines -- their strengths, weaknesses, and types of queries they handle best.
T.Sharon Search Engine Seminar
37
MSE pros Useful when you want to retrieve a
relatively small number of relevant results an excellent choice for obscure topics a good option when you are not having
luck finding what you want when you search
appropriate when you want to get an overall picture of what is available on the Web on your topic
T.Sharon Search Engine Seminar
38
MSE cons use is limited primarily to simple queries little or no field searching is available most services return a limited number
of results that do not represent the totality of results from any source engine
Sponsored results may are not highlighted (even though probably not first)
T.Sharon Search Engine Seminar
39
How to Choose your MetaSE
Search engines used Operators supported Special features Speed Presentation
T.Sharon Search Engine Seminar
40
Meta-SEs Features Chart
Red – not working
T.Sharon Search Engine Seminar
41
Vivisimo: link: Not supported?
T.Sharon Search Engine Seminar
42
T.Sharon Search Engine Seminar
43
Practical Recommendations Use Ixquick for fast results and
maximal syntax flexibility Use Vivisimo/Clusty (start) for
Clustering and/or Hebrew Use Dogpile to include
Google+Yahoo!, date range, or spelling corrections.
Use none for non-MSE tasks (see MSE cons)…
T.Sharon Search Engine Seminar
44
Exercises Find a presentation by Mary Ellen
Bates Learn about the litrature of Pablo
Neruda from a research/educational point of view Hint, query: +domain:edu
+literature +"pablo neruda“ Explore the different meanings of
Jaguar
T.Sharon Search Engine Seminar
45
Bibliography http://www.lib.berkeley.edu/TeachingLib/Guides/Internet
/MetaSearch.html http://www.sctboces.org/isc/iss/trainings/searching/
metasearchengines.htm http://www.cs.washington.edu/homes/etzioni/papers/me
tacrawler.pdf http://www.cs.washington.edu/homes/etzioni/papers/iee
e-metacrawler.pdf http://searchenginewatch.com/links/article.php/2156241 http://vivisimo.com/advanced?form=Advanced http://vivisimo.com/help.html http://searchenginewatch.com/searchday/article.php/
2226841