1 Enhancements in Query Evaluation and Page Summarization of The Thinking Algorithm M. Shoaib Jameel...

Post on 01-Jan-2016

222 views 0 download

Transcript of 1 Enhancements in Query Evaluation and Page Summarization of The Thinking Algorithm M. Shoaib Jameel...

1

Enhancements in Query Evaluation and Page

Summarization of The Thinking Algorithm

M. Shoaib Jameel Amar Akshat Chingtham Tejbanta Singh

Department of Computer Science and Engineering

Sikkim Manipal Institute of Technology

INDIA

ITSim’2008 Kuala Lumpur Malaysia

2

Discussion Flow

Introduction. Query Parsing. Page Summarization. URL Sorting. Results Evaluation Mechanism. Conclusion.

ITSim’2008 Kuala Lumpur Malaysia

3

Introduction – The Thinking Algorithm

Web search engine algorithm.

Tries to solve the following question with every query “Sitting in a particular region, WHY have you entered such a query?”

ITSim’2008 Kuala Lumpur Malaysia

4

Query Parsing

Determine Query context. Ambiguity Removal. Determining User’s Competence from query

eg. [hacking] and [how do I do hacking] Compounded Uniqueness Level – Geo-

Location searching.

ITSim’2008 Kuala Lumpur Malaysia

5

Page Summarization

PageTags and Query Expansion {demo, evaluation} and {Tutorials, courses}

Understanding Page Format – Amount of textual information, number of images

How rich a web page is in a particular context.

ITSim’2008 Kuala Lumpur Malaysia

6

URL Page Sorting

Important Feature: Considers user’s Internet connection speed.

Queries targeted: [news], [download doom]

ITSim’2008 Kuala Lumpur Malaysia

7

Query Evaluation Mechanism – Query [download gcc]

ITSim’2008 Kuala Lumpur Malaysia

Step 1: Parsing of the query.

Step 2: ‘download’ found. Conveys that the user wants to download something.

Step 3: Convert gcc to gcc.exe, gcc.tar.gz, gcc.zip etc.

Step 4: Search in the indexes of software download sites like download.com, tucows.com etc.

Step 5: Apply C.U.L.

Step 6: Sort pages according to user’s internet connection speed.

Step 7: Results

8

Conclusion

A unique algorithm – considers human factors – especially competency part.

Solves some of the major issues in search like Geo-location searches, WHY factor.

Future of search technology.

ITSim’2008 Kuala Lumpur Malaysia

9

Thank You

ITSim’2008 Kuala Lumpur Malaysia