Trends in Search Engine Optimization and Search Engine Marketing
Restricted Search Engine
-
Upload
thane-harrington -
Category
Documents
-
view
24 -
download
2
description
Transcript of Restricted Search Engine
Restricted Search Engine
Laurent Balat
Christophe Decis
Thomas Forey
Sebastien Leclercq
ESSI2 Project
Supervisor: Johny BOND
June 2002
Introduction(1)
• What is a search engine?
• 3 types:– disciplinary– global– thematic
• Internet users spend more than 50% of their time to search!
Introduction (2)
• Lots of pages can’t be reached.
WEB
Indexable WEB Google
How does it work ?
• The search engine is composed of two parts
First processing : the WEB site spider
WEB Spider Processing
indexing
PDFunitDOC
unitHTMLprocessing
unit
DATABASE
Constraint
How does it work ?
• User part architecture
DATABASEQuery engine
Query Interface
User
Constraints
• Domain Restriction.
• Search depth.
• Theme: words accepted or not.
• Document type.
• Time delay.
The Spider Part
Check if link already visited
Check type data in constraints
Error download
HTTP HEADlink
linkpriority queue
Stackdata pagePush pageDownload
Document Processing
• Analyse of type• Send to the appropriate unit.• Extract words and links• Trying to resolve bad links
Indexation
• Binary Search Tree:- quick building- efficient use
• Check constraints:- start list and stop list.
Database
• MySQL database.• General Structure:
KeywordsWeb links
Correspondencebetween keywords and links
User interface and query engine
• The web page is generated by a script (cgi).
• The query engine questions the database
• Formatting the results
Demonstration (1)• Fill the Database
Demonstration (2)
• How to search pages?
Conclusion
• Results and perspective– Original search engine.– Easy to improve by adding units to process
differents file format (ps, doc, xls,…).• Team working and repartition. • This Project shows us how to use the
different tools seen this year
References
http://www.w3c.org
http://www.mysql.com
http://www.sgi.com/tech/stl
http://www.searchengineshowdown.com