Web Search and Advanced Internet Services

Post on 06-Jan-2016

32 views 2 download

Tags:

description

Web Search and Advanced Internet Services. 290N Class Introduction Tao Yang, 2014. Introduction. Web Search/Traffic User interests Web content Importance of search engine traffic Online advertisement Class Topics. Internet Users. Sales of Mobile Devices/PCs. - PowerPoint PPT Presentation

Transcript of Web Search and Advanced Internet Services

1

Web Search and Advanced Internet Services

290N Class Introduction

Tao Yang, 2014

2

Introduction

• Web Search/Traffic User interests

– Web content

Importance of search engine traffic Online advertisement

• Class Topics

Internet Users

Sales of Mobile Devices/PCs

http://www.businessinsider.com/the-future-of-mobile-deck-2012-3?op=1

More Mobil Search (2012 Survey)

Users’ Interests

7

Content trend and ownership

• Content consumption is fragmenting – nobody owns more than 10% of WW PVs

• No single place will own all the content

[Ramakrishnan and Tomkins 2007]

Web Search Engine Market in USA (Jan 2012)

• Google: 66.2%• Bing: 15.2%• Yahoo: 14.1%• Ask: 3%• AOL: 1.6%

Search Traffic is Important for Business

2012 Survey: Web Search Importance for Business

Online advertising market, Worldwide

12

Search query

Ad

13

Questions

• Do you think an “average” user, knows the difference between sponsored search links and algorithmic search results?

14

Course Objectives

• Practice and experience for building search services and developing related mining applications Broad topics in web mining and search engines,

advertisement Algorithms & System support

• Workload: 1 take-home exam Group project (2 persons).

– paper reviewing and presentation– Implementation/evaluation. Report.

2 group HW exercises (Lucene/Solr search, Hadoop log analysis)

15

Course Topics

• Web Search Indexing, Compression, and Online Search Ranking methods with text/ link/click analysis.

Machine learning.• Text Mining

Duplicate analysis. Text Categorization and Clustering

Recommendation• Advertisement• Systems Support

Online servers and offline computation. MapReduce.

Caching. Crawling and document parsing. Open source systems

16

Expected Work

• Tentatively Project 50%. Take-home exam 40%. 10% HW exercise.

• Timeline Jan 29: 1-page project proposal (plain email text). Jan 30-Feb 6:

– Meet with me and select paper(s) for reviewing.– Demo for HW 1

Feb 15 week: – Project progress & related papers presentation

Feb 27. HW2– Then schedule second meeting with me on HW2 and proj

March 15 week or earlier:– Project demo/interview– Final project slides/report.

Take-home exam. Problems based on class presentation/references/HW.

17

References

• Christopher D. Manning, Prabhakar Raghavan and Hinrich Schütze (MRS), Introduction to Information Retrieval, Cambridge University Press. 2008.

• Search Engines: Information Retrieval in Practice byCroft, Metzler, Strohman (CMS)Addison-Wesley, 2010

• Selected papers

• www.cs.ucsb.edu/~tyang/class/290N13

18

Class Computing Resource

• Triton supercomputer accounts: Week 2 (Jan 13).Get a class account in Triton by

emailing your name, UCSB email, and ssh public key with subject "CS290N ssh key" to scc@oit.ucsb.edu . Instructions on generating ssh keys can be found in http://cs.ucsb.edu/~hnielsen/cs140/ssh-keypair.html

• CSIL sandbox disk space /cs/sandbox/class/cs290n /cs/sandbox/student/<username>

• 290N class discussion group at Google.com (we will send an invitation based on the class list).