SMS-Based web Search for Low-end Mobile Devices

Jay Chen

New York University

jchen@cs.nyu.edu

Lakshmi Subramanian

New York University

lakshmi@cs.nyu.edu

Eric Brewer

University of California,

Berkely

brewer@cs.berkely.edu

SMS-based web service is a rapidly growing market Over 12 million subscribers in July 2008 A significant fraction of mobile devices in developing regions are still low-cost devices

Motivation(1)

Undesirable performance about current existing SMS-based web service Low accuracy (Google SMS 22.2%, Yahoo! One search

27.8%[vertical and pre-defined topics]) Long median response time (ChaCha 227.5 seconds

[hire human to search the web and answer questions])

Motivation(2)

SMS search suffer from the long tail phenomenon 21% of the queries are verticals and 79% are long tailed (in

ChaCha) None of the existing automated SMS search services is a

complete solution for search queries across arbitrary topics

The search queries are inherently ambiguous

Challenges

Seek to build an automated system has performance Fast (unlike ChaCha) Accurate(unlike Google SMS and Yahoo! One search) Return a disambiguated result for queries across arbitrary

topics

Problem

Related work

Mobile search is different from conventional desktop search Click-through rate and search page views were

significantly lower Persistence of mobile users was very low Diversity of search topics for low-end phone users was

much less

Distinct at least one of the three dimensions fromTREC tracks The nature of the input query The document collection set The nature of the search result in the query response

System architecture

Run algorithm and return a snippet

Vertical: topics are pre-defined or popular

Long tail: topics are not popular

A snippet: any continuous stream of text that fits within an SMS message(within 140 bytes)

Hint: a term or a collection of consecutive terms that determine what kind of information the user is looking for

Introduce of definition

SMSFind algorithm

The SMSFind search problem can be characterized as :

★ Given an unstructured SMS search query in the form of <query, hint> and top-k return pages by a search engine, extract a condensed set of text snippets from the response pages that provide an appropriate search response to the query.

This problem definition assumes that the hint is specified for every query. Like Google SMS have a similar explicit requirement, where a keyword is specified as the last term.(this paper’s hint is arbitrary)

SMSFind algorithm

Neighborhood Extraction

N-gram Ranking

Snippet Ranking

Considering a search query (Q,H) where Q is the search query containing the hint term H.

Let P1, . . . PN represent the textual content of the top N search response pages to Q. Given(Q,H) and P1 . . . PN, the SMSFind snippet extraction algorithm contains three main steps:

Process of SMSFind

Filtering n-grams

Neighborhood extraction

Ranking n-grams

Split snippets tiles

Snippet ranking

Generate n-gramsFilter the set of n-gram based on three dimensions: frequency (3), mean rank(ignore low PageRank n-gram) and Minimum distance(10) .

Rank(s)=freq(s)+meanranks(s)+mindist(s)

Based on the cumulative rank of top-k(5) ranked n-grams within the snippet

Using a 140bytes slide window

Generate n-gram

n-gram :1-5 words

N-gram Frequency Min. Distance

"the" 2 1

"the brown" 1 3

"the brown cow" 1 2

"brow cow jumped" 1 1

Table 1: Slicing example for the text “the brown cow jumped over the moon”. Hint=“over”

N-gram Ranking Three metrics:

Frequency: the number of times the n-gram occurs across all snippets

Mean rank: the sum across every occurrence of a n-gram of the PageRank of the page in which it occurs, divided by the n-gram’s raw frequency.

Minimum distance : the minimum distance between a n-gram and the hint across any occurrences of both.

An example at this point of metrics to evaluate the rank of n-gram

If two n-grams s,t have the same frequency measure but if n-gram s has a much lower web frequency than t, then s needs to be higher ranked than t

TF-IDF

Rank(s)=freq(s)+meanrank(s)+mindist(s) {a linear combination of three normalized ranks}

snippet Ranking

How to extract a hint

Resource date analysis:

95% of 100, 000 queries from ChaCha are less than 14 terms or less

Several common structures can be observed and have corresponding transformation rules

Like:45% of the queries began with “what”, of which over 80% of

the queries are in standard forms (e.g. “what is”, “what was”, “what are”, “what do”, “what does”)

“what is a quote by Ernest Hemingway”Satisfy structure of “what is X”, ignore the stop word “a”, the final<query, hint> is <“ernest hemingway”, quote>

Implement

Implement: Language: 600 lines of python uses publicly parsing

Library Deployment: a front-end to send and receive SMS

message Set up: a SMS short code with a local telco in Kenya,

and route all SMS requests and response to and from our server machine

Implement interfaces : to several basic vertical as a part of service including: weather, definitions, local business results, and news. (each of those interfaces under 150 lines python code)

Evaluation

Use the sub-topic in ChaCha to focus on long tail topics

variety of the topics

Important to use n-gram to rank the snippet

Critical to return a snippet rather than n-gram

Significant to modify the queries

The readability of our snippets is poor

Conclusion

A combination of simple Information Retrieval algorithms in conjunction with existing search engines can provide reasonably accurate search response for SMS queries

Using queries across arbitrary topics show SMSFind can answer 57.3% of the queries in test set.

Represent a foray into an open and practical research domain

SMS-Based web Search for Low-end Mobile Devices

Documents

Transcript of SMS-Based web Search for Low-end Mobile Devices

Sir - Intelligent search via i devices

EFF Border Search Electronic Devices

Personalizing Search on Shared Devices - Ryen Whiteryenwhite.com/talks/pdf/WhiteSIGIR2015.pdf · 2015. 8. 12. · Personalizing Search on Shared Devices Ryen White and Ahmed Hassan

HBS-730 BLUETOOTH Stereo Headset - B&H Photo · SMS Reader The HBS-730 Bluetooth headset supports SMS reading on Android TM based devices. In Google Play™ search for the LG BT Reader

Business Messaging Roadmap...on nearly all mobile devices, brands using SMS typically reach 95% of users. consider these supporting facts • 65% of marketers report that SMS is a

SMS SERIES · 4 Contacts layout SMS 15 SMS 18 SMS 24 SMS 36 SMS 2 SMS 3 SMS 4 SMS 6 SMS 9 SMS 12 Contact identifications shown are for mating face. Contact identifications of wiring

In search of subtlety: Discursive devices and rhetorical ...

SMCT Channel Service Technology paging Devices with SMS Problems 1(SMS Paging) Real time not guaranteed due to delays or failures in paging SMS URL Sending ? ? Packet Network Access

Geo-Location Forensics on Mobile Devices - ICDFIsecmeeting.ihep.ac.cn/paper/Paper_Yi_Sun_ICDFI2012.pdf · Geo-Location Forensics on Mobile Devices ... Call History, Contacts, SMS

SMS SERIES - RS Components4 Contacts layout SMS 15 SMS 18 SMS 24 SMS 36 SMS 2 SMS 3 SMS 4 SMS 6 SMS 9 SMS 12 Contact identifications shown are for mating face. Contact identifications

International Telecommunication Union...International Telecommunication Union 4 Spam spam spam zWireless spam: SMS, MMS, email over mobile devices…(and SMS scams) z“Spim”: spam

Executive Search Medical Devices - Surgical, Renal, Peritoneal, Blood

Development of Compound Semiconductor Devices— In Search of ...

How to restore lost sms from android devices

The Search for new lifeinfo.somos.com/rs/687-QPS-924/images/Toll-Free... · SMS messages per day (One Reach: Texting Statistics That Prove Businesses Need to Take SMS Seriously).

digitalbuddha.in · bulk sms marketing search engine marketing content marketing search engine optimization tv & radio ads analytics and monitoring . graphic designing "wow" is the

SMS Short Message Service The A2P SMS Market …...2017/07/13 · messaging. Use cases in areas such as IoT, with software updates for home electronics and devices, show additional

State of the Art Search of Handheld Mine Gas Detection Devices

SRCH2: High-Power Search for Mobile, and Remote Devices

With SMS / GSM / GPRS communication - Masermic...With SMS / GSM / GPRS communication Bee EN_C_160419 iLOGSBee devices are low consumption nodes with Digital / Analogical I/O and they