Finding things on the web with BOSS
-
date post
19-Oct-2014 -
Category
Technology
-
view
6.047 -
download
0
description
Transcript of Finding things on the web with BOSS
Finding things on the web with BOSS
Christian Heilmann | http://wait-till-i.com | http://scriptingenabled.org
Open Hack Day 2009 - Bangalore, India
What is innovation?
Innovation is improving the current state of affairs.
In the best of all cases this means that it gets easier for the person using a product
to reach a goal.
This can be achieved by connecting several different products and turning them into one. (pssst... Mashup).
I’ve seen this followed cleverly on several levels
here in India.
BOSS is your chance to make the web an easier to navigate
space.
You can help us turn a search engine into a find engine.
Web search is broken!
http://luckyrobot.com/2009/02/06/search-is-broken-%E2%80%93-really-broken/
http://www.borthwick.com/weblog/2009/02/05/creative-destruction-google-slayed-by-the-notificator/
Back in the days the web was
small and largely static documents.
Nowadays it is huge and its content and the publication speed increased a lot.
This is one challenge of search engines these days.
The others are old, but also increasing
Now, say you have the most awesome idea of a search engine that works around these issues.
Where you will get stuck is the overhead of indexing,
storing and filtering the data of the web.
And this is where BOSS comes in.
It is an API interface to our data stores for search.
BOSS is Build Your Own Search Service:
http://developer.yahoo.com/search/boss/
To use it, you need a Application ID:
https://developer.yahoo.com/wsregapp/
And there is full documentation available:
http://developer.yahoo.com/search/boss/boss_guide/
Happy Hacking!
... oh alright then ...
You can get the code examples I will show here:
http://isithackday.com/hacks/bangalore/bosscode.zip
Say you want to search the web for donkeys.
... oh alright then ...
Because
Donkeys
Rock!
Using BOSS you can do this with a REST API and display
the results any way you want!
The REST API:boss.yahooapis.com/ysearch/{type}/v1/
{search}
The REST API:boss.yahooapis.com/ysearch/{type}/v1/
{search}
type is what you want to search:
web: the interwebs
news: new stuff
images: pictures
The REST API:boss.yahooapis.com/ysearch/{type}/v1/
{search}
search is the term to look for (url-encoded)
Put “” around terms to ensure the right order, f.e. “donkey fur” (you don’t want to see cats, do you?)
Filter with a -, f.e. donkey -shrek
Restrict to a site using site:, f.e. donkey site:flickr.com
The REST API:boss.yahooapis.com/ysearch/{type}/v1/
{search}
Other parameters:appid: your app ID (needed)
count: amount of results
start: where to start the counting
region / lang: country and language
format: xml or json
sites: restrict to certain sites (comma separated)
Web search REST API:boss.yahooapis.com/ysearch/web/v1/{search}
Extra parameters:filter: To filter out nasties, use filter=-porn-hate
type: to search different types. You can use html, text, pdf, xl, msword, ppt or groups like msoffice and nonhtml. You can also do a type=msoffice,-xl
Image search REST API:boss.yahooapis.com/ysearch/images/v1/
{search}
Extra parameters:filter: no nudies
dimensions: all, small, medium, large, wallpaper, widewallpaper
refererurl: all images in that url
url: image at that url
News search REST API:boss.yahooapis.com/ysearch/news/v1/{search}
Extra parameters:age: how old the news are in days. Last five days would be “5d”
There are restrictions how to display results and
information as to what data comes back.
For this, read the guide!http://developer.yahoo.com/search/boss/boss_guide/
Everybody Duck!
There will be code
The easiest way to use BOSS is using JavaScript.
http://boss.yahooapis.com/ysearch/web/v1/donkeys?format=json&appid={id}
{"ysearchresponse":{"responsecode":"200","nextpage":"\/ysearch\/web\/v1\/donkeys?format=json&appid=[...]&start=10","totalhits":"492215","deephits":"15700000","count":"10","start":"0","resultset_web":[{"abstract":"Hyperlinked description of the domesticated mammal discussing its appearance, relationship to horses, economic <b>...<\/b> horses and <b>donkeys<\/b> were brought back <b>...<\/b>","clickurl":"http:\/\/lrd.yahooapis.com\/_ylc=X3oDMTU4b2NoaDR2BF9TAzIwMjMxNTI3MDIEYXBwaWQDVFg2YjRYSFYzNEVuUFhXMHNZRXI1MWhQMXBuNU84S0FHcy5MUVNYZXIxWjdSbW1WclpvdXo1U3Z5WGtXc1ZrLQRwb3MDMARzZXJ2aWNlA1lTZWFyY2hXZWIEc2xrA3RpdGxlBHNyY3B2aWQDR3lDaEgwU081cTlmSktUNG1ndTVUUUJNdlNjaS4wa1ZUVndBQVF5Sw--\/SIG=11820sato\/**http%3A\/\/en.wikipedia.org\
To use this across domains, simply define a callback
parameter:
http://boss.yahooapis.com/ysearch/web/v1/donkeys?
format=json&callback=founddonkeys&appid={id}
founddonkeys({"ysearchresponse":{"responsecode":"200","nextpage":"\/ysearch\/web\/v1\/donkeys?format=json&callback=founddonkeys&appid=TX6b4XHV34EnPXW0sYEr51hP1pn5O8KAGs.LQSXer1Z7RmmVrZouz5SvyXkWsVk-&start=10","totalhits":"492215","deephits":"15700000","count":"10","start":"0","resultset_web":[{"abstract":"Hyperlinked description of the domesticated mammal discussing its appearance, relationship to horses, economic <b>...<\/b> horses and <b>donkeys<\/b> were brought back <b>...<\/b>","clickurl":"http:\/\/lrd.yahooapis.com\/_ylc=X3oDMTU4cG05cXJwBF9TAzIwMjMxNTI3MDIEYXBwaWQDVFg2YjRYSFYzNEVuUFhXMHNZRXI1MWhQMXBuNU84S0FHcy5MUVNYZXIxWjdSbW1WclpvdXo1U3Z5WGtXc1ZrLQRwb3MDMARzZX
All you then need to do is put this url in a script node and
write the founddonkeys function:
<div id="searchresults"></div> <script type="text/javascript"> function founddonkeys(o){ var donkeys = o.ysearchresponse.resultset_web; var results = document.createElement('ul'); for(var i=0,j=donkeys.length;i<j;i++){ var item = document.createElement('li'); var link = document.createElement('a'); var abstract = document.createElement('p'); link.setAttribute('href',donkeys[i].clickurl); link.innerHTML = donkeys[i].title; item.appendChild(link); abstract.innerHTML = donkeys[i].abstract; item.appendChild(abstract); results.appendChild(item); }
document.getElementById('searchresults').appendChild(results); } </script> <script type="text/javascript" charset="utf-8" src="http://boss.yahooapis.com/ysearch/web/v1/donkeys?format=json&callback=founddonkeys&appid=xxx"></script>
Two problems though:
First of all - without JavaScript there are no
donkeys!
Secondly - you can only find donkeys!
The solution: Event Handling and dynamic script
generation.
<p>Warning: this is terrible code, USE A LIBRARY INSTEAD!</p><ul id="searches"> <li><a href="http://search.yahoo.com/search?va=donkeys"> Search for Donkeys </a> </li> <li><a href="http://search.yahoo.com/search?va=kittens"> Search for kittens </a> </li></ul><div id="searchresults"></div>
<script type="text/javascript" charset="utf-8"> function founddonkeys(o){ var donkeys = o.ysearchresponse.resultset_web; var results = document.createElement('ul'); for(var i=0,j=donkeys.length;i<j;i++){ var item = document.createElement('li'); var link = document.createElement('a'); var abstract = document.createElement('p'); link.setAttribute('href',donkeys[i].clickurl); link.innerHTML = donkeys[i].title; item.appendChild(link); abstract.innerHTML = donkeys[i].abstract; item.appendChild(abstract); results.appendChild(item); } var resultsbox = document.getElementById('searchresults'); resultsbox.innerHTML = ''; resultsbox.appendChild(results); } var APIkey = 'TX6b4XHV34EnPXW0sYEr51hP1pn5O8KAGs'+ '.LQSXer1Z7RmmVrZouz5SvyXkWsVk-'; var searchlinks = document.getElementById('searches').getElementsByTagName('a'); for(var i=0;searchlinks[i];i++){
var APIkey = 'TX6b4XHV34EnPXW0sYEr51hP1pn5O8KAGs'+ '.LQSXer1Z7RmmVrZouz5SvyXkWsVk-'; var searchlinks = document.getElementById('searches').
getElementsByTagName('a'); for(var i=0;searchlinks[i];i++){ searchlinks[i].onclick = function(){ var searchterm = this.href.split('va=')[1]; var url = 'http://boss.yahooapis.com/ysearch/web/v1/' + searchterm + '?format=json&' + 'callback=founddonkeys' + '&appid=' + APIkey var s = document.createElement('script'); s.setAttribute('type','text/javascript'); s.setAttribute('src',url); document.getElementsByTagName('head')[0].appendChild(s); return false; } }</script>
*click*
Using the YUI library (YUI3 JavaScript and CSS grids) you
can easily make this much cooler:
To make using BOSS with JavaScript easier, I’ve written
a BOSS wrapper called YBOSS:
http://icant.co.uk/sandbox/yboss/
<div id="results"></div><script type="text/javascript" src="yboss-lib.js"></script><script type="text/javascript">YBOSS.get({ searches:'search,news', query:'obama', count:10, callback:seedpics});function seedpics(o){ var all = '<h4>Web Sites</h4>' + o.webHTML + '<h4>News</h4>' + o.newsHTML; var out = document.getElementById('results'); out.innerHTML = all;}</script>
For server side code there are code examples available at:http://www.saurabhsahni.com/boss-
examples.zip
There’s also the Python mashup framework that
allows for SQL for remixing arbitrary XML/JSON sources:
http://developer.yahoo.com/search/boss/mashup.html
And there’s an easy way to deploy BOSS apps at Google
App Engine:http://zooie.wordpress.com/2008/08/04/yahoo-boss-
google-app-engine-integrated/
All this has been around for a while.
Here are some new things added lately:
People are trying to make the web a less messier place by
adding semantic data to HTML documents.
Using SearchMonkey technology BOSS now lists this information in the results.
http://www.flickr.com/photos/glenscott/3273401181/
view=searchmonkey_rdf
view=searchmonkey_feed
http://developer.yahoo.com/search/boss/structureddata.html
Using the view=keyterms parameter you can get
keywords associated with each result.
In order to get longer descriptions of each result
you can now use the abstract=long parameter to
get up to 300 characters instead of 130.
Another thing we’ve done is using the Yahoo Site Explorer
and bundle it with BOSS.
This way you can now get page information and page inlink information with two
new BOSS services.
http://boss.yahooapis.com/ysearch/se_inlink/v1/{URL}?
appid={APPID}
Gets you a list of pages that linked to the URL provided.
http://boss.yahooapis.com/ysearch/se_pagedata/v1/
{URL}?appid={APPID}
Gets you a list of all pages belonging to a domain in the
Yahoo! index.
So what has been done using BOSS?
And on a more lighter note:
The client side is where strange things happen.
http://isithackday.com/hacks/web-the-adventure/
The motherlode of BOSS implementations:
http://mashable.com/boss/
http://delicious.com/tag/bossmashup
Add yours by tagging it with “bossmashup” on Del.icio.us!
Keep in touch:
Christian Heilmann
http://wait-till-i.com
http://scriptingenabled.org
http://twitter.com/codepo8
T H A N K S !