Post on 25-Feb-2016
description
Caching XML Web Services to Support Disconnected Operation
Venugopalan RamasubramanianCornell University
Doug TerryMicrosoft Research, Silicon Valley
Web Services
• method of providing and accessing services on the Internet– consumer services
• hotmail, orbitz, mapquest, ebay, …– B to B services
• supply chain management
• request-response paradigm– RPCs on the internet
XML Web Services
• W3C (world wide web consortium) standards– Microsoft, IBM, HP, …– Microsoft .Net web services (HailStorm)
• mycontacts, myprofile, myfavoritewebsites
– TerraServer, CoolRooster• SOAP (simple object access protocol)
– standard representation of web service requests/responses (SOAP-RPC)
• WSDL (web services description language)– description of web services
Availability of Web Services
GOALmake web services available despite frequent
disconnections and limited bandwidth!
• web service clients reside on all kinds of devices– desktop, laptop, PDA, smart phone
• network outages (especially wireless)• bandwidth restriction
Governing Principles• cannot modify web services• cannot modify access protocols• can perhaps modify client
– must also comply with existing clients • can interpose storage and computation
client-side caching is a solution to improve availability!
XML Standards: SOAP• SOAP-RPC standard
– encoding definitions for data types – success, failure definitions
• SOAP-Envelope– outer-most element
• SOAP-Body– obligatory– request operation: name, parameters– response status: return value, failure
• SOAP-Header– optional, multiple header blocks.– supplementary information: kerberos ticket
• HTTP binding– HTTP request and response messages
example: soap request<s:Envelope xmlns:s=“http://schemas.xmlsoap.org/soap/envelope/”
xmlns:m=“http://schemas.microsoft.com/hs/2001/10/myContacts” xmlns:c=“http://schemas.microsoft.com/hs/2001/10/core”xmlns:mp="http://schemas.microsoft.com/hs/2001/10/myProfile" >
<s:Header><licenses xmlns="http://schemas.xmlsoap.org/soap/security/2000-12">
<c:identity> <c:kerberos>3240</c:kerberos> </c:identity></licenses><path xmlns="http://schemas.xmlsoap.org/rp/">
<action>http://schemas.microsoft.com/hs/2001/10/core#request</action><to>http://terry.microsoft.com</to><fwd><via /></fwd><rev><via /></rev><id>b55528a4-5d63-49f1-87a2-5fab8d76f658</id>
</path><c:request service="myContacts" document="content" method="insert" genResponse="always" >
<key puid="3240" instance="1" cluster="1" /></c:request>
</s:Header><s:Body>
<c:insertRequest select="/m:myContacts/m:contact[mp:name/mp:givenName = ‘Terry']/mp:emailAddress" >
<mp:email>terry@microsoft.com</mp:email></c:insertRequest>
</s:Body></s:Envelope>
XML Standards: WSDL
• concrete definition of the web service– data structures– interface offered by the web service
• operation names and parameters– message formats (components of a message)– protocol binding (SOAP)
• automatic generation of client-side stubs– Visual Studio .Net
Experiments with Web Cache
• experiment with existing clients and services (Microsoft .Net web services)
• check feasibility by building a cache to store HTTP requests/responses
MyContacts
MyServices
MyProfilecache
Issues in Caching
• web services are active– default HTTP cache directive is No Cache!
• web services are diverse– unlike files and databases, web services have custom
interfaces • fundamental questions
– which requests are cacheable?– which operations have permanent side effects?– how to understand requests/responses?
• services use different formats for requests/responses
example: soap request<s:Envelope xmlns:s=“http://schemas.xmlsoap.org/soap/envelope/”
xmlns:m=“http://schemas.microsoft.com/hs/2001/10/myContacts” xmlns:c=“http://schemas.microsoft.com/hs/2001/10/core”xmlns:mp="http://schemas.microsoft.com/hs/2001/10/myProfile" >
<s:Header><licenses xmlns="http://schemas.xmlsoap.org/soap/security/2000-12">
<c:identity> <c:kerberos>3240</c:kerberos> </c:identity></licenses><path xmlns="http://schemas.xmlsoap.org/rp/">
<action>http://schemas.microsoft.com/hs/2001/10/core#request</action><to>http://terry.microsoft.com</to><fwd><via /></fwd><rev><via /></rev><id>b55528a4-5d63-49f1-87a2-5fab8d76f658</id>
</path><c:request service="myContacts" document="content" method="insert" genResponse="always" >
<key puid="3240" instance="1" cluster="1" /></c:request>
</s:Header><s:Body>
<c:insertRequest select="/m:myContacts/m:contact[mp:name/mp:givenName = ‘Terry']/mp:emailAddress" >
<mp:email>terry@microsoft.com</mp:email></c:insertRequest>
</s:Body></s:Envelope>
Issues in Caching contd.
• consistency– later requests might invalidate responses cached
earlier.• read/write, write/write conflicts
– how to specify consistency requirements for generic web services?
request 1: query request
<deleteRequest select = “myContacts/contact[name=‘terry’]/phone[@cat=‘cell’]” />
request 2: delete request
<queryRequest select = “myContacts/contact[name=‘terry’]” />
More Issues…
• user experience– user unaware of web service cache– operations reportedly successful could fail!
• hoarding– keeping the cache hot– user controlled hoard requests
• security– enforce access control
Our Approach• annotate WSDL description of web services to
define cache properties– published by service providers or third party– no changes to server side code required
• transparent cache for web services– acts as a web proxy on the client machine– no modifications of the client program necessary
• custom cache managers for each web service– generated automatically from the annotated WSDL
description
CCM1
Architecture
Web Client 1
Web Client 2
ProxyServer
Cache
WebService 1
WebService 3
WebService 2
INTERNET
CCM1: Custom Cache Manager 1
CCM2
CCM3
WBQ
WBQ: Write Back Queue
WSDL Annotations: for each Operation
• cacheable: the operation can be cached• lifetime: the duration for which replies are
cached • play-back: the operation has side effects
and must be played back when connection is restored
• default-response: a default response will be sent when connection is not available
WSDL Annotations: for each Service
• identify the operation (operationName)– xpath (xml query language) expression to
extract the name of the operation• extract the request message (identifier)
– portions of the request message should be ignored while caching (date)
– xpath expression to extract relevant parts of the message for identification
<binding name="myContactsBinding" type="tns:myContactsPort"
operationName =
"substring-before(localname(/senv:Envelope/senv:Body/*[1]), 'Request')"
Identifier = "/senv:Envelope/senv:Header/s0:licenses | /senv:Envelope/senv:Header/s1:request | /senv:Envelope/senv:Body">
<s:binding transport="http://schemas.xmls.org/s/http" style="document" />
<operation name="insert" cacheable="false" playback="true" defaultResponse="true" cacheHeader="true">
<s:operation sAction="http://schemas.microsoft.com/hs/2001/10/c#request" />
snippet from annotated myContacts.wsdl
Annotations for Consistency
• when does request 2 invalidate the response of an earlier request 1 in the cache?– an insert could invalidate an earlier query response
• consider requests to be functions with signaturesreq1: op1 (param1,1, param1,2, …, param1,n)req2: op2 (param2,1, param2,2, …, param2,m)
• invalidate condition is an expression of req1 and req2
f(op1, op2, param1,1, …, param2,1, …)
Annotations for Consistency: XSL Transformations
• extensible style sheet language (XSL)– transforms XML documents in to html/text/xml– Turing-complete language
• cache transform: transforms a cached response– input: request1, reply1, request2, reply2
– output: transformed reply1 (null if invalidated)• powerful than just specifying invalidations
– can actually transform the old response
Cache Transform Example
request 1: query request
<deleteRequest select = “myContacts/contact[name=‘terry’]/phone[@cat=‘cell’]” />
request 2: delete request
<queryRequest select = “myContacts/contact[name=‘terry’]” />
smart cache transform would delete the cell phone number from the cached query response
<xsl:template match="/"> <xsl:variable name="service1" select="$req1/s:Header/c:request/@service"/> <xsl:variable name="service2" select="$req2/s:Header/c:request/@service"/> <xsl:variable name="opName1" select="substring-before(local-name($req1/s:Body/*[1]), 'Request')"/> <xsl:variable name="opName2" select="substring-before(local-name($req2/s:Body/*[1]), 'Request')"/> <xsl:choose> <xsl:when test="$service1 = $service2"> <xsl:choose> <xsl:when test="$opName2 = 'query' and ($opName1 = 'insert' or $opName1 = 'delete' or $opName1 = 'replace')"> <xsl:variable name="cleanQuery1">
<xsl:call-template name="StripSegment"> <xsl:with-param name="xpQuery" select="substring-after($req1/s:Body/c:*/@select, '/')"/></xsl:call-template>
</xsl:variable> <xsl:variable name="cleanQuery2">
<xsl:call-template name="StripSegment"> <xsl:with-param name="xpQuery" select="substring after($req2/s:Body/c:queryRequest/c:xpQuery/@select, '/')"/></xsl:call-template>
</xsl:variable> <xsl:call-template name="CheckIntersection">
<xsl:with-param name="xpQuery1" select="$cleanQuery1"/><xsl:with-param name="xpQuery2" select="$cleanQuery2"/>
</xsl:call-template> </xsl:when> <xsl:otherwise> <xsl:value-of select="$rep2"/> </xsl:otherwise> </xsl:choose> </xsl:when> <xsl:otherwise> <xsl:value-of select="$rep2"/> </xsl:otherwise> </xsl:choose></xsl:template>
Picking Level of Consistency• user-freedom in choosing consistency
guarantees– multiple consistency transforms
• strong consistency– less availability – better user experience
• weak consistency– user experience could deteriorate
• operations reportedly successful could fail!• optional cache header
– better availability
More Transforms
• response transform– response from the cache may have to be
changed before returning to the client.– adding time-stamp, unique identifiers etc.
• default response transform– generates a default response for a request.– default responses are returned when
disconnected but request is queued for play-back
Optional Cache Header
• cache provides information to the client using cache header– response from cache or server– age of cached response– request will be played back in the future
• no changes to the definition of WSDL– would not affect existing clients in any way.
• cache aware clients can provide additional information to the user
example: default response and cache header
<s:Envelope xmlns:s=“http://schemas.xmlsoap.org/soap/envelope/” xmlns:hs="http://schemas.microsoft.com/hs/2001/10/core"><s:Header>
<path xmlns="http://schemas.xmlsoap.org/rp/"><action>http://schemas.microsoft.com/hs/2001/10/core#response</action></rev><from>http://terry.microsoft.com</from><relatesTo > d978b559-aceb-4e9e-9747-b8a306234bc8 <relatesTo>
</path>< response xmlns ="http://schemas.microsoft.com/hs/2001/10/core" /><cacheHeader defaultResponse="true" toPlayback="true"
xmlns="http://localhost/wsdlannotation" /></s:Header><s:Body>
<hs:insertResponse status="success" selectedNodeCount="1" newChangeNumber="0" /></s:Body>
</s:Envelope>
Conclusion
• built a prototype web services cache• experimented with Hailstorm web services
and clients• annotated Hailstorm WSDL files• the prototype demonstrates custom cache
managers in action for Hailstorm • couldn’t give a demo
Work for the Future
• WSDL annotations for more web services– hard to find interesting web services with
WSDL descriptions yet!• hoarding to enhance availability
– specify user controlled hoard queries– hoard transform to obtain response from
cached hoard requests• incorporate security constraints• tune cache performance