Transcoding: Extending e-business to new … · Transcoding: Extending e-business to new...

26
Transcoding: Extending e-business to new environments by K. H. Britton R. Case A. Citron R. Floyd Y. Li C. Seekamp B. Topol K. Tracey The promise of e-business is coming true: both businesses and individuals are using the Web to buy products and services. Both want to extend the reach of e-business to new environments. Customers want to check accounts, access information, and make purchases with their cellular phones, pagers, and personal digital assistants (PDAs). Banks, airlines, and retailers are competing to provide the most ubiquitous, convenient service for their customers. Web applications designed to take advantage of the rich rendering capabilities of advanced desktop browsers on large displays do not generally render effectively on the small screens available on phones and PDAs. Some devices have little or no graphics capability, or they require different markup languages, such as Wireless Markup Language (WML), for text presentation. Transcoding is technology for adapting content to match constraints and preferences associated with specific environments. This paper compares and contrasts different approaches to content adaptation, including authoring different versions to accommodate different environments, using application server technology such as JavaServer pages TM (JSP TM ) to create multiple versions of dynamic applications, and dynamically transcoding information generated by a single application. For dynamic transcoding, the paper describes several different transcoding methodologies employed by the IBM WebSphere TM Transcoding Publisher product, including HyperText Markup Language (HTML) simplification, Extensible Markup Language stylesheet selection and application, HTML conversion to WML, WML deck fragmentation, and image transcoding. The paper discusses how to decide whether transcoding should be performed at the content source or in a network intermediary. It also describes a means of identifying the device and network characteristics associated with a request and using that information to decide how to transcode the response. Finally, the paper discusses the need for new networking benchmarks to characterize the server load and performance characteristics for dynamic transcoding. T he phenomenon of the Internet has caused bus- inesses to rethink their business models, cus- tomer relationships, and internal processes. Tech- nology advances have created new opportunities to reach employees and customers, wherever they are, with information that is tailored to their needs and preferences. This information is often already stored and used in the business, but the delivery system must be re-engineered to exploit the new technology and tailor the content so it is more usable. A common problem with which this re-engineering must deal is that data or information is stored in some form on some system, when it is needed somewhere else in a different form. The problem of adapting or customizing existing con- tent to new applications and deliveries is not new. The user interaction model advanced by the Inter- net browser, along with portable and interoperable features of new technologies such as the Java** lan- guage 1 and Extensible Markup Language (XML) 2 have created a new opportunity to address this prob- lem with some common techniques. In contrast, the rapid appearance of wireless and other new networks with widely varying characteristics and the prepon- derance of new devices with a wide variety of capa- bilities creates new constraints on the solution. De- vices that are designed to be easily carried and used in the field trade off some capabilities to gain this portability. To be easily carried, they must be light rCopyright 2001 by International Business Machines Corpora- tion. Copying in printed form for private use is permitted with- out payment of royalty provided that (1) each reproduction is done without alteration and (2) the Journal reference and IBM copy- right notice are included on the first page. The title and abstract, but no other portions, of this paper may be copied or distributed royalty free without further permission by computer-based and other information-service systems. Permission to republish any other portion of this paper must be obtained from the Editor. IBM SYSTEMS JOURNAL, VOL 40, NO 1, 2001 0018-8670/01/$5.00 © 2001 IBM BRITTON ET AL. 153

Transcript of Transcoding: Extending e-business to new … · Transcoding: Extending e-business to new...

Page 1: Transcoding: Extending e-business to new … · Transcoding: Extending e-business to new environments by K. H. Britton R. Case A. Citron R. Floyd Y. Li C. Seekamp B ... buy products

Transcoding: Extendinge-business to newenvironments

by K. H. BrittonR. CaseA. CitronR. Floyd

Y. LiC. SeekampB. TopolK. Tracey

The promise of e-business is coming true: bothbusinesses and individuals are using the Web tobuy products and services. Both want to extendthe reach of e-business to new environments.Customers want to check accounts, accessinformation, and make purchases with theircellular phones, pagers, and personal digitalassistants (PDAs). Banks, airlines, and retailersare competing to provide the most ubiquitous,convenient service for their customers. Webapplications designed to take advantage of therich rendering capabilities of advanced desktopbrowsers on large displays do not generallyrender effectively on the small screens availableon phones and PDAs. Some devices have little orno graphics capability, or they require differentmarkup languages, such as Wireless MarkupLanguage (WML), for text presentation.Transcoding is technology for adapting contentto match constraints and preferences associatedwith specific environments. This paper comparesand contrasts different approaches to contentadaptation, including authoring different versionsto accommodate different environments, usingapplication server technology such as JavaServerpagesTM (JSPTM) to create multiple versions ofdynamic applications, and dynamicallytranscoding information generated by a singleapplication. For dynamic transcoding, the paperdescribes several different transcodingmethodologies employed by the IBMWebSphereTM Transcoding Publisher product,including HyperText Markup Language (HTML)simplification, Extensible Markup Languagestylesheet selection and application, HTMLconversion to WML, WML deck fragmentation,and image transcoding. The paper discusses howto decide whether transcoding should beperformed at the content source or in a networkintermediary. It also describes a means ofidentifying the device and network characteristicsassociated with a request and using thatinformation to decide how to transcode theresponse. Finally, the paper discusses the needfor new networking benchmarks to characterizethe server load and performance characteristicsfor dynamic transcoding.

The phenomenon of the Internet has caused bus-inesses to rethink their business models, cus-

tomer relationships, and internal processes. Tech-nology advances have created new opportunities toreach employees and customers, wherever they are,with information that is tailored to their needs andpreferences. This information is often already storedand used in the business, but the delivery system mustbe re-engineered to exploit the new technology andtailor the content so it is more usable. A commonproblem with which this re-engineering must deal isthat data or information is stored in some form onsome system, when it is needed somewhere else ina different form.

The problem of adapting or customizing existing con-tent to new applications and deliveries is not new.The user interaction model advanced by the Inter-net browser, along with portable and interoperablefeatures of new technologies such as the Java** lan-guage1 and Extensible Markup Language (XML)2

have created a new opportunity to address this prob-lem with some common techniques. In contrast, therapid appearance of wireless and other new networkswith widely varying characteristics and the prepon-derance of new devices with a wide variety of capa-bilities creates new constraints on the solution. De-vices that are designed to be easily carried and usedin the field trade off some capabilities to gain thisportability. To be easily carried, they must be light

rCopyright 2001 by International Business Machines Corpora-tion. Copying in printed form for private use is permitted with-out payment of royalty provided that (1) each reproduction is donewithout alteration and (2) the Journal reference and IBM copy-right notice are included on the first page. The title and abstract,but no other portions, of this paper may be copied or distributedroyalty free without further permission by computer-based andother information-service systems. Permission to republish anyother portion of this paper must be obtained from the Editor.

IBM SYSTEMS JOURNAL, VOL 40, NO 1, 2001 0018-8670/01/$5.00 © 2001 IBM BRITTON ET AL. 153

Page 2: Transcoding: Extending e-business to new … · Transcoding: Extending e-business to new environments by K. H. Britton R. Case A. Citron R. Floyd Y. Li C. Seekamp B ... buy products

and small. This requirement limits the types of userinterfaces they can support. Large screens and fullkeyboards are impossible; a small screen and tele-phone keypad are more realistic, although some de-vices may have only a voice interface. To run onbattery power for useful periods of time, power con-sumption must be carefully managed, forcing the useof designs with little storage and processing capa-bility. To be connected from anywhere requires wire-less or intermittent wired connections, which limitthe types of interactions and bandwidth available foraccessing content.3

All of these constraints create difficult challenges indesigning a useful system for delivering content toa wide array of devices. However, if such an infor-mation delivery system can be created quickly andcost-effectively and if it integrates with existing in-formation systems, the value to customers can be im-mense. Transcoding, or adapting content from oneform to another, is a key part of meeting these re-quirements for rapid, inexpensive deployment of newways to access existing content.

The media signal processing industry first used theterm “transcoding” to refer to the task of convert-ing a signal, say a television program, from one for-mat to another (for example, converting the NTSC(or National Television System Committee) stan-dard, used in America and Japan, to the PAL (orPhase Alternating Line) standard, used in much ofthe rest of the world), while preserving the contentof the program. Although the term has lately beenused to mean many different things, we use the termtranscoding to refer to the tasks of summarizing orfiltering (which modify content without changing itsrepresentation) and translating, or converting, con-tent from one representation to another.

Every Web application is associated with a varietyof costs, including those associated with the designof business logic, creating an attractive layout, de-ciding which parts of the application to make secure,publishing the various resources associated with theapplication to a Web server or Web applicationserver, testing the visual appeal of the resulting pages,testing to see whether the outcome of the applica-tion is correct, and testing to see whether the ap-plication works correctly with various browsers andvarious different browser versions. Over time, thereare also costs associated with maintaining an appli-cation, including operational costs for server process-ing, storage, and network bandwidth, and develop-ment costs for adding new features, updating the

presentation, retesting, and keeping track of the re-sources.

The proliferation of wireless devices with very dif-ferent form factors multiplies the costs associatedwith Web applications. Different devices requirepages to be presented in different markup languages,such as HyperText Markup Language (HTML), com-pact HTML4 (including a version called imode thatis popular in Japan), Unwired Planet’s Handheld De-vices Markup Language (HDML),5 Wireless MarkupLanguage (WML),6 and VoiceXML. Even with a par-ticular markup language such as WML, different pre-sentations may be required on different specificdevices. For example, for one WML-based phone,choices are most appropriately presented as buttons.For another phone, they might best be presented ashyperlinks. Other constraints come into play. For ex-ample, different phones can accept different sizedecks, where a deck is a set of information that canbe sent to the phone at one time. Sending a phonea deck that is too large will cause the application tofail to display properly. Some forms of content maybe much more expensive for a server or network toprocess than others.7 The existence of a growingnumber of client devices with different presentationrequirements presents new problems for maintain-ing an e-business site.

In addition, there are business opportunities for net-work intermediaries, such as Internet service pro-viders (ISPs), to make existing Web applications avail-able to a greater market by making them accessiblefrom different devices. ISPs do not necessarily ownthe content that they serve to their clients. Instead,they provide value-added services, such as perfor-mance improvements via local caching. Making in-formation available in different forms for differentclient devices is an additional possible value-addedservice for them.

Content adaptation

In the following discussion, application refers to anyform of program that generates information in re-sponse to requests from Web users, content refersto the information returned in answer to a request,and rendering refers to the way the content is pre-sented to the user, for example, how it is laid out ona Web page and how the user navigates through theinformation. Some markup languages, includingmost XML dialects, contain content along with tagsthat identify the meaning of different fields in thecontent. Other markup languages, such as HTML and

BRITTON ET AL. IBM SYSTEMS JOURNAL, VOL 40, NO 1, 2001154

Page 3: Transcoding: Extending e-business to new … · Transcoding: Extending e-business to new environments by K. H. Britton R. Case A. Citron R. Floyd Y. Li C. Seekamp B ... buy products

WML, carry both content and rendering instructions,since the tags are primarily concerned with the waythe content is presented. Transcoding can modify ei-ther the content or the rendering associated with anapplication. For example, a subset of the content canbe selected for a small-screen device, and it can berendered differently to achieve a more pleasing pre-sentation.

Static Web content includes pages that are presentedto all users in the same way, generally retrieved fromthe file system of a Web server. Dynamic Web con-tent is generated at run time based on informationprovided by the user. Dynamic Web content includespages that are generated by programs, such as serv-lets and JavaServer Pages** (JSP**) that run on Webapplication servers. In both cases, documents—gen-erally in some markup language—are produced tobe sent back to the requesting user.

There are two main approaches to handling the needto present the content delivered by an applicationdifferently in different circumstances, for differentdevices, and for different levels of network service.The first is to design each application for each setof presentation requirements, transcoding the ap-plication statically at design time. The second is tomodify the application output automatically on thefly, essentially transcoding the application dynam-ically at run time. These two approaches can be com-bined in various ways to match the requirements ofparticular applications. Both static and dynamicforms of transcoding can be performed with bothstatic and dynamic Web content. Figure 1 shows asimple comparison between static and dynamictranscoding.

Static transcoding. With static transcoding of staticcontent, different versions of the same content are

Figure 1 Static and dynamic transcoding

CONTENT AND RENDERING

CONTENT AND RENDERING

CONTENT AND RENDERING

CONTENT AND RENDERING

STATIC TRANSCODING

DYNAMIC TRANSCODING

CONTENT

BACK-ENDAPPLICATION

BACK-ENDAPPLICATION

RENDERING

RENDERING

RENDERING

RENDERING

IBM SYSTEMS JOURNAL, VOL 40, NO 1, 2001 BRITTON ET AL. 155

Page 4: Transcoding: Extending e-business to new … · Transcoding: Extending e-business to new environments by K. H. Britton R. Case A. Citron R. Floyd Y. Li C. Seekamp B ... buy products

produced by a designer, generally using various stu-dio tools, and stored in the file system of the Webserver. The main problem to be solved is selectingthe right version of the content for a particular userwith a particular device. One way to perform this se-lection is to provide the user of a specialized device

with a special home page, often preconfigured in thedevice. This home page contains a set of links to con-tent in the appropriate form for the device. All thelinks in this content point to other content appro-priate for the device. If the users of phones are us-ing them for specific applications rather than for gen-eral Web surfing, this method provides appropriatecontent with no user actions.

With static transcoding of dynamic content, eitherdifferent versions of the application are provided orthe application contains logic that produces differ-ent versions of the content. In any case, the choiceof version is based on information from the incom-ing request that describes the device. For example,there can be a version of a particular JSP that pro-duces HTML for a full-function browser, another thatproduces WML, another that produces VoiceXML,and so on. Once again, the JSP for the different pre-sentations are designed by humans who are cogni-zant of the presentation on the specific target de-vices. The Web application server may provideassistance by selecting the correct JSP for the incom-ing request. For example, IBM’s WebSphere* Appli-cation Server8 3.5 includes an extensible file calledclient_types.xml that it uses to map patterns foundin the HyperText Transfer Protocol (HTTP) headersto specific output markup language requirements.

The primary advantages of static transcoding arequality of presentation and performance.9 Each pageis constructed by a human designer, generally work-ing with tools that show how the application will lookor sound and thus allowing the presentation to behand-tailored for each environment. Also, since thetranscoding is done at design time, the informationis generated directly in the form required by the us-

er’s device. This reduces the run-time processing re-quired to generate the information.

The primary disadvantage of static transcoding is thenumber of different pages and applications that haveto be created, tested, organized, and maintained. Asthe number of devices increases, the burden of hand-generating different versions of each application orpage for different presentations will become oner-ous.

Dynamic transcoding can also work with both staticand dynamic content. In fact, by the time dynamictranscoding occurs, the content is just content, andthe means of generating it is not important.

Dynamic transcoding. Dynamic transcoding prom-ises to allow the problem of creating content to beseparated from the problem of creating different pre-sentations. Dynamic transcoding consists of a set oftechniques for tailoring the information generatedfor a user to match the specific presentation require-ments of a given user on a given device on a givenlevel of network service. Various dynamic transcod-ing mechanisms are possible. Several examples aredescribed later in this paper. They include a mech-anism to select the proper stylesheet to convert anincoming XML document into a presentation appro-priate for a particular output device, a mechanismto translate from HTML into various other markuplanguages, including HDML, compact HTML, andWML, and clipper mechanisms that extract the sub-set of the content that is most appropriate for dis-play on a small-screen device. To the extent that thesemechanisms can be used across a wide variety of con-tent for a wide variety of devices, they reduce theoverall development and maintenance costs of Webapplications.

Both static and dynamic transcoding have a place increating effective e-business sites. Some pages willbe accessed frequently and will be seen as the sig-nature pages for a business. For these, hand designwill often be important. However, there are a num-ber of considerations that will make dynamictranscoding more cost-effective. These include:

1. Is information to be presented that is not underthe user’s direct control? This situation can oc-cur for various reasons, such as (1) an organiza-tion has separate groups working on content au-thorship and its presentation to different devices;(2) an ISP or a portal provider has legal agree-

Both static and dynamictranscoding have a place

in creating effectivee-business sites.

BRITTON ET AL. IBM SYSTEMS JOURNAL, VOL 40, NO 1, 2001156

Page 5: Transcoding: Extending e-business to new … · Transcoding: Extending e-business to new environments by K. H. Britton R. Case A. Citron R. Floyd Y. Li C. Seekamp B ... buy products

ments with the content owners that allow themto tailor it for presentation to different devices.

2. What skills does the user have that can be usedto develop the complete system? Static transcod-ing may need to be done by persons skilled in Webpage design or dynamic content generation. Dy-namic transcoders may need to be built by per-sons skilled in Extensible Stylesheet Language(XSL) stylesheet development or clipper develop-ment. The most successful projects use a designthat allows for a clean delineation of roles andresponsibilities. That is, the content experts cangenerate the content without having to under-stand the details of page design, server admin-istration, or device support. The site designers cancreate a user experience without understandingall the details of the content or the many differ-ent client device types. The presentation customi-zation experts can focus on the required devicesand their capabilities without the need to under-stand either the content or site design in detail.

3. Is there only concern about preparing content fora single device, such as a WML-enabled phone, orare there other target content types or device re-quirements? For example, some Wireless Appli-cation Protocol (WAP) predecessors send HDML.In Japan, another HTML variant called imode ispopular. Some customers want to present infor-mation in a speech markup language, such asVoiceXML. The more different target formats re-quired, the more work there is to author contentor content generation programs directly for eachone.

4. With WAP phones, are multiple types of phonesbeing served with different deck sizes? Transcod-ing can include “deck fragmentation” that takescare of breaking a page into pieces that can beaccepted by a specific phone, and creating nav-igation links between them. This facility makesthe job of generating the original content—evenif already in WML—simpler for a variety of phoneswith different deck sizes.

5. Can the need be foreseen to support new devices,such as automotive browsers, within days of theirappearance on the network? If there are a largenumber of applications that involve statictranscoding, it may be necessary to upgrade ev-ery application to support the new device. Anytime an application is modified, the existing sup-port may need to be retested.

6. How much content maintenance is there? Arepages or content applications modified fre-quently? Rapidly changing content may precludestatic transcoding.

7. Is any of the content produced in XML, wherethere is a need to apply stylesheets to generatethe right output content for the client device?Where and how is the linkage between devicesand stylesheets to be maintained? Is it necessaryto be able to send XML documents to browsersthat are not able to apply stylesheets?

Use of XML and XSL stylesheets for content adap-tation. XML is the universal format for structureddocuments and data on the Web and is a standardrecommendation maintained by the World WideWeb Consortium (W3C**). Because XML is an open,standard data representation language with ever-in-creasing support from both business and educationalinstitutions, it is becoming an almost universal lan-guage used for a growing number of purposes.

Unlike HTML, which mixes data and display infor-mation, XML is intended to represent the content andstructure of data, without describing how to displaythe information. Although XML allows the structureof a set of data to be defined, it does not usually pro-vide information for changing the structure of thedata to conform to a different set of requirements.For instance, if business A uses one structure to rep-resent a set of data and business B uses a differentstructure to represent the same type of data, XMLitself does not provide the information to transformthe data from one structure to the other. However,the World Wide Web Consortium maintains a stan-dard recommendation called XSL Transformations(XSLT). XSLT is a language for transforming XML doc-uments into other documents. It can be used to trans-form XML into markup languages suitable for dis-play (like HTML), or into XML documents having adifferent structure. Documents written for use byXSLT engines are called stylesheets.

One of the major design issues related to the eco-nomics of maintaining Web applications is whetherto use XML in place of rendering-based markup lan-guages, such as HTML or WML. For many businesses,it makes sense to extract content into a single-useformat that describes the semantics of the informa-tion. This format is more widely useful than the samecontent in an HTML format or WML format thatcannot be easily manipulated for other data accessrequirements, such as business-to-business interac-tions. Reduced information technology infrastruc-ture costs can come from writing the data extractionprograms once into a single form, in this case XML,that can be used for many different data processingrequirements. Once the decision to move to an XML

IBM SYSTEMS JOURNAL, VOL 40, NO 1, 2001 BRITTON ET AL. 157

Page 6: Transcoding: Extending e-business to new … · Transcoding: Extending e-business to new environments by K. H. Britton R. Case A. Citron R. Floyd Y. Li C. Seekamp B ... buy products

deployment model has been made, transcoding viaXSL stylesheets can transform the data into a widevariety of business interchange and rendering for-mats, including HTML and WML.

However, there are costs involved in writing andmaintaining XSLT stylesheets. For these costs to belower than maintaining different versions of the ap-plication, stylesheets have to be written intelligently,with parameters that allow a single stylesheet to workfor multiple presentations and with reusable chunksthat can be used across a large number of applica-tions. Otherwise, an organization will end up need-ing to write and maintain a large number ofstylesheets. Moreover, as XML and XSL mature, toolsthat reduce the effort to create stylesheets will be-come readily available, thus further reducing theamount of time required to be dedicated toward writ-ing stylesheets. Finally, well-known techniques forrepresenting user interfaces in a generic fashion suchas those employed by Dharma10–13 can be leveragedto make XSL a more general-purpose language.These approaches cause the original XML contentto be first transcoded to an XML document that rep-resents a user interaction. From there, general-pur-pose stylesheets can transcode the content to the ren-dering markup language appropriate for a specificrequester.

Dynamic transcoding as implemented inWebSphere Transcoding Publisher

In March 2000, IBM released WebSphere Transcod-ing Publisher (WTP),14 a product that performs sev-eral kinds of dynamic transcoding and can be de-ployed in different ways. WTP is designed as a“backbone” and “plug-ins.” This structure is shownin Figure 2. The backbone includes the function todynamically route individual requests to the propertranscoders, based on the document source URL (uni-form resource locator), device characteristics, net-work capabilities, and user preferences.

Specific types of dynamic transcoding implementedin WTP are described in this paper, including:

● HTML simplification: modifying HTML documentsto tailor them for reduced function devices, for ex-ample, by removing unsupported features

● HTML to WML transcoding: transforming HTMLdocuments into Wireless Markup Language doc-uments suitable to be sent to WML-based phones

● XML stylesheet selection and application: select-ing the right stylesheet to convert a particular XMLdocument for presentation on a particular device

● Fragmentation: breaking a document into frag-ments that can be handled by a particular device

Figure 2 The structure of the WebSphere Transcoding Publisher

TRANSCODING FRAMEWORK

ADMINISTRATION

• COMMON RULES FOR PERFORMING TRANSFORMATIONS• COMMON FRAMEWORK FOR PLUG-IN INTEROPERABILITY

• FRAMEWORK CONFIGURATION• USER PREFERENCE AND DEVICE PROFILES• REGISTRATION OF STYLESHEETS

DEVELOPERTOOLKIT

IMAGEENGINE

TEXTENGINE:HTML

TEXTENGINE:XML

OTHER IBMTRANSCODERS

THIRD-PARTYTRANSCODERS

• EXISTING PLUG-IN MODIFICATION• CUSTOM PLUG-IN DEVELOPMENT• NEW PROFILE DEVELOPMENT

• KNOWLEDGE OF DEVICE, USER, APPLICATIONOR NETWORK PREFERENCES TO CONTROLTRANSFORMS

• CUSTOMER-PROVIDEDTRANSFORMATIONPLUG-INS

• READILY AVAILABLE TRANSFORMATION PLUG-INS

PROFILES

BRITTON ET AL. IBM SYSTEMS JOURNAL, VOL 40, NO 1, 2001158

Page 7: Transcoding: Extending e-business to new … · Transcoding: Extending e-business to new environments by K. H. Britton R. Case A. Citron R. Floyd Y. Li C. Seekamp B ... buy products

and creating the navigation links between the frag-ments

● Image transcoding: modifying images by convert-ing them to different formats or reducing scale orquality, or both, to tailor the image for presenta-tion on a particular device

The paper also explains the different deployment op-tions, including criteria for selecting a particular op-tion. WTP may be deployed:

● As a stand-alone proxy that is interposed betweenthe client device and the Web server

● As a proxy that pulls transcoded results througha separate caching proxy product, such as Web-Sphere Traffic Express or the National ScienceFoundation’s Squid

● As a MIME (Multipurpose Internet Mail Exten-sions)-filter servlet running in WebSphere Appli-cation Server

● As a set of JavaBeans** transcoders that can becalled by servlets, JSP, or other programs.

Finally, the paper discusses some future transcod-ing possibilities that could be implemented as Web-Sphere Transcoding Publisher extensions or as sep-arate products.

HTML simplification. Although powerful full-func-tion workstation browsers such as Netscape Com-municator** 4.5 (and later) and Microsoft InternetExplorer** 4.5 (and later) are widely used, many lesspowerful devices have browsers that support only asubset of the HTML functions supported by the morepowerful browsers. In addition, because some features,although supported, might so strain the resources ofthe device that they are troublesome, a user might pre-fer to have a less expensive construct substituted.

For example, there are numerous handheld comput-ers running the Windows CE** operating system,which uses a reduced function version of Microsoft’sInternet Explorer browser. Early versions of thebrowser do not support Java applets. Devices with evenmore constrained resources, especially memory andprocessing power, may have browsers with even lesssupport for HTML functions, such as frames, Java-Script**, and perhaps even tables. As the different typesof somewhat limited devices that provide some typeof HTML browser to allow accessibility to the Internetgrow in number, the permutations of HTML capabil-ities that are and are not supported also grow.

Despite the large number of limited-function devices,much of the HTML content being written today is pro-duced to take advantage of the advanced capabil-ities of powerful workstation browsers. Rather thaneach content provider producing multiple versionsbased on the capabilities of each possible client de-vice, it is desirable to allow the content to be cre-ated once. Transcoding can be used to transform thecontent into a form suitable for the client device. InWebSphere Transcoding Publisher, this process oftransforming the HTML content based on the capa-bilities of the client HTML browser is called HTML sim-plification. This transcoding could be done in two ways:(1) create transcoding function unique to each differ-ent device type, or (2) characterize the capabilities ofdevices in some way and use these characteristics asparameters to direct the transcoding function.

For the reasons discussed earlier, and consideringthe rapidly growing number of different types of de-vices supporting HTML browsers with differing lev-els of support, the WebSphere Transcoding Pub-lisher took the second approach. By using a set ofproperties that describe the capability of a givenbrowser to render various HTML elements, a profiledescribing the browser can be generated. This pro-file provides sufficient information to direct thetranscoding process. In WTP these descriptive prop-erties are referred to as preferences, because in somecases these parameters are truly what is preferred inthe context of the environment in which such devicesmight be used, such as using a wireless connectionto access the Internet. In a wireless environment, thetextLinksPreferredToImages preference might be setto true, indicating that the HTML simplification pro-cess should generate a link to a separate page con-taining an image rather than automatically down-loading it and rendering it in-line in the page, thusgiving the user the choice of whether to spend thetime and byte transmission cost of downloading theimage to the device.

As another example, some limited-function brows-ers are unable to render HTML TABLE elements.For this case, WTP uses a convertTablesToUnor-deredLists preference that can have a Boolean trueor false value. For a limited browser that does notsupport tables, this value is set to true to indicate thatthe table should be converted into an unordered(nested) list. This preserves the relationship betweennested tables by representing them as nested unor-dered lists.

IBM SYSTEMS JOURNAL, VOL 40, NO 1, 2001 BRITTON ET AL. 159

Page 8: Transcoding: Extending e-business to new … · Transcoding: Extending e-business to new environments by K. H. Britton R. Case A. Citron R. Floyd Y. Li C. Seekamp B ... buy products

Currently, approximately 20 preferences in WTP canbe used to guide the HTML simplification process,including preferences that indicate which types ofimage formats are not supported, whether framesare supported, and whether images should be re-moved entirely.

HTML simplification is only applicable if the con-tent is marked in HTML. The HTML simplificationdescribed here is available in WebSphere Transcod-ing Publisher. It is customizable using the prefer-ences, but flexibility is currently limited to thosepreferences. Table 1 shows an example of HTMLsimplification. Figures 3 and 4 show the beforeand after HTML files from Table 1 rendered in abrowser.

XML stylesheet selection and application. An in-creasingly used approach with XML is to generatecontent in XML and use XSLT stylesheets to transformthe information into a form suitable for display, or totransform the structure to a different one suitable foranother party. Thus, XSLT stylesheets can be writtento support display of XML content as well as to supportthe use of XML in electronic data interchange (EDI).

WebSphere Transcoding Publisher provides a frame-work to aid in using XSLT stylesheets to transformXML documents. The WebSphere Transcoding Pub-lisher includes a version of the Xalan XSLT proces-sor, an open-source XSLT processor supported by the

Apache XML Project.15 The major benefits that WTPprovides in dealing with XSLT stylesheets and theirapplication to XML documents is a structure for or-ganizing stylesheets and support for defining the cri-teria used for selecting the proper stylesheet to beused in the transformation of XML documents. Forexample, an organization might generate content inXML and create different stylesheets to render thecontent for display on different device types. Theymight create one stylesheet for generating an HTMLversion, another for a WML version, and so on. A WTPadministrator registers a given stylesheet with WTP,providing the location of the stylesheet, the contenttype generated (for example, text/HTML), and selec-tion criteria that must exist in the context of an in-coming XML document request in order for thestylesheet to be used in transforming that XML doc-ument. For example, key-value pairs representingproperties associated with a given device, such as thedescriptive name of the device, can be used as cri-teria, guaranteeing that the stylesheet will only beused on requests originating from such devices. Ad-ditionally, conditions matching all or a portion of agiven document URL can be specified, guaranteeingthat the stylesheet will only be used on one or a smallnumber of documents originating from a given serverlocation.

As more and more enterprises use XML andstylesheets, many interesting issues arise. For exam-ple, the document type definitions (DTDs) defining

Table 1 HTML simplification with preferences indicating that images should be converted to images to links and that tablesshould be converted to unordered lists

HTML Fragment Simplified HTML

,HTML. ,HTML.,body. ,body.,img src5"images/house.gif" ,br.alt5"Picture of a House".,/a. ,A,table. HREF5"images/house.gif/PVC_PHANTOM_GENERATOR".,tr.,td.991105001PB,/td. Picture of a House,/a.,br.,/a.,td.Ranch,/td. ,ul.,td.$207,000,/td. ,li.991105001PB,/li.,td.4,/td.,td.2,/td.,/tr.,tr. ,li.Ranch,/li.,td.991105002PB,/td. ,li.$207,000,/li.,td.Split Level,/td. ,li.4,/li.,li.2,/li.,td.$237,000,/td. ,li.991105002PB,/li.,td.5,/td. ,li.Split Level,/li.,td.2.5,/td. ,li.$237,000,/li.,/tr. ,li.5,/li.,/table. ,li.2.5,/li.,/body. ,/ul.,/HTML. ,/body.

,/HTML.

BRITTON ET AL. IBM SYSTEMS JOURNAL, VOL 40, NO 1, 2001160

Page 9: Transcoding: Extending e-business to new … · Transcoding: Extending e-business to new environments by K. H. Britton R. Case A. Citron R. Floyd Y. Li C. Seekamp B ... buy products

a given type of structure of an XML document andthe stylesheets used to manipulate such documentsmight be considered important proprietary resourcesand need to be kept secure. Also, as the number andtypes of devices allowing Internet access increase,an approach based on different stylesheets for eachdevice type could cause the number of stylesheetsan organization needs to create and maintain to in-crease precipitously. As WTP and stylesheet toolsevolve, there will be increasing support for stylesheetparameterization and stylesheet internalization, andincreasingly complex criteria for selecting the rightset of stylesheets for a given purpose. Stylesheets arestandard, general, and extensible, but using stylesheettranscoding requires that an XSL stylesheet be writ-ten to perform the desired transcoding.

Transformation of HTML into WML. The WirelessMarkup Language (WML) is an XML-based markuplanguage intended for use in specifying content anduser interface information for narrowband devices,including pagers and cellular phones. It is part of theWireless Application Protocol (WAP) standard fromthe WAP Forum.16 WAP-enabled phones are becom-ing more and more prevalent, especially in Europe,with the Nokia Group and others manufacturingphones that support the WAP wireless standard. Manyenterprises that now provide information access tousers using desktop browsers wish to support theirmobile users who would like to access their infor-mation by using WAP devices, but the enterprises donot wish to expend the resources to provide contentin both HTML for workstation browsers and WAP de-

vices. WTP provides transcoding support for trans-forming HTML into WML.

WTP transforms HTML into WML in a two-step pro-cess, with an optional third step to further refine theresult. The first step takes the original HTML doc-ument and converts it into a Document ObjectModel (DOM) representation using an HTML parserdeveloped by a team at the IBM Tokyo Research Lab-oratory. The DOM is essentially a tree structure, con-forming to a World Wide Web Consortium require-ments standard.17 Using standard Java applicationprogramming interfaces (APIs) for traversing andmanipulating the DOM, the second step modifies theHTML DOM, creating a DOM containing WML ele-ments, conforming to the WML standard. This trans-formation is done on an element-by-element basis.For example, an HTML document starts with an HTMLelement (node), whereas a WML document starts witha WML element. Thus, the transformation logic re-sponsible for processing the HTML node replaces thisnode with a WML node in the DOM.

The logic for dealing with a particular HTML elementfalls into one of three categories:

1. The transformation requires unique logic, suchas the HTML example described above.

2. The source element, and all the children under it,are simply deleted. An example is the APPLET el-ement, which represents a Java applet. Since it isnot possible to transform the Java applet into some

Figure 3 Original HTML page displayed in a browser Figure 4 Transcoded HTML page showing the simplifiedlayout

IBM SYSTEMS JOURNAL, VOL 40, NO 1, 2001 BRITTON ET AL. 161

Page 10: Transcoding: Extending e-business to new … · Transcoding: Extending e-business to new environments by K. H. Britton R. Case A. Citron R. Floyd Y. Li C. Seekamp B ... buy products

other form for execution on the WAP device, theAPPLET element and its contents are simply deleted.

3. The source element is replaced by its children.That is, everything under the top element replacesthe top element itself. For example, the SUPelement, which indicates text contained withinit should be rendered in a superscript font, isreplaced by the text it contains, since superscriptfonts are not supported in WML. In text form,this would be seen as replacing “,SUP.specialtext,/SUP.” with simply “special text.”

The first case above requires special logic for eachtransformed element, whereas the second and thirdcases can each be handled by a single processingmodule, with the HTML elements to be handled sim-ply by providing them to the module in the form ofsome list. Thus, the module responsible for deletingelements and their children is given a list of elementnames that should be transformed in this way.

Once a document has been converted to the targetmarkup language, it can be further tailored by a stepcalled text clipping. There are a couple of reasons whythis might be beneficial:

1. The application of generic transformation logicto a specific document, although resulting in avalid WML document, may still not result in themost visually pleasing results.

2. The transcoded document may contain a largeamount of unnecessary content that tends to ob-fuscate the key information the user was request-ing. For example, although it may be reasonableto provide links to every product and departmentin a company in a table at the top of a page ren-dered on a powerful desktop browser, these linksmay just frustrate the user as he or she has to scrollpast themtofindthe informationactually requested.

There are three ways in which the text clipping pro-cess might be accomplished:

1. Because WML is an XML-based language, XMLstylesheets can be used.

2. The transcoding logic can work with the text formof the document, using traditional string manip-ulation functions to locate the points at which toremove text.

3. The transcoding logic can operate on the DOMdirectly.

All of these approaches require that specialized logicbe written, in either Java or XSL, to perform the de-sired clipping.

The first approach, applying stylesheets to XML, wasexplained in the previous subsection and is not ex-plained further here. Experience has shown that thesecond approach can have serious limitations unlessthe content never changes. Assume that the contentis a weather forecast and that the main forecast de-scription is preceded by the string “Local Forecast.”Suppose, “Local Forecast” was enclosed in a B (boldtypeface) element:

,B.Local Forecast,/B.

Note that if we search simply for the string “LocalForecast” and remove the text prior to this point,the result is

Local Forecast,/B.

which is invalid, because there is an ending tag(,/B.) without the beginning tag (,B.). The so-lution to this problem is to search for the exact text,including tags. However, even a small change to thecontent, such as the B tags being replaced with I (ital-ic) tags, would make the transformation fail.

The third text clipping approach, working directlywith the DOM, has distinct advantages. Individual nodesin the DOM tree represent the beginning and endingtags of the element, so that when a node is removed,there is no problem with removing the beginning tagwithout the ending tag. Also, it is possible to search aportion of the DOM for text regardless of the elementnodes of which they are children. For example, it wouldbe possible to search for the text node “Local Fore-cast” regardless of the fact that the text was a childof a B or an I node. To aid in the creation of textclippers that operate on the DOM, WTP provides func-tions such as findNodeMatchingPattern, whichsearches under a particular node in the DOM for atext node whose text matches a given regular expres-sion pattern, and findSiblingMatchingPattern, whichsearches under siblings of a given node and their chil-dren for a text node matching a given pattern. Withuse of such functions to allow searching and mod-ification of the DOM based on string patterns, it ispossible to write text clipping transformations thatcan withstand minor changes to the source contentwithout requiring updates to the text clipping logic.

Figures 5 and 6 show tree representations of a DOMbefore and after clipping. The DOM in Figure 6 iscreated from the DOM in Figure 5 by a Java clipperperforming the following steps:

BRITTON ET AL. IBM SYSTEMS JOURNAL, VOL 40, NO 1, 2001162

Page 11: Transcoding: Extending e-business to new … · Transcoding: Extending e-business to new environments by K. H. Britton R. Case A. Citron R. Floyd Y. Li C. Seekamp B ... buy products

1. Find the paragraph (P) node in the document.There should only be one of these generated bythe previous HTML to WML transcoding step.

2. Search for a child node of the P node that has atext node somewhere beneath it containing thestring “Stock Quote.”

Figure 5 DOM created by an HTML to WML transcoding step

NYSE:IBM

STOCKQUOTE

WML

CARD

P

B

AAABR DO

BR

STRONG STRONG

LASTUPDATE

••• ••• ••• •••

Figure 6 “Clipped” DOM, created from the previous DOM by a Java clipper

NYSE:IBM

STOCKQUOTE

WML

CARD

P

B

ABR DO

BR

STRONG STRONG

LASTUPDATE

• • • • • •

IBM SYSTEMS JOURNAL, VOL 40, NO 1, 2001 BRITTON ET AL. 163

Page 12: Transcoding: Extending e-business to new … · Transcoding: Extending e-business to new environments by K. H. Britton R. Case A. Citron R. Floyd Y. Li C. Seekamp B ... buy products

3. Remove all siblings before the node found instep 2.

4. Look at all siblings following the node found instep 2 until we reach the first anchor (A) node.

5. Starting with the anchor node found in step 4, de-lete all nodes until we find the node that containsa text node beneath it containing “Last Update.”

6. Starting with the node found in step 5, keep allsibling nodes until the next anchor node isreached. (Note that the last node before the firstanchor node found is a STRONG node.)

7. Starting with the anchor node found in step 6, de-lete all following sibling nodes until a DO nodeis found.

The HTML to WML transcoding approach describedabove can be used for transforming HTML to othermarkup languages, such as HDML, compact HTML,or imode.

WML deck fragmentation. WML content, as definedby the WAP Forum, is organized for rendering in acard, similar to a page in HTML. A collection of cardsis known as a deck. A single WML file contains a deckand its component cards, allowing several cards tobe delivered to the phone in one request, eventhough only one card is displayed at a time. Manydevices such as WML phones have a limited memorycapacity. Thus they present a special challenge fortranscoding, since the transcoded content deliveredto the device must not exceed the capacity of the de-vice. In general, HTML pages that are converted toWML as described above will not fit in the limitedstorage available on these devices. Even content thatis generated as WML originally may exceed the de-vice limit. All such content must be subdivided sothat each subdivided piece will fit in the device. InWTP, the process of breaking up such pages into ac-ceptably sized pieces is called WML deck fragmenta-tion. This procedure is performed by the WML frag-mentor. Although the initial implementation workswith WML, the same techniques are easily applied toother markup languages such as imode.

If the WML fragmentor needs to fragment a WMLdeck generated by the HTML-to-WML transcodingprocess, it can retrieve and operate on the DOM cre-ated for the document during that process. Other-wise the fragmentor parses the WML text itself in or-der to create a DOM. In either case, the fragmentorperforms all of its operations on the deck as repre-sented by a DOM, not on the plain WML text.

When presented with a WML deck that may need tobe fragmented, the fragmentor must first determineboth the maximum size deck that may be deliveredto the device, and the size of the input deck as it willbe presented to the device. The first piece of datais a configuration parameter associated with the user-agent field present in the HTTP request that producedthe deck. The second is determined by “walking” theDOM that represents the deck and counting the sizeof each node as it will be seen by the device. Whendoing this counting operation, the fragmentor takesinto account the binary encoding for WML tags thatwill be performed by the WAP gateway.

If the deck exceeds the maximum allowed size, thenthe fragmentation process begins. First, the individ-ual cards are removed from the deck so that eachmay be considered independently. If a single cardby itself exceeds the maximum card size, it must befragmented into smaller subcards. A card is frag-mented by walking the DOM that represents the card,keeping track of places in the tree where the cardmay be fragmented. The fragmentor will fragmenton either ,p. or ,br. tags. As the fragmentor vis-its each node in the tree, it determines the size ofthe card that would be produced if the main cardwas fragmented at this node. The fragmentor con-tinues walking the DOM until this new card size ex-ceeds the maximum. Once this limit is reached, thefragmentor ceases walking the DOM and fragmentsthe card at the last noted fragmentation point. Thisprocess may be repeated for the card remaining af-ter fragmentation, if its size still exceeds the max-imum allowed. Each card that is oversized is frag-mented in this fashion.

Note that as a card is fragmented, links are addedso that the user may navigate from one card to an-other. Each fragment contains a ,prev. elementto allow navigation to the previous card, and a ,go.element that will move the user to the next card insequence. Figure 7 shows an example of linking frag-ments together in this way.

Once all of the individual cards are small enough,they are arranged into decks so that each deck is notlarger than the maximum deck size. Unique decknames are generated by the fragmentor based on theoriginal request URL. The final step in the fragmen-tation process is called link rebinding. During thisstep, the text of the HREF (the parameter specifyingthe URL of a linked resource) attributes of all linksfound in the decks is adjusted so that the target name

BRITTON ET AL. IBM SYSTEMS JOURNAL, VOL 40, NO 1, 2001164

Page 13: Transcoding: Extending e-business to new … · Transcoding: Extending e-business to new environments by K. H. Britton R. Case A. Citron R. Floyd Y. Li C. Seekamp B ... buy products

is correct, based on the name of the deck in whicheach card was placed.

At this point the fragmentor sends the first gener-ated deck to the device and stores the remaining gen-erated fragments in the resource repository. The re-source repository is simply a general-purpose storagefacility maintained by WTP that allows for retrievalof one of the decks when a request is received forit. Such a request will be generated when a user fol-lows any of the links in the first deck that point toa card in one of the other generated decks. Decksin the resource repository must eventually time outand be removed. Any one of a variety of cache al-gorithms may be used to determine when to removedecks from the repository. If a request is receivedfor a deck that has been removed from the repos-itory, then a deck is generated that contains an er-ror message indicating that the fragment has expiredand the original deck must be requested.

The WTP WML fragmentor described here works onany WML deck without any customization.

Image transcoding. The final type of dynamictranscoding we discuss is image transcoding. Imagesthat view well and render (reasonably) quickly on adesktop browser connected to a local area network(LAN) may not appear in a usable form on the low-

quality and small-sized screens of handheld devices.In addition, they may take unacceptably long todownload and render on such devices. Finally, somedevices do not support all image formats; therefore,to display an image at all it may be necessary to con-vert from the original image format to one supportedby the device before transmitting the image to thedevice.

Image transcoding is performed in WTP by the im-age engine. Figure 8 is an example of an image pro-cessed by the image engine. The image on the rightin the figure has been scaled and converted to grayscale to reduce the image size for display on a wire-less device. Given an input image to transcode, theimage engine must determine three things: the out-put format that should be used for the image, thescale factor to be applied to the image, and the out-put quality of the image. “Output quality” is a some-what vague term. It is a measure of how “good” theimage will appear on the screen. High-quality im-ages will look sharp and clear, but will generally beconsiderably larger (in bytes) than lower-quality im-ages. Also, note that low-quality screens may be in-capable of displaying a high-quality image. In thiscase it is preferable to use a lower-quality image asoutput, since that will speed both transmission andrendering time.

Figure 7 Example of WML deck fragmentation; arrows represent links, either within or between fragments

Deck1

Card1

TARGET

LINK

FragDeck1

FragCard1

TARGET

CONTINUE

FragDeck2

FragCard2

RETURN

LINK

IBM SYSTEMS JOURNAL, VOL 40, NO 1, 2001 BRITTON ET AL. 165

Page 14: Transcoding: Extending e-business to new … · Transcoding: Extending e-business to new environments by K. H. Britton R. Case A. Citron R. Floyd Y. Li C. Seekamp B ... buy products

Deciding the output format for an image is straight-forward. If the input image is in a format acceptedby the device, the format is not changed. Otherwise,the output format is set to be the first of the formatssupported by the device (as determined by the user-agent, or UA, field in the HTTP request).

Scale factor may also be a simple configuration pa-rameter associated with the requesting device. Gen-erally, it will be set so that images appear relativelyas big on the screen of the requesting device as theywould on a desktop screen. Windows CE devices,however, do not have a fixed screen size. Rather, thescreen size is indicated by the UA-pixels header in-cluded in the HTTP request. When this header ispresent, the image engine can use this informationto dynamically calculate a scale factor based on thedifference in size between the screen described bya UA-pixels header and some (configurable) standardscreen size.

Finally, output quality is determined by combininginformation about the quality of the screen of therequesting device and the configured trade-off thatshould be made between quality and size for this re-quest. The screen quality is a configured parameterfor the device, and may be set to “high,” “medium,”

or “low.” The trade-off configuration parameter willgenerally be associated with a network type: Overa slow wireless link it would be better to favor smallsize over high-quality images, whereas over a LANconnection high quality would be preferred, sincethere is little to be gained in reducing the image size.Over a dial-up connection it would be appropriateto compromise between the two. The image enginefirst chooses a tentative output quality based on thescreen capability, and then adjusts this size down-ward if the trade-off parameter is set to either com-promise or favor small size.

The WTP image engine described here works with-out customization using defaults for preferencesbased on the client device type.

Source identification and preferenceaggregation

One of the major goals of WTP is to tailor the con-tent sent to the user based on the user’s environ-ment—especially the device and connectivity lim-itations. The following two major steps are neededto accomplish this:

Figure 8 Example of image transcoding–a screen capture from the “Transform Tool” in WTP

BRITTON ET AL. IBM SYSTEMS JOURNAL, VOL 40, NO 1, 2001166

Page 15: Transcoding: Extending e-business to new … · Transcoding: Extending e-business to new environments by K. H. Britton R. Case A. Citron R. Floyd Y. Li C. Seekamp B ... buy products

1. Identify the characteristics, constraints, and pref-erences for each environment source. Suchsources include device types, network types, andusers. This step is a configuration step. In WTPthe characteristics, constraints, and preferencesare all given the single name preferences and aredefined as key-value pairs consisting of a nameand a value. For a given incoming request forsome document or image from a user, identify theenvironment sources. For example, when a userrequests www.ibm.com via a browser running onsome client device, WTP tries to determine sourceinformation such as device type and networktype18 to determine what information is neces-sary to provide the most suitable transformationof the requested document. This step is calledsource identification.

2. Given multiple sources of preference information,provide the value of a given preference to usewhen requested by some transcoding logic. Thisissue is trivial when only one of the sources pro-vides a value for a given preference. However, itis possible, for example, that both the devicesource and the network source provide a valuefor a given preference. In such a case, it is de-sirable to have some single mechanism respon-sible for taking multiple values for a given pref-erence and providing a single value. With use of suchan approach, numerous transcoding modules canbe written using preference values, but withoutregard to how these values were determined. InWTP, the mechanism that provides this functionis called the preference aggregator, because it “ag-gregates” multiple values into a single value.

Source identification. Identifying the device sourcetype for an incoming HTTP request is based on theuser-agent field in the HTTP header. For example, theuser-agent field for an English language version ofthe Netscape Communicator 4.7 browser running ona Windows NT** system is “Mozilla/4.7 [en] (WinNT;U).” The user-agent field for the Netscape Commu-nicator 4.6 version was identical, except that it spec-ified “4.6” rather than “4.7.” Rather than requiringsource identification on exact matches on the entireuser-agent string, WTP supports source identificationbased on a portion of the user-agent string. For ex-ample, specifying the pattern “*Mozilla/4.*” wouldmean that any browser with a user-agent string con-taining the string “Mozilla/4.” would match the pat-tern. Additionally, WTP supports device source iden-tification via pattern expressions combined via logicaloperators. Thus, the expression:

(user-agent5*Mozilla/4.*)u(user-agent5*Mozilla/5.*)

means that any user-agent string containing“Mozilla/4.” or “Mozilla/5.” would match the pat-tern expression.

For network source identification, WTP currentlydoes not have the benefit of an HTTP header fieldlike the user-agent field used for device source iden-tification. In the future, if WTP is used with some typeof (perhaps wireless) gateway, it may be possible toreceive some network connection identification in-formation. Currently, WTP provides a much more ba-sic mechanism. When operating as a transcodingproxy, WTP supports the use of separate Transmis-sion Control Protocol ports to identify the type ofnetwork connection being used. That is, WTP can beconfigured to understand that port 8089 is a wirelessnetwork connection. Thus, if the browser of a clientdevice is configured to connect to the WTP proxy viaport 8089, a request from that device will then be as-sumed to be coming in via a wireless connection. There-fore, the wireless network profile will be used. The stan-dard WTP configurationcomeswithdefault,wireless, and14.4 profiles configured on different ports, but thesecan be changed or additional ones added.

Preference aggregation. For a given request, if morethan one preference source has a value for a givenpreference requested by a transcoding module, thepreference aggregator must provide a single value.In most cases, for a given preference name, the valuespecified in one source profile should take prece-dence over the value in another source profile. Forexample, the value of compressSource preferencefor the Windows CE device profile is false, meaningthat no attempt should be made to reduce the sizeof HTML elements (by reducing noncritical attributes)sent to such devices. However, the wireless net-work profile specifies a value of true for the com-pressSource preference. For a request from a Win-dows CE device using a wireless connection, howshould the value of compressSource be resolved?The preference aggregator allows a precedence or-dering to be specified to help resolve such conflicts.The default precedence ordering that is configured is:

user, network, device, user-default,network-default, device-default

This ordering states that for a given preference name,if there is a specific (not default) user profile and it

IBM SYSTEMS JOURNAL, VOL 40, NO 1, 2001 BRITTON ET AL. 167

Page 16: Transcoding: Extending e-business to new … · Transcoding: Extending e-business to new environments by K. H. Britton R. Case A. Citron R. Floyd Y. Li C. Seekamp B ... buy products

contains the preference value, it should be used. Ifnot, if there is a specific (not default) network pro-file and it contains the value, it should be used, andso on. Thus, in our example, since WTP does not cur-rently have a way of identifying specific users, thereis a default user profile in addition to the wirelessnetwork profile and Windows CE device profile.Thus, given the above ordering, since there is a spe-cific network profile (wireless) with a value of com-pressSource, its value (true) takes precedence. Be-cause the same generic precedence ordering may notbe appropriate for all preference values, WTP allowsa different precedence ordering to be specified forpreference names for which such an ordering is ap-propriate.

In some cases, rather than choosing a single valuebased on a precedence ordering, it might instead bedesirable to combine the multiple values in some way.For example, it might be desirable to say that for agiven preference whose value is Boolean, if eitherthe network or device value is true, the value shouldbe true. WTP also supports this capability on a pref-erence-by-preference basis.

In order to provide such flexibility, the preferenceaggregator employs a rules mechanism. Genericrules, such as the one providing the default prece-dence ordering, are supplied, but preference-specificrules can be supplied as needed. As is typical withany rules engine, once a given preference has beenresolved, the rules are not needed to resolve the valueagain. In WTP, a preference value remains for thelife of a request because each request is distinct andcan have different source information. The rulesthemselves, although actually Java classes, can bewritten in text form and then parsed and compiled.The rules language uses a compiled scripting lan-guage, NetRexx,19 as an underlying scripting lan-guage so that powerful expressions can be supportedwithout having to use a totally new scripting lan-guage. NetRexx, which is compiled into Java, can beused on any platform on which Java can be used.

At initialization time, and when preference profilesare updated because of configuration changes, rulesbased on the key-value pairs contained in the pref-erence profiles are dynamically generated. Althoughthe rules are compiled, there is still a run-time costassociated with interpreting the rules. In order tominimize the performance impact of using theserules, an analysis of the rules is done immediatelyafter the dynamic generation of rules from prefer-

ence profiles is done, and those rules that are de-rived based solely on source identification are cachedso that their values can be “looked up” based onsource identification information from the requestwithout having to actually interpret the rules.

The rules-based mechanism provides almost unlim-ited flexibility in extending the preference aggrega-tion mechanism, while allowing transcoding modulewriters to use preferences without worrying abouthow the values are resolved. In addition, the pref-erence aggregation mechanism can be updated byan administrator without installing a new version ofthe product and without having to be a Java program-mer.

Deployment models

Transcoding must be interposed at some point be-tween the generation of the initial content and thefinal rendering on the client device. Three locationsare available in the current Web architecture:

1. The client device2. An intermediate point in the network, such as a

proxy or firewall3. The source of the content

Figure 9 shows a simplified picture of a network con-figuration and the potential transcoding placementlocations. Each has its strengths and weaknesses.Therefore, combinations of the approaches may beuseful in some situations.

The Web experience is, today, still largely orientedtoward a client model that assumes a high-bandwidthnetwork connection and workstation capabilities.Content that is designed based on these assumptionsis not usually appropriate for display on lower-ca-pability devices or for delivery on low-bandwidth orhigh-cost connections. This is the crux of the prob-lem when performing client transcoding. When wefirst started looking at client transcoding options forWebSphere Transcoding Publisher, we tried visitinga variety of sites on the Web. It could and did take20 minutes or more to view some pages using adial-up or wireless connection. Even over faster links,the very limited rendering speeds and capabilitiesof handheld devices will usually make pure client-side transcoding an unworkable solution. Imagetranscoding can be particularly problematic for cli-ents because of its computationally intensive nature,

BRITTON ET AL. IBM SYSTEMS JOURNAL, VOL 40, NO 1, 2001168

Page 17: Transcoding: Extending e-business to new … · Transcoding: Extending e-business to new environments by K. H. Britton R. Case A. Citron R. Floyd Y. Li C. Seekamp B ... buy products

large image sizes, and the potentially very high com-pression rates.20

Independent of the transcoding placement issue,there still needs to be a Web-capable presence onthe client device. Additional client-side presence canbe useful in providing transcoding-specific user op-tions (such as image-specific transcoding options) ora controlled client environment.21 Such a client canalso provide additional information about client de-vice capabilities, although emerging standards suchas composite capability/preference profiles (CC/PP)22

are expected to supplant this role.

WTP assumes only that there is Web rendering soft-ware of some sort on the client device and that theHTTP headers generated by the device either specifythe capabilities of the device or allow it to be inferred.Currently available Web-capable devices meet thisrequirement through the use of user-agent and otherdescriptive headers. This allows WTP to effectivelytranscode for a wide variety of devices without ad-ditional client presence.

A second deployment option is at an intermediatepoint in the network, such as a firewall or proxy. Inthis model, all client requests are directed to the in-termediate representative that forwards them on tothe specified destination server. Most Web clientssupport use of proxies for at least HTTP. WTP can be

configured to run as an HTTP proxy, and in this modecan transparently transcode content for these clients.

In the proxy model, all HTTP responses flowing tothe client can be transcoded, regardless of the serverproducing the content. This approach, combinedwith the preference-based transcoding of WTP, al-lows new client devices to be easily added to a net-work. For example, a business that wishes to give em-ployees with WAP phones access to internal Webcontent can do so by using a WAP gateway (for pro-tocol conversion) and WTP proxy (for contenttranscoding), without requiring content to be spe-cifically authored for the phones, and without im-pacting servers generating content.

There are some drawbacks to the proxy approachfor transcoding. The use of encryption such as se-cure sockets layer (SSL) to protect data precludestranscoding via a proxy. This drawback could be ad-dressed by allowing the proxy to act as both an end-point and a source for SSL connections. However,this additional point of exposure may not be accept-able to some users. It also imposes a significant per-formance penalty at the proxy.

Another potential concern of the proxy-based ap-proach is legal, rather than technical. Some provid-ers of content also wish to control the presentation

Figure 9 Placement options for transcoding function in a network

CLIENT HTTP REQUEST

POTENTIAL TRANSCODING POINTS

HTTP RESPONSE

WEB SERVER(CONTENT SOURCE)

HTTPPROXY

WEB SERVER(CONTENT SOURCE)

••

IBM SYSTEMS JOURNAL, VOL 40, NO 1, 2001 BRITTON ET AL. 169

Page 18: Transcoding: Extending e-business to new … · Transcoding: Extending e-business to new environments by K. H. Britton R. Case A. Citron R. Floyd Y. Li C. Seekamp B ... buy products

of this content, and so will not want it to betranscoded outside their control. Although this re-striction will not be a concern in the corporate in-tranet example presented above, it can limit deploy-ment in the external Internet environment.

A final area of concern with this approach is in ex-tending the proxy with new transcoders. In the Webserver environment, de facto standards such as theservlet API provide a widely supported method forextending function. There is no similar API yet inplace for proxies. WTP is based on the Web Inter-mediaries (WBI) framework,23 and supports exten-sion via the WBI API. In addition, WTP allowstranscoders written to the servlet API to be deployedin a WTP proxy.

Dynamic transcoding can be a relatively resource-intensive activity. Fortunately, techniques availablefor spreading requests across Web servers or prox-ies may also be used with a transcoding proxy. IBM’sNetwork Dispatcher, for example, can serve as a frontend for multiple transcoding proxies. This configura-tion is shown in Figure 10. WTP instances do not sharetranscoded data, and so session affinity mechanismsin Network Dispatcher are used to ensure that re-quests depending on a previous state (for example,deck fragmentation) are routed back to the correctinstance of WTP.

A final option is to provide transcoding at the sourceof the data. This option, content source transcod-ing, is attractive for a number of reasons. It requiresno client configuration since there is now no inter-vening proxy. For clients that do not support HTTPproxies, such as some WAP phones, it may be the onlyworkable option. Since the transcoding is coresidentwith the content, the content provider has tightercontrol over what is transcoded and how it is pre-sented. Content source transcoding allows transcod-ing to be done before content is encrypted, and soit can support secure connections. If transcoding isbeing done to control who sees what data (for ex-ample, only allowing users from certain locations fullaccess to data), performing transcoding at the sourceenforces these views.

The major drawback to content source transcodingis its limited scope. Deploying transcoding in the con-tent source only affects targeted content from thatserver. It is not appropriate when a client is viewingcontent from a range of servers. Performing transcod-ing on the content source server will also impose anadditional load on the resources of the server, andso may require additional infrastructure enhance-ments.

The options for deploying content source transcod-ing depend on the configuration capabilities of the

Figure 10 Using Network Dispatcher

CLIENTS

WEBSERVERS

WEBSERVERS

WTPPROXY

NETWORKDISPATCHER

WTPPROXY

••

••

••

••

••

BRITTON ET AL. IBM SYSTEMS JOURNAL, VOL 40, NO 1, 2001170

Page 19: Transcoding: Extending e-business to new … · Transcoding: Extending e-business to new environments by K. H. Britton R. Case A. Citron R. Floyd Y. Li C. Seekamp B ... buy products

server hosting the content. For this reason, WTP pro-vides two models for content source transcoding:

1. Servlet filtering2. Transcoding JavaBeans

In the first option the output from an existing serv-let or JSP is run through a filter servlet after it is gen-erated. This option allows content generated by serv-lets to be transcoded without change to the existingservlet. Not all Web servers support servlet filter-ing, since it is not yet part of the servlet API spec-ification (although this is expected to change). Serv-let filtering does, however, enjoy enough widespreadsupport to make it a useful deployment option. WTPrequires both the output from the existing servlet andinformation on the original request to successfullytranscode content in this model.

If the Web server does not support servlet filtering,or if more control is desired over the transcodingprocess, the content provider may wish to or needto explicitly invoke transcoding on generated con-tent. WTP provides JavaBean transcoders that canbe used to perform image or text transcoding on datastreams.

All Web servers in common use today include ex-tension APIs that allow the requests and responsesflowing through the server to be intercepted andmodified at various points. A transcoding extensionwritten to these APIs provides another option for con-tent source transcoding. However, since these APIsare Web server specific, they are less portable andmore difficult to deploy on a range of platforms. WTPdoes not currently provide support for deploymentat this level.

Application scenarios

Transcoding is used to help make different types ofcontent available in different formats, allowing newapplications of the content that may be different fromthe application for which the content was originallydeveloped. We provide some example applicationscenarios to describe how transcoding can be usedin practice.

Web browsing. One application of transcoding is tomake general Web content available to devices withlimited or specialized user interfaces. In this appli-cation, the transcoding function is deployed as aproxy so that no changes are required to any of theWeb servers that are hosting the content. Many of

the transcoding functions described above—such asHTML simplification, image transcoding, conversionof HTML to other markup languages like WML, anddeck fragmentation—are relevant in constructingthis application. Since typical Web pages have manydifferent layouts and semantic information about thetype and value of the content is scarce, it may be dif-ficult to create a set of transformations that work wellenough for all content of interest. For a Web con-tent transcoding application to work well in practice,it is usually necessary to limit the viewed pages toa selected set and to provide specialized “clippers”that customize the transcoding or clipping for spec-ified content. These provisions allow a great deal ofcontrol over how specific pages are transcoded forspecific devices.

Rendering vertical XML. We expect more and morecontent to be published in XML dialects that are spe-cifically designed for a given application domain.These dialects are called “vertical XML” since theyare specialized for specific applications. These ver-tical XML dialects have a great deal of semantic in-formation about the content, but little or no infor-mation about how it is to be rendered or displayed.Such rendering information can be defined in XSLstylesheets, and these stylesheets can be applied tothe vertical XML dynamically to customize the ren-dering for specific devices or users at the time thecontent is delivered. The XSL stylesheet selection andapplication function discussed earlier are the key op-erations for this application.

Host integration. An important application oftranscoding technology is to leverage the existing in-vestment in legacy applications of an enterprise whileextending its reach to pervasive devices and businesspartners via the Internet. Many companies executetheir business processes on host systems (S/390* andAS/400*) running applications that use 3270, 5250, orVT protocols. Original legacy applications were builtin an environment where users employed dumb ter-minals (and later, terminal emulators) to interactwith host programs. The potential risks and costs pro-hibit companies from porting these existing appli-cations to any new technology for “competitive ad-vantage.” Using the transcoding technology willenhance host integration solutions for use by a va-riety of devices and business partners.

Following are several ways to extend the reach ofexisting host applications to the Internet environ-ment without modifying existing host applications.

IBM SYSTEMS JOURNAL, VOL 40, NO 1, 2001 BRITTON ET AL. 171

Page 20: Transcoding: Extending e-business to new … · Transcoding: Extending e-business to new environments by K. H. Britton R. Case A. Citron R. Floyd Y. Li C. Seekamp B ... buy products

● Develop a downloadable host application emula-tor using the traditional graphical user interface(GUI) emulator. These emulators are usually Javaapplets24 or ActiveX** controls25 that run insidea desktop browser. This approach does not fit intopervasive devices that do not support Java and Ac-tiveX control.

● Develop a customized user interface by program-ming against host applications. This approach issometimes called “screen-scraping.” Programmerscan write code to interact with host applicationsand generate a customized user interface for theend user. The customized GUI could be in HTMLor WML that can be supported by different devices.The disadvantage of this approach is its cost, sinceprograms have to be written for presentations ondifferent devices.

● Apply transcoding technology as a third approach.First, we convert host application data into a com-mon XML format that is display-independent. ThenXSL stylesheets are developed to transform the hostapplication data to different presentation formatsthat fit into different devices. Figure 11 shows thearchitecture for this approach. Figure 12 shows anexample of a screen from a host application. Fig-ure 13 shows the same screen, transcoded for useon a WAP-enabled phone.

At the design time, the transcoding service is usedfor stylesheet association, stylesheet deployment, and

stylesheet storage. At run time, the transcoding ser-vice retrieves the stylesheet and applies the trans-formation.

There are many advantages to incorporating transcod-ing technology in a host integration solution. The re-engineering of host applications is needed only once.After the XML data have been generated, data trans-formations can be performed to generate the properformat for different client devices and business part-ners. Great flexibility comes from the fact that the workof extracting host content is done only once and thehost application can be delivered to any type of clientby applying a proper style transformation. When newdevices are used to access host applications, only newstylesheets need to be developed. The result is quickdelivery with a minimum development cost of legacyapplications to clients using different devices or to bus-iness partners.

Performance considerations

Transcoding a document from one format to anotheris a computationally intensive operation. To convertHTML to WML or any other format involves parsingthe HTML document. Converting an image file fromone with the JPG extension (the JPEG, or Joint Pho-tographic Experts Group, format) to one with theGIF (graphics interchange format) extension or lowerresolution JPG involves reading and processing the

Figure 11 Architecture of the IBM host integration solution

HOSTAPPLICATIONDATACONVERTER

32705250 XML HTML

WMLXML

XSLTSTYLESHEETS

IBM WEBSPHERETRANSCODINGPUBLISHERVT

BRITTON ET AL. IBM SYSTEMS JOURNAL, VOL 40, NO 1, 2001172

Page 21: Transcoding: Extending e-business to new … · Transcoding: Extending e-business to new environments by K. H. Britton R. Case A. Citron R. Floyd Y. Li C. Seekamp B ... buy products

entire image and performing floating-point opera-tions on the data.

The payoff is that Web-based files can be viewed bydevices that normally would not be able to view thosefiles. An additional benefit for image files is that theresulting file is much smaller than the original file.Sometimes the resulting image is less than one-tenththe size of the original file. This is particularly im-portant if the user is connected over a slow wirelessconnection.

To handle a large number of end users, Transcod-ing Publisher employs a variety of methods to dis-tribute the load and reduce the number of transform-ing documents that it has already transformed.

Transcoding Publisher will use an external cache ifit is configured to do so. The cache interface is an in-dustry standard, so any caching product that supportsthe industry standard will work. When using an exter-nal cache, the transcoder attempts to fetch the already-transformed page from the cache. If the page, alreadytransformed for the requesting device, is found in thecache, it is returned to the requester. This fetch op-eration avoids the potentially costly step of convert-ing the document. If the data are not in the cache,the act of requesting the data through the cachemechanism causes the transformed file to be storedin the cache for the next time data are requested.

Even with an external cache, there will be cachemisses. A processor can become saturated transform-

Figure 12 Example of an original screen of a host application

IBM SYSTEMS JOURNAL, VOL 40, NO 1, 2001 BRITTON ET AL. 173

Page 22: Transcoding: Extending e-business to new … · Transcoding: Extending e-business to new environments by K. H. Britton R. Case A. Citron R. Floyd Y. Li C. Seekamp B ... buy products

ing documents. Several instances of WebSphereTranscoding Publisher can be run in conjunction withIBM Network Dispatcher,26 which distributes the loadamong them. The client devices do not have to knowthat there is more than one transcoder in the net-work since they are all configured to use the sameproxy Internet Protocol address. With Network Dis-patcher, more transcoding computers can be addedwhen a site adds more customers. This permits scal-ing in a modular fashion.

Benchmarks. There are a number of issues with us-ing existing Web server or Web proxy benchmarkssuch as WebStone27 or SPECweb**28 to evaluatetranscoding performance. Those benchmarks weredeveloped with the wired Web and full-functionbrowsers in mind and may not be relevant to the usersof small devices with slow Internet access. The au-dience targeted by Transcoding Publisher is oftenaccessing the Web wirelessly using devices that donot handle the same formats or input devices. The

standard Web server and proxy measurement toolsassume a certain workload or set of workloads. It isnot clear that any of these workloads will be gen-erated by users of pervasive devices. For example,the user of an Internet-capable cellular phone prob-ably does not have the same “think time” charac-teristics as the user of a desktop Web browser. Adesktop browser has a comparatively large screen.Therefore, there is more information to read beforeanother request for a page is required. In contrast,it is difficult to input information on a cellular phone,which may lower the request rate.

The user of a desktop browser with high-speed In-ternet access often does not think twice about fetch-ing a new page. However, the user of a wireless de-vice will quickly realize that it takes a long time todownload data from the Web. This realization islikely to affect usage patterns. For example, some-one using a desktop browser will not hesitate to usea search engine to find sites of interest. The user ofa cellular phone is more likely to perform one or twosimple queries. Cache hit rates may also be affectedby different usage patterns. For cellular phones, alarger percentage of requests is likely to be param-eterized queries such as “What is the current valueof xxx stock?” Since the answer to the query maychange every time it is asked, caching is not usuallybeneficial. Therefore, the mix of simple page fetchesto parameterized queries has a big impact ontranscoding throughput. Transcoding is sensitive tothe type of device that is requesting the informationbecause different transformations are performed fordifferent devices. The standard benchmark tests donot take the requesting device type into account.

The current Web measurement tools need to bechanged to include sets of pages that are typicallyfetched by pervasive devices, an appropriate mix ofqueries and simple fetches, and an appropriate mixof different devices. Based on these changes, a newset of benchmarks needs to be developed to makeit possible to compare competing products, whichtoday present performance information in very dif-ferent ways. In addition to standard benchmarks, ca-pacity planning algorithms are needed in order tohelp network designers plan the number of serversrequired to support certain workloads.

Some performance measurements. Despite the lackof relevant, standardized benchmarks, it is still nec-essary to have some idea of the capacity of theTranscoding Publisher. To do this, the WebStonesource was modified to send the HTTP headers that

Figure 13 Example of application in Figure 12transcoded for presentation on a smart phone

BRITTON ET AL. IBM SYSTEMS JOURNAL, VOL 40, NO 1, 2001174

Page 23: Transcoding: Extending e-business to new … · Transcoding: Extending e-business to new environments by K. H. Britton R. Case A. Citron R. Floyd Y. Li C. Seekamp B ... buy products

the transcoder uses to identify different devices.Workloads were also defined that might be typicalfor users of devices that have small screens and slowInternet connections.

In Table 2, the results for two scenarios are shown:a WAP scenario and a Palm Pilot** scenario. For theWAP scenario, all the clients are WAP phones fetch-ing HTML pages. The HTML pages range from 1000to 3000 bytes. No images are fetched. For the PalmPilot scenario, all the clients are Palm Pilot devices.The same HTML pages are fetched as in the WAP sce-nario. In addition, some images (JPG and GIF) arefetched. They range from 500 bytes to 16000 bytes.Images are fetched one-tenth as frequently as HTML.

The table shows the number of users that can be sup-ported while maintaining a four-second responsetime. All measurements were made on single-pro-cessor, and two-way and four-way multiprocessorIBM RS/6000* systems running the Advanced Inter-active Executive (AIX*) 4.3.3 operating system. Ver-sion 1.1.2 of Transcoding Publisher was tested.

Because WebStone clients have no think time, thetests tend to drive processor utilization very high, andthe apparent scalability is less than we expected. Inother experiments we added up to 15 seconds thinktime between requests. For those runs, the scalingfrom one to two to four processors was almost lin-ear. The results shown here are for the zero thinktime test. The number of concurrent users is an es-timate of how many real users (who do think aboutor read a page before requesting another) can beusing the system concurrently.

Related work

There are several commercial products and researchprototypes whose capabilities are similar to that pro-vided by IBM WebSphere Transcoding Publisher.29

In this section we describe these systems.

The ProxiNet** transcoding engine30 focuses on de-livering existing Web (i.e., HTML) information to mo-bile handheld devices. It uses an ultra-thin client andrelies on a middle tier to perform any necessarytranscoding.31 The ProxiNet engine can filter outcontent not compatible with the capabilities of ahandheld device, reformat HTML tables and framesas lists, and transcode images for these devices byperforming color depth reduction and scaling.

The Spyglass Prism** transcoding engine32 also sup-ports the transcoding of existing Web content forWeb-enabled non-PC devices. The Spyglass Prism en-gine is server-based and translates richly formattedWeb content such as tables, JPEGs, and frames intoformats that match the relatively limited display ca-pabilities of non-PC devices. The engine is also ca-pable of transcoding images for these devices byperforming color depth reduction and scaling. In ad-dition, Spyglass Prism is capable of transcoding HTMLto WML.

The Online Anywhere transcoding engine is capa-ble of transforming content to a variety of informa-tion appliances, including Web-connected TVs, PDAs,wireless devices (pagers, data phones), and voice-only products.

A portal is a gateway that serves as a starting site fromwhich a user begins Web-browsing activity. There

Table 2 Performance measurements made on an RS/6000 44P Model 270, 375 MHz processor with 2GB RAM running AIX;Java tuning parameters: ms 5 256, mx 5 512

Test Case Number ofProcessors

Number ofConcurrent

Users

ResponseTime

Throughput(megabits/sec)

Pages/Sec

WAP clients only 1 1000 4.32 .33 22.97(HTML to WML; 2 1400 4.26 .46 32.581K to 3K size text files;no images)

4 2000 3.93 .72 50.43

PDA clients only 1 1300 4.38 .61 29.44(HTML simplification; 2 1600 4.05 .81 39.131K to 3K size text files;images up to 16K)

4 2500 4.15 1.24 59.77

IBM SYSTEMS JOURNAL, VOL 40, NO 1, 2001 BRITTON ET AL. 175

Page 24: Transcoding: Extending e-business to new … · Transcoding: Extending e-business to new environments by K. H. Britton R. Case A. Citron R. Floyd Y. Li C. Seekamp B ... buy products

are general portals such as the Netscape and Ya-hoo!** Web sites, as well as specialized portals suchas Garden.com for gardeners and Fool.com for in-vestors. Typically, a portal has many links to manyother Web sites. Portals can be a significant sourceof revenue because of income provided by sellingadvertising space (e.g., for banner advertisements)on the portal site or payments received for directingsignificant user traffic to other sites from the portal.

A portal can also provide a means to provide accessto transcoded content. That is, it can serve as a start-ing point for a set of sites, all of which have beentranscoded for a certain class of devices. For exam-ple, one can design a WML portal as a starting pointto sites that have been transcoded for WML devices.Rather than having to transcode all possible HTML(Web) sites, one can then limit the number of pagesthat need to be transcoded into WML to those pagesthat are accessible from the portal. WTP can be usedto implement such portals, for example, by using dy-namic transcoding in conjunction with text clippersfor those sites that can be accessed from the portal.There are other products in the marketplace, suchas the Oracle Portal-to-Go**, that provide supportfor generating such portals, especially for wirelessdevices.

Han et al.33 present an analytical framework for de-termining whether to transcode and how much totranscode an image for the two cases of store-and-forward transcoding as well as streamed transcod-ing, and also provide practical adaptation policiesbased upon transcoding delay, transcoded image size(in bytes), and the estimation of network bandwidth.

The InfoPyramid transcoding engine34,35 is capableof transcoding content to a variety of client devices.InfoPyramid is able to perform transformations onimages, video, text (i.e., HTML), and audio to reducethe amount of content sent to client devices. In ad-dition, InfoPyramid can also perform modality-basedtransformations. InfoPyramid can convert video toimages, video to audio, video to text, text to audio,and audio to text.

Conclusions and future work

As new types of devices become more prevalent, theneed to adapt existing content for display on a va-riety of devices is becoming greater. The techniquespresented in this paper are valuable in constructingsolutions that reuse existing content for the new ap-plications that are made possible by the new device

and network types. We expect these technologies tocontinue to evolve to address more applications andenvironments and to make it easier to customize thesolutions. The extensive use of standards such asHTML, XML, and the Java language has been a keyfactor in enabling us to quickly build and deliver aproduct—WebSphere Transcoding Publisher—thatincorporates these technologies and delivers on thepromised value.

There are several aspects of WebSphere Transcod-ing Publisher that are excellent avenues for futurework. Transcoding Publisher currently does not pro-vide automatic conversion of HTML content to a for-mat suitable for voice-recognition-based browsers.Providing this type of function (e.g., HTML toVoiceXML conversion) would enable TranscodingPublisher to reach a greater range of devices.

The ability of Transcoding Publisher to transformcontent in a variety of ways could be leveraged toadapt content to enable it to be more suitable forusers who have accessibility issues. For example,Transcoding Publisher could be used to increase thefont size of content to enable it to be read by userswith limited vision capabilities, and content couldbe modified such that it was presented in a simplerform, thus making it suitable for users with atten-tion deficit disorder or dyslexia.

The clipping and transformation capabilities ofTranscoding Publisher are well suited for the cre-ation of portals for wireless devices. AdaptingTranscoding Publisher to serve in this capacity is avaluable avenue for related work.

Transcoding Publisher currently relies upon useragent identifiers transmitted by browsers to deter-mine the capabilities of the client. As more formalmethodologies for sending information regarding cli-ent capabilities such as CC/PP mature, the policies ofTranscoding Publisher for determining how totranscode for a device should be updated to exploitthis detailed description of capabilities.

Transcoding Publisher could be extended to dynam-ically adapt its transcoding policies as the environ-ment in which it is being used changes over time.For example, image reduction policies could changewhen network bandwidth is suddenly reduced.

Finally, Transcoding Publisher needs to be extendedto support emerging formats such as the MPEG-7(Moving Picture Experts Group) multimedia format

BRITTON ET AL. IBM SYSTEMS JOURNAL, VOL 40, NO 1, 2001176

Page 25: Transcoding: Extending e-business to new … · Transcoding: Extending e-business to new environments by K. H. Britton R. Case A. Citron R. Floyd Y. Li C. Seekamp B ... buy products

to allow it to continue to provide first-class supportfor cutting-edge wireless devices.

In addition to functional enhancements, it would beuseful for comparing the performance of differentproducts if standardized performance benchmarksfor mobile devices accessing the Internet were de-veloped.

Acknowledgments

The authors acknowledge the large internationalteam that designed, developed, tested, documented,and marketed WebSphere Transcoding Publisher,including the core IBM team in Research TrianglePark, North Carolina; an IBM team in Austin, Texas,led by Matt Rutkowski that developed the cachingsupport, especially Andrew Hately who endured foursnowed-in days in North Carolina to make sure itworked; the IBM Tokyo Research team that devel-oped the HTML parser vital to the HTML to WMLtranscoding, especially Goh Kondoh and ShinichiHirose; the IBM Hawthorne Research team that con-tributed the image transcoding technology, especiallyJohn R. Smith, Chung-Sheng Li, and Richard La-Maire; the Lotus XSL team that developed the XalanXSL transformation engine, especially Scott Boagwho also worked with our customers to improve theirstylesheets; and the IBM Almaden Research team ledby Paul Maglio and Rob Barrett that contributed theWBI framework at the core of WebSphere Transcod-ing Publisher and is still working with us to enhanceit. We thank every individual on this extended teamfor an exceptional degree of long-distance cooper-ation.

*Trademark or registered trademark of International BusinessMachines Corporation.

**Trademark or registered trademark of Sun Microsystems, Inc.,Massachusetts Institute of Technology, Netscape Communica-tions Corporation, Microsoft Corporation, Standard PerformanceEvaluation Corporation, Palm, Inc., ProxiNet, Inc., OpenTV, Inc.,Yahoo! Inc., or Oracle Corporation.

Cited references and notes

1. Java, Sun Microsystems, Inc., http://java.sun.com.2. XML, World Wide Web Consortium, http://www.w3.

org/XML.3. K. F. Eustice, T. J. Lehman, A. Morales, M. C. Munson,

S. Edlund, and M. Guillen, “A Universal Information Ap-pliance,” IBM Systems Journal 38, No. 4, 575–601 (1999).

4. Compact HTML for Small Information Appliances, WorldWide Web Consortium, http://www.w3.org/TR/1998/NOTE-compactHTML-19980209/.

5. Proposal for a Handheld Device Markup Language, World

Wide Web Consortium, http://www.w3.org/TR/NOTE-Sub-mission-HDML.HTML.

6. Wireless Markup Language is part of the Wireless Applica-tion Protocol standard from the WAP Forum, http://www.wapforum.org.

7. S. Chandra, C. S. Ellis, and A. Vahdat, “Differentiated Mul-timedia Web Services Using Quality Aware Transcoding,”INFOCOM 2000—Nineteenth Annual Joint Conference of theIEEE Computer and Communications Societies (March 2000),http://www.cs.duke.edu/;surendar/infocom00.pdf.

8. IBM WebSphere Application Server, IBM Corporation,http://www.ibm.com/software/webservers/appserv/.

9. T. F. Abdelzaher and N. Bhatti, “Web Content Adaptationto Improve Server Overload Behavior,” 8th InternationalWorld Wide Web Conference (1999). http://www8.org/w8-papers/4c-server/web/web.pdf.

10. F. Kitayama, S. Hirose, and K. Kuse, “Dharma: A Frame-work for Development of Web Applications for Pervasive Ter-minals—Systems Overview and Application Objects,” IPSJ57th Annual Convention, in Japanese (1998).

11. S. Hirose, F. Kitayama, and K. Kuse, “Dharma: A Frame-work for Development of Web Applications for Pervasive Ter-minals—View Object Generation and HTML GenerationMechanism,” IPSJ 57th Annual Convention, in Japanese(1998).

12. F. Kitayama, S. Hirose, and K. Kuse, “Design and Implemen-tation of Web-Application Development System for BusinessObjects and Pervasive Terminals,” IPSJ ’98 Object OrientedSymposium, in Japanese (1998).

13. F. Kitayama, S. Hirose, K. Kuse, and G. Kondoh, “Designof a Framework for Dynamic Content Adaptation to Web-Enabled Terminals and Enterprise Applications,” IPSJ/IEEEAPSEC ’99 (December 1999).

14. IBM WebSphere Transcoding Publisher, IBM Corporation,http://www.ibm.com/software/webservers/transcoding/.

15. The Apache Software Foundation, http://xml.apache.org.16. WAP Forum, http://www.wapforum.org.17. Document Object Model, http://www.w3c.org/DOM.18. Currently WTP does not have a mechanism to identify in-

dividual users. In the future, WTP will be used in environ-ments where user identification is possible. In this case, pref-erences from the user source will also be used.

19. The NetRexx Language, IBM Corporation, http://www2.hursley.ibm.com/netrexx/.

20. R. Han, “Factoring a Mobile Client’s Effective ProcessingSpeed into the Image Transcoding Decision,” WOWMOM’99 (August 1999).

21. A. Fox, S. Gribble, Y. Chawathe, and E. Brewer, “Adaptingto Network and Client Variation Using Infrastructural Prox-ies: Lessons and Perspectives,” IEEE Personal Communica-tions 5, No. 4, 10–19 (August 1998).

22. CC/PP, World Wide Web Consortium, specifications athttp://www.w3.org/TR/NOTE-CCPPexchange;http://www.w3.org/Mobile/IG.

23. R. Barrett and P. Maglio, “Intermediaries: An Approach toManipulating Information Streams,” IBM Systems Journal 38,No. 4, 629–641 (1999).

24. IBM SecureWay Host On-Demand 4.0: Enterprise Communi-cations in the Era of Network Computing, SG24-2149-01, IBMCorporation.

25. Attachmate Corporation, http://www.attachmate.com.26. IBM Network Dispatcher; IBM Corporation, http://www.

ibm.com/software/network/dispatcher/.27. WebStone benchmark, http://www.mindcraft.com/webstone/.28. SPECweb99, http://www.specbench.org/osg/web99/.

IBM SYSTEMS JOURNAL, VOL 40, NO 1, 2001 BRITTON ET AL. 177

Page 26: Transcoding: Extending e-business to new … · Transcoding: Extending e-business to new environments by K. H. Britton R. Case A. Citron R. Floyd Y. Li C. Seekamp B ... buy products

29. R. C. Henderson, B. Topol, C.-S. Li, R. Mohan, and J. R.Smith, “Taxonomy of Network Transcoding, in MultimediaComputing and Networking 2000,” K. Nahrstedt and W.-C.Feng, Editors, Proceedings of SPIE 3969, pp. 65–72 (2000).

30. ProxiNet, http://www.proxinet.com/technology/, now part ofPuma Technology, Inc., http://www.pumatech.com.

31. A. Fox, S. D. Gribble, E. A. Brewer, and E. Amir, “Adaptingto Network and Client Variability via On-Demand DynamicDistillation,” ASPLOS-VII, Cambridge, MA (October 1996).

32. Spyglass, Inc., http://www.spyglass.com/, now part of Open-TV, Inc., http://opentv.com/.

33. R. Han, P. Bhagwat, R. LaMaire, T. Mummert, V. Perret,and J. Rubas, “Dynamic Adaptation in an Image Transcod-ing Proxy for Mobile WWW Browsing,” IEEE Personal Com-munications Magazine 5, No. 6, 8–17 (December 1998).

34. J. R. Smith, R. Mohan, and C.-S. Li, “Transcoding InternetContent for Heterogeneous Client Devices,” Proceedings ofthe IEEE International Symposium on Circuits and Systems(ISCAS), special session on Next Generation Internet (June1998).

35. J. R. Smith, R. Mohan, and C.-S. Li, “Content-BasedTranscoding of Images in the Internet,” Proceedings of theIEEE International Conference, Image Processing (ICIP-98)(October 1998).

Accepted for publication September 22, 2000.

Kathryn Heninger Britton IBM Application Integration Middle-ware Division, P.O. Box 12195, Research Triangle Park, North Caro-lina 27515 (electronic mail: [email protected]). Ms. Britton isa Senior Technical Staff Member in the Application IntegrationMiddleware Division. She is the technical leader of the world-wide research and development team that produced the IBMWebSphere Transcoding Publisher product, first shipped in March2000. Since joining IBM in 1981, she has contributed to IBM’snetwork architecture, with a focus on two-phase commit proto-cols and multiprotocol networking. She has also worked on thedevelopment of several software products for networking and mo-bile computing. She has served as IBM’s representative to theX/Open XNET working group. Prior to joining IBM, Ms. Brit-ton worked for the Naval Research Laboratory on software en-gineering research and technology transfer. She holds a bache-lor’s degree in English from Stanford University and two master’sdegrees, one in computer science from the University of NorthCarolina at Chapel Hill.

Ralph Case IBM Software Group, 4205 South Miami Boulevard,Research Triangle Park, North Carolina 27709 (electronic mail:[email protected]). Mr. Case is a senior programmer develop-ing transcoding strategy and technology. He also works on APPNw

architecture and chairs the APPN Implementers Workshop(AIW). Previously, he helped develop S/390 hardware, special-izing in channel subsystems, ESCONw, and total systems test. Hehas experience building automated tools and cooperative process-ing systems using communications protocols, knowledge-basedsystems, and object-oriented programming. He holds a B.S. inelectrical engineering from Rensselaer Polytechnic Institute.

Andrew Citron IBM Software Group, 4205 South Miami Boule-vard, Research Triangle Park, North Carolina 27709 (electronic mail:[email protected]). Mr. Citron is currently the team lead forthe Transcoding Publisher and Host Publisher performance team.He has also been a software developer on Web Express and

Mwavew. Prior to that he was lead architect for IBM’s APPC ar-chitecture.

Rick Floyd IBM Software Group, 4205 South Miami Boulevard,Research Triangle Park, North Carolina 27709 (electronic mail:[email protected]). Dr. Floyd received the B.S. degree from IowaState University in 1977 and M.S. and Ph.D. degrees from theUniversity of Rochester in 1982 and 1989. He was a member ofthe technical staff at the Clinton P. Anderson Meson Physics Fa-cility from 1978 to 1981, and at BBN Laboratories Incorporatedfrom 1987 to 1989. He has been with IBM in Research TrianglePark since 1989. His research interests include distributed sys-tems and mobile computing.

Yongcheng Li IBM Application Integration Middleware Division,4205 South Miami Boulevard, Research Triangle Park, North Caro-lina 27709 (electronic mail: [email protected]). Dr. Li is a softwareengineer in the Advanced Design and Technology Group of theApplication Integration Middleware Division. He currently workson XML-based application integration and data transformation.After joining IBM in 1997, he worked on Web-host integration.Dr. Li received his Ph.D. degree in computer science from Tsing-hua University in 1993. He is a coinventor on 16 filed patents.

Christopher Seekamp IBM Software Group, 4205 South MiamiBoulevard, Research Triangle Park, North Carolina 27709 (elec-tronic mail: [email protected]). Mr. Seekamp is a program-ming consultant and has been the primary developer and devel-opment team leader for the text-related portions of theWebSphere Transcoding Publisher since early in its development.He has held numerous key positions working on the developmentof the various communications software projects. For most of thelast several years he has worked on issues dealing with providingcontent to various types of mobile devices, both inside and out-side of IBM. Over 10 years ago, Mr. Seekamp was an early adopterof object-oriented design and development techniques and hasbeen employing object-oriented design and development ap-proaches in his work ever since.

Brad Topol IBM Software Group, 4205 South Miami Boulevard,Research Triangle Park, North Carolina 27709 (electronic mail:[email protected]). Dr. Topol is a member of the AIM/BusinessConnection Advanced Technology Group at IBM in ResearchTriangle Park. He received the B.S. and M.S. degrees in com-puter science from Emory University in 1993 and the Ph.D. de-gree in computer science from the Georgia Institute of Technol-ogy in 1998. Currently, he is actively involved in advancedtechnology projects in the areas of content adaptation, distrib-uted systems, networking, and graphical user interfaces.

Karen Tracey IBM Software Group, 4205 South Miami Boule-vard, Research Triangle Park, North Carolina 27709 (electronic mail:[email protected]). Dr. Tracey received B.S., M.S., and Ph.D. de-grees in electrical engineering from the University of Notre Damein 1987, 1989, and 1991, respectively. She joined IBM at ResearchTriangle Park in 1991. She has worked in development for var-ious products, most recently for Communications Server/390 andWebSphere Transcoding Publisher.

BRITTON ET AL. IBM SYSTEMS JOURNAL, VOL 40, NO 1, 2001178