Integration of Accessible Documents into Digital Libraries of Tomorrow

Integration of Accessible Documents into Digital Librar-ies of Tomorrow

Alexander Haffner

Technische Universität Dres-den

01062 Dresden Germany

[email protected]

Gerhard Weber Technische Universität Dres-

den 01062 Dresden

Germany [email protected]

dresden.de

Digital libraries are processing mainly digital text resources and intend to en-sure long-tem preservation. Future systems additionally have to focus on ac-cessible multimedia dissemination and overcome traditional channels for dis-tribution. We have developed two different approaches towards a client/server distribution model for an accessible digital library based either on an exten-sion of DAISY or by transforming MultiReader documents. These approaches rely on ingest of comprehensive contents covering enhanced semantic rich-ness. We consider role changes of authors and their corresponding responsi-bilities for actual resource quality increase. Furthermore, we discuss resulting end user benefits by multimedia and related modalities of use.

1. Introduction Recently, the term “digital library” has been extremely expanded in its meaning, the development turned from simple online catalogues to archival information systems. Archival information systems are archives, consisting of an organization of people and systems that have accepted the responsibility to preserve informa-tion and make it available for a designated community [OAIS02]. Each reader community is affected by the availability of digital resources. Accord-ing to the European Disability Forum (EDF), approx. 50 million people of the population in the European Union suffer from disabilities [EDF08]. For example, people dependent on a wheelchair can avoid lack of mobility when visiting digital libraries instead of conventional libraries. In contrast, print-disabled people get only access to contents available if it is available as digital asset explicitly. For the majority of people with a disability digital resources offer a diversity of van-tages for comfortable document use. Instead of delivery by mail, digital libraries support near to instant access. Issues of reengineering ingest, archival and dis-semination processes challenge a modern library-for-all. This paper discusses access strategies to rich multimedia contents. Multimedia solutions may include synchronized equivalent media streams that already match user needs or are adaptable to their needs. Production and ingest responsibilities

by authors and librarians may be based on a distributed system to support ar-chiving and become applicable for high quality resource distribution.

2. Access by consumers Consumers perform an online document search, if they benefit from the biblio-graphic metadata. Often they intend immediately to read or at least browse a document that suits their interest. Digital library search functionality is mostly of-fered by a web based user interface, whereby forms assist users in a purposeful search. Therefore techniques and guidelines for web accessibility may already cover accessibility aspects. In contrast, access to actual resources is not neces-sarily covered by such guidelines. [PET05] determined requirements of print disabled users (blind, partial sighted, dyslexic, and hearing impaired users) in document handling. Results show each group has problems using standard print but has as well specific requirements for the use of digital documents. For example, blind users are not able to use graph-ics whereas graphics support understanding of all the other readers, partial sighted readers need scalable text and adjustable contrasts, audio without any capturing is useless for deaf, and too small line distances let dyslexics get lost in text. In particular, video and audio demand different accessibility approaches to serve each single target groups. The use of multimedia documents by mainstream readers as well as by print-disabled readers requires contents which matches the constrained modalities of use and offer personalization of document presentation.

2.1 Digital Talking Books in DAISY format A Digital Talking Book (DTB) is a multimedia document developed also for dis-abled users. DTBs use Digital Accessible Information System (DAISY) as a smart format to integrate and synchronize text, audio and images. ANSI/NISO Z39.86-2005 standard [DTB05] defines the format and content of the electronic file set that comprises a Digital Talking Book. The XML based specification provides producers with the ability to structure a book in great detail. Compared to HTML mark-up, XML increases mark-up op-tions and makes more detailed structure and some nesting possible. The corre-sponding structure allows readers both global navigation (through pages, chap-ters, headings etc.), and local navigation within a document at a very fine granu-larity (on a paragraph, sentence or word level as well as within a table). DAISY players offer readers personalization of text display, as well as adjustment of audio playback speed within some bandwidth. Additionally a player can con-tinuously highlight text phrases. Typically DTBs contain synthetic or human nar-rated voice. An exemplary synchronized media playback processing is shown in Figure 1. The illustration demonstrates dependencies of permitted DTB items. A maximum of three media objects (one is time-dependent) may be applicable in parallel at every moment of reading.

dtb:audio<audio> <audio> <audio>

m

dtb:text<dtbook>

dtb:image<image>

dtb:image<image>

t

Figure 1: example for synchronized multimedia in DAISY The deficit in DAISY is the lack of video integration. In particular people with a loss of hearing suffer from this fact. For deaf people sign language is the native language. Consequently, navigable sign language videos are produced by spe-cialized publishers to make documents more readable utilizing a variety of tech-nical approaches and typically without XML-based markup. The MultiReader project noticed already in 2005 the demand for corresponding media alternatives and developed its own format-for-all utilizing the same indus-try standard file formats as DAISY. Recently, the DAISY Consortium likewise observed missing provision of several media streams particularly matching hearing-impaired needs. Accompanying to the design of SMIL 3.0 a SMIL 3.0 DAISY profile was devel-oped. The profile meets a variety of additional requirements. Consequently, DAISY Consortium announced a revision of ANSI/NISO Z39.86. The revised Standard will address both authoring (master creation) and distribution require-ments [DAYNL08].

2.2 MultiReader Project The MultiReader document model is based on enriched media documents and the separation of content from presentation. A MultiReader document consists of a main file and several source files which are rich media documents containing both XHTML and a kind of SMIL mark-up to identify single media objects. XHTML+TIME is a development combining the well known Hypertext Mark-up Language (HTML) with the timing and synchronization mechanisms of SMIL. By using these XML languages further document processing with easily available XML parsers and XSLT engines is possible. Further, XHTML+TIME documents can be displayed by industry standard web browsers. MultiReader documents are read through a reading program with novel user interface elements. Identification of media objects is using a mechanism in XHTML for linking de-scription of contents with a description of presentation (Cascading Stylesheet, CSS). In XHTML each stylesheet is identified through a name, which also serves as a microformat. In addition, MultiReader specifies classes of media objects to describe some contents which will be transformed according to user needs. Hier-archical nesting of MultiReader classes is possible through nested mark-up

(<span>-tags). Figure 2 shows such a nesting for some video played together with background, followed by music. Narration to the video is enriched by audio description by blind users. Captions and subtitles may be offered to deaf users instead.

cvideo

cvbgsound cvmusic

cvnarration

cvsubtitlecvcaption

cvaudiodesc cvaudiodesc

<video>ctsynth

cthigh

<audio>

<set> <set> <set><audio> <audio>

<audio> <audio> <audio>

t

m

text text

Figure 2: example for synchronized multimedia in MultiReader documents [SPI08] In essence, the formats for accessible multimedia documents do exist. But a digi-tal library of tomorrow does not only support search and download mechanisms, it should also support platform independent web based playback of multimedia resources. Neither MultiReader nor DAISY have been developed with this inten-tion.

3. Web based playback solutions This section introduces two enhancements for an improved access to multimedia documents-for-all ensuring independent player technologies. An important role in both approaches plays the support of a multimodal usage concept regarding support for handicapped users.

3.1 Timesheet based MultiReader solution In contrast to the traditional MultiReader concept [SPI08] replaces use of HTML+TIME technologies by a different approach towards multimedia synchro-nisation in a browser independent broadcast. The redeveloped system more strictly is based on SMIL, but implements SMIL through SMIL-Timesheets [STS08] and allows validation as XHTML file. Timesheet provide absolute, rela-tive and event based time controlling of multimedia items in a web page. The ad-vantage of Timesheet is in the external and modular specification similar to CSS. All the timing specification can be hold in one file and will be attached to embed-ded page elements at runtime. Timesheet are based on JavaScript, a technology which screenreaders may utilize successfully if used carefully. A Timesheet en-gine supports client side events and corresponding media handling. It is dynami-cally integrated into a distributed MultiReader solution by relying on AJAX. The distributed MultiReader system consists of an application server maintaining a user profile. The implementation is based on Cocoon. Corresponding to the

user profile, the application server composes mandatory media objects into a MultiReader document. Those documents are located in a separate archival storage system which supports streaming functionality. The Cocoon application server transforms media containers by XSLT and arranges all necessary compo-nents in an accessible webpage. This means personalization is completed on the server. As a result users read in web pages which meet their needs more ade-quately. Adjustment of visual media is provided by utilizing individual CSS. The multimedia items are just embedded as link objects. Streaming of multimedia data is handled by an accessible Flash player, controlled with respect to temporal requirements by the timesheet as well as with respect to user intervention for ex-tra reading time by mouse or keyboard operated buttons. Furthermore documents contain besides standard browser navigation, functional-ities of navigation support by table of contents and indices to explore the hierar-chical document structures. A variety of assistance features offer quick help. The enhanced MultiReader system has been tested for AA WCAG 1.0 conformance successfully as well as for usability in a pilot test by a blind user.

3.2 Enhanced DAISY web player The enhanced DAISY format of [EBE08] contains, besides the traditional dtbook textual content file and the recorded audio files, a variety of specific audio tracks, an additional video stream, a sign language stream and different subtitles and captions for videos as primary media. Combinations of single media objects aim to the needs of different user groups. Each user has a client-side user profile which causes a first arrangement of nec-essary media for synchronization. Therefore the client requests the original DAISY-DOM from an archival storage system and undertakes a rebuilding into an adapted personalized document. Afterwards every user has the opportunity to undertake further unassisted personalization in the player. The client side player application is completely embedded in a Flex [FLEX08] based Flash environment to support heterogeneous systems by proper media stream playing. The actual text of the dtbook file is displayed in an HTML area. This approach allows the reuse of standard web accessibility features, so it is also possible to read text by screenreader if equivalent audio files are missing. Furthermore display adaptation is adjustable by CSS. The major advantage of Flash is its comfortable and accessible audio and video file playback similar to timesheet based MultiReader.

3.3 Comparison Unlike static web pages support both approaches optional placement and plastic-ity for all media objects included. Primarily plasticity of the user interface is ensur-ing resizing but also aspects of overlaying and free positioning (i.e. subtitles on the display) are addressed through a client/server system. On system side every media object is selectable individually and streamed. Con-sequently, a client just selects the necessary media from the streaming server. Users will experience a better degree of controllability also with respect to addi-tional time for reading. The streaming approach of both [SPI08] and [EBE08] en-

ables partial playback of large files with almost no delays, in particular requires streamed Flash only little buffering before playback can start. In contrast, ap-proaches to download videos or audio files would cause undesired pauses. The main difference arises from support for screenreaders. The MultiReader-based approach is based on HTML and JavaScript. Many screenreader support HTML sufficiently well and reliably follow techniques for accessible web pages. In contrast Flex, is supporting screenreaders only within the Windows operating system by addressing MSAA. The following Table 1 summarizes this compari-son. Client/Server

MultiReader Enhanced DAISY

Web-based distribution personalised by server personalized by client W3C standards XHTML incl. JavaScript requires standard update readers blind, partial sighted,

dyslexic, deaf and hear-ing impaired users

blind, partial sighted, dyslexic, deaf and hear-ing impaired users

streamed time-dependend media

Flash Flash

screenreader support Independent independent pipeline Cocoon DAISY pipeline applica-

ble operating systems Independent MS Windows family

4. Collaborative accessible multimedia production Accessible multimedia document production either in DAISY or MultiReader for-mat demands well skilled experts in accessible document processing but can be assisted by automatically processing technologies.

4.1 Textual source generation The usual book author mainly produces text results including several images for print or electronic publication. However, a generated source document should always be the starting point for a fully accessible multimedia publication, no mat-ter if textual content, audio or video is used as primary media. Our considerations concentrate on textual content as primary media. Structured information is the first big step towards high-quality accessible infor-mation. A document whose internal structure can be defined and its elements isolated and classified, without losing sight of the overall structure of the docu-ment, is a document that can be navigated [DPA08]. Formats providing the po-tential to create those structured content should base at optimal case on XML. So how can an author without any particular knowledge in the field of accessibil-ity reach these objectives?

Many office applications like Microsoft Office 2007 or OpenOffice use internally XML based formats. The author’s responsibilities are just in qualified content marking by specifying adequate style templates. For example, if an author uses headings instead of big, bold fonts the authoring tool can perform semantic rea-soning for internal document structuring. Of course it is not only the styling, addi-tional tasks like alternative image descriptions allow the verbalisation of graphical content by words of the actual author. Accessible textual content production in specific authoring environments is already discussed in a variety of publications, so authors will always get help in their particular environment by mutually agreed guidelines. Our interest is more in the resulting, ‘digital born’ XML document and its opportu-nities for reuse in a library-for-all and its document processing.

4.2 Automatic vs. manual accessible document preparation The preparation of generated source documents shouldn’t be part of authors specific authoring tool work. To achieve identical high-quality results authors or their publishers as representatives have to ingest the XML based source docu-ments for common processing in an adapted environment. This environment is our archival information system. Different import filters allow consistent transfor-mation of source documents in specific resources for adequate archiving in a simple and economic manner. For instance, does a qualified OpenDocument to PDF filter ensure overtake of all document structures and tagging, whereas most free market solutions cannot offer. What about filters for the generation of accessible multimedia documents? Re-sulting first step is as well a transformation of produced source documents to cor-responding mark-up-based textual content files and related exports of multimedia items. Of course in our considered context multimedia primarily refers to graphi-cal contents. In the past DTBs only contained human narrated voice and no related textual content. Today textual content is automatically convertible into synthetic voice by text-to-speech solutions. Synthetic speech is accepted by users in some do-mains, for example timely production of a TV guide or items that do not have in-creased demand by handicapped users. The DAISY Pipeline offes a variety of filters and additional validation components to safeguard high-quality transformation. Consequently, generated source docu-ments get transcribed into DAISY master documents which may need to be transformed to a specific delivery format (e.g. electronic or printed Braille, E-text, Daisy Text-Only DTB) [DAPI08]. For audio production, a narrator component realises a text-to-speech transforma-tion and corresponding mark-up in a DTB for synchronisation of textual content and audio. Furthermore, DAISY Pipeline converts DTBs between different DAISY Standards. This approach can ensure aspects of long-term preservation in archi-val information systems. Much more difficult is a union of text based sources and human narrated voice. Studio recording is based on well skilled narrators. Then the synchronisation mostly takes place in a lot of handcraft. As a seminal development a speech–to-

text transformation could detour the troublesome synchronisation work to step towards an almost automatic production. Currently no application is providing such functionality for accessible multimedia production. Most difficult and probably most expensive in accessible multimedia production is the generation of sign language videos. Deaf people do not yet accept artificial sign language by avatars. Recording effort is similar to human narrators of text but it is related to a much higher expense for synchronisation because of missing recognition and transformation tools. The only applicable approach including sus-tainable costs is the use of a lexicon to describe written words in sign language. We also want to mention the issue in subtitling videos or audio as primary media. In the library of tomorrow video and audio productions will be substantial assets. The actual spoken text is as well extractable by speech recognition approaches. But the breakup of single audio tracks without creators support is almost impos-sible. Furthermore, blind and hearing impaired users need additional descriptions that definitely have to be extended by well skilled accessibility staff.

4.3 Ingest strategies It is obviously that current full accessible multimedia production still relate to a lot of handcraft and time intensive efforts. Additionally we have illustrated the de-mand of a vast number of contributors in resource (resp. resource part) produc-tion. Therefore distributed specialists have to be provided by a single access point for common authoring. The MultiReader project introduced the MultiWriter as a web based authoring tool for a progressive and collaborative document generation [SZC04]. Unfortunately this approach is not starting at document production by usual authors with little respect to accessibility. A more advanced approach is in an Iterative ingest of distributed produced items in combination with filter techniques. Consequently, automatic resource produc-tion can take place on generated source documents as well as additional proc-essing steps. Accessibility experts just extend the resource by additionally pro-duced media items and ensure their synchronisation. Similar to MultiWriter archival information system should achieve distributed and iterative ingest processes by the provision of a web based interface supporting collaborative work by contributors. Another innovative approach for the increase of document quality is in actual reader contribution. Web 2.0 experiences evidence the success caused by role changes of end users to authors. The probably most famous example is Wikipe-dia. It is conceivably that disabled users know best what fits requirements of fel-lows in miseries. If those users are able to supply further information, give them a chance to!

4.4 Metadata enrichment Digital libraries already cover a variety of metadata in fields of descriptive, struc-tural, administrative, and long-term preservation metadata. Currently metadata enrichment is almost exclusively provided by librarians. Only little metadata gets attached by authors or publishers during ingest phase.

In context of accessible multimedia production and dissemination digital library face also new challenges in distributed metadata enrichment on single item level as well as item dependency level. But what particular metadata appears by archiving accessible multimedia re-sources? In first row it is necessary to group resources into two possible catego-ries: primary resources and equivalent alternative resources. The primary re-source is the initial or default resource. An equivalent alternative resource pro-vides equivalent semantic and behavioural functionality [ACCMD04]. Equivalent alternative resource can cover the whole primary resource or only parts of. Primary resources have to declare global access modalities (sight, sound, and touch, with an additional special content property of 'text' to denote the need for text literacy [ACCMD04]) and a local modality of use for included sub-items. Ad-ditionally, a primary resource needs metadata about adaptability regarding dis-play transformability and control flexibility. Furthermore this metadata must cover information about existence of equivalent alternative resources. Equivalent alternative resources can supplement (i.e. captions for a video) or substitute (i.e. DTB substitutes a PDF) a primary resource. Corresponding meta-data declares the nature of the resource equivalence. Metadata refers the actual primary media and specifies the kind of alternative equivalent and its modality of use. For example, video equivalents are captions (visual) and audio descriptions (auditory). Consequently an archival information system is able to match equivalent alterna-tives to needs or preferences of a user. Needs and preferences can be set by users before searching. Resulting from, the system pre-manufactures accessible items to an accessible multimedia document. If users do not specify preferred modalities they get a list of all available media sources and choose best fitting. The interesting issue in distributed metadata enrichment is: Who specifies which metadata at what time?

1. Authors or representative publishers of generated textual source docu-ments have to declare main descriptive metadata and additionally i.e. ver-balization of included graphical contents as most simple alternative.

2. Afterwards a librarian does his usual job by enriching conventional meta-data to ingested resources. In respect to accessibility librarians have to specify missing access modalities on resource and sub-item level and pay special attention to structural metadata for primary resources.

3. For accessible resource publishing the contribution in metadata enrich-ment by accessibility experts as resource producers is inevitably. They have to declare the alternative access modalities and the relation to a pri-mary resource. Furthermore, it is important to specify information about resource retrieval. For example, a tactile graphic is not online available.

Following steps require collaborative work of the librarians and the accessibility experts to ensure the needs for archiving in long-term and best suitable resource distribution. Particularly enhanced structural and administrative metadata opti-mize corresponding workflows.

5. Conclusions Digital libraries will be able to deliver personalized books if the authors, narrators and transcribers work together with librarians. Such a workflow requires a distrib-uted and asynchronous approach while preserving the author’s intention through quality assurance methods and tools. Readers will benefit from improved read-ability of these books but authors may find it difficult to write ‘for’ their readers. Both, mainstream users and print-disabled people enjoy more comfortable document dealing. We are not proposing a new Kindle, but focus on the plasticity of the reading experience in order to ensure accessibility. We have described reading programs for similar types of rich multimedia documents. Key to their ac-cessibility is the use of industry formats such as XHTML and Flash. Both pre-serve the user’s identity by storing a reader profile locally with the reading pro-gram. The main difference arises from the support of existing quality assurance tools such as manual and automatic tools for checking web accessibility. Future work will have to show how a digital library benefits from such tools in order to address more readers.

6. References [ACCMD04] IMS AccessForAll Meta-data (http://www.imsglobal.org/accessibility/accmdv1p0/imsaccmd_oviewv1p0.html). July 2004. [DPA08] Document Processing for Accessibility. CEN WORKSHOP AGREE-MENT, CWA 15778, February 2008. [DAPI08] DAISY Pipeline (http://www.daisy.org/projects/pipeline/). retrieved Sept 8, 2008. [DAYNL08] DAISY Planet Newsletter August 2008 (http://www.daisy.org/news/newsletters/planet-2008-08.shtml). retrieved Sept 8, 2008. [DTB05] Specifications for the Digital Talking Book. ANSI/NISO Z39.86-2005, April 2005. [FLEX08] http://www.adobe.com/products/flex/, retrieved Sept 8, 2008. [EDF08] http://www.edf-feph.org/Page_Generale.asp?DocID=12534, retrieved Sept 8, 2008. [EBE08] Eberius, W.: Multimodale Erweiterung und Distribution von Digital Tal-king Books. Diploma Thesis, Dept. Computer Science, TU Dresden, 2008. [OAIS02] Consultative Committee for Space Data Systems: Reference Model for an Open Archival Information System (OAIS). CCSDS 650.0-B-1 BLUE BOOK, January 2002. [PET05] Petrie, H.; Weber, G.; Fisher, W.: Personalisation, interaction and navi-gation in rich multimedia documents for print-disabled users. IBM Systems Jour-nal, 44 (3), 2005, 629-636. [SPI08] Spindler, M.: Verteilte barrierearme multimediale Dokumente. Diploma Thesis, Dept. Computer Science,TU Dresden, 2008.

[STS08] SMIL Timesheets 1.0, W3C Working Draft 10 (http://www.w3.org/TR/timesheets/). January 2008. [SZC04] Szczepaniak, A.: Authoring System for a XML-based Multimedia eBook. Master Thesis, Multimedia Campus Kiel, 2004. [WTT08] Timed Text (TT) Authoring Format 1.0 – Distribution Format Exchange Profile (DFXP) (http://www.w3.org/TR/2006/CR-ttaf1-dfxp-20061116/). November 2006.

Integration of Accessible Documents into Digital Libraries of Tomorrow

Documents

Transcript of Integration of Accessible Documents into Digital Libraries of Tomorrow