Digital libraries: an overview

9
Digital Libraries: An Overview by Candy Schwartz Digital libraries are complex systems that stretch institutional resources and capabilities, but also offer unparalleled opportunities for new and improved user services. An overview of the basic components of a digital library looks at the challenges and the potentials. Candy Schwartz is Professor at the Graduate School of Library and Information Science, Simmons College, 300 The Fenway, Boston, Massachusetts 02115-5898 ,[email protected].. I t is an understatement to say that the phrase “digital library” (DL) means different things to different people. As part of a class project, students in a digital library course recently found 64 different formal and informal definitions of “digital library.” 1 These range from what one could call the “strict” definitions underlying much of the federally funded DL research, such as: Digital libraries are organizations that pro- vide the resources, including the specialized staff, to select, structure, offer intellectual access to, interpret, distribute, preserve the integrity of, and ensure the persistence over time of collections of digital works so that they are readily and economically available for use by a defined community or set of communities. 2 to much looser characterizations, as seen in: A digital library is a distributed electronic collection that covers virtually all fields of human endeavor including art, music, medi- cine, science, movies, videos, books, product literature, newspapers, brochures, and cata- logs. 3 An analysis of the multitude of defini- tions, as well as mission statements and project proposals, indicates that digital li- braries are seen as resources that ideally: Serve a defined community or set of communities; May not be a single entity; Are underpinned by a unified and log- ical organizational structure; Incorporate learning as well as access; Make the most of human (“librarian”) as well as technological resources; Provide fast and efficient accessing, with multiple access modes; Provide free access (perhaps just to the specified community); Own and control their resources (some of which may be purchased); and Have collections that: • Are large, and persist over time, • Are well organized and managed, • Contain many formats, • Contain objects, not just representa- tions, • Contain objects that may be other- wise unobtainable, and • Contain some objects that are digital ab origine. These elements, especially as related to ownership and control of collections of digital objects, position a digital library as a separated set of resources and activities within (or perhaps not associated with) a traditional library. However, in an envi- ronment of e-books, e-journals, wired communities, and remote services, tradi- tional libraries conduct their business in a sometimes uneasy space lying between the tangible and the virtual. The concept of the “hybrid library,” which seems to have originated in Europe, reflects the realities being faced by libraries as they attempt to integrate externally purchased electronic monographs, serials, and ac- cess services with digital collections pro- duced in-house. Pinfield, Eaton, Edwards, Russell, Wissenburg, and Wynne feel that the hybrid library is more than a tempo- rary state of affairs: The hybrid library is on the continuum be- tween the conventional and digital library, where electronic and paper-based informa- tion sources are used alongside each other. The challenge associated with the manage- ment of the hybrid library is to encourage end-user resource discovery and information use, in a variety of formats and from a num- ber of local and remote sources, in a seam- lessly integrated way. The hybrid library should be “designed to bring a range of tech- The Journal of Academic Librarianship, Volume 26, Number 6, pages 385–393 November 2000 385

Transcript of Digital libraries: an overview

Page 1: Digital libraries: an overview

Digital Libraries: An Overviewby Candy Schwartz

Digital libraries are complexsystems that stretch

institutional resources andcapabilities, but also offer

unparalleled opportunities fornew and improved user

services. An overview of thebasic components of a digital

library looks at the challengesand the potentials.

Candy Schwartz is Professor at the GraduateSchool of Library and Information Science,

Simmons College, 300 The Fenway, Boston,Massachusetts 02115-5898

,[email protected]..

I t is an understatement to say that thephrase “digital library” (DL) meansdifferent things to different people.

As part of a class project, students in adigital library course recently found 64different formal and informal definitionsof “digital library.”1 These range fromwhat one could call the “strict” definitionsunderlying much of the federally fundedDL research, such as:

Digital libraries are organizations that pro-vide the resources, including the specializedstaff, to select, structure, offer intellectualaccess to, interpret, distribute, preserve theintegrity of, and ensure the persistence overtime of collections of digital works so thatthey are readily and economically availablefor use by a defined community or set ofcommunities.2

to much looser characterizations, as seenin:

A digital library is a distributed electroniccollection that covers virtually all fields ofhuman endeavor including art, music, medi-cine, science, movies, videos, books, productliterature, newspapers, brochures, and cata-logs.3

An analysis of the multitude of defini-tions, as well as mission statements andproject proposals, indicates that digital li-braries are seen as resources that ideally:

● Serve a defined community or set ofcommunities;

● May not be a single entity;

● Are underpinned by a unified and log-ical organizational structure;

● Incorporate learning as well as access;

● Make the most of human (“librarian”)as well as technological resources;

● Provide fast and efficient accessing,with multiple access modes;

● Provide free access (perhaps just to thespecified community);

● Own and control their resources (someof which may be purchased); and

● Have collections that:

• Are large, and persist over time,

• Are well organized and managed,

• Contain many formats,

• Contain objects, not just representa-tions,

• Contain objects that may be other-wise unobtainable, and

• Contain some objects that are digitalab origine.

These elements, especially as relatedto ownership and control of collections ofdigital objects, position a digital library asa separated set of resources and activitieswithin (or perhaps not associated with) atraditional library. However, in an envi-ronment of e-books, e-journals, wiredcommunities, and remote services, tradi-tional libraries conduct their business in asometimes uneasy space lying betweenthe tangible and the virtual. The conceptof the “hybrid library,” which seems tohave originated in Europe, reflects therealities being faced by libraries as theyattempt to integrate externally purchasedelectronic monographs, serials, and ac-cess services with digital collections pro-duced in-house. Pinfield, Eaton, Edwards,Russell, Wissenburg, and Wynne feel thatthe hybrid library is more than a tempo-rary state of affairs:

The hybrid library is on the continuum be-tween the conventional and digital library,where electronic and paper-based informa-tion sources are used alongside each other.The challenge associated with the manage-ment of the hybrid library is to encourageend-user resource discovery and informationuse, in a variety of formats and from a num-ber of local and remote sources, in a seam-lessly integrated way. The hybrid libraryshould be “designed to bring a range of tech-

The Journal of Academic Librarianship, Volume 26, Number 6, pages 385–393 November 2000385

Page 2: Digital libraries: an overview

nologies from different sources together inthe context of a working library, and also tobegin to explore integrated systems and ser-vices in both the electronic and print envi-ronments.” The hybrid library should not,then, be seen as nothing more than an uneasytransitional phase between the conventionallibrary and digital library but, rather, as aworthwhile model in its own right, whichcan be usefully developed and improved.4

The hybrid library is the context withinwhich most academic digital libraries arefound—the ecosystem of the digital li-brary, as it were.

Whatever words are used, those creat-ing and maintaining digital libraries ofone kind or another are probably less con-cerned with shades of meaning than theyare with coping with the sheer amount oforganizational, technological, and humaneffort involved. Even the digitization of asmall collection requires thinking aboutfunding, conversion methods, informationarchitecture, metadata, preservation, in-terface and presentation design, evalua-tion, and integration into existing struc-tures. Decisions, drawing on theknowledge and expertise of many, need tobe made from the general policy level(e.g., deciding which segments of thecommunity will have access to the digitallibrary) down to such fundamental thingsas file type (e.g., choosing the format inwhich sound files will be stored and de-livered to users). Other authors in thisissue look at projects and institutions thathave managed to navigate successfullythrough these complexities, or look athow DLs might evolve. The purpose ofthis article is to provide context throughan overview of the components of digitallibrary work, and also to point to re-sources for further exploration. This is thetip of the iceberg.

SETTING THE WHEELS IN MOTION

Many libraries and other institutionsmaintain collections of purchased or con-verted digitized information. However,the notion of separate entities called “dig-ital libraries,” and the now substantialbody of research investigating them,emerged primarily because of concertedefforts in the mid- to late 1990s by na-tional government agencies or large-scalecollaboratives. Examples include:

● The federally-funded Digital LibraryInitiatives (DLI) administered by theNational Science Foundation (NSF) inthe United States;

● The Electronic Libraries Program (e-

Lib) of the United Kingdom’s JointInformation Systems Committee(JISC);

● The joint JISC-NSF International Dig-ital Libraries Research Program;

● The Digital Library Federation (DLF)in the United States;

● Australia’s Distributed Systems Tech-nology Center (DSTC);

● The Canadian Initiative on Digital Li-braries (CIDL);

● The Nordic Council for Scientific In-formation (NORDINFO); and

● The DELOS Working Group of theEuropean Research Consortium for In-formatics and Mathematics (ERCIM).

White papers, conference reports, jour-nal articles, research reports, and a greatdeal of experience sharing continue toresult from these activities as well as fromthe experiences gained from many indi-vidual institutionally supported endeav-ors. Consequently, our collective under-standing of what makes digital librarieswork well (or poorly) has grown, as havethe number of digital libraries, and theinventory of useful technological tools.Digital libraries of one kind or another arenow deployed by (individually and in col-laborative arrangements with each other)academic institutions (within and outsideof academic libraries), public libraries, lo-cal and national government agencies, in-ternational organizations, national librar-ies, corporations, publishers, andprofessional associations. They rangefrom specialized collections focusing onone file type in one subject area, to com-plex aggregations of multiple formatssupporting interdisciplinary research.

Why Digital Libraries?

The focus of the Digital Library Initia-tive is “to dramatically advance the meansto collect, store, and organize informationin digital forms, and make it available forsearching, retrieval, and processing viacommunication networks—all in user-friendly ways.”5 Why have so many li-braries and other institutions, with orwithout federal funding, turned theirhands to creating digital libraries? Someof the reasons have to do with advances ininformation technology—increasinglyboth scholarly and recreational materialsare being made available in electronic for-mats, either in addition to or instead ofprint. The costs of creating, storing, andtransmitting digital information have de-

creased, and the technology to supportdistribution and access is widespread.Rising acquisition and subscription fees(not to mention shelving and processingcosts) have forced libraries to seek otherways to make information available, andcontent aggregators and e-book publish-ers are providing the means.

Perhaps more importantly, digital li-braries support service improvement. In-formation search and navigation acrosselectronic information resources is faster,with enriched points of access, and alter-native methods for browsing and explora-tion. The resources themselves can besegmented, rearranged, annotated, and en-hanced in ways not possible before, andcan be directly integrated with desktopproductivity tools for local analysis andprocessing. A digital environment enablescross-community interactivity and collab-oration, regardless of physical location.Also, digitization presents opportunitiesfor long-term preservation of bodies ofknowledge, if not of the original carriersof that knowledge.

UNDERNEATH THE HOOD

The planning, creation, and execution of adigital library draws on the expertise ofmany, and requires making difficultchoices at almost every stage. Most DLprojects, although they may have officialdirectors, are underpinned by teams thatbring together (or draw on as needed)people with capabilities in collection de-velopment, cataloging and representation,networking, software engineering, preser-vation, imaging, human computer interac-tion, copyright law, project management,and other related areas. Options, alterna-tives, and questions lurk around everycorner of the path to the realization of adigital library.

The Resources

Library collections, digital or other-wise, are formed by decisions about whatto acquire, from which sources, in whichformat(s), and how to preserve materialsonce acquired. In digital libraries, thesedecisions are complicated by a number offactors.

Digital objects can be text, image,sound, moving image, multimedia, data-sets, software, three-dimensional files,and possibly other as yet unknown types.Even within one type, such as text, thereis a plethora of formats (e.g., plain text,rich text, Word, Postscript, PDF, andmarked up in HTML or another SGML orXML document type). Additionally, there

386 The Journal of Academic Librarianship

Page 3: Digital libraries: an overview

are questions as to what constitutes anobject—for example, is a digital book oneobject, or is each chapter, illustration, ta-ble, table of contents, title page, and indexa separate digital object? If individualcomponents of an intellectual documentare disaggregated, then can they be re-aggregated with components of other in-tellectual documents? Answering ques-tions of this nature depends on anunderstanding of the eventual use of thematerial, and requires complex objectmanagement capabilities, as well as con-siderations of intellectual property rights.

Selection and acquisition of materialsfor library collections is a thoughtful pro-cess, involving a balance between meet-ing user needs and observing fiscal con-straints. In digital libraries, the costs ofacquisition may include not only purchaseor subscription, but also conversion todigital format. Considerations of what tocollect may, therefore, be influenced bywhether an item is already owned by theinstitution, or in the public domain, hasalready been converted, or is digital tobegin with. Conversion of print text andimage materials (leaving aside sound, orvideo) is costly and time-consuming, in-volving document preparation, scanning,inspection for quality control, probablysome degree of optical character recogni-tion, more inspection, editing, and index-ing—not to mention storage, transmis-sion, and display of the object onceconverted.

Preservation and archiving is one ofthe thornier problems for libraries withelectronic content, whether acquiredthrough subscription services, purchase,or conversion. There are serious concernswith the archiving and access policies (orlack thereof) of journal publishers andcontent aggregators. However, in the con-text of an institution’s created digital con-tent, preservation is up to the institution.Apart from questions about which for-mats and versions should be archived,collection managers need to worry aboutmedia deterioration over time, and tech-nological obsolescence as software andhardware platforms change. Relying onstandards to be persistent is risky, as isattempting to preserve hardware and soft-ware specific to points in time. Migratingfiles as platforms change is generally rec-ommended, and requires diligent planningand good record keeping, but CliffordLynch suggests that this too has its risks,and he supports an algorithmic approach.6

A Place for Everything, And. . .

To be of use in a collection, digitalobjects must be more than the collectionof bits that result from creation or conver-sion. They need to be identified, de-scribed, stored, and disseminated. Theymay be rendered in different formats fordifferent circumstances. Stored in reposi-tories, each digital object must have itsown unique identification, or name. Thename is used to retrieve the item, associ-ate metadata with it, associate it withother objects, and act as a reference incitation. The name should persist overtime and over migration from one physi-cal location to another, and should resolvequickly. The Corporation for National Re-search Initiatives (CNRI) “handle” sys-tem7 is one of the best known and widelyused naming and object management sys-tems.

Many types of metadata, serving dif-ferent purposes, might be associated witha digital object (or subset of objects) in arepository:

● Descriptive metadata include content,or “bibliographic,” description (in-cluding subject access) and possiblycontent ratings or evaluation informa-tion;

● Administrative metadata relate toterms and conditions of use, prove-nance (e.g., original source, date ofcreation, and version), processing(e.g., file format or internal structure),transactions (e.g., date of deposit orrequest), and object interrelationships;and

● Structural metadata concern informa-tion that makes the object usable, forexample, to enable page turning in anelectronic book.

Work on metadata in specific communi-ties may suggest more, depending on thenature of the object.8 Some metadata canbe created automatically, but much re-quires human labor.

The applications software underlying adigital library can be completely custom-ized, purchased from one vendor, or puttogether from existing products (prefera-bly conforming to accepted standards).Most digital libraries take the last route,which then entails some work in systemintegration. Underlying any digital libraryis, of course, a collection of servers, net-work technologies, storage devices, andattendant software tools and staff dedi-cated to distributed information manage-

ment. Making many objects available to,for instance, a large campus community,requires strong capacious networks andsizeable active storage and delivery capa-bilities. Deployment of streaming media,familiar to most of us through Real Audioand Real Video, will make even moredemands on capacity and expertise. Addto this the need for security and authenti-cation management (some of which islikely to be already in place), and distrib-uted transaction processing, and it be-comes clear that the impact of digital li-brary activity on institutional computingresources is far from trivial.

AT THE WHEEL —WELCOME TO THE

L IBRARY

A digital library is more than the re-sources and the aggregation of hardwareand software that manages them. It is alsoa collection of services, developed (onehopes) with users in mind, although thoseusers may be hard to define. What userswill remember about a library will be theservices it provided, and the meansthrough which those services were avail-able, whether those means were librarystaff, self-guided instruction, architecturaland interior design elements, or a ma-chine-driven interface.

Services

Digital libraries can potentially supporta range of traditional and not-so-tradi-tional library services. Although ad-vanced DL research projects exploit thepossibilities of a digital-only world, work-ing digital libraries for the most part sup-port functions that closely resemble theirbrick-and-mortar library counterparts, inslightly different guises.

Information Seeking and RetrievalAll digital libraries are searchable,

whether through direct query entry orthrough some form of browsing, or both.LC’s American Memory project, which isone of the more well-developed and com-plex digital libraries, offers direct searchthrough the entire collection, or throughselected collections, with the same kindsof options (match any words, all words,this exact phrase) as Internet search en-gines. Users may also browse throughcollections, and within a collection canbrowse subject and author indexes.

Reference Query FulfillmentVery few digital libraries are suffi-

ciently advanced to have incorporated in-teractive reference services. Most offer ane-mail contact for queries, and one or

November 2000 387

Page 4: Digital libraries: an overview

more frequently asked question (FAQ)files, which typically focus on either de-scriptions of the technology or informa-tion about the resources. For example,digitized collections at Duke Universityare listed on the home page of the RareBook, Manuscript, and Special Collec-tions Library, along with a link to a pagethat provides e-mail and telephone con-tact to reference services for those doingresearch in the collections, whether printor digital.

User TrainingUser training (i.e., bibliographic in-

struction) can take many forms—guides,tutorials, manuals, pathfinders, work-shops, videos, or one-on-one guidance.Few digital libraries have well developeduser training as yet. In addition to FAQs,many digital libraries carry guides to us-ing the collections. As a typical example,the Lester S. Levy Collection of SheetMusic at Johns Hopkins University in-cludes several useful pages on searchingand browsing. The William Blake Ar-chive goes further than most in providinga very well-developed virtual tour, exten-sive help documentation, a history of theproject, an FAQ, a bibliography, and in-formation on editorial principles, the edi-tors, and the technology.

Current AwarenessThe closest most digital libraries get to

current awareness is a “what’s new” page,or a regular newsletter available at theWeb site. This is hardly a personalizedcurrent awareness service. Some (e.g.,CalFlora hosted at the University of Cal-ifornia) will subscribe users to an e-mailannouncement service. In the future, dig-ital library services will include the devel-opment of personalized informationspaces that will customize the user expe-rience based on knowledge about individ-uals, their past behaviors and preferences,and their connecting technologies. Thiskind of interaction is on the researchagenda of DLI projects at institutionssuch as Cornell University.9

Learning Facilitation andCollaborative ActivitiesPublic libraries traditionally work

closely with local schools, and school andacademic libraries are directly involved inthe curricula of their parent instructions.Digital libraries as well are concernedwith helping educators make the most ofelectronic resources and services. TheUniversity of Michigan Digital LibraryTeaching and Learning Project has re-

sulted so far in online learning resourcesin a number of different subject areas forhigh school and middle school students.The Perseus Project, at Tufts University(and one of the oldest digital libraries)devotes a section of the site to “Teachingwith Perseus,” including syllabi and classnotes, comments from instructors, andteacher and student guides. Similarly, theColorado Digitization Project includes asection on “Working with Schools,” withlesson plans and links to additional toolsand resources. Classes using these re-sources could meet entirely in cyberspace,although it is likely that most of them donot. Digital libraries can, of course, sup-port collaborative research activitiesamong individuals who may be far re-moved from each other, and from thedigital library. This has been a subject ofresearch interest—for example, the Uni-versity of Michigan Digital Library siteillustrates a visual collaborative informa-tion space in research on user interfaces—but little has surfaced in actual workingdemonstrations.

Users and DL Interaction

On the other end of these services areusers, who may be a restricted community(when access is limited to authenticatedusers), or may be Internet surfers at large.For at least the primary community, con-sideration must be given to expected ac-cess capabilities. Technology and connec-tivity at the user end will vary withrespect to speed, capacity, device, andgraphic display, and this should be takeninto account in interface design. Usersmay also face physical challenges, re-stricting their ability to use a keyboard, orrequiring the use of an assistive devicethat translates information into spokenwords.10 For the most part, digital librar-ies rely on pointing devices (e.g., mice,styli, etc.) and keyboards for interaction,and are primarily designed for display ondesktop or laptop computers, although thehuge growth of interest in wireless com-puting, and the development of standardsfor transmitting and displaying Web-ac-cessible information on handheld devices,is likely to lead to more work along thelines of Cornell University’s NomadicDigital Libraries noted earlier.

In addition to accommodating techni-cal constraints and special needs in usercommunities, a digital library should ac-commodate varying levels of expertise,and should offer flexibility in methods ofexploration. Users should be able to findobjects directly (analogous to finding a

specific author or subject in a library cat-alog), and should also be able to browsesystematically. The interface, togetherwith the underlying infrastructure, inter-prets digital objects for presentation to theuser, manages the relationships amongobjects, and may have also to negotiateterms and conditions for use of objects.

Direct SearchEnd users of library online public ac-

cess catalogs (OPACs) and commercialonline secondary services (e.g., Silver-Platter) usually can choose to search bykeyword (that normally includes wordsfrom title, subject heading, and other con-tent describing fields), or by a particularfield (e.g., title, subject, author). Under“advanced” options, such systems offerBoolean searching, truncation, and per-haps proximity functions (i.e., this wordwithin so many words of this other word).Retrieval is usually based on exact matchto the query, that is, the results containexactly what the user specified. If thesystem cannot match to a query, either theuser is told that nothing exists, or an indexof alphabetically close words may be dis-played for browsing.

On the other hand, a look at the worldof Internet search engines demonstratesalmost infinite variations on the theme of“type something and we will try to findthe best answer.” Many include advancedoptions similar to those just mentioned,but retrieval is based on primarily statis-tical algorithms that take into account theamount and distribution of query words inWeb page representations (as well asmany other factors). Query words are usu-ally stemmed (suffixes are removed, andother transformations may occur), and al-most invariably something is retrieved(although in some cases the results mayhave very little relevance).

Direct search in digital libraries is usu-ally similar to the OPAC or online search-ing model, employing search templates orindex navigation tools. This takes advan-tage of the fact that digital libraries holdwell structured data and metadata, provid-ing the basis for fielded search and brows-ing through indexes. Also, many digitallibraries contain non-text objects, whichdo not lend themselves to text-based sta-tistical retrieval algorithms (althoughsome interesting work is being done inimage pattern matching). Digital librariesfrequently employ one of the prevailingstandards for managing search—these in-clude the Z39.50, structured query lan-guage (SQL), and tools developed for

388 The Journal of Academic Librarianship

Page 5: Digital libraries: an overview

Web search engine applications (e.g.,Open Text). Experimentation with sophis-ticated statistical and linguistic algorithmsis thus far usually confined to researchsettings.

Browsing and NavigationDigital libraries can take advantage of

the flexibility of electronic presentation tooffer far more in the way of browsingthan traditional library catalogs. The samecollection can be navigated under differ-ent views, depending on the nature of theinformation. The following examples il-lustrate possible approaches.

● New York Public Library (NYPL).The NYPL’s Digital Library Collec-tions include collections of stereo-scopes, prints and photographs, text,and archival finding aids. Each ofthese uses different browsing methods.In addition to direct search, the stereo-scopic collections organize images forbrowsing by theme (e.g., “Niagara inwinter”). After choosing a theme, theinitial screen describes the collectionand suggests related subject headings,and the “tour” shows nine thumbnailsto a screen, with options to view thefront, back, full color view, and cata-log record. Digital Schomburg visualand text resources (about AfricanAmericans in the 19th century) includedirect search as well as browse by title,author, and literary form for text ma-terials, and by subject category orgenre for images. Archival findingaids are presented using DynaText/DynaWeb (now owned by Enigma,Inc.), which uses the underlying EAD(Encoded Archival Description)markup to generate expandable tablesof contents for browsing.

● Alexandria Digital Library Project.The Alexandria Digital Library (ADL)at the University of California SantaBarbara was developed with DLIfunding from 1994 through 1999 (fur-ther NSF funding is now supportingthe extended Alexandria Digital EarthPrototype—ADEPT). The ADL fo-cuses on geospatially referenced data,that is, all objects in the library areassociated with one or more geo-graphic locations. Library objectsinclude data sets, gazetteers, bibliog-raphies, reference tools, aerial photo-graphs, Landsat images, and manyother types. Navigation exploits theunderlying geospatial metadata—theuser specifies location through a dy-

namic map browser, and then cansearch within location by collection,data type, subject, and a variety ofdifferent parameters depending on thecollection. Results are displayed intext descriptions as well as being footprinted on a map, which can be used tozoom in to a fine level. The ADL isalso accessible by map browsingthrough the California Digital Library.

● Berkeley Digital Library Project. Likethe ADL, the University of Californiaat Berkeley was one of the projectsfunded under the first Digital LibrariesInitiatives. The collections at Berkeleyare diverse, including botanical, zoo-logical, and geographical data andphotographs, more than 300,000 pagesof environmental documents, and tensof thousands of images, mostly of Cal-ifornia. The entire library is outlined ina “Quick Access to the Collections”matrix. All Berkeley-held informationcan be searched by keyword through asingle interface based on Berkeley’sCheshire II search engine and Javascripting. Each result is listed at thebottom of an automatically generatedcollection hierarchy, showing its loca-tion, and all levels of the hierarchy areactive links. In addition, collectionshave their own search and navigationtools, including simple search tem-plates, dynamic maps, browsable cat-egories, scannable alphabetical in-dexes for almost all searchable fields,and even a foray into image patternmatching based on “Blobworld” re-search. The TileBar interface usedwith the California environmentaldocument collection is particularly in-novative—it presents a visual methodof assessing which part of a long doc-ument is relevant to a query.

MEANWHILE , BACK IN THE FACTORY

Apart from the underlying architecture,and the development of services for users,managers of digital libraries have to copewith practical matters such as ongoingfunding and maintenance, proper manage-ment of intellectual property rights, andassessing the effectiveness of digital ser-vices with a view to continuation, expan-sion, and improvement.

Economic Support

Most large-scale digital libraries gottheir start with a combination of grantfunding and institutional support. Grants,of course, terminate, and although funded

research is usually completed by the timethis happens, it is not quite so easy toclose the doors of a funded digital libraryonce the money runs out. Like most li-braries, digital libraries grow, and even inthe case where digitization of a discretecollection may have been completed, re-sources will still be needed for ongoingmaintenance and access provision.Wealthier institutions might be able tosupport such activities in the normalcourse of business, and in some casesadditional grant monies can be obtained.For many others, cooperation is a com-mon avenue—being virtual rather thantied to a specific brick and mortar loca-tion, a digital library can be deployed by aconsortium of institutions sharing thecosts of equipment, resource acquisition,and personnel. Cooperative ventures havethe additional advantage of broadeningthe base of owned digital resources.

Other possibilities include revenuesfrom advertising (not completely unheardof on academic Web sites, although thiscan be contentious), support in the formof money or technology from informationindustry partnerships, and user charges.This last is not an attractive option forlibraries of any kind, and would requirerobust and secure transaction manage-ment. On the other hand, this type ofinfrastructure is likely to already be inplace for rights management purposes,and will become increasingly necessaryas libraries provide access to e-booksowned by publishers.

Maintenance

Maintenance of a digital library in-volves both the equipment and the collec-tion. As with any computer-based service,hardware and software upgrades andmodifications need to be carefullyplanned. In the case where that infrastruc-ture is managing one or more repositoriesof digital objects, attention must be givento the effect of any change on object stor-age and access. The problems of, andapproaches to, obsolescence managementwere mentioned earlier.

Collection management, the set of ac-tivities which is intended to ensure that alibrary’s internally held and externallyprovided resources meet the needs of itsusers, leads to weeding and acquisition.Dealing with withdrawn or newly ac-quired materials in a digital library ismore than a matter of modifying cata-loguing records and adjusting shelf space.Adding and deleting digital objects re-quires control of all versions of the object,

November 2000 389

Page 6: Digital libraries: an overview

and adjustment of pointers to objects, andis complicated by inter-object relation-ships.

Rights Management

Intellectual property rights pertain totext, images, data, and other media indigital libraries as much as in any othersetting. Managing property rights neces-sitates coping with licensing agreements,payment systems, secure transactions,user authentication, and usage tracking. Insome cases it may be difficult to deter-mine intellectual ownership—historicallyinteresting photographs found in a little-used filing cabinet may lack any identify-ing notes, but they are still someone’sproperty. Even when ownership is known,and permission is granted for use, thatpermission may be restricted (for exam-ple, only so many accesses at one time).Further complications arise when an in-tellectual object is decomposed into mul-tiple physical objects (e.g., the chapters ofa book, each as a separate object), and canbe reaggregated with other objects withtheir own permissions. It is also possiblethat rights might be granted for membersof a restricted community to use an ob-ject, but the object could be part of adigital library that is a cooperative ven-ture among a number of different commu-nities.

Clearly rights management is a chal-lenge, and our understanding of the legal,administrative, and technological aspectsof intellectual property and copyright indigital environments is still evolving. Li-brary and information science organiza-tions, including the American Library As-sociation (ALA), the Association ofResearch Libraries (ARL), and the Coali-tion for Networked Information (CNI),have been active in influencing legislationand policy development, as well as inhelping the library community at large tounderstand both national and internationalcopyright law.11

Evaluation

“If you build it, they will come.” Well,probably, but once they arrive, will theyfind what they need, easily, and will theycome back? And if not, why not? Theseare the kinds of questions that can beaddressed by systematic evaluation. Thepurposes of digital library evaluationcould be, among other thing:

● To make a case for increases in (or atthe least continuation of) financing,personnel, or equipment;

● To measure achievements againstgoals (assuming goals have been enu-merated);

● To test a system component;

● To compare several competing solu-tions to a problem; and

● To determine whether and whereproblems exist, or, to assess the suc-cess of an attempt to address a prob-lem.

Measurement implies the existence ofrulers, or metrics—to say that somethingperforms better implies that it performsbetter with respect to some measurableeffect. Evaluation might be concernedwith usability, looking at the efficiency,effectiveness, and satisfaction with whichusers can achieve specified goals. Al-though measuring efficiency might be rel-atively straightforward, there is littleagreement as to how to measure effective-ness and satisfaction in library contexts.Alternatively, evaluation might be con-cerned with the relevance of retrieved in-formation—another area of ongoing re-search and few accepted standards.Digital library evaluation is further com-plicated by a plethora of data gatheringtools (e.g., surveys, interviews, focusgroups, transaction logs, and observa-tion), channels (e.g., in person, online,“snail” mail, e-mail), and settings (e.g.,data could be captured during sessions inprototype digital libraries, or in controlledlaboratories, or in real working DLs). Addto this the very nature of the setting, andthe challenges are even more pronounced,as Ann Bishop observes.

The design and evaluation of digital libraries. . . are complicated by the newness of thesystems, their ability to integrate a range offunctions that were previously designed andevaluated separately, the heterogeneity oftheir user population, the physically distrib-uted nature of usage, the ability to fragmentand rearrange previously integrated docu-ments and images, and the rapid versioningof digital objects.12

In this scenario, it is not surprising thatconferences have been devoted to digitallibrary evaluation,13 and that D-Lib hascreated a Working Group on Digital Li-brary Metrics. In connection with a work-shop sponsored by the Working Group,Paul Kantor, who has been concernedwith performance measurement and li-braries for more than 20 years, under-scored the complexity of digital libraryevaluation.

The “performance” of a digital library is acomplex relation among inputs, outputs andconstraints. Realistic evaluation of alterna-tive technologies and designs for digital li-braries will result in a wealth of data. Typi-cally, configuration A will seem better thanconfiguration B with regard to some, but notother scales or criteria. This will be truewhether the measures are measures of input(or cost, bandwidth, operating expense) oroutput (activity, timeliness, reliability, us-ability. . . ). In such situations it is not pos-sible to arrive at a single realistic measure ofvalue, which takes all of these aspects intoaccount.14

THE OPEN ROAD

One of the problems of a “library withoutwalls” is just that—the absence of physi-cal dividing lines that separate the libraryfrom the rest of the world, and that alsogive us some sense of being able to con-trol, or at least see, our collections and ourusers. Digital libraries, and all librariesthat take advantage of digital resources,have to operate in a sometimes-nebulousspace that is populated with invisibleplayers, all of whom have some stake inDL activities. Authors (i.e., creators ofintellectual property), publishers, contentaggregators, e-book retailers, hardwareand software vendors, library administra-tors, DL researchers, and users—each hasan agenda, and each is trying to makesense (and in some cases dollars) out ofan unpredictable environment with an un-precedented rate of change. Managers ofdigital libraries have to pay proper atten-tion to the rights of content creators andsellers, figure out where to acquire con-tent from competing providers, select thebest affordable technological infrastruc-ture, placate fiscally conservative direc-tors, encourage ongoing DL research,help users find their way through complexand growing collections, ensure preserva-tion of content for future generations, andcompete against other information ser-vices for user attention.15

On the other hand, it’s a very excitingtime to be a librarian, especially in aca-demic settings, where much of the digitalaction is happening. Internet 2 opens thedoors to research in using such tools asreal-time dynamic visualization to sup-port information retrieval, or multicastingfor user interaction through audio andvideo. Remote learning is on the increase,and is obviously well served by organizedand structured libraries that are equallyoblivious to physical location. Tools thatpromote interoperability, such as theZ39.50,16 as well as work on interoper-

390 The Journal of Academic Librarianship

Page 7: Digital libraries: an overview

ability of metadata,17 point to a futurewhere it will be commonplace to usemore than one library or digital library ata time (and the distinction between thetwo will be unimportant to users). So fas-ten your digital seatbelts and keep bothhands on the wheel, because it’s going tobe an interesting ride—not one to sit backand relax on, lots of curves and fallingrocks, but a spectacular view at the nextrest area (and there is no end).

APPENDIX

RESOURCES

Links to all of the Web-accessible itemslisted below (and mentioned in the article)can be found on the author’sDigital Li-braries Web page, ,http://web.sim-mons.edu/;schwartz/mydigital.html..This page also contains a long list ofinitiatives and projects.

Monographs

● Tatjana Aparac (Ed.),Digital Librar-ies: Interdisciplinary Concepts, Chal-lenges and Opportunities: Proceed-ings of the Third InternationalConference on Conceptions of Libraryand Information Science, Dubrovnik,Croatia, May 23–26, 1999(Lovke,Croatia: Benja, 1999).

● William Y. Arms, Digital Libraries(Cambridge, MA: MIT Press, 2000).Especially useful for its historic per-spective and overview of related con-cepts; not intended as a practical guidefor DL creation.

● Donald L. DeWitt (Ed.), “Going Dig-ital: Strategies for Access, Preserva-tion, and Conversion of Collections toa Digital Format,”Collection Manage-ment, 22(3/4) (1998). Also publishedas a monograph (New York: Haworth,1999).

● Michael Lesk, Practical Digital Li-braries: Books, Bytes, and Bucks(SanFrancisco, CA: Morgan Kaufmann,1997). A good overall introduction in-cluding details on technical structuresas well as chapters on issues such asevaluation, economics, and intellec-tual property rights.

● Peter Noerr, The Digital LibraryTool Kit, 2d ed. (PDF file, March2000). Available:,http://www.sun.com/products-n-solutions/edu/librar-ies/digitaltoolkit.html. (August 25,2000). A practical well-writtenguide, exploring alternatives for the

various components of DL construc-tion, and offering decision makingadvice.

● David Stern (Ed.), “Digital Libraries:Philosophies, Technical Design Con-siderations, and Example Scenarios,”Science & Technology Libraries,17(3/4) (1999). Also published as amonograph (New York: Haworth,1999).

Journals

Although many library and informa-tion science journals and listserv listshave content related to digital libraries,the following titles have a DL focus.

● Ariadne ,http://www.ariadne.ac.uk/..E-journal, noteworthy for understand-able overviews of new technologies thathave an impact on digital libraries.

● D-Lib Magazine ,http://www.dlib.org/.. Research e-journal, also con-taining editorials, news, reviews.

● Digital Library News ,http://cimic.rutgers.edu/;ieeedln/.. IEEE e-news-letter.

[email protected] ,http://www.ifla.org/II/lists/diglib.htm.. List-serv list focusing on research.

[email protected],http://sunsite.berkeley.edu/DigLibns/.. List-serv list concerned with practical as-pects.

● International Journal on Digital Li-braries,http://link.springer.de/link/service/journals/00799/index.htm..Print scholarly journal.

● LJ Digital: Digital Libraries ,http://www.ljdigital.com/articles/infotech/digitallibraries/digitallibrariesarchive.asp.. The archives of RoyTennant’s e-column.

● National Digital Library Periodic Re-ports ,http://lcweb.loc.gov/ndl/per.html.. Library of Congress e-news-letter.

● RLG DigiNews,http://www.rlg.org/preserv/diginews/.. Research Librar-ies Group e-newsletter.

Web Sites

● ARL Digital Initiatives Database,http://www.arl.org/did/.. A registryof descriptions of digital initiatives,maintained in collaboration by theUniversity of Illinois at Chicago andthe Association of Research Libraries(ARL).

● Arts and Humanities Data Service(AHDS) ,http://ahds.ac.uk/.. Al-though primarily a gateway to arts andhumanities networked resources, theAHDS also maintains an excellent col-lection of general resources, includingInformation Packs, Guides to GoodPractice, a series on Managing DigitalCollections, and other publications.

● D-Lib ,http://www.dlib.org/.. Oneof the first, and still the foremost DLWeb site, includingD-Lib Magazine,D-Lib Working Groups, other activi-ties, and an excellent ready referencecollection.

● DELOS,http://www.iei.pi.cnr.it/DE-LOS/.. The ERCIM Working Groupconcerned with digital libraries haspages on their own work as well asgeneral DL conferences, journals,Websites, and so forth.

● Digital Libraries Initiative, Phase 2,http://www.dli2.nsf.gov/.. NSF homefor information on the national DigitalLibraries Initiatives program, co-spon-sored by a number of different fundingagencies.

● Digital Libraries Resources,http://www.albany.edu/;ls973/digital.html..Long and well annotated list of re-sources in various categories, main-tained by Lorre Smith.

● Digital Library Federation ,http://www.clir.org/diglib/dlfhomepage.htm.. Under the umbrella of theCouncil on Library and InformationResources (CLIR), the Federation in-cludes the Library of Congress (LC),the National Archives and RecordsAdministration (NARA), the NewYork Public Library (NYPL), and anumber of university libraries. The siteincludes a number of presentations(slideshows and papers) on fundamen-tal concepts and issues.

● Digital Library Information and Re-sources ,http://www.canis.uiuc.edu/;bgross/dl/.. Very current and wellannotated resource list, kept up to dateby Ben Gross, doctoral student andvisiting scholar at the University ofIllinois, Urbana Champaign.

● Digital Library Information Resources,http://sunsite.berkeley.edu/Info/..Rich collection, well-annotated, fromthe Berkeley Digital Library SunSITE.

● Digital Toolbox ,http://coloradodigi-tal.coalliance.org/toolbox.html..From the Colorado Digitization

November 2000 391

Page 8: Digital libraries: an overview

Project, including general resources,as well as pages on administrative,technical, metadata, intellectual prop-erty, funding, and vendors.

● IFLA Electronic Collections and Ser-vice ,http://www.ifla.org/II/.. TheInternational Federation of LibraryAssociations and Institutions (IFLA)maintains 30 or more Web pages thatare well organized guides to extensivecollections of library-related links.Five or six of these concern DLs.

● PADI: Preserving Access to DigitalInformation ,http://www.nla.gov.au/padi/.. A National Library of Austra-lia (NLA) subject gateway to digitalpreservation resources.

Regular Conferences

● Advances in Digital Libraries (ADL),http://cimic.rutgers.edu/;adl/..Sponsored by the Technical Commit-tee on Digital Libraries (TCDL) of theIEEE Computer Society. Selected pa-pers published in theInternationalJournal of Digital Libraries.

● American Society for Information Sci-ence (ASIS) Annual Meeting,http://www.asis.org/.. Usually contains pa-pers and sessions of relevance todigital libraries. Print proceedings; ab-stracts and some papers online at theASIS Web site.

● European Conference on Digital Li-braries (ECDL) ,http://www.iei.pi.cnr.it/DELOS/DLC/dlc.htm.. Sup-ported by ERCIM through its DELOSWorking Group.

● International Conference on DigitalLibraries (DL) ,http://www.acm.org/pubs/contents/proceedings/series/dl/.. Sponsored by several Special In-te res t Groups (SIGs) o f theAssociation for Computing Machinery(ACM). Print proceedings; full textonline in the ACM Digital Library;some content at the site for each an-nual conference.

● International ACM SIGIR Conferenceon Research and Development in In-formation Retrieval,http://www.acm.org/pubs/contents/proceedings/series/sigir/.. Always has content related todigital libraries—SIGIR is the SpecialInterest Group on Information Re-trieval. Print proceedings; full text on-line in the ACM Digital Library; somecontent at the site for each annual con-ference.

NOTES AND REFERENCES

1. Candy Schwartz, “LIS 530Z, DigitalLibraries: Definitions” (January 28,2000). Available: http://web.simmons.edu/;schwartz/530-defs.html (August 25,2000).

2. Digital Library Federation, “A WorkingDefinition of Digital Library” (April 21,1999). Available: http://www.clir.org/diglib/dldefinition.htm (August 25, 2000).

3. Nabil R. Adam & Yelena Yesha, “An-nouncement and Call for Papers:Interna-tional Journal of Digital Libraries–JODL” (1996). Available: http://www.springer-ny.com/compsci/ i jodl.htm(August 25, 2000).

4. Stephen Pinfield, Jonathon Eaton, Cathe-rine Edwards, Rosemary Russell, AstridWissenburg, & Peter Wynne, “Realizingthe Hybrid Library,”D-Lib Magazine4(9)(1998). Available: http://www.dlib.org/dlib/october98/10pinfield.html. Pinfield etal. quote Chris Rusbridge, “Towards theHybrid Library,” D-Lib Magazine, 4(7)(1998). Available: http://www.dlib.org/dlib/july98/rusbridge/07rusbridge.html(August 25, 2000).

5. Digital Libraries Initiative Phase One.(1999, April 29). Available: http://www.dli2.nsf.gov/dlione/ (August 25, 2000).

6. Clifford Lynch, “Canonicalization: AFundamental Tool to Facilitate Preserva-tion and Management of Digital Informa-tion,” D-Lib Magazine5(9) (1999). Avail-able:http://www.dlib.org/dlib/september99/09lynch.html (August 25, 2000).

7. A good summary of the handles systemcan be found in William Y. Arms, “KeyConcepts in the Architecture of the DigitalLibrary,” D-Lib Magazine 1(1) (1995).Available: http://www.dlib.org/dlib/July95/07arms.html (August 25, 2000).

8. See, for instance, metadata types dis-cussed in Murtha Baca, editor,Introduc-tion to Metadata: Pathways to Digital In-formation (Los Angeles, CA: J. PaulGetty Trust, 1998).

9. Nomadic Digital Libraries, a recentproject of the Cornell Digital Library Re-search Group, is concerned in part withsynchronizing personal data and digitallibrary data to create personalized infor-mation spaces that are “responsive to anindividual’s current environment, includ-ing differing computing devices, commu-nications capabilities, andinformationneeds[italics mine].” Nomadic digital li-braries: Proposal from Cornell Universityto Intel Corporation (September 20,1999) . Ava i lab le : h t tp : / /www.cs.cornell.edu/wya/nomad/proposal.html(August 25, 2000).

10. Laura Reiner’s Web site on Adaptive &Assistive Technology in Libraries offersa glimpse at the world of devices thatmay be at the user end of a digital library,http://web.simmons.edu/;reiner/

adapt/index.html., and the World WideWeb Consortium (W3C) leads a WebAccessibility Initiative ,http://www.w3.org/WAI/. (August 25, 2000).

11. Several Web sites provide excellent col-lections of resource links on copyright inthe library context, including texts of le-gal documents as well as explanations inlay terms. A few of the best are ARL’sCopyright and Intellectual Propertypage,http://www.arl.org/info/frn/copy/copytoc.html., the Berkeley Digital Li-brary’s SunSITE onCopyright, Intellec-tual Property Rights, and LicensingIssues ,http://sunsite.berkeley.edu/Copyright/., Yale University Library’sLib l icense ,ht tp: / /www. l ibrary.yale.edu/;llicense/index.shtml., and awonderful “but what if” Copyright FAQ,http://ahds.ac.uk/bkgd/copyright-faq.html. co-produced by the Arts andHumanities Data Service (AHDS) andthe Technical Advisory Service for Im-ages (TASI). For international resources,the International Federation of LibraryAssociations (IFLA) maintainsInforma-tion Policy: Copyright and IntellectualProperty ,http://www.ifla.org/II/cpyright.htm..

12. Ann Peterson Bishop, “Working To-wards an Understanding of Digital Li-brary Use,”D-Lib Magazine, 1 (October1995). Available: http://www.dlib.org/dlib/october95/10bishop.html (August25, 2000).

13. In 1995, NSF and the Graduate School ofLibrary and Information Science, Uni-versity of Illinois at Urbana-Champaign,sponsored the 37th Allerton Institute,with the title “How We Do User-Cen-tered Design and Evaluation of DigitalLibraries: A Methodological Forum”(1995). The 1996 Allerton Institute was afollow-up on the same general theme.The programs and some papers are avail-able at: http://edfu.lis.uiuc.edu/allerton/. Atthe same institution, the 35th Clinic on Li-brary Applications of Data Processing,held in 1998, had the theme “Successesand Failures of Digital Libraries” (the pro-ceedings were published in mid-2000: Su-san Harum & Michael Twidale (Eds.),Successes and Failures of Digital Librar-ies, Urbana, IL: Graduate School of Li-brary and Information Science, Universityof Illinois at Urbana-Champaign, 2000).

14. Paul Kantor, “Dealing with ComplexEvaluations of Digital Libraries” [Ab-stract for a presentation at a Workshopon Digital Library Metrics, June 27,1998]. Available http://www.dlib.org/metrics/public/6–98-workshop/presenta-tions.html (August 25, 2000).

15. Ronald J. Heckart discusses the impor-tance of paying attention to tools devel-oped for user interaction in e-commercesettings in “Imagining the Digital Li-brary in a Commercialized Internet,”

392 The Journal of Academic Librarianship

Page 9: Digital libraries: an overview

Journal of Academic Librarianship25(1999): 274–280.

16. Distributed search across multiple data-bases using Z39.50 can be seen inprojects such as COPAC,http://www-.copac.ac.uk/. and Europagate,http://europagate.dtv.dk/., as well as commer-cial products like OCLC’s SiteSearch,http://www.ocl.org/oclc/menu/site.htm.andBookwhere2000,http://www.

bookwhere.com/.. Many digital librar-ies use Z39.50 compliant systems.

17. For example, the School of InformationManagement and Systems at UniversityCalifornia, Berkeley, supports a researchproject on Search Support for UnfamiliarMetadata Vocabularies,http://ww-w.sims.berkeley.edu/research/metadata/index.html., and directions in handlingdiffering metadata requirements are re-

ported in David Bearm, Eric Miller, God-frey Rust, Jennifer Trant, and Stuart Wei-bel, “A Common Model to SupportInteroperable Metadata: Progress Reporton Reconciling Metadata Requirementsfrom the Dublin Core and INDECS/DOICommunities,” D-Lib Magazine, 5(1)(1999). Available: http://www.dlib.org/dlib/january99/bearman/01bearman.html(August 25, 2000).

November 2000 393