Digital Libraries: An Overview President’s Board Room, 210 Burruss Jan. 18, 1999 Edward Fox, John...
-
Upload
laureen-sanders -
Category
Documents
-
view
214 -
download
0
Transcript of Digital Libraries: An Overview President’s Board Room, 210 Burruss Jan. 18, 1999 Edward Fox, John...
Digital Libraries: An Overview
President’s Board Room, 210 BurrussJan. 18, 1999
Edward Fox, John Carroll, Gail McMillan,Clifford Shaffer, Robert Williges
DigitalLibraries
DefinitionsWhy of global interest?Why of interest in universities?Why of interest in computing?NSF Digital Libraries Initiative
DLs: Definitions Super information systems Knowledge management systems with persistence, organization,
and usability Collections of digital objects with expanded services, for
distributed user communities, without limitations of space, time, physical copies
Latest implementation of visions of Bush, Licklider, Nelson, and previous scholars
Systems, services, institutions, enterprises, and projects of the digital library community
DLs: Why of Global Interest? National projects can preserve antiquities and heritage:
cultural, historical, linguistic, scholarly Knowledge and information are essential to economic and
technological growth, education DL - a domain for international collaboration
– wherein all can contribute and benefit
– which leverages investment in networking
– which provides useful content on Internet & WWW
– which will tie nations and peoples together more strongly and through deeper understanding
Why of Interest in Universities? Source of funding for research groups DLs can be used for research and teaching
– California Digital Library– NSF’s Science, Mathematics, Engineering and Technology
Education (SMETE) initiative to build a digital library to support undergraduates
DLs can support outreach, dissemination– Scholarly Publishing: ETDs, e-journals, tech reports– Extension information, public relations, discourse, ...
Students can find jobs in area if trained suitably
Interested Universities? 1994 NSF DLI had 73 proposals, funded 6
– 3 in CA: Stanford, Berkeley, UC Santa Barbara
– CMU, U. Illinois, U. Michigan
1998 NSF DLI2 had over 200 proposals, will fund a number of small, medium, + a few large– large proposals from: CA, Cornell, UNC CH, ...
Many universities have R&D teams, some with substantial internal funding (e.g., Columbia):– NC State received funds for a staff of 11
– U. Michigan, Illinois have big teams / spinoffs (e.g., JSTOR)
Consortia: DL Federation, MESL, ...
Why of Interest in Computing? Presents exciting challenges in key fields like: database
management, multimedia, hypertext, information retrieval Efficiency requires advances in, e.g.,
– software: programs, algorithms– hardware: storage, computers– networking: faster, more reliable, quality multimedia
Effectiveness requires advances in, e.g.,– HCI (ex., visualization, DLs embedded in dist. ed.)
Computing can help others who want DLs
DIGITAL LIBRARIES INITIATIVEFunded through a joint initiative of:
National Science FoundationDefense Advanced Research Projects Agency
National Aeronautics and Space Administration
Stephen M. GriffinDivision of Information and Intelligent Systems
National Science Foundation
http://dli.grainger.uiuc.edu/national.htmNational Synchronization Home Page
Computing (flops)Digital content
Com
mun
icat
ions
(ban
dwid
th, c
onne
ctiv
ity)
Locating Digital Libraries in Computing andCommunications Technology Space
Digital Libraries technologytrajectory: intellectualaccess to globally distributed information
less more
Core Sponsors: NSF, DARPA, NLM, LoC, NASA, NEH ~$8-10 million/yr for 4-5 years (beginning FY98) sponsor a full-spectrum of activities
– fundamental research, content & collections development, domain applications, testbeds, operational environments, new resources for education and preserving America’s cultural heritage
address topics over entire DL lifecycle– information creation, dissemination, access, use, preservation, impact, contexts
implement a modular, open program structure– add new sponsors, performers, projects at any time
Digital Libraries Initiative - Phase 2
Program Goals: new DL research, technologies and applications to advance
the use of distributed, networked information of all types around the nation and the world
Goals for the Future Gather information and build collections
(to understand the incompleteness of our knowledge)
• Create new communities(to communicate and collaborate)
• Make technology disappear(from our awareness and experience)
Prior/CurrentGrants
NCSTRLENVISIONEICSTC, CRIM, NRGNDLTD
NCSTRL: CS TECHNICAL REPORTS
CS TR project supported by ARPA (Berkeley, CMU, Cornell, MIT, Stanford)
WATERS project funded by NSF and led by ODU, SUNY Buffalo, UVA, VPI&SU
Merger summer 1995 to www.NCSTRL.org (Networked CS Tech Report Library)
Most large departments now have joined “Central” server: UVA, “backup”: VPI&SU 1998 extension to preprint service, with LANL
Larger NSF Grants 1991-1993 ENVISION: one of first DL projects 1993-1998 “Interactive Learning with a Digital Library
in CS”: http://ei.cs.vt.edu (11M accesses to over 45 courses)
1998-2000 “Computer Science Teaching Center” by NSF and ACM Education Board: http://ei.cs.vt.edu/~cstc/
1998-2000 “Curriculum Resources in Interactive Multimedia”: http://ei.cs.vt.edu/~crim/
ENVISION
A User-Centered Database from the Computer Science Literature (1991-93)
Collected bib. data, converted to SGML Converted typesetter data to SGML Scanned thousands of page images MARIAN search engine (also applied to the Virginia
Tech library catalog) used as part of a prototype object-based DL, with tailored visualization interface (L. Nowell dissertation)
Envision Results Window
NSF Education Innovation (EI)
NSF “Interactive Learning with a Digital Library in Computer Science” (1993-98)
45 online courses, 100+K accesses/wk, plus: DL courseware, overall EI project pages
Tools: SWAN (visualization), QUIZIT Evaluation
– traditional– network logging and analysis– tools for visualization
PAPERLESS COURSES
CS1604: Introduction to the Internet
CS3604: Professionalism in Computing
CS4624: Multimedia, Hypertext and Information Access (MHIA)
CS5604: Information Storage and Retrieval
CS6604: Digital LibrariesSelf Study Course on Digital Libraries
CS Teaching Center (CSTC) Instead of building large, expensive multimedia packages,
that become obsolete and are difficult to re-use, concentrate on small knowledge units.
Learners benefit from having well-crafted modules that have been reviewed and tested.
Use digital libraries to build a powerful base of support for learners, upon which a variety of courses, self-study tutorials & reference resources can be built. (See NSF SMETE-Lib Study at http://www.dlib.org/smete/public/smete-public.html)
CSTC -> CRIM CSTC will have a variety of focused centers so that different
types of resources can be collected, tested, and suitably packaged:– laboratory exercises, activities, assignments (CNJ)– visualizations/visualization tools (U Ill. Springfield)– interactive multimedia resources (VT & GWU)
CRIM focuses on interactive multimedia
– repository of materials (like SMETE-lib)
– curriculum for courses in CS and other areas, sequences, undergraduate and graduate programs
Network Research Group
NSF 3 year grant on WWW logging, characterization, and optimization: Abrams, Fox, Pollard (CNS)
Core member of Web Characterization Activity of World-Wide Web Consortium
Providing DL to support WCA:– logs– tools– publications
NDLTDNews, Background/HistoryVision, Benefits, Approach,
PossibilitiesConcerns, Problems, OppositionSolutions, Implementation, Results,
Plans
What led to today’s situation? 1987 mtg in Ann Arbor: UMI, VT, … 1992 mtg in Washington: CNI, CGS, UMI, VT and 10
universities with 3 reps each 1993 mtg in Atlanta to start Monticello Electronic
Library (MEL): SURA, SOLINET 1994 mtg in Blacksburg re ETD project: std of PDF +
SGML + multimedia objects 1996 funding by SURA and US Dept. of Education
(FIPSE) for regional, national projects (NDLTD)
Aiding universities to enhance grad educ., publishing and IPR efforts: to help improve the availability and content of theses and dissertations
Educating ALL future scholars so they can publish electronically and effectively use digital libraries (i.e., are Information Literate and can be more expressive)
Demonstrating how, for other organizations
What are we doing?
A Digital Library Case Study
Electronic theses and dissertations (ETDs)
Submission: http://etd.vt.edu
Collection: http://www.theses.org
Networked Digital Library of Theses and Dissertations (NDLTD) http://www.ndltd.org (formerly “National” because of Fed. funds, before international members started joining)
Something for EveryoneStudents - contribute -> gain acclaimUniversities - join -> help your students, gain
increased DL experience + visibilityResearchers - use, encourage -> contentPublishers - liaise, support -> have more
knowledgeable authors + backup detailsDL enthusiasts - adapt resources / ideas -> have
exemplary pilot / model project
What are the key ideas? People can switch to electronic documents
– Becoming more expressive with hypermedia Mandating ETDs will change all future
scholarship (for 100’s of thousands/yr) Scalability
– Empower authors to submit to DL, as a natural part of the educational process
– Study workflow & apply automation, so institutions streamline processing and build their part of the DL
– Federate along most suitable cultural/political lines
Key Ideas: Networked infrastructure
Scalability
Education is the rationale
University collaboration
Workflow, automation
Authors must submitMaximal access
PDF, SGML, MMStandards
Federated search
8th graders vs. grads
MARC, DC, URNs
User Search Support
NDLTD W orld Federated Search
V irg in ia Tech ...(un iv)
U S universities(30 other)
Austra lia(7 un iv group)national funds
Portugese N L ...(national lib )
R ussia(N W region)proposals in
G erm an groupbased in
Berlin
UserInterface
Note: There are 51 members worldwide, growing
ETD Initiative (and UMI)
StudentsLearn aboutDL, EPub
TDsbecome more
expressive
N. Amer. (T)Ds areaccessible, archived
Global TDsbecome more
accessible,archived
UMI
Universities
Support Services Developed
CD/WWW site with > 300M: student guidelines, listservs, FAQs, press info, multimedia training materials
Automated submission system SGML DTD for ETDs, SGML to HTML (web
generator)Donations: Adobe, MicrosoftEvaluation: instruments, analysis
NDLTD Future Work Recruit more members, support current ones: May 17-
18 Blacksburg workshop Interoperability tests among universities and with UMI
to provide integrated services Study with testbed that emerges, to improve
information retrieval, browsing, interface, and other types of user support
Evaluation, improving learning experience, spread to worldwide initiative, sustainable support and coordination
Pending/FutureProposals
NUDL, NATO, other int’lPetaPlex (RI, MRI)EI2, Smaller DLI2SMETE-libDL4U
NUDL/NATO
July 1998: Proposal to help Russia establish NDLTD, with assistance from US, Portugal
1/15/99: Proposal to NSF under DLI2 international program for $.5M– Fox, Kleiner, McMillan, Eaton– Partners: UK (2) , Singapore, Russia, Korea, Germany, plus
Iberoamerican group (Spain, Portugal, Argentina, Brazil, Chile)
– Multilingual search, multimedia submissions, requirements/usability, ...
PetaPlex
High-performance “superstore” 1000 to 1,000,000 gigabytes (terabyte/petabyte) Parallel computer, video WWW server, … Part of NSF CISE proposal of 11/98 (40Tbytes) Preproposal submitted for NSF MRI
– CWT: wireless connections for flexibility– CPES: low cost power– DLRL/ITIC@VTZ (CS): software, applications, experiments
for digital library server
EI2, Other DLI2
EI2: successor to 1993-98 effort, to be led by Osman Balci, focusing on DL of models and simulations to help learners
DLI2 second competition supports small, medium and large grants– Recommender for a library of software– Experiments using PetaPlex for DL applications– Possible joint efforts, e.g., evaluation methods with UNC
Chapel Hill
SMETE-lib
Central coordination: manage NSF’s DL for undergraduates
Content coordination: help collect and provide access to body of material by topic or genre– CS, Math, …– Models/simulations, algorithms/programs,
laboratory materials, … Partnering necessary: OCLC? ...
DL4U
Submitted 7/15/98 for $4M (5 years) Virtual corporation (similar model to efforts of ECpE
but with different type of activity) 5 Divisions
– User Support: Local, Remote– Collection and Testbed– User Interface & Environments– Evaluation & Usability– Business
DL4U Organization
Students - 3UEaton ,
Seam ans,H usser
Faculty - 5UH atfie ld, Luke,M oore , M osser,
Stone ,W ildm an; G3
B EV - 4U(B lacksburg
Elec tronic V illage)C ohill
Local12U
N D LTD - 6UEaton ,
M cM illan,Fox
N avy3U
B alc i,N ance; N PS
H eritage2U
Fox;B atte lle
Singapore - 2UFox; K R D L,N LB , N TU,SingaR EN
C hem is try3U , 1G (D ow)
G andour,D essy; D ow
C om puting - 3U,1G (N SF), Fox; G 3:C STC + in terac tivem ultim edia (C R IM )
R em ote19U,
2G (o ther funds)
U ser Support31U,
2G (o ther funds)
C ollec tionD evelopment - 5U,
1G , M cM illan;ISO G EN , N TU
O ntologies&C ata loging5U ,1G (V T)-Fox,France,
H usser; ISI, OC LC,V TLS, N atLibPortugal
Plus O ther Partners:C hungnam,
G yeongsang,N TU,St.Pe tersburgStTechU
D ocs & Search - 8U,1G (N LM ),1G (V TLS),
Fox,France ,Pow ell; G 3,IB M ,K RD L,O C LC ,V TLS
Theore tica lFoundations
4U , 1G - W atsonFox; Batte lle , K R DL
C ollec tion & Testbed22U , 2G,
3G (o ther funds)
Environm entD evelopm ent
6U , 1GShaffer; FX PA L, G3
Partic ipa tory D esign4U , 1G
C arroll, R osson;U D LA, N TU
C ollabora tion &D ecis ion Support
4U , 1GR osson, Kle iner
V irtua l Environm ents &V isua liza tion3U , 1G - H ix,
France; B atte lle , N TU
Sim ula tion4U , 1G
B alci, N ance;N PS
U ser Interface &Environm ent - 21U , 5G
Labora toryStudies5U , 1G
W illiges ; SCT
Form ativeEvaluation
5U , 1GH artson; Firs t V irtual
SurveyR esearch
3U , 1 s taffB ayer; N TU
Policy &Socia l Issues
3U , 1GD ow ney
Evaluation & U sability16U , 3G , 1 s taff
H um anR esources3U , 1 /3G
Public Rela tions &M arketing4U , 1 /3G
Inform atio nSys tems3U , 1 /3G
B usiness10U , 1G
C o-H eads: Sheetz , S irgy
Projec t D irec tor (C EO)Projec t M anager (C O O)
100U, 1 sta ff, 11G,5G (o ther funds)
DL4U Organization - 1
Students - 3UEaton,
Seam ans,H usser
Faculty - 5UH atfie ld, Luke,M oore , M osser,
S tone ,W ildm an; G3
B EV - 4U(B lacksburg
Elec tronic V illage)C ohill
Loca l12U
N D LTD - 6UEaton,
M cM illan,Fox
N avy3U
B alc i,N ance; N PS
H eritage2U
Fox;B atte lle
Singapore - 2UFox; K R D L,N LB , N TU,SingaR EN
C hem is try3U , 1G (D ow)
G andour,D essy; D ow
C om puting - 3U,1G (N SF), Fox; G 3:C STC + in terac tivem ultim edia (C R IM )
R em ote19U,
2G (o ther funds)
U ser Support31U,
2G (o ther funds)
Pro jec t D irec tor (C EO)Projec t M anager (C O O)
100U, 1 sta ff, 11G,5G (o ther funds)
DL4U Organization - 2
C ollec tionD eve lopment - 5U,
1G , M cM illan;ISO G EN , N TU
O nto log ies&C ata log ing5U ,1G (V T)-Fox ,France,
H usser; ISI, OC LC,V TLS, N atLibPortuga l
P lus O ther Partners:C hungnam,
G yeongsang,N TU,St.Pe te rsburgStTechU
D ocs & Search - 8U,1G (N LM ),1G (V TLS),
Fox ,France ,Pow ell; G 3,IB M ,K RD L,O C LC ,V TLS
Theore tica lFounda tions
4U , 1G - W atsonFox; Ba tte lle , K R DL
C ollec tion & Tes tbed22U , 2G,
3G (o ther funds)
Pro jec t D irec tor (C EO)Projec t M anager (C O O)
100U, 1 sta ff, 11G,5G (o ther funds)
DL4U Organization - 3
Environm entD eve lopm ent
6U , 1GShaffer; FX PA L, G3
Partic ipa tory D es ign4U , 1G
C arroll, R osson;U D LA, N TU
C ollabora tion &D ecis ion Support
4U , 1GR osson , Kle iner
V irtua l Environm ents &V isua liza tion3U , 1G - H ix,
France ; B atte lle , N TU
Sim ula tion4U , 1G
B alci, N ance;N PS
U ser Inte rface &Environm ent - 21U , 5G
Projec t D irec tor (C EO)Projec t M anager (C O O)
100U, 1 sta ff, 11G,5G (o ther funds)
DL4U Organization - 4
Labora toryStudies5U , 1G
W illiges ; SCT
Form ativeEva lua tion
5U , 1GH artson ; Firs t V irtual
SurveyR esearch
3U , 1 s ta ffB ayer; N TU
Policy &Socia l Issues
3U , 1GD ow ney
Evalua tion & U sability16U , 3G , 1 s ta ff
Pro jec t D irec tor (C EO)Pro jec t M anager (C O O)
100U, 1 sta ff, 11G,5G (o ther funds)
DL4U Organization - 5
H um anR esources3U , 1 /3G
Public Rela tions &M arke ting4U , 1 /3G
Inform atio nSys tems3U , 1 /3G
B usiness10U , 1G
C o-H eads : Sheetz , S irgy
Projec t D irec tor (C EO)Projec t M anager (C O O)
100U, 1 sta ff, 11G,5G (o ther funds)
DL4U VT InvestigatorsPI
Edward A. Fox
Co-PIs John CarrollH. Rex HartsonClifford ShafferRobert Williges
Investigators Osman BalciAlan BayerAndrew CohillRaymond DessyGary DowneyJohn EatonRobert France
DL4U VT Investigators cont’dRich GandourLen HatfieldEileen HitchinghamDeborah HixJohn HusserBrian KleinerTimothy LukeGail McMillanJohn MooreDaniel MosserRichard NanceJames PowellMary Beth RossonNancy SeamansSteven SheetzJoseph SirgyNick StoneLayne WatsonTerry Wildman
DL4U PartnersPartneror Supporter
LowValue
FullValue
Battelle 1,173,000 1,855,000ChungnamNat.U.
360,000 600,000
Dow 495,000 750,000First Virtual 102,830 102,830FXPAL 125,000 625,000G3 Systems 100,000 1,000,000Heritage 17,320 17,320IBM 120,000 450,000ISOGEN 100,000 100,000ITIC 100,000KRDL 470,000 470,000OCLC 245,000 245,000Naval Post.School
55,760 55,760
SCT 247,500 247,500Solinet 20,000 20,000VTLS 292,711 292,711Total 3,924,121 6,931,121
DL4U Activities
DL4U Environments
DL4U Theory
Theory informing DL4U.
Conclusions
Spread of efforts on campusIntegrative theme for
collaboration
Spread of Efforts on Campus Digital Library Research Laboratory Departments: CS, ISE Centers: Internet TIC, HCI Center, CWT, CPES Scholarly Communications Project (McMillan) Distributed Information Systems (Powell) Center for Applied Technologies in the Humanities (CATH:
Mosser, Hatfield) Digital Discourse Center (Luke, Hatfield) VT Multimedia Users Group
Integrative Theme for Collaboration
To compete with other universities re DL & to serve our community effectively, collaboration is necessary in the DL area, maybe via ACITC:
1/3 of ACITC is for Electronic Library Specific programs in ACITC include
– DLRL– HCI Center– CATH– New Media Center