Models for Digital Libraries CSC 9010 - week 2 The 5S model is the work of Edward A. Fox and his...

23
Models for Digital Libraries CSC 9010 - week 2 The 5S model is the work of Edward A. Fox and his students at Virginia Tech. These slides rely heavily on that work.

Transcript of Models for Digital Libraries CSC 9010 - week 2 The 5S model is the work of Edward A. Fox and his...

Models for Digital Libraries

CSC 9010 - week 2The 5S model is the work of Edward A. Fox and his students

at Virginia Tech. These slides rely heavily on that work.

Schedule notes

• We will briefly review the slides from week 1 to be sure all is well.

• I will be traveling again next week. I apologize for the disruption. This is the last trip this semester that will affect the class.

• I will have an online class for you, so we will just have a week of distance learning.

Week 2 goals

• Discuss reading of “As we may think”• Review of points in the discussion of What is a

Digital Library?– You read the definition. I hope you gave it some

thought. Let’s explore a bit more now.

• Introduce a formal model of digital libraries• Use the 5 S model to direct thinking about the

design of a DL• Briefly introduce systems to be used

– More will come about the computer systems and the software available. Start with your own goals.

Our systems

• Several linux machines, remotely accessible

• Bare machines with just basic system software.

• We will install apache and the rest of the web infrastructure, as well as the DL software.

• Detailed instructions will be available next week or the week after that.

Using your own?

• If you prefer to use your own computer for the class projects, that is fine IF– (Required) It can be put online so that others can

access the DL you produce during the semester– (Preferred, but optional) You will agree to transfer

your DL to a VU system for long term demonstration use if requested.

• The best projects are good inspiration for other students and we would like your work to remain visible if it is a good project that might interest other people.

Discussion - Reading 1

• “As we may think” by Vannevar Bush• Published: July 1945• Had been part of the Manhattan Project,

which produced the atomic bomb• After the war, he wanted to direct attention to

other possibilities for science• His work led directly to the establishment of

the National Science Foundation

Concern with access to knowledge

• Mendel's concept of the laws of genetics was lost to the world for a generation because his publication did not reach the few who were capable of grasping and extending it; and this sort of catastrophe is undoubtedly being repeated all about us, as truly significant attainments become lost in the mass of the inconsequential.

Memex - Hypermedia

• It affords an immediate step, however, to associative indexing, the basic idea of which is a provision whereby any item may be caused at will to select immediately and automatically another. This is the essential feature of the memex. The process of tying two items together is the important thing.

memex• Vannevar Bush’s vision

– What did you notice about this article -- style or content or background or anything else.

– How far have we come?– Did the article suggest anything you would not want to see

happen?

Image source:kelty.rice.edu/375/images/memex/camera.jpghttp://www.knowledgesearch.org/presentations/etcon/images/memex.gif

MyLifeBits

• Gordon Bell and Microsoft• http://www.guardian.co.uk/science/story/0,3605,1674359,00.html

“Gordon Bell doesn't need to remember, but has no chance of forgetting. At the age of 71, he is recording as much of his life as modern technology will allow, storing it all on a vast database: a digital facsimile of a life lived.

If he goes for a walk, a miniature camera that dangles from his neck snaps pictures every minute or so, immediately committing the scene to a memory built not of neurons but ones and noughts. If he wanders into a cafe, sensors note the change in light, the shift of temperature and squirrel the information away. Conversations are recorded and steps logged thanks to a GPS receiver carried with him.”

How does this compare to Vannevar Bush’s vision?

Related work

• Walden’s Path– http://www.csdl.tamu.edu/walden/– When the user is building a trail, he names it, inserts the name in

his code book, and taps it out on his keyboard. - Vannevar Bush “As We May Think”

– System used by itself or as a service within a digital library– Allows a user to make a path through a set of related

resources and save the path for reuse at a later time.• Used to allow a teacher to “blaze a trail” through a collection of

materials to help students find their way from a starting point to a goal.

• Also for recording personal trips through a collection of material to be revisited.

Moving Forward

• Last week – Looked at what a library is

• Now– How do we translate that to a digital entity?

• Information resources, including digital libraries, are very complex systems. – A formal model helps to capture the essence of the

system and give special attention to specific areas– The model also allows developers of digital libraries to

have a check list of areas to consider and develop well.

The 5S model - An informal summary

• Streams– The flow of information in various formats

• Structures– Organizational aspects of the DL

• Spaces– Views of components; real or abstract images

• Scenarios– Services and behaviors

• Societies– Communities and relationships among them

5S summaryModel Primitives Formalisms Objectives

Stream Text; video, audio, software program

Sequences, types Describes properties of the DL content, encoding and textual material or particular forms of multimedia data.

Structure Collection, catalog; hypertext; document; metadata; organizational tools

Graphs; nodes; links; labels; hierarchies

Specifies organizational aspects of the DL content

Space User Interface; index; retrieval model

Sets; operations; vector space; measure space; probability space

Defines logical and presentational views of several DL components

Scenarios Service, event; condition; action

Sequence diagrams; collaboration diagrams

Details the behavior of DL services

Societies Community; managers; actors; classes; relationships; attributes; operators

Object-oriented modeling constructs; design patterns

Defines managers responsible for running DL services; actors that use those services, and relationships among them

Source: http://www.dlib.vt.edu/projects/5S-Model/

Etana - A DL for archeology

An example application of 5S - Etana: A DL for an archeological site

Text Video Audio

*Site *Sub-partition *Container *Artifact*LocusRegion

Taxonomies

Temporal Artifact-specific

Space model

Structuremodel

Metadata

Drawing Photo 3DStreammodel

*Partition

Society model

Archaeologist

General public

Geographic space

Service Manager

Information Satisfaction

Value added

Repository buildingScenario

model Services

Domain specific

User interface Metric space

Spatial

Source: E. A. Fox http://feathers.dlib.vt.edu/

An exercise - Subjects of interest for creating a DL:

• Indian Music• Video games• Photos• Books• Animals• Cultures• National Parks

• Example to use– Music -1– Video games - 3– Photos - 2– Books -2– Animals 3– Cultures - 3– National Parks 2

Applying the model, informallySubjects from class interest list• Stream - what types of data? Gif, jpg, avi? • Structure - How are the elements organized? Is there a

hierarchy? Are there multiple structures?• Spaces - How will we index the items? How will we

divide them into related groups• Scenarios - what services will we provide? What

information do we need to provide those services?• Societies - who is the library intended to serve?

Remember to include agents and other processes as well as users.

In your group, choose one or the other. Start with stream, scenarios,

societies.

More formally: Definitions

• Definition: A stream is a sequence whose co-domain is a non empty set.

• Definition: A structure is a tuple (G, L, F) where G = (V,E) is a directed graph with vertex set V and edge set E, L is a set of label values, and F is a labeling function.

Definitions, cont’d

• Definition: A space is a measurable space, measure space, probability space, vector space, topological space, or metric space– A vector space is a representation for the set of

elements in a collection. The vector representing each element is a set of characteristics held by that element and both connecting that element to others that are similar and distinguishing it from those that are different.

– We will do an exercise to illustrate

Definitions - 3• Definition: A scenario is a sequence of related

transition events (e1, e2, …, en) on state set S such that ek = (sk, sk+1,) for 1 <= k <= n.– More easily visualized, a scenario is a path in a

directed graph, G = (S, ∑e), where vertices correspond to states in the state set S and directed edges are equivalent to events in a set of events, ∑e, and correspond to transitions between states.

– Scenarios must be implemented to make a working system.

Definitions - 4

• Definition: A society is a tuple (C,R) where – C = (c1, c2, …, cn) is a set of conceptual

communities, each community referring to a set of individuals of the same class or type (e.g. actors, activities, components, hardware, software, data);

– R = (r1, r2, …, rm) is a set of relationships, each relationship being a tuple rj = (ej, ij) where ej is a Cartesian product ck1

x ck2 x … x cknj. 1<= k1 < k2 < …

< knj<= n, which specifies the communities involved in the

relationship and ij is an activity.

Summary - Week 2

• Continued to explore what a digital library really is

• Introduced some formal concepts for modeling a DL

• Briefly discussed the installation and operation of our own DLs.