Models for Digital Libraries CSC 9010 Digital Libraries - week 2 The 5S model is the work of Edward...

25
Models for Digital Libraries CSC 9010 Digital Libraries - week 2 The 5S model is the work of Edward A. Fox and his students at Virginia Tech. These slides rely heavily on that work.

Transcript of Models for Digital Libraries CSC 9010 Digital Libraries - week 2 The 5S model is the work of Edward...

Models for Digital Libraries

CSC 9010 Digital Libraries - week 2The 5S model is the work of Edward A. Fox and his students at

Virginia Tech. These slides rely heavily on that work.

Week 2 goals

• Expand our understanding of concept maps• Discuss reading of “As we may think”• Expand and confirm understanding of the 5 S

model• Use the 5 S model to direct thinking about the

design of a DL• Pointer to digital library software instructions• Begin plans for initial DL installation

– What subject?– Form teams

“As we may think”• An exercise –

– A Focus question: What is Vannevar Bush’s vision of the future of knowledge management?

• First, what is his summary of the current state? How does the reference to Mendel summarize the problem?

• In his discussion of photography, what does he get right? What does he miss?

– Provide random words or phrases that come to mind as you recall this article

• aka “the parking lot”– <Jotted on the board as they are mentioned>– In groups, make a concept map to address the

focus question, incorporating as many of the ideas reflected in those words or phrases as you can. (Other concepts may be added.)

Specifically, memex• Vannevar Bush’s vision

– How far have we come?– What did you notice about this article -- style or content

or background or anything else.– Did the article suggest anything you would not want to

see happen?

Image source:kelty.rice.edu/375/images/memex/camera.jpghttp://www.knowledgesearch.org/presentations/etcon/images/memex.gif

Some modern versions

• Gordon Bell and the Microsoft “My Life Bits” project

• Walden’s Path at Texas A & M

MyLifeBits

• Gordon Bell and Microsofthttp://www.guardian.co.uk/science/story/0,3605,1674359,00.html

“Gordon Bell doesn't need to remember, but has no chance of forgetting. At the age of 71, he is recording as much of his life as modern technology will allow, storing it all on a vast database: a digital facsimile of a life lived.

If he goes for a walk, a miniature camera that dangles from his neck snaps pictures every minute or so, immediately committing the scene to a memory built not of neurons but ones and noughts. If he wanders into a cafe, sensors note the change in light, the shift of temperature and squirrel the information away. Conversations are recorded and steps logged thanks to a GPS receiver carried with him.”

This article is now a few years old. Look for an update on this project and report next class.

The Guardian, Tuesday 27 December 2005

Related work

• Walden’s Path– http://www.csdl.tamu.edu/walden/– System used by itself or as a service within a

digital library– Allows a user to make a path through a set of

related resources and save the path for reuse at a later time.

• Used to allow a teacher to “blaze a trail” through a collection of materials to help students find their way from a starting point to a goal.

• Also for recording personal trips through a collection of material to be revisited.

Reviewing the 5 S model

• Last week – Looked at what a library is and introduced the 5S

model

• Information resources, including digital libraries, are very complex systems. – The 5S model helps to capture the essence of the

system and give special attention to specific areas– The model also allows developers of digital

libraries to have a check list of areas to consider and develop well.

– Let’s review the 5 S model

The 5S model

• Streams– The flow of information in various formats

• Structures– Organizational aspects of the DL

• Spaces– Views of components; real or abstract images

• Scenarios– Services and behaviors

• Societies– Communities and relationships among them

5S summaryModel Primitives Formalisms Objectives

Stream Text; video, audio, software program

Sequences, types Describes properties of the DL content, encoding and textual material or particular forms of multimedia data.

Structure Collection, catalog; hypertext; document; metadata; organizational tools

Graphs; nodes; links; labels; hierarchies

Specifies organizational aspects of the DL content

Space User Interface; index; retrieval model

Sets; operations; vector space; measure space; probability space

Defines logical and presentational views of several DL components

Scenarios Service, event; condition; action

Sequence diagrams; collaboration diagrams

Details the behavior of DL services

Societies Community; managers; actors; classes; relationships; attributes; operators

Object-oriented modeling constructs; design patterns

Defines managers responsible for running DL services; actors that use those services, and relationships among them

Source: http://www.dlib.vt.edu/projects/5S-Model/

Recall the application of the 5 S model to the Etana DL for archeology

Text Video Audio

*Site *Sub-partition *Container *Artifact*LocusRegion

Taxonomies

Temporal Artifact-specific

Space model

Structuremodel

Metadata

Drawing Photo 3DStreammodel

*Partition

Society model

Archaeologist

General public

Geographic space

Service Manager

Information Satisfaction

Value added

Repository buildingScenario

model Services

Domain specific

User interface Metric space

Spatial

Source: E. A. Fox http://feathers.dlib.vt.edu/

A ‘hands-on’ exercise

• Let’s look at a publically available digital library and construct a 5 S diagram to define its design.

• (This is an existing library; we are not deciding how we would construct it. We are trying to identify its design features. We may decide that we would have designed it differently.)

http:// www.geocaching.com

5S analysis template

Society model Scenario model

Space model

Structure model

Stream model

Use more or less space as needed

Fill in the template

• For the geocaching.com library, fill in the template.

• Work in groups again.

• We will compare the results

Getting ready for projects

• What are the interests? Let’s brainstorm about possible projects.

• Note – there are two stages– A basic digital library design and

implementation. Everyone does this – in teams or alone

– A project for most of the semester.• This can be an extension of the initial DL or a new

DL or some other digital library relevant project.

• Initial project plan is due next week. You can take a bit longer to decide on your big project.

Applying the model, informallyChoose a subject area – then answer the questions• Stream - what types of data? gif, jpg, avi, docx, pdf, html? • Structure - How are the elements organized? Is there a

hierarchy? Are there multiple structures?• Spaces - How will we index the items? How will we divide

them into related groups• Scenarios - what services will we provide? What

information do we need to provide those services? What events might happen that we need to plan for?

• Societies - who is the library intended to serve? Remember to include agents and other processes as well as users.

This is the first deliverable for your first project.

More formally: Definitions

• Definition: A stream is a sequence whose codomain is a non empty set.

• Definition: A structure is a tuple (G, L, F) where G = (V,E) is a directed graph with vertex set V and edge set E, L is a set of label values, and F is a labeling function. F : (V E ) → L.∪

See http://www.mathsisfun.com/sets/domain-range-codomain.html for a nice description of domain, range, codomain if you need it.

Structure illustration

Images

Audio files

Books

Collectionin

clude

s incl

ude

s

include

s

A very simple structure. How might it be enhanced? How would an index be included? What substructures might be added?

What are the G, L,F, V, E parts of this example?

Definitions, cont’d

• Definition: A space is a measurable space, measure space, probability space, vector space, topological space, or metric space– A vector space is a representation for the set of

elements in a collection. The vector representing each element is a set of characteristics held by that element and both connecting that element to others that are similar and distinguishing it from those that are different.

– We will do an exercise to illustrate

Vector space illustration

• Consider a car. What are the characteristics that you associate with a car? If you want to compare one car to another, what characteristics would you choose?

• Make a vector of those characteristics.

• Then, fill in the vector for several specific cars.

Definitions - 3• Definition: A scenario is a sequence of related

transition events (e1, e2, …, en) on state set S such that ek = (sk, sk+1,) for 1 <= k <= n.– More easily visualized, a scenario is a path in a

directed graph, G = (S, ∑e), where vertices correspond to states in the state set S and directed edges are equivalent to events in a set of events, ∑e, and correspond to transitions between states.

– Scenarios must be implemented to make a working system.

Definitions - 4

• Definition: A society is a tuple (C,R) where – C = (c1, c2, …, cn) is a set of conceptual

communities, each community referring to a set of individuals of the same class or type (e.g. actors, activities, components, hardware, software, data);

– R = (r1, r2, …, rm) is a set of relationships, each relationship being a tuple rj = (ej, ij) where ej is a Cartesian product ck1

x ck2 x … x cknj. 1<= k1 < k2 < …

< knj<= n, which specifies the communities involved in the

relationship and ij is an activity.

Projects in our DL laboratory

• Mendel 289 is the center of activity for projects related to digital libraries and similar projects.

• Summary of the projects under way, which may present opportunities for class projects or for independent study

• NSDL, CITIDEL, CSTA, Ensemble, Distributed Expertise, Computing Ontology, Interdisciplinary Computing and its relationship to the libraries ….

Our systems

• Now available– Fedora linux machines, remotely accessible (use the

gateway)– Bare machines with just basic system– We can install Drupal either from the Drupal site

(doing things for ourselves) or from the Bitnami site (builds the stack for us)

• If you have a computer of your own and want to use it, – Fine, but you must be able to demonstrate it to the

class at the end of the semester. I will need to be able to see what you are doing from time to time during the semester. – That means you need a static IP address.

Summary - Week 2

• Further developed our understanding of concept maps

• Explored the vision of Vannevar Bush• Applied the 5S model to an existing Digital

Library• Began planning for the first DL project• Learned about existing projects that may

provide ideas for class or independent study projects.