Where are Repository's Going?

151
Where are repositories going? Ben O’Steen (ORA Software Developer) Sally Rumsey (ORA Service & Development Manager) Repository Fringe, Edinburgh 2009

description

Keynote talk by Sally Rumsey and Ben O'Steen, given at the Repository Fringe 2009, Edinburgh.

Transcript of Where are Repository's Going?

Page 1: Where are Repository's Going?

Where are repositories going?

Ben O’Steen (ORA Software Developer)

Sally Rumsey (ORA Service & Development Manager)

Repository Fringe, Edinburgh 2009

Page 2: Where are Repository's Going?

Growth of repositories & historical parallel

Page 3: Where are Repository's Going?

Sir Thomas Bodley

Page 4: Where are Repository's Going?

Storage

Content

Page 5: Where are Repository's Going?

“…and you having built an Ark to save learning from deluge, deserve propriety in any new instrument or engine, whereby learning should be improved or advanced.”

http://novels.mobi/create/out_mobi/pg/1/2/5/1/12515/12515/4.php

Francis Bacon to Thomas Bodley Nov 1605

Page 6: Where are Repository's Going?

Reproduced for this presentation with kind permission of King's College London, Foyle Special Collections Library

www.kcl.ac.uk/.../exhibitions/marsex/mcoll.html

Search function

Library catalogue 1620

Page 7: Where are Repository's Going?

Jeffrey Keefer http://www.flickr.com/photos/jeffreykeefer/773540725/ CC licence: Attribution-Non-Commercial-Share Alike 2.0 Generic

Page 8: Where are Repository's Going?

Radcliffe Camera 1749

Radcliffe Science Library 1861

New Bodleian 1940

Page 9: Where are Repository's Going?

Artist’s impression of new Bodleian book depository at SwindonDetails may change

Page 10: Where are Repository's Going?

* 37,000 University card holders plus 28,000 external readers

8.5M volumes

1.6M visitors each year

65,000 registered readers*

5.4M requests for full-text journal articles

1.8M requests for e-books

Bodeian Stats 2009

Page 11: Where are Repository's Going?

Bodleian Library declaration: I hereby undertake not to remove from the Library, or to mark, deface, or injure in any way, any volume, document or other object belonging to it or in its custody; not to bring into the Library, or kindle therein, any fire or flame, and not to smoke in the Library; and I promise to obey all rules of the Library.

Library terms and conditions

Page 12: Where are Repository's Going?

QUOD FELICITER VORTAT

ACADEMICI OXONIENS

BIBLIOTHECAM HANC

VOBIS REIPUBLICAEQUE

LITERORUM T.B.P.

That it might turn out happily, Oxonian academics, for you and for the republic of lettered men Thomas Bodley placed this library

Page 13: Where are Repository's Going?

Growth in numbers of digital repositories

Source: Tim Brody. ROAR Registry of Open Access Repositories. http://roar.eprints.org/

Page 14: Where are Repository's Going?

Some overarching themes

Page 15: Where are Repository's Going?

Theme

Realisation as a catalyst for change

Page 16: Where are Repository's Going?

Theme

Repositories as a concept

Page 17: Where are Repository's Going?

Institutional repository

Repository as a box

Paper in Paper out

Page 18: Where are Repository's Going?

Integration with other hard and soft systems

enderisnotmyrealname http://www.flickr.com/photos/enderisnotmyrealname/3586300347/

Attribution-Non-Commercial-Share Alike 2.0 Generic

Page 19: Where are Repository's Going?

“An effective institutional repository of necessity represents

a collaboration among librarians, information

technologists, archives and records mangers, faculty, and university administrators and

policymakers.”

Clifford Lynch. ARL Bi-monthly report No. 226 Feb 2003 http://www.arl.org/resources/pubs/br/br226/br226ir.shtml

Page 20: Where are Repository's Going?

“… a university–based institutional repository is a set of services……an institutional repository is not simply

a fixed set of software and hardware.”

 

Clifford Lynch. ARL Bi-monthly report No. 226 Feb 2003 http://www.arl.org/resources/pubs/br/br226/br226ir.shtml

Page 21: Where are Repository's Going?

Sally Ben

Page 22: Where are Repository's Going?

The most successful repository is the internet.

Embrace it.

Page 23: Where are Repository's Going?

Some pointers then:

• Distributed across a number of nodes.•

Page 24: Where are Repository's Going?

Some pointers then:

• Distributed across a number of nodes.• The services and storage should be

separate.•

Page 25: Where are Repository's Going?

Some pointers then:

• Distributed across a number of nodes.• The services and storage should be

separate.• There should be multiple ways to search the

content.•

Page 26: Where are Repository's Going?

Some pointers then:

• Distributed across a number of nodes.• The services and storage should be

separate.• There should be multiple ways to search the

content.• Any service or storage can disappear, be

added or upgraded without affecting the other systems unduly.

Page 27: Where are Repository's Going?

Some pointers then:

• Distributed across a number of nodes.• The services and storage should be

separate.• There should be multiple ways to search the

content.• Any service or storage can disappear, be

added or upgraded without affecting the other systems unduly.

Just think how you might make your IR work more like

the web does.

Page 28: Where are Repository's Going?

"The future is here.

It's just not evenly distributed yet.""The future is here. It's just not evenly distributed yet."

s

William GibsonNPR Talk of the Nation

30 November 1999Timecode: 11min 55sec

Link: discover.npr.org/features/feature.jhtml?wfId=1067220Also: www.npr.org/rundowns/rundown.php?prgld=5&prgDate=30-Nov-1999

Page 29: Where are Repository's Going?

"The future is here.

It's just not evenly distributed yet.""The future is here. It's just not evenly distributed yet."

s

William GibsonNPR Talk of the Nation

30 November 1999Timecode: 11min 55sec

Link: discover.npr.org/features/feature.jhtml?wfId=1067220Also: www.npr.org/rundowns/rundown.php?prgld=5&prgDate=30-Nov-1999

Page 30: Where are Repository's Going?
Page 31: Where are Repository's Going?

http://bit.ly/89AtD

For those who want to follow along:

Page 32: Where are Repository's Going?

Using google to assay the forms of usage

Page 33: Where are Repository's Going?

Using amazon's in-book search and browse to find the

phrase.

Page 34: Where are Repository's Going?

'Tim O'Reilly checked with Cory Doctorow who checked with Lorna Toolis who checked with Barry Wellman who checked with Ren Reynolds and Ellen Pozzi who point out that there's an NPR Talk of the Nation broadcast from 1999 where Gibson says,

"As I've said many times, the future is already here. It's just not very evenly distributed."

William GibsonNPR Talk of the Nation

30 November 1999Timecode: 11min 55sec

Link: discover.npr.org/features/feature.jhtml?wfId=1067220Also: www.npr.org/rundowns/rundown.php?prgld=5&prgDate=30-Nov-1999

Page 35: Where are Repository's Going?

NPR has changed their site since then, breaking the link to the

metadata about that recording...

whoops...

Page 36: Where are Repository's Going?

But the link to the actual broadcast works:

Notice anything interesting in that url?

http://discover.npr.org/features/feature.jhtml?wfId=1067220

Page 37: Where are Repository's Going?

eature.jhtml?wf Id=1067220

Hint Hint:

Page 38: Where are Repository's Going?

Realisation:

People search for Things – the fact that they can only retrieve

documents concerning those Things is incidental to them.

Page 39: Where are Repository's Going?

Things:

• People

• Places

• Dates

• Books

• CDs

• Performances/Events

• Topics/subjects

• … etc, etc

Page 40: Where are Repository's Going?

Things:

• What did they all have in common?••

Page 41: Where are Repository's Going?

Things:

• What did they all have in common?•• They all have 'names' of one sort in real-life.•• But there are plenty of those Things that

don't have names on the web...•

• How about we give them names?

Page 42: Where are Repository's Going?

Provide documents that directly relate to rather than simply

mention a Thing the person is searching for.

Page 43: Where are Repository's Going?

Provide documents that directly relate to rather than simply

mention a Thing the person is searching for.

Page 44: Where are Repository's Going?

“Relate” is a fluffy word.

The key is knowing how a Document relates to a Thing.

Page 45: Where are Repository's Going?

“Relate” is a fluffy word.

The key is knowing how a Document relates to a Thing.

Does it describe it, comment on it, refer to it, locate it, disagree with it?

Page 46: Where are Repository's Going?

The types of relationships between Named Things are

very important

Page 47: Where are Repository's Going?

Realisation – we have been avidly giving ourselves HTTP names for some time now

xkcd.com/262/

Page 48: Where are Repository's Going?
Page 49: Where are Repository's Going?

Realisation – we have been avidly giving ourselves HTTP names for some time now

• http://facebook.com/benosteen• http://twitter.com/benosteen• http://oxfordrepo.blogspot.com• Etc• Etc

Page 50: Where are Repository's Going?

Realisation – we have been avidly giving ourselves HTTP names for some time now

• Things can have multiple, simultaneous names in real-life and online.

• The real power comes from relating names:

• “This named thing {is the same as} that other named thing”

Page 51: Where are Repository's Going?

And when it comes to people on the web, there

has been a social sea-change

Page 52: Where are Repository's Going?

“Have you got a profile page on

friendsreunited?”

Page 53: Where are Repository's Going?

“Are you on facebook?”

“Are you on twitter?”

Page 54: Where are Repository's Going?

Linked Data

Linked Data and HTTP names

Page 55: Where are Repository's Going?

Real Data

• http://data.gov•• http://recovery.gov

–– - Repositories of US Federal Data and Federal

Funding information, being published in a re-usable manner using Atom and RDF.

Page 56: Where are Repository's Going?

Real Data

• http://id.loc.gov

– Library of Congress publishing their authority lists as Linked Data in RDF.

Page 57: Where are Repository's Going?

Yahoo and Google indexing RDF embedded in

HTML pages (as RDFa)O'Reilly post on Google's “adoption”http://radar.oreilly.com/2009/05/google-announces-support-for-m.html

A piece from the RDFa.info site about Yahoo and SearchMonkey's use of RDFahttp://rdfa.info/2008/03/14/yahoo-into-semantic-web/

Page 58: Where are Repository's Going?

Ben Sally

Page 59: Where are Repository's Going?

Theme

Policies

Page 60: Where are Repository's Going?
Page 62: Where are Repository's Going?

1. Accession

Page 63: Where are Repository's Going?

“Dedicated to the freeing of the refereed research literature online through author/institution self-archiving”

November 2000

Page 64: Where are Repository's Going?

“Recognised as the easiest and fastest way to set up repositories of:

• research literature• scientific data• student theses• project reports• multimedia artefacts• teaching materials• scholarly collections• digitised records• exhibitions and performances”

July 2009

Page 65: Where are Repository's Going?

“Resources range from simple materials such as Word documents or Powerpoint presentations, to complex learning packages, IMS, SCORM and VLE course modules that combine various multimedia formats such as video, audio and animation.”

http://www.jorum.ac.uk/

Page 66: Where are Repository's Going?

Success story Community specific user interfaces for

deposit, discovery and access

Page 67: Where are Repository's Going?

Deposit in multiple repositories

More needs to be done!

Institutionalrepository

Subject repository

Otherrepository

Single deposit

Page 68: Where are Repository's Going?

Sally Ben

Page 69: Where are Repository's Going?

Realisation:

We've reinvented too many “wheels”

Page 70: Where are Repository's Going?

The Web exists

and it works.

Page 71: Where are Repository's Going?

Don't fight it.

Work with it.

Page 72: Where are Repository's Going?

Using the defacto standards of the web gives you a massive

advantage.

Page 73: Where are Repository's Going?

Defacto standards

• Transfer:– Files (HTTP)– Lists of things (Atom, RSS)

• Create, Read, Update, Delete:– HTTP PUT, GET, POST, DELETE

• Names:– URIs

• Lookups:– DNS resolvers

Page 74: Where are Repository's Going?

The big advantages

• Instant community• Lots of tried and tested software that you

don't have to write from scratch !• No wheels need be re-invented• May not be perfect, but it works

Page 75: Where are Repository's Going?

The big disadvantage

• If you really are doing something unique or new (which is really unlikely) then try to get a community to help you.

•• If noone else wants to do it like you, then

think about what you are truely accomplishing.

Page 76: Where are Repository's Going?

What is new?

• Doing something new does not mean using a more refined vocabulary to describe things.

Page 77: Where are Repository's Going?

What is new?

• I mean something that we don't have defacto standards for

Page 78: Where are Repository's Going?

What is new?

• I mean something that we don't have defacto standards for:– real-time event notifications through the

browser–

Page 79: Where are Repository's Going?

What is new?

• I mean something that we don't have defacto standards for:– real-time event notifications through the

browser– Simultaneous collaborative document editing

Page 80: Where are Repository's Going?

What is new?

• I mean something that we don't have defacto standards for:– real-time event notifications through the

browser– Simultaneous collaborative document editing– Data qualified and ranked by evidence

Page 81: Where are Repository's Going?

Name some

repositories.

Participation!

Page 82: Where are Repository's Going?

Some things I consider to be Repositories

• Flickr

• Facebook

• Google Docs

• A filesytem of BagITs

• Scribd

• Slideshare

• Blogs (WP/etc)

• Wikis

• Twitter/Identi.ca

• IRs• Domain Rs (Jorum,

Pubmed, etc)• Publisher sites• Forums• CVS/SVN/Git/Hg• WebDAV• FTP

Page 83: Where are Repository's Going?

So, what do these repositories have in

common? Standards? APIs?

Page 84: Where are Repository's Going?

So, what do these repositories have in

common? Standards? APIs?

Erm.... not much, but they all contain sets of things.

Page 85: Where are Repository's Going?

“Trying to get stuff into your

repository?

Noone gives a SIP...”

Page 86: Where are Repository's Going?

Realisation: Object transfer is still in a divergent state

• Lots of containers, lots of formats, too many conventions that you just have to know..

Page 87: Where are Repository's Going?

Realisation: Object transfer is still in a divergent state

• Lots of containers, lots of formats, too many conventions that you just have to know..

•• There is no negotiation for the format of a

SIP – you deal with what you are given.

Page 88: Where are Repository's Going?

Realisation: Object transfer is still in a divergent state

• Lots of containers, lots of formats, too many conventions that you just have to know..

•• There is no negotiation for the format of a

SIP – you deal with what you are given.•• And sometimes, you just have to go and

harvest what you can.

Page 89: Where are Repository's Going?

Don't Panic

Page 90: Where are Repository's Going?

Normal Archival Process

• (paraphrased by an observer...)•

Page 91: Where are Repository's Going?

Normal Archival Process

• - Accept delivery of boxes of stuff, and record roughly what was received.

•• THINGS GET PERMANENT IDS NOW!•• Even if it is just on a per-box basis. The item

carries that provenance throughout.–

Page 92: Where are Repository's Going?

Normal Archival Process

• - Accept delivery of boxes of stuff, and record roughly what was received.

• - Triage the contents within a stable environment.–

Page 93: Where are Repository's Going?

Normal Archival Process

• - Accept delivery of boxes of stuff, and record roughly what was received.

• - Triage the contents within a stable environment.– Deal with the fragile things first, things that will

deteriorate.–

Page 94: Where are Repository's Going?

Normal Archival Process

• - Accept delivery of boxes of stuff, and record roughly what was received.

• - Triage the contents within a stable environment.– Deal with the fragile things first, things that will

deteriorate.– Try to sort out issues that arise, with the next of

kin/donater.–

Page 95: Where are Repository's Going?

Normal Archival Process

• - Accept delivery of boxes of stuff, and record roughly what was received.

• - Triage the contents within a stable environment.– Deal with the fragile things first, things that will

deteriorate.– Try to sort out issues that arise, with the next of

kin/donater.– Some things may stay in the box for a long

time...

Page 96: Where are Repository's Going?

Normal Archival Process

• - Accept delivery of boxes of stuff, and record roughly what was received.

• - Triage the contents within a stable environment.– Deal with the fragile things first– Try to sort out issues that arise– Some things may stay in the box– Identify actions that need to be taken to ensure

future access.

Page 97: Where are Repository's Going?

Normal Archival Process

• - Accept delivery of boxes of stuff, and record roughly what was received.

• - Triage the contents within a stable environment.- Characterise and catalogue the contents, using relevant tools.

Page 98: Where are Repository's Going?

Normal Archival Process

• - Accept delivery of boxes of stuff, and record roughly what was received.

• - Triage the contents within a stable environment.- Characterise and catalogue the contents, using relevant tools.

• - Update archival records so that people can find the content (if they are allowed to.)

Page 99: Where are Repository's Going?

Digital Process

• - Accept delivery of boxes of stuff, and record roughly what was received.

• - Triage the contents within a stable environment.- Characterise and catalogue the contents, using relevant tools.

• - Update archival records so that people can find the content (if they are allowed to.)

Page 100: Where are Repository's Going?

Digital Process

The media may be different.

And the tools may be too.

You are still likely to be doing` something like this for a while.

Page 101: Where are Repository's Going?

Not all storage is the same

The absolute biggest benefit to any repository is to separate out the concerns of storage and services.

•• It will make you life so, so much easier.•• Trust me.

Page 102: Where are Repository's Going?

Hardware, software, people and storage will

come and go.

Your content is constant.

Page 103: Where are Repository's Going?

Ben Sally

Page 104: Where are Repository's Going?

Medical scientists at Oxford

“When it's one click

deposit, I'll do it"

Page 105: Where are Repository's Going?

Bill Hubbard: Institutional Policies and Processes for Mandate Compliance. May 2009. http://www.sherpa.ac.uk/documents/OA%20Choices%20-%20researcher%27s%20view.ppt

Page 106: Where are Repository's Going?
Page 107: Where are Repository's Going?

What we need is:

Deposit by stealth and other easy

solutions

Multiple Repository Deposit Regime (MuRDeR)

Answers to related problems that worry people such as multiple versions

Automation, automation, automation

Page 108: Where are Repository's Going?

Copyright ©

Page 109: Where are Repository's Going?

“There is a supreme irony that just as technology is allowing greater access to books and other creative works than ever before for education and research, new restrictions threaten to lock away digital content in a way we would never countenance for printed material.”

Dame Lynne Brindley, CEO The British Library

Copyright for Education and ResearchGolden Opportunity or Digital Black Hole?http://www.bl.uk/ip/pdf/copyrightresearchreport.pdf

Page 110: Where are Repository's Going?

Legal Deposit as a parallel to repository

mandates

Page 111: Where are Repository's Going?

Bodley’s agreement 1610

“That one Book of every sort that is new Printed, or Re-printed with Additions, be sent to the University of Oxford for the Use of the publick Library there, … to be sent to the Library at Oxford accordingly, upon pain of Imprisonment”

Order of the Star Chamber

1637

http://www.british-history.ac.uk/report.aspx?compid=74953

Page 112: Where are Repository's Going?

“…there tower the few, the very few, Libraries of Deposit. These are the super-Dreadnoughts of the literary world, and the Bodleian claims to be among them … a really great library should have Universal scope, Independences, Size, Permanence, Wealth, and multiform Utility.”

The Bodleian Library at Oxford. Falconer Madan. 1919

http://www.archive.org/stream/bodleianlibrarya00mada/bodleianlibrarya00mada_djvu.txt

Page 113: Where are Repository's Going?

2. Management and preservation

Page 114: Where are Repository's Going?

“Preservation aims towards preserving access”

alancleaver_2000 Attribution 2.0 Generichttp://www.flickr.com/photos/alancleaver/2638883650/

Page 115: Where are Repository's Going?

Assured secure storage and permanent access needs to be

well-managed

Aided by intra- and inter-institutional advisory and support

services

Page 116: Where are Repository's Going?

Shared and distributed expertise

Page 117: Where are Repository's Going?

Sally Ben

Page 118: Where are Repository's Going?

Why do people choose to interact

with systems?

Page 119: Where are Repository's Going?

Why do people choose to interact

with systems?

Disproportionate Feedback loops

Page 120: Where are Repository's Going?

Disproportionate Feedback Loop =>

The perception that a small effort leads to a very great benefit.

Page 121: Where are Repository's Going?

Disproportionate Feedback Loop =>

The perception that a small effort leads to a very great benefit.

Which leads to more “little efforts” which add up!

Page 122: Where are Repository's Going?
Page 123: Where are Repository's Going?
Page 124: Where are Repository's Going?

High Scores

Technically trivial, but...

Psychologically addictive and drives a lot of replay

Page 125: Where are Repository's Going?

High Scores

IR High scores?

● Usage stats

● Re-usage stats (trackbacks, tweets, references)

Page 126: Where are Repository's Going?
Page 127: Where are Repository's Going?

Ben Sally

Page 128: Where are Repository's Going?

Peer review is used in the UK for 3 main purposes:

1. Allocation of research funding

2. Publication of research in scientific journals. To assess the quality of research submitted for publication and to assess its importance.

3. Assess the research rating of university departments

http://www.parliament.uk/post/pn182.pdf

Parliamentary Office for Science and Technology

Page 129: Where are Repository's Going?

3. Dissemination

Page 130: Where are Repository's Going?
Page 131: Where are Repository's Going?

Links to: Actionable raw data; Data fusion

Links to: Interactive version; Google maps; additional visualisation

Page 132: Where are Repository's Going?

Open Access

Page 133: Where are Repository's Going?
Page 135: Where are Repository's Going?

“What many people fail to realise is that the uplands of this country once belonged to us, open common land, free for all to walk at will. They were only enclosed and parcelled off to the rich by acts of parliament pushed through by the rich. An old rhyme goes:

They hang the man and flog the womanThat steals the goose from oft the commonYet leave the greater villain looseThat steals the common from the goose.”

Mike HardingThe Guardian, Wednesday 18 April 2007 http://www.guardian.co.uk/environment/2007/apr/18/society.guardiansocietysupplement1

Page 136: Where are Repository's Going?

Free! Free!

Free!Free!

Free!

FREE!

Free!

Free!

Free!Free!

Free!

Free!

Free!

Free!

Free!

Free!

Free! Free!

Page 137: Where are Repository's Going?

Green Open Access

Gold Open Access

Mandated open access

(as reported by SHERPA/Romeo)

Open options

Open access journals

Too complicated!

Some journals in UKPMC allow harvesting of the full text of all items, others allow it for only some items, and many do not allow it at all. See the PMC Open Access List for specifics.

Open Archives (OAI) Service

Operates two different open

access Models

Page 138: Where are Repository's Going?

Preview Time

Page 139: Where are Repository's Going?

You are here

Page 140: Where are Repository's Going?

Time

Evolution

Seismic change

Step change Incremental change

Time

Time

Evolution

Evolution

Rapid change

Time

Evolution

Page 141: Where are Repository's Going?

Trends

1. Entering a period of steady growth and change

3. Names

2. Repositories as a set of services embedded within institutional systems

Page 142: Where are Repository's Going?

Trends: Still waiting…

• Easy multiple deposit• Collaboration between publishers,

repositories, HEIs, government and research funders as a group

• Common policies• Less complexity

Page 143: Where are Repository's Going?

Crystal ball by Hamachi! CC license. Attribution-Non-Commercial-No Derivative Works 2.0 Generic

Available at http://www.flickr.com/photos/mawari/2091456761/

Page 144: Where are Repository's Going?

Print-on-demand is going to be big

And I don't mean printed facsimiles.

Page 145: Where are Repository's Going?

What does having a book mean if you can print one in

5 minutes for £2?

http://www.ondemandbooks.com/home.htm

Page 146: Where are Repository's Going?

How about this scenario then? It's all technically possible now

Page 147: Where are Repository's Going?

How about this scenario then? It's all technically possible now

You print off a set of articles into a book on the libraries book-printer.

Your collegues comments, tweets and reviews are interleaved with the text.

Your collegues were found from your professional social network.

Page 148: Where are Repository's Going?

How about this scenario then? It's all technically possible now

You print off a set of articles into a book on the libraries book-printer.

Your collegues comments, tweets and reviews are interleaved with the text.

Your collegues were found from your professional social network.

Page 149: Where are Repository's Going?

How about this scenario then? It's all technically possible now

You print off a set of articles into a book on the libraries book-printer.

Your collegues comments, tweets and reviews are interleaved with the text.

Your collegues were found from your professional social network.

Page 150: Where are Repository's Going?

How about this scenario?

You create a bookmark list of plates from 18th century books online which you believe to be

the work of one anonymous artist.

This list with your comments is a new resource in of itself, and can be commented on or

printed as a book.

Page 151: Where are Repository's Going?

Permanent books, temporary magazines?

Is this true?

How about facsimiles that are printed, so that a student can study with their coffee and

donuts?

And why print to paper? Why not print to digital paper, once it arrives...