Seman&c Web for
Libraries & Publishers
Charleston Conference
111103
Monday, November 21, 11
so, what’s the problem?
2
The Problem Set
Monday, November 21, 11
Monday, November 21, 11
Silos
Monday, November 21, 11
More silos
Monday, November 21, 11
Lots of different silos
Monday, November 21, 11
Blue silos
Monday, November 21, 11
Old SilosWe in the library and publishing trades force readers, some of them who are authors as well, to search iteratively for information they want or need or thinks might exist, in many different silos, using many different search engines, forms, and vocabularies. We do not make it easy for them to discover what is locally available, what is more or less easy to get, or everything that might be available. No wonder the young and foolish depend upon and believe in Google’s searches. Google is quick...and in terms of search terms of relevance, very, very dirty.
Monday, November 21, 11
We give them better interfaces, ones that permit refinement of results, to our holdings at the title level, BUT...
Monday, November 21, 11
Simulateneously, we show them many other tools, each excellent in some ways, to continue their exploration of the literature. No single tool is comprehensive. We do not refer our clients to the Web, at least not on our own web sites! // Our OPACs refer to our holdings. While Indices and abstracts refer our readers to articles in journals to which we may have licensed. SFX and similar provide readers with links to titles revealed to which we have subscribed. Neither our opacs nor the secondary databases directly to more than a tiny, percentage of the vast collection of pages that is the World Wide Web. The Web, of course, refers in fragmentary fashion to information resources we might, I emphasize, MIGHT have on hand for our readers.
Monday, November 21, 11
And the results of using other, often very good, discovery tools differ in relevance ranking, format, and options than the ones we provide for our OPAcs, thus adding confusion.
Monday, November 21, 11
some of us provide our readers with lots of databases to search. Too many really, for all but a few are not forensic-level scholars.
Monday, November 21, 11
Selecting a licensed data base is an art in itself!Once again notice that we rarely offer a web search engine as an option, and for good reasons. Nevertheless, the discoverable relevant information resources on the web apparently are not part of our repertory.
!!!
Monday, November 21, 11
We have not conspired to make the search for relevant information objects difficult. We just have not yet had the tools, the methods, the vision, and yes, the gumption to try something new.
Ntl Cntr forBiotech Info
NSF CyberInfrastructurequake engineering simulation
ATLAS at LHC -- 150*106 sensors
Monday, November 21, 11
Here’s a teensy slice of the information and communication environment in which our faculty and students find themselves. And it gets more complex every day. Alas the larger the number of websites indexed by Bing or Google or whatever search engine du jour, the more likely it is that the relevance of the returns will be less pointed and precisely matched to what the searcher hoped to find.
Monday, November 21, 11
Too many silos.Here’s the biggest of the lot...
16
Monday, November 21, 11
17
One size fits all???
Monday, November 21, 11
Does one size fit all?
18
Monday, November 21, 11
Not quite. Even Google has silos and uses, as do others, clever interfaces to hide the fact of the silos.
Monday, November 21, 11
Given all these silos and search engines, our users, our authors, and readers, and teachers, and students, people on the street, our nations...need us to find a better way. Facts about the information objects we have acquired or leased, facts about books, articles, films, and so forth that we have published need to be found in the wild, on the web. Ideally, we, librarians and publishers will get the facts about what we have and what we are making public, for fun or profit, discoverable on the Web.
Discovery & Access
... the problems
Monday, November 21, 11
Let’s dwell on the problems briefly...
1. Too many stovepipe systems
2. Too little precision with inadequate recall
3. Too far removed from W3 WorldWide
Web
Monday, November 21, 11
1. Too many stovepipe systems
Monday, November 21, 11
The landscape of discovery & access services is a shambles
1. Too many stovepipe systems
Monday, November 21, 11
The landscape of discovery & access services is a shambles
It can’t be mapped in any logical way
1. Too many stovepipe systems
Monday, November 21, 11
The landscape of discovery & access services is a shambles
It can’t be mapped in any logical way• not by us (the supposed information pros)• not by the faculty & students who must navigate the chaos
1. Too many stovepipe systems
Monday, November 21, 11
The landscape of discovery & access services is a shambles
It can’t be mapped in any logical way• not by us (the supposed information pros)• not by the faculty & students who must navigate the chaos
This state of affairs shouldn’t be a surprise
1. Too many stovepipe systems
Monday, November 21, 11
2. Too little precision with inadequate recall
Monday, November 21, 11
Some of the problem ... too many stovepipe systems
2. Too little precision with inadequate recall
Monday, November 21, 11
Some of the problem ... too many stovepipe systems• dumbing-down effects of federation often hinder explicit searches• each interface has its own search-refinement tricks• numerous, overlapping discovery paths hamper full recall
2. Too little precision with inadequate recall
Monday, November 21, 11
Some of the problem ... too many systems• dumbing down effects of federation often hinder explicit searches• each interface has its own search-refinement tricks• numerous, overlapping discovery paths hamper full recall
Most of the problem ... limitations in the design & execution of infrastructure that supports discovery & access
2. Too little precision with inadequate recall
Monday, November 21, 11
the 1st limiting factor ... ambiguity
Monday, November 21, 11
the 1st limiting factor ... ambiguityMost of our metadata uses a string of bytes to label a semantic entity [people, places, things, events, ...]
Monday, November 21, 11
the 1st limiting factor ... ambiguityMost of our metadata uses a string of bytes to label a semantic entity [person, place, thing, event, ...]
• discovery based on matching text labels• not on the gist of semantic entities
Monday, November 21, 11
the 1st limiting factor ... ambiguityMost of our metadata uses a string of bytes to label a semantic entity [person, place, thing, event, ...]
• discovery based on matching text labels• not on the gist of semantic entitiesFor libraries, the fix is authorities• authoritative forms of strings (names, organization, titles, places, events, topics, etc.)
Monday, November 21, 11
the 1st limiting factor ... ambiguity
For libraries, the fix is authorities• authoritative forms of strings (names, organization, titles, places, events, topics, etc.) work to improve precision and recall
hold on ... what about cases where no one-to-one relationship exists between a string-of-text label & the underlying semantic entity
Most of our metadata uses a string of bytes to label a semantic entity [person, place, thing, event, ...]
• discovery based on matching text labels• not on the gist of semantic entities
Monday, November 21, 11
the 1st limiting factor ... ambiguity
For libraries, the fix is authorities• authoritative forms of strings (names, organization, titles, places, events, topics, etc.) work to improve precision and recall
hold on ... what about cases where no one-to-one relationship exists between a string-of-text label & the underlying semantic entity
byte string: 4a 61 67 75 61 72
Take for example the text string: jaguar
Most of our metadata uses a string of bytes to label a semantic entity [person, place, thing, event, ...]
• discovery based on matching text labels• not on the gist of semantic entities
Monday, November 21, 11
MacintoshOS X 10.2
E-Type (UK) or XK-E (US) mftg 1961 to 1974
Atari videogame console
XK series, in pro-duction since 1996
etc.
Ltd.
... a rose is a rose is a rosecompany
cars
hardware & software
John Giannandrea, CTO, Metaweb
Monday, November 21, 11
Imagine this keyword search and realize the ambiguity of the term “jaquar”
inspired by John Giannandrea, CTO, Metaweb ... from his presentation at PARC in April, 2008
MacintoshOS X 10.2
type 140 Jaguar class fast attack craft [torpedo],Germany WWII
E-Type (UK) or XK-E (US) mftg 1961 to 1974
Fender electric guitar,introduced in 1962
XF10F prototype swing-wing fighter, early 1950s, Grumman
Atari videogame console
XK series, in pro-duction since 1996
Anglo-French ground attack aircraft
etc.
Ltd. heavy metal band formed in Bristol, England. Dec 1979
Philadelphia-basedsinger/songwriter Jaguar Wright
... a rose is a rose is a rosecompany
cars
hardware & software
music
military
John Giannandrea, CTO, Metaweb
Monday, November 21, 11inspired by John Giannandrea, CTO, Metaweb ... from his presentation at PARC in April, 2008
MacintoshOS X 10.2
type 140 Jaguar class fast attack craft [torpedo],Germany WWII
Jacksonville
E-Type (UK) or XK-E (US) mftg 1961 to 1974
Fender electric guitar,introduced in 1962
DC Comics' Impact series, ... loosely based on Archie Comics' character
XF10F prototype swing-wing fighter, early 1950s, Grumman
The Jaguar is a superheropublished by Archie Comics
Atari videogame console
XK series, in pro-duction since 1996
Anglo-French ground attack aircraft
etc.
Ltd. heavy metal band formed in Bristol, England. Dec 1979
Philadelphia-basedsinger/songwriter Jaguar Wright
... a rose is a rose is a rosecompany
cars
hardware & software
music
military
heros
pro footbal
John Giannandrea, CTO, Metaweb
Monday, November 21, 11inspired by John Giannandrea, CTO, Metaweb ... from his presentation at PARC in April, 2008
MacintoshOS X 10.2
type 140 Jaguar class fast attack craft [torpedo],Germany WWII
Jacksonville
E-Type (UK) or XK-E (US) mftg 1961 to 1974
Fender electric guitar,introduced in 1962
DC Comics' Impact series, ... loosely based on Archie Comics' character
XF10F prototype swing-wing fighter, early 1950s, Grumman
The Jaguar is a superheropublished by Archie Comics
Atari videogame console
XK series, in pro-duction since 1996
Anglo-French ground attack aircraft
etc.
Ltd. heavy metal band formed in Bristol, England. Dec 1979
Philadelphia-basedsinger/songwriter Jaguar Wright
Prrrrr... a rose is a rose is a rosecompany
cars
hardware & software
music
military
heros
pro footbal
John Giannandrea, CTO, Metaweb
Monday, November 21, 11
inspired by John Giannandrea, CTO, Metaweb ... from his presentation at PARC in April, 2008
the 2nd limiting factor ... instance-based metadata
Monday, November 21, 11
the 2nd limiting factor ... instance-based metadata
Most of our metadata uses focuses on publication artifacts
• identify responsibility for its creation • list topical headings
Monday, November 21, 11
the 2nd limiting factor ... instance-based metadata
For simple cases ... few worries• as with ambiguity, one-to-one relationships pose few problems• things work for authors with a few books in several editions
Most of our metadata uses focuses on publication artifacts
• identify responsibility for its creation • list topical headings
Monday, November 21, 11
the 2nd limiting factor ... instance-based metadata
For simple cases ... few worries• as with ambiguity, one-to-one relationships pose few problems• things work for authors with a few books in several editions
Most of our metadata uses focuses on publication artifacts
• identify responsibility for its creation • list topical headings
But, as complexity increases, precision & recall suffer
Monday, November 21, 11
search: Shakespeare’s Hamlet 811 entriesWading thru search results for authors
like Shakespeare shows clearly the effects that instance-based metadata has on precision & recall
Prolific authors ...
Monday, November 21, 11
A Socrates (Stanford Libraries OPAC) keyword search for the terms shakespeare and hamlet
search: Shakespeare’s Hamlet 811 entriesWading thru search results for authors
like Shakespeare shows clearly the effects that instance-based metadata has on precision & recall
Unflagging patience marks the task of flipping back & forth between hundreds of brief and full records to sort thru the varied instances of a single entity
Prolific authors ...
Monday, November 21, 11
search: Shakespeare’s Hamlet 811 entriesWading thru search results for authors
like Shakespeare shows clearly the effects that instance-based metadata has on precision & recall
Unflagging patience marks the task of flipping back & forth between hundreds of brief and full records to sort thru the varied instances of a single entity, e.g.• critical editions based on primary sources• 18th & 19th century collections of the plays• social, historical and literary essays• histories & critiques of such writings• video and audio recordings of performances• reviews and indices of the same• treatments of stagecraft, costumes, music• life & works of notables associated with the plays (e.g., performers, directors)• other art forms inspired by the plays
Prolific authors ...
Monday, November 21, 11
3. Too far removed from W3 WorldWide
Web
Monday, November 21, 11
Together, our metadata & collections make up a big chunk of the “dark web”
[ info resources that search-engine spiders can’t see ]
3. Too far removed from W3 WorldWide
Web
Monday, November 21, 11
Together, our metadata & collections make up a big chunk of the “dark web”
[ info resources that search-engine spiders can’t see ]
It’s clear that visibility on the web promotes dramatic increases in discovery and access
3. Too far removed from W3 WorldWide
Web
Monday, November 21, 11
Together, our metadata & collections make up a big chunk of the “dark web”
[ info resources that search-engine spiders can’t see ]
It’s clear that visibility on the web promotes dramatic increases in discovery and access• Library of Congress & Smithsonian images (FLICKR)
3. Too far removed from W3 WorldWide
Web
Monday, November 21, 11
Together, our metadata & collections make up a big chunk of the “dark web”
[ info resources that search-engine spiders can’t see ]
It’s clear that visibility on the web promotes dramatic increases in discovery and access• Library of Congress & Smithsonian images (FLICKR)• SULAIR’s Highwire Press ( > 2x increase via Google)
3. Too far removed from W3 WorldWide
Web
Monday, November 21, 11
Together, our metadata & collections make up a big chunk of the “dark web”
[ info resources that search-engine spiders can’t see ]
It’s clear that visibility on the web promotes dramatic increases in discovery and access• Library of Congress & Smithsonian images (FLICKR)• SULAIR’s Highwire Press ( > 2x increase via Google)
The state of affairs is well known ...
3. Too far removed from W3 WorldWide
Web
Monday, November 21, 11
54
Our Working Environment
Monday, November 21, 11
library
academy
produceprovide
publisher
Scholars& students
Monday, November 21, 11
Here is a schematic to suggest how our ecosystem works. It is more complex, of course, but the basics are embodied here.
internet
Once upon a &me…the Internet
Monday, November 21, 11
And here is the way the e-discovery and e-communication environment is developing. First there was the Internet. Prophets such as Vannevar Bush, Ted Nelson, and Doug Englebart showed us the way.
internet
Then…the World Wide Web
webof
pages
Monday, November 21, 11
Thanks to another profit, Tim Berners-Lee, the Internet, a network of communicating computers, became a web of pages of information. Scholarly journal publishers and some librarians realized early on that there were functional advantages to scholarship and to publishing in the web of pages. Yahoo, Google, and others realized that mining the web of pages by words on those pages, could make the rapidly growing web of pages reveal more through indexing and cataloging the web. Indexing won out as we now know over cataloging.
The next thing is the subject of this talk. It is the web of data. It is the web of relationships constructed and expressed so that both computers and humans can identify and understand relationships in that web. The web of data lives with the web of pages and is carried on the Internet, the global carrier.
internet
web
of
pages
web
of
data
Under construc&on
Monday, November 21, 11
This web of data is the next big thing in discovering relevant information objects and the next big thing in empowering individuals, communities, and industries in making better use of information that they or others create. What distinguishes this web of data, this linked data environment, is the principal of identifying entities, virtual & real by statements of relationships and descriptions in machine readable form. More about this as we go along.
internet
web
of
pages
web
of
data
aka Linked Data
Under construc&on
Monday, November 21, 11
We are calling this next phase the Linked Data phase, because it is enGrely dependent upon statements of relaGonships and descripGons in machine readable form, but this phase may be only a pre-‐cursor to another, more complex and more difficult web world to engineer. The next phase is the SemanGc Web, which in theory allows the machine readable relaGonships and descripGons to interoperate to saGsfy a person’s requirements, albeit without constant interacGon. In short, in the SemanGc Web, the machines will understand meaning and presumably act on it. Scarey, eh?
60
ConstrucGon Tools
Monday, November 21, 11
How to we work to alleviate our problems as informaGon professionals, librarians and publishers?
• identify people, places, things, events, and other entities embedded in the knowledge resources that a research university consumes and produces
Recipe for crea+ng the web of data
Monday, November 21, 11
• identify people, places, things, events, and other entities embedded in the knowledge resources that a research university consumes and produces• tie those facts together with named connections
Recipe for crea+ng the web of data
Monday, November 21, 11
• identify people, places, things, events, and other entities embedded in the knowledge resources that a research university consumes and produces• tie those facts together with named connections• publish the relationships as crawl-able links on the web
Recipe for crea+ng the web of data
Monday, November 21, 11
• identify people, places, things, events, and other entities embedded in the knowledge resources that a research university consumes and produces• tie those facts together with named connections• publish the relationships as crawl-able links on the web
Recipe for crea+ng the web of data
Build/use apps supporting discovery via the web of data
Monday, November 21, 11
65
Monday, November 21, 11
Here is a pile of words represenGng all the words on the web that most search engines index constantly. Good search engines today can do a lot with this pile. BUT, the search engines create the percepGon of relaGonships, not based on meaning, but on other factors, such as number of links to a site containing the words of interest OR the traffic to a site.
66From this pile of words, structure!
Monday, November 21, 11
The Linked Data approach aSempts to structure the pile in anGcipaGon of the need for discovery. That structure is based on meaning, on relaGonships. I will make this clearer in the next slides.
67
Monday, November 21, 11
Here’s a graph of a very few relaGonships to Yo Yo Ma, the great ‘cellist.
68Linked Data WebMonday, November 21, 11
Here’s a graph of relaGonships to Haggis, just a fun one I could not resist throwing in. Meaning is provided by understanding relaGonships.
69
RDF$triples$&$URIs$
• RDF$triples$=$subject$–$object$–$predicate$– A$way$to$describe$objects$or$even$ideas$on$the$web$– An$object$or$idea$might$have$many$RDF$triples$describing$it$– Objects$or$ideas$need$not$exist$on$the$web!$
• URIs$=$Uniform$Resource$IdenDfiers$– Allows$machine$interacDon$among$Web$objects$– Various$syntacDcal$schemes$&$protocols$used$to$construct$URIs$
– At$least$3$needed$to$support$an$RDF$(subject$–$objectJ$predicate)$
Monday, November 21, 11
Geek ingredients to the construcGon of the Linked DAta Web. RDF means Resource DescripGon Framework, always expressed as a simple sentence, though mulGple such statements might aSach to a single enGty. In fact, we need mulGple RDFs in this scheme.
70
Monday, November 21, 11
A graph of RDF statements and URIs
71
The Linked Data Principles1. Use Resource Description Frameworks as names of things (people, places, times, objects, ideas...anything really)2. Use HTTP URIs so that people can look up those names3. When someone looks up a URI, provide useful RDF information4. Include RDF statements that link to other URIs so that they can discover related things
Monday, November 21, 11
The really great aspect of RDFs is that they can refer to ideas, not just to physical or virtual enGGes. Any kind of idea could be treated.
72
Library'Metadata'
• Library'metadata'standards'closed'• “Passive”'metadata,'searchable,'but…'• In'Silos ''• Readable,'but'not'ac=onable'• Search'results'refinable,'but'final'
'
Monday, November 21, 11
These are some of the edges of the problem of library metadata.
73
Library'Metadata'• Library'metadata'standards'
closed'• “Passive”'metadata,'
searchable,'but…'• In'Silos ''• Readable,'but'not'
ac<onable'• Search'results'refinable,'but'
final'
Seman/c'Web'Metadata'• Open'
• Dynamic,'Contextualized'
• In'the'wild'• Interac<ve,'Responsive'
• Leading'to'other'queries'&'views'
Library'Metadata'• Library'metadata'standards'
closed'• “Passive”'metadata,'
searchable,'but…'• In'Silos ''• Readable,'but'not'
ac<onable'• Search'results'refinable,'but'
final'
Seman/c'Web'Metadata'• Open'
• Dynamic,'Contextualized'
• In'the'wild'• Interac<ve,'Responsive'
• Leading'to'other'queries'&'views'
Monday, November 21, 11
And here is the comparison between the library metadata scene now and the one we advocate for the Linked Data/SemanGc Web. Library metadata in the Linked Data Web should be freely available, constantly updated, o[en reconciled with RDF triple statements from non-‐library sources. Library Linked Data should be enGrely open on the web.
74
Make Library bibliographic factsin to RDFs & URIs;Release them into the wild.Make Library Linked Data OPEN.
Monday, November 21, 11
I should add that accounGng for physical objects in our collecGons, locaGng them, making our collecGons auditable, and managing our collecGons seems to be possible using Linked Data too, at least in principal.
75
What about Publishers?
Monday, November 21, 11
76
Publishers*&*Socie/es**making*use*of*Linked*Data*
• Aggregate*content*in*their*own*realms*&*beyond*• Aggregate*informa/on*about*– Conferences*– Career*building*&*employment*opportuni/es*– Communi/es*in*collabora/on*– Commercial*&*other*services*suppor/ng*research*with*specimens,*source*material,*processing,*trials*
– Produc/ve*rela/onships*with*others*• Provide*ac/onable,*constantly*updated*links*in*support*of*scholars,*teachers,*and*learners*
• Provide*compelling*services*tying*users*to*them*
Monday, November 21, 11
Libraries too can use Linked Data to reveal and adverGse compelling services offered to their clients.
77Seman4c Web adoptersMonday, November 21, 11
Here are some of the big players in the Linked Data / SemanGc Web world. The BriGsh Library has released RDFs/URIs for the enGre BriGsh NaGonal Bibliography. The Library of Congress has released the same for LCSH & Name Authority Files. LCSH includes links to AGROVOC, RAMEAU, DNB, GLIN Subject Thesaurus, and the NaGonal Agriculture Library's Subject Index. Every Personal and Corporate entry in LC/NAF links to VIAF, the Virtual InternaGonal Authority File based at OCLC. The N Y Times 18 months ago made all 500,000 (and growing) of its index terms available in the wild as RDFs and URIs.
78
Monday, November 21, 11
For publishers and libraries...though we should not neglect services.
79
...if users can find it in their own contextMonday, November 21, 11
Context
80
ContentUsers
Users = readers, authors, teachers, students
Monday, November 21, 11
Context
81
ContentUsers
Publishers must make content VISIBLEMonday, November 21, 11
I am using the imperaGve here, because invisible published content means invisible benefit to the author and/or the publisher.
82
Monday, November 21, 11
Here is a recent PLoS arGcle from PLoS Neglected Tropical Diseases.
83
Monday, November 21, 11
And here is the semanGcally enhanced version of this arGcle, enhancements provided by David ShoSen et al. in the form of links to further informaGon, interacGve figures, re-‐orderable reference list, citaGons in context and tag trees. These enhancements took 10 man weeks in 2009! However, with the growing ecology of linked data, much of this could be accomplished by auto-‐tagging and algorithmic construcGon of the basic RDFs & URIs for the unique arGcle. Microdata submiSed by some publishers and their supporGng services to schema.org lead to these exciGng possibiliGes.
84
aggrega+onMonday, November 21, 11
AggregaGon counts, but think how much more we would get if we could aggregate from libraries, publishers, and the wild and weird variety of sources on the web?
85
Monday, November 21, 11
86
Disambigua4on
Monday, November 21, 11
RDFs and URIs can operate in many languages and relaGonships can be expressed across languages, a potenGal big benefit to research and collaboraGon in research.
87
Web of Data Progress
Monday, November 21, 11
88
2007
Monday, November 21, 11
FOAF = Friend of a Friend. Hundreds of millions of RDFs/URIs. Fortunately they do not take much space in memory!
89
Monday, November 21, 11
This is the 2011 graph of enGGes supplying RDFs and URIs. Now the populaGon is in the hundreds of billions, heading to trillions.
90hSp://inkdroid.org/lod-‐graph/
2011
Monday, November 21, 11
91
EncouragementExamples
Monday, November 21, 11
92
Linked'Open'Data'Value'Proposi4on'• Linked'open'data'(LOD)'puts'informa4on'where'people'are'looking'for'it'–'on'
the'Web;''• LOD'can'expands'discoverability'of'our'content;''• LOD'opens'opportuni4es'for'crea4ve'innova4on'in'digital'scholarship'and'
par4cipa4on;''• LOD'allows'for'open'con4nuous'improvement'of'data;''• LOD'creates'a'store'of'machineDac4onable'data'on'which'improved'services'can'
be'built;''• Library'linked'open'data'might'facilitate'the'break'down'the'tyranny'of'domain'
silos;''• LOD'can'provide'direct'access'to'data'in'ways'that'are'not'currently'possible;''• LOD'provides'unan4cipated'benefits'that'will'emerge'later'as'the'stores'of'LOD'
expand'exponen4ally.'''A"product"of"the"Stanford/CLIR"Linked"Data"Workshop"June"2011."
Monday, November 21, 11
25 ParGcipants from the BriGsh Library, the Bibliothèque naGonale de France, the Deutsch NaGonalbibliothek, the Royal Library of Denmark, Aalto University in Finland, the Library of Congress, the Bibliotheca Alexandrina, the NaGonal InsGtute of InformaGcs of Japan, Google, Seme4, Emory, University of Virginia, University of Michigan, California Digital Library, Knowledge MoGfs, CLIR, and Stanford.
93Google using Stanford bib facts + web resources
Monday, November 21, 11
This is a movie of a live interacGon with Freebase using bibliographic facts from Stanford, and linked informaGon resources from the web. It shows in a limited way the potenGal for discovery and retrieval in the Linked Data Web.
94
BnF using data only from its catalogs & Gallica
Monday, November 21, 11
This is another movie of the Linked Data prototype based enGrely on bibliographic facts from the BnF catalogs and digital texts in Gallica. There are no other web resources drawn into this prototype...yet.
95
Monday, November 21, 11
96
A"Bibliographic"Framework"for"the"Digital"Age"(October"31,"2011)!
• “The!new!bibliographic!framework!project!will!be!focused!on!the!Web!environment,!Linked!Data!principles!and!mechanisms,!and!the!Resource!Descrip?on!Framework!(RDF)!as!a!basic!data!model.!!The!protocols!and!ideas!behind!Linked!Data!are!natural!exchange!mechanisms!for!the!Web!that!have!found!substan?al!resonance!even!beyond!the!cultural!heritage!sector.!!Likewise,!it!is!expected!that!the!use!of!RDF!and!other!W3C!(World!Wide!Web!Consor?um)!developments!will!enable!the!integra?on!of!library!data!and!other!cultural!heritage!data!on!the!Web!for!more!expansive!user!access!to!informa?on.”!
Deanna%Marcum,%Associate%Librarian%of%Congress,%introducing%a%transi7on%from%MARC.%
Monday, November 21, 11
97
We in the cultural heritage and knowledge management institutions are discovering better ways of publishing, sharing, and using information by linking data and helping others do the same. Through this work, we have come to value and to promote the following practices:
1. Publishing data on the web for discovery and use, rather than preserving it in dark, more or less unreachable archives that are often proprietary and pro?it driven;
2. Continuously improving data and Linked Data, rather than waiting to publish “perfect” data;
3. Structuring data semantically, rather than preparing ?lat, unstructured data;
4. Collaborating, rather than working alone;
5. Adopting Web standards, rather than domain speci?ic ones;
6. Using open, commonly understood licenses, rather than closed and/or local licenses.
Value Proposi-on for LAM’s
from the Stanford/CLIR Workshop on Linked Data, June 2011
Monday, November 21, 11
In each couplet, we emphasize the second half, a[er “rather than”, admitng that someGmes the first half of the couplet has to be operaGve.
98
DARPA InternetMonday, November 21, 11
This is where we started 2.5 decades ago.
99World Wide Web
Monday, November 21, 11
Thanks to Tim Berners-‐Lee and many others, we advanced in this environment from the early 1990s unGl today.
100
SOCIAL WEB
Monday, November 21, 11
We cannot ignore the social web that exists in the current WWW, but think how much more, some of it scarey, could be done in the Linked Data Web with the behaviors of the Social Web.
101Linked Data WebMonday, November 21, 11
Just that funny reminder of the fundamental nature of the Linked Data Web: expressing machine acGonable relaGonships.
102Seman+c WebMonday, November 21, 11
And in the next web, the SemanGc Web, who knows what may be possible.
103
Ubiquitous compu+ng
Monday, November 21, 11
To the progression of network types, we need to add a couple of enormously important environmental factors. Ubiquitous compuGng is a very important one. Having lots of computers on the net makes the possibility of an open global linked data web very strong.
104
Mobility
Monday, November 21, 11
And our ability to communicate by voice (how about that Siri?) and by bits/bytes from everywhere, is, perhaps, just another aspect of ubiquitous compuGng.
105
Ubiquitous Compu4ng
Mobile
Internet
Web
Social Web
Linked Web
Monday, November 21, 11
The black box in the upper right corner is the SemanGc Web, a level of sophisGcaGon yet to be achieved. The linked data web is at hand, though.Will Librarians and Publishers join the development of the Linked Open Data web? I certainly think we should.
Monday, November 21, 11
NO MORE SILOS ARE NEEDED or wanted.
107
W3C Library Linked Data Incubator Grouphttp://www.w3.org/2005/Incubator/lld/
A Bibliographic Framework Initiative General Plan for the Digital Age (October 31, 2011)http://www.loc.gov/marc/transition/news/framework-103111.html
Linked Data Survey & Workshop June 2011hSp://www.clir.org/pubs/archives/linked-‐data-‐survey/
Monday, November 21, 11
108
Monday, November 21, 11
109
Monday, November 21, 11
110
Monday, November 21, 11
111
Monday, November 21, 11
112
Monday, November 21, 11
113
Monday, November 21, 11
Top Related