BioHealth Informatics Group Introducing Metadata Introducing Metadata ROGERS ROGERS JEREMY JEREMY...

32
BioHealth Informatics Group <title> <title> Introducing Introducing Metadata Metadata </title> </title> <speaker> <speaker> <n> <n> <sn>ROGERS</sn> <sn>ROGERS</sn> <fn>JEREMY</fn> <fn>JEREMY</fn> </n> </n> <em>[email protected]</em> <em>[email protected]</em> </speaker> </speaker>

Transcript of BioHealth Informatics Group Introducing Metadata Introducing Metadata ROGERS ROGERS JEREMY JEREMY...

BioHealthInformaticsGroup

<title><title>Introducing MetadataIntroducing Metadata</title></title>

<speaker><speaker> <n><n> <sn>ROGERS</sn><sn>ROGERS</sn> <fn>JEREMY</fn><fn>JEREMY</fn> </n></n> <em>[email protected]</em> <em>[email protected]</em></speaker></speaker>

BioHealthInformaticsGroup

OutlineOutline

►What is metadataWhat is metadata►Dublin CoreDublin Core

►Who is using it?Who is using it?►Case study: RHICase study: RHI

►TechnologiesTechnologies►Topic Maps and stand-off markupTopic Maps and stand-off markup

BioHealthInformaticsGroup

What is metadata?What is metadata?

►Data about dataData about data►Provenance (who wrote it ?)Provenance (who wrote it ?)

►How data is organisedHow data is organised

►Subject matterSubject matter

►Some analogies:Some analogies:►Labels on a canLabels on a can

►A name tagA name tag

►Index of a bookIndex of a book

Tha

nks

to A

ndre

w S

alo

p, A

dob

e.co

m

BioHealthInformaticsGroup

(A metametadata slide)(A metametadata slide)

Each metadata item comprises:Each metadata item comprises:

► A label to say what type of metadata followsA label to say what type of metadata follows

► On paper wrapped around a tinOn paper wrapped around a tincan we can expect (in some order):can we can expect (in some order):

►Name of productName of product

►Country of originCountry of origin

►Ingredients – expressed in E-numbers Ingredients – expressed in E-numbers (or not)(or not)

►IllustrationIllustration

►Quantity containedQuantity contained

►Best before dateBest before date

► A reference to agreed meaning and scope A reference to agreed meaning and scope of typeof type

► A valueA value

BioHealthInformaticsGroup

Some examples: Some examples: provenanceprovenance

► AuthorAuthor

► Date authoredDate authored

► Authoring tool usedAuthoring tool used

► File formatFile format

► Etc……Etc……

BioHealthInformaticsGroup

Dublin Core MetadataDublin Core Metadatahttp://dublincore.org/documents/dces/http://dublincore.org/documents/dces/

TitleTitle – the name of the resource – the name of the resource

CreatorCreator – author – author

SubjectSubject – subject as keywords, classification codes – subject as keywords, classification codes

DescriptionDescription – can of worms – can of worms

PublisherPublisher – entity responsible for making available – entity responsible for making available

ContributorContributor – responsible entity – responsible entity

DateDate – usually creation or publication date – usually creation or publication date

TypeType – nature or genre of the content – nature or genre of the content

FormatFormat – physical or digital manifestation – physical or digital manifestation

IdentifierIdentifier – unambiguous URI – unambiguous URI

SourceSource – Reference to contributing material – Reference to contributing material

LanguageLanguage

RelationRelation – a reference to a related resource – a reference to a related resource

CoverageCoverage – jurisdiction, temporal validity – jurisdiction, temporal validity

RightsRights – Information abut rights over the resource – Information abut rights over the resource

BioHealthInformaticsGroup

Metadata: a real exampleMetadata: a real example

<!-- Copyright DSTC, 1997, 1998. --> <html> <head> <META name="DC.Title" content="Metadata.Net Home Page"> <META name="DC.Creator" content="Renato Iannella"> <META name="DC.Subject" content="metatdata, registry, editor"> <META name="DC.Date.Created" content="1997-03-12"> <META name="DC.Date.Modified" content="1998-07-23"> <META name="DC.Rights" content="Copyright DSTC Pty Ltd 1998"> <title>Metadata.Net Home Page</title> </head>

<body bgcolor=#f7feff text=#000000 link=#0000ff vlink=#0000e0> <h1> <img src="./images/md-title.gif" alt="Welcome to Metadata.Net Home Page"> </h1>

<font size=-1> Metadata.Net Locations: <a href="http://metadata.net">Australia</a>, <a href="http://www.ilrt.bris.ac.uk/discovery/mirrors/metadata.net/">United Kingdom</a> </font>

<h1>Metadata Projects</h1> <UL> <DT><img src="./images/tri.gif" align=center> <a href="http://metadata.net/harmony/">The HARMONY Project</a> </UL>

BioHealthInformaticsGroup

……and another and another exampleexample

<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Transitional//EN" "http://www.w3.org/TR/xhtml1/DTD/xhtml1-transitional.dtd">

<html xmlns="http://www.w3.org/1999/xhtml" lang="en-US" xml:lang="en-US"> <head>

<title>Dublin Core Metadata Initiative (DCMI)</title> <link rel="schema.DC" href="http://purl.org/dc/elements/1.1/" /> <meta name="DC.date" content="2004-10-05" /> <meta name="DC.format" content="text/html" /> <meta name="DC.contributor" content="Dublin Core Metadata Initiative" /> <meta name="DC.language" content="en" /> <meta http-equiv="Content-Type" content="text/html; charset=iso-8859-1" /> <meta name="DC.title" content="Dublin Core Metadata Initiative (DCMI) Home Page" /> <meta name="DC.description" content="The Dublin Core Metadata Initiative is an open forum engaged in the development of

interoperable online metadata standards that support a broad range of purposes and business models. DCMI's activities include consensus-driven working groups, global conferences and workshops, standards liaison, and educational efforts to promote widespread acceptance of metadata standards and practices." />

<link rel="meta" href="index.shtml.rdf" /> <link rel="alternate" type="application/rss+xml" title="RSS" href="http://dublincore.org/news.rss" /> <link rel="stylesheet" href="/css/default.css" type="text/css" /> <script src="/js/default.js" type="text/javascript"></script>

</head> <body>

<div class="headerhome">

BioHealthInformaticsGroup

Meanwhile…Meanwhile…In medical publishing:In medical publishing:

INFORMATION IN PRACTICE:INFORMATION IN PRACTICE:Gunther Eysenbach, Thomas L Diepgen, J A Muir Gunther Eysenbach, Thomas L Diepgen, J A Muir Gray, Maurizio Bonati, Piero Impicciatore, Chiara Gray, Maurizio Bonati, Piero Impicciatore, Chiara Pandolfini, and Subbiah ArunachalamPandolfini, and Subbiah ArunachalamTowards quality management of medical Towards quality management of medical information on the internet: evaluation, information on the internet: evaluation, labelling, and filtering of informationlabelling, and filtering of informationBMJ, Nov 1998; 317: 1496 - 1502. BMJ, Nov 1998; 317: 1496 - 1502. ...we discuss a common measure that could ...we discuss a common measure that could solve both aspects at the same time: solve both aspects at the same time: assigning "assigning "metadatametadata" to internet information; " to internet information; both evaluative both evaluative metadatametadata to help consumers to help consumers assess reliability and descriptive assess reliability and descriptive metadatametadata to...to... ...A prerequisite for this approach is that ...A prerequisite for this approach is that internet information is labelled with internet information is labelled with ""metadatametadata" in a standardised format to allow " in a standardised format to allow software to search for and check information software to search for and check information that is...that is...

<meta name="robots" content="nofollow">

<META NAME="ROBOTS" CONTENT="NOARCHIVE">

BioHealthInformaticsGroup

Robots

Date

Robots

Date

Robots

Description

Top 10 medical journals by impact factor (1999)Top 10 medical journals by impact factor (1999)Source: Source: Journal Citation ReportsJournal Citation Reports ( (JCRJCR) on CD-ROM ) on CD-ROM 1999 Science Edition Journal Rankings Sorted by Impact Factor1999 Science Edition Journal Rankings Sorted by Impact Factor

Robots

RobotsRobots

Robots

RobotsRobots

None

NoneNone

RobotsNone

NoneNone None

So who So who isis using metadata? using metadata?

BioHealthInformaticsGroup

Subject Metadata: Subject Metadata: What’s it all about?What’s it all about?

Dublin Core Element Name: SubjectDublin Core Element Name: SubjectLabelLabel:Subject and Keywords:Subject and KeywordsDefinitionDefinition:A topic of the content of the resource.Comment:Typically, Subject will be :A topic of the content of the resource.Comment:Typically, Subject will be expressed as keywords, key phrases or classification codes that describe a topic of the expressed as keywords, key phrases or classification codes that describe a topic of the resource. Recommended best practice is to select a value from a controlled vocabulary or resource. Recommended best practice is to select a value from a controlled vocabulary or formal classification scheme.formal classification scheme.

Issues:Issues:

What controlled medical vocabularies are there?What controlled medical vocabularies are there?

Which were designed for describing narrative?Which were designed for describing narrative?

How many are formal classifications?How many are formal classifications?

How do you use them in this context?How do you use them in this context?

More than 100

None of them

Nobody knows

Two

BioHealthInformaticsGroup

Case Study:Case Study:

►Journalists write ~100 health related Journalists write ~100 health related articles dailyarticles daily

►Delivered to clients Delivered to clients en blocen bloc or filtered into or filtered into news ‘verticals’news ‘verticals’

►News verticals are dynamicNews verticals are dynamic

►How to filter?How to filter?►Including retrospectively if new vertical createdIncluding retrospectively if new vertical created

BioHealthInformaticsGroup

An example RHI storyAn example RHI storyClinicalClinical

HRT effect on breast density not the same for all womenHRT effect on breast density not the same for all women

Last Updated: 2004-10-18 14:56:25 -0400 (Reuters Health)Last Updated: 2004-10-18 14:56:25 -0400 (Reuters Health)

By Anthony J. Brown, MDBy Anthony J. Brown, MD

NEW YORK (Reuters Health) - The extent to which hormone replacement therapy (HRT) increases breast NEW YORK (Reuters Health) - The extent to which hormone replacement therapy (HRT) increases breast density depends on the age at which it is started, the duration of use, and the patient's reproductive history, density depends on the age at which it is started, the duration of use, and the patient's reproductive history, according to findings presented this week at the American Association for Cancer Research meeting in Seattle.according to findings presented this week at the American Association for Cancer Research meeting in Seattle.

Increased "breast density can make tumors more difficult to detect on mammogram," lead author Erin J. Aiello, Increased "breast density can make tumors more difficult to detect on mammogram," lead author Erin J. Aiello, from the Group Health Cooperative in Seattle, told Reuters Health. "Identifying factors that influence the link from the Group Health Cooperative in Seattle, told Reuters Health. "Identifying factors that influence the link between HRT and breast density may help clinicians and patients make a more informed decision about such between HRT and breast density may help clinicians and patients make a more informed decision about such therapy." therapy."

The new findings are based on a study of 46,436 women who underwent a total of 100,525 screening The new findings are based on a study of 46,436 women who underwent a total of 100,525 screening mammograms between January 1996 and June 2002. Questionnaires were administered at each mammogram mammograms between January 1996 and June 2002. Questionnaires were administered at each mammogram session to assess HRT use as well as the presence of breast cancer risk factors.session to assess HRT use as well as the presence of breast cancer risk factors.

Compared with never users of HRT, current and former users were 70% and 18% more likely, respectively, to Compared with never users of HRT, current and former users were 70% and 18% more likely, respectively, to have dense breasts.have dense breasts.

In current HRT users, the odds of dense breasts were heightened by long duration of use, older age, rising In current HRT users, the odds of dense breasts were heightened by long duration of use, older age, rising parity, younger age at first birth, and natural menopause, the investigators state. In former users, the odds were parity, younger age at first birth, and natural menopause, the investigators state. In former users, the odds were increased by long duration of use, older age, and high BMI. increased by long duration of use, older age, and high BMI.

Further analysis revealed that combined estrogen-progestin HRT was more strongly associated with dense Further analysis revealed that combined estrogen-progestin HRT was more strongly associated with dense breasts than estrogen-only HRT, the researchers point out. breasts than estrogen-only HRT, the researchers point out.

"Older age, having more children, and younger age at first birth are generally associated with a decrease in "Older age, having more children, and younger age at first birth are generally associated with a decrease in breast density," Aiello said. "Our findings indicate that HRT may actually eliminate those protective effects." breast density," Aiello said. "Our findings indicate that HRT may actually eliminate those protective effects."

Aiello said her group is beginning a randomized trial to investigate whether stopping HRT a few weeks prior to Aiello said her group is beginning a randomized trial to investigate whether stopping HRT a few weeks prior to mammography can reduce breast density and ease radiologic interpretation. mammography can reduce breast density and ease radiologic interpretation.

BioHealthInformaticsGroup

RHI News VerticalsRHI News Verticals

Consumer Health News:AIDS NewsAddiction NewsAllergy NewsAlternative Medicine NewsArthritis NewsAsthma NewsBlood Cancer NewsBlood NewsCancer NewsCardiology NewsChildren's Health NewsCholesterol NewsDental NewsDermatology NewsDiabetes NewsDigestive Diseases NewsDisease NewsDrug & Device NewsFitness NewsGenetics NewsGynecologic OncologyHeadache NewsHealth Care News

aHealth eLineHealth eLine - ImportantHealth eLine - PriorityHealth eLine - SpanishHealth eLine - PortugueseHeartburn News

Human Interest NewsHypertension NewsInfectious Diseases NewsLaw & Ethics NewsLiver NewsMen's Health NewsMental Health NewsMultiple SclerosisNeurology NewsNutrition NewsObesity NewsOrthopedics NewsOsteoporosis NewsPain NewsPublic Health NewsReproductive NewsResearch NewsRespiratory NewsSchizophrenia NewsSenior NewsSexuality NewsSports Medicine NewsSurgery NewsTerminal Care NewsTransplant NewsTravel NewsUrology NewsVaccine NewsWellness NewsWomen's Health News

Professional:AIDSAddictionAll Professional NewsAllergyAlzheimer's DiseaseAnxietyArthritisAsthmaBipolar DisorderCardiologyChild PsychologyClinicalDepressionDermatologyDiabetesDrug & Device DevelopmentEconomicEpidemiologyEthicsExerciseGERDGastroenterologyGeneticsGeriatricsGynecologic OncologyHematologic CancerHematologyHuman InterestHypertensionIndustryIndustry Briefing - Priority

Infectious DiseasesLegalLegislativeLiverManaged CareMediaMen's HealthMental HealthMental Health - SpanishMigraineMultiple Sclerosis NewsNeurologyNutritionObesityOncologyOphthalmologyOrthopedicsOsteoporosisPain ManagementPediatricsPolicyPoliticalProfessional DevelopmentProfessional Medical NewsProfessional Medical News - PriorityProfessional Medical News - SpanishPublic HealthRadiologyRegulatoryReproductive HealthRespiratory MedicineSchizophreniaScienceStrokeSurgery

TransplantTravelUrologyVaccineVitaminsWomen's Health

BioHealthInformaticsGroup

Available formatsAvailable formats

►TextText►No markup, except organised into paragraphsNo markup, except organised into paragraphs

►(X)HTML(X)HTML►Paragraphs marked up to indicate functionParagraphs marked up to indicate function

►RHI RDFRHI RDF►As (X)HTML but with RDF header including semantic codesAs (X)HTML but with RDF header including semantic codes

►NewsMLNewsML►eNews industry standard format, includes RDF metadataeNews industry standard format, includes RDF metadata

BioHealthInformaticsGroup

Under the bonnetUnder the bonnet

1.1. Journalist writes articleJournalist writes article

2.2. Neural net software suggests SNOMED termsNeural net software suggests SNOMED terms

3.3. Lexical match pulls terms from other vocabulariesLexical match pulls terms from other vocabularies

E.g. Drug lexicons, company names etcE.g. Drug lexicons, company names etc

4.4. Typically 5-10 terms suggestedTypically 5-10 terms suggested

5.5. 50% of suggestions wrong: Journalist deletes 50% of suggestions wrong: Journalist deletes

6.6. Journalist adds further 2-4 codesJournalist adds further 2-4 codes

3 minutes per article3 minutes per article

7.7. Software adds all parent codes Software adds all parent codes

more general termsmore general terms

BioHealthInformaticsGroup

Incoming or Incoming or existing News existing News StoriesStories

Vertically Vertically Specialized Specialized VocabulariesVocabularies

Route to Route to specific areas specific areas of interestof interest

Oncology Urology Proctology Cardiology Pathology Neurology Kidney Disease

Heart Disease

Alzheimer AIDS

Flexible definition of Flexible definition of vertical content vertical content packages to meet packages to meet demands of many demands of many customerscustomers

RHI Process ModelRHI Process Model

SNOMED

AIDS

NAICS

Pah

Companies

Glaxo

Business terms

regulatory

BioHealthInformaticsGroup

Metadata technologies:Metadata technologies:

Topic MapsTopic Maps

BioHealthInformaticsGroup

What’s in an index ?What’s in an index ?La BohèmeLa Bohème, 10, 70, , 10, 70, 197-198197-198, 326 , 326 Cavalleria RusticanaCavalleria Rusticana, 71, , 71, 203-204203-204 The Girl of the Golden WestThe Girl of the Golden West, see , see La fanciulla del WestLa fanciulla del West Leoncavallo, Ruggiero Leoncavallo, Ruggiero

I PagliacciI Pagliacci, 71-72, 122, , 71-72, 122, 247-249247-249, 326 , 326 Madama ButterflyMadama Butterfly, 70-71, , 70-71, 234-236234-236, 326 , 326 Manon LescautManon Lescaut, , 294294 Mascagni, Pietro Mascagni, Pietro

Cavalleria RusticanaCavalleria Rusticana, 71, , 71, 203-204203-204 Puccini, Giacomo, 69-71 Puccini, Giacomo, 69-71

La BohèmeLa Bohème, 10, 70, , 10, 70, 197-198197-198, 326 , 326 La fanciulla del WestLa fanciulla del West, , 291291

Madama ButterflyMadama Butterfly, 70-71, , 70-71, 234-236234-236, 326 , 326 Manon LescautManon Lescaut, , 294294 ToscaTosca, 26, 70, , 26, 70, 274-276274-276, 326 , 326 TurandotTurandot, 70, , 70, 282-284282-284, 326 , 326 Rustic ChivalryRustic Chivalry, see , see Cavalleria RusticanaCavalleria Rusticana singers, 39-52, singers, 39-52, See alsoSee also individual names individual names

baritone, 46 baritone, 46 bass, 46-47 bass, 46-47 soprano, 41-42, 337 soprano, 41-42, 337 tenor, 44-45 tenor, 44-45

soprano, 41-42, 337 soprano, 41-42, 337 tenors, 44-45 tenors, 44-45 ToscaTosca, 26, 70, , 26, 70, 274-276274-276, 326 , 326 TurandotTurandot, 70, , 70, 282-284282-284, 326 , 326

different different typestypes of topic (the names of of topic (the names of operas are shown operas are shown in italicin italic))

different different typestypes of occurrence of occurrence (references to synopses are (references to synopses are in boldin bold))

synonyms for the same topicsynonyms for the same topic

links to associated topics (links to associated topics (see alsosee also))

associations between different topics associations between different topics (e.g. between a composer and his (e.g. between a composer and his works)works)

supertype and subtype informationsupertype and subtype information

BioHealthInformaticsGroup

What’s a topic map ?What’s a topic map ?

►Very similar to an index except…Very similar to an index except…

►Implicit information made explicitImplicit information made explicit►Not represented in typographical conventionsNot represented in typographical conventions

►Consistent use of topics within indexesConsistent use of topics within indexes►Consistent use between indexes ?Consistent use between indexes ?

BioHealthInformaticsGroup

What is a topic map ?What is a topic map ?www.topicmaps.orghttp://www.ontopia.net/topicmaps/materials/tao.html

►Means to convey knowledge about Means to convey knowledge about resources resources ► Knowledge embodied in a superimposed layerKnowledge embodied in a superimposed layer

► A ‘map’ of the resources A ‘map’ of the resources

►Captures Captures ► the subjects of which resources speakthe subjects of which resources speak

► the relationships between those subjectsthe relationships between those subjects

► implementation-independentimplementation-independent

BioHealthInformaticsGroup

What is a Topic Map ?What is a Topic Map ?

The central concepts in topic maps are:The central concepts in topic maps are:

Occurrences e.,g. a physical book, a web page= the page numbers

Associations e.g. ‘is-author-of’= ‘see also’ links

Topics e.g. ‘Shakespeare’, ‘Hamlet’= the list of subjects in an index

BioHealthInformaticsGroup

History of Topic MapsHistory of Topic Maps

►1993 first description of topic maps1993 first description of topic maps

►2000 ISO/IEC standard 13250:20002000 ISO/IEC standard 13250:2000► Used HyTMUsed HyTM

►2001 minor update2001 minor update► To include XTMTo include XTM

►Main protagonists today Main protagonists today ► (a small but active community)(a small but active community)

► Ontopia (Oslo)Ontopia (Oslo)

► Techuila (Oxford)Techuila (Oxford)

► InfoLoom (New York)InfoLoom (New York)

► TopicMaps.OrgTopicMaps.Org

An exampleAn example

Subjects OccurencesType

Shakespeare 1

Hamlet 2

Jonson 3

London 4

Stratford 5

Volpone 6

Play 7

Town 8

Associations

Person 9

Topics

BioHealthInformaticsGroup

An Example in XTMAn Example in XTM<<topictopic id=“2"> id=“2">

Shakespeare 1

Hamlet 2

Play 7

writtenBy

author

work

<topic id=“2"> <instanceOf><topicRef xlink:href="#7"/></instanceOf>

<topic id=“2"> <instanceOf><topicRef xlink:href="#7"/></instanceOf><baseName>

<baseNameString>Hamlet, Prince of Denmark</baseNameString>

<baseNameString>The Scottish Play</baseNameString></baseName>

<topic id=“2"> <instanceOf><topicRef xlink:href="#7"/></instanceOf><baseName>

<baseNameString>Hamlet, Prince of Denmark</baseNameString>

<baseNameString>The Scottish Play</baseNameString></baseName><occurrence>

<instanceOf><topicRef xlink:href=“#plain-text-format"/>

</instanceOf> <resourceRef xlink:href=“#mybook.hamlet.page101“/>

</occurrence></topic>

<association><instanceOf>

<topicRef xlink:href="#written-by"/></instanceOf>

<topic id=“2"> <instanceOf><topicRef xlink:href="#7"/></instanceOf><baseName>

<baseNameString>Hamlet, Prince of Denmark</baseNameString>

<baseNameString>The Scottish Play</baseNameString></baseName><occurrence>

<instanceOf><topicRef xlink:href=“#plain-text-format"/>

</instanceOf> <resourceRef xlink:href=“#mybook.hamlet.page101“/>

</occurrence></topic>

<association><instanceOf>

<topicRef xlink:href="#written-by"/></instanceOf><member>

<roleSpec><topicRef xlink:href="#author"/></roleSpec>

<topic id=“2"> <instanceOf><topicRef xlink:href="#7"/></instanceOf><baseName>

<baseNameString>Hamlet, Prince of Denmark</baseNameString>

<baseNameString>The Scottish Play</baseNameString></baseName><occurrence>

<instanceOf><topicRef xlink:href=“#plain-text-format"/>

</instanceOf> <resourceRef xlink:href=“#mybook.hamlet.page101“/>

</occurrence></topic>

<association><instanceOf>

<topicRef xlink:href="#written-by"/></instanceOf><member>

<roleSpec><topicRef xlink:href="#author"/></roleSpec><topicRef xlink:href="#1"/>

</member>

<topic id=“2"> <instanceOf><topicRef xlink:href="#7"/></instanceOf><baseName>

<baseNameString>Hamlet, Prince of Denmark</baseNameString>

<baseNameString>The Scottish Play</baseNameString></baseName><occurrence>

<instanceOf><topicRef xlink:href=“#plain-text-format"/>

</instanceOf> <resourceRef xlink:href=“#mybook.hamlet.page101“/>

</occurrence></topic>

<association><instanceOf>

<topicRef xlink:href="#written-by"/></instanceOf><member>

<roleSpec><topicRef xlink:href="#author"/></roleSpec><topicRef xlink:href="#1"/>

</member> <member>

<roleSpec><topicRef xlink:href="#work"/></roleSpec>

<topic id=“2"> <instanceOf><topicRef xlink:href="#7"/></instanceOf><baseName>

<baseNameString>Hamlet, Prince of Denmark</baseNameString>

<baseNameString>The Scottish Play</baseNameString></baseName><occurrence>

<instanceOf><topicRef xlink:href=“#plain-text-format"/>

</instanceOf> <resourceRef xlink:href=“#mybook.hamlet.page101“/>

</occurrence></topic>

<topic id=“2"> <instanceOf><topicRef xlink:href="#7"/></instanceOf><baseName>

<baseNameString>Hamlet, Prince of Denmark</baseNameString>

<baseNameString>The Scottish Play</baseNameString></baseName><occurrence>

<instanceOf><topicRef xlink:href=“#plain-text-format"/>

</instanceOf> <resourceRef xlink:href=“#mybook.hamlet.page101“/>

</occurrence></topic>

<association><instanceOf>

<topicRef xlink:href="#written-by"/></instanceOf><member>

<roleSpec><topicRef xlink:href="#author"/></roleSpec><topicRef xlink:href="#1"/>

</member> <member>

<roleSpec><topicRef xlink:href="#work"/></roleSpec><topicRef xlink:href="#2"/>

</member> </association>

BioHealthInformaticsGroup

What are Topic Maps for ?What are Topic Maps for ?

►Indexing large resources you can’t write in Indexing large resources you can’t write in directlydirectly► Original ‘back of book index’ motivationOriginal ‘back of book index’ motivation

►Organising web sitesOrganising web sites► Some elements on a web page generated from topic mapSome elements on a web page generated from topic map

►E.g. www.ontopia.net/operamap/E.g. www.ontopia.net/operamap/

►Expert systemsExpert systems

►Way of storing information flows & audit trailsWay of storing information flows & audit trails► Or any other network of info, reallyOr any other network of info, really

BioHealthInformaticsGroup

How do I make one ?How do I make one ?

► By handBy hand► Very labour intensive, but usually highest qualityVery labour intensive, but usually highest quality

► Some tools e.g. Ontopia, TMTab for Protégé, TMDesignerSome tools e.g. Ontopia, TMTab for Protégé, TMDesigner

► Semi-automagicallySemi-automagically► If the original data is already well structuredIf the original data is already well structured

► Some language processing tools can helpSome language processing tools can help

► Automatic TransformationAutomatic Transformation► Re-write other sources of the same informationRe-write other sources of the same information

► Useful websitesUseful websites► www.ontopia.netwww.ontopia.net

► www.techquila.comwww.techquila.com

BioHealthInformaticsGroup

Strengths of Topic Maps…Strengths of Topic Maps…

► Good for large or dynamic Good for large or dynamic information sourcesinformation sources► E.g. ‘2001 Tax Products’ CDROM from E.g. ‘2001 Tax Products’ CDROM from

US IRSUS IRS

► E.g. www.quid.frE.g. www.quid.fr

► Better querying than a static Better querying than a static indexindex► ‘‘Find all composers who have written Find all composers who have written

an opera that was not first performed an opera that was not first performed in Italy but was based on a work in Italy but was based on a work originally written by Shakespeare’originally written by Shakespeare’

► Making it possible/easier to Making it possible/easier to merge knowledge resourcesmerge knowledge resources► allegedlyallegedly

BioHealthInformaticsGroup

……and limitationsand limitations

► Not so good for Not so good for veryvery large information large information sourcessources► E.g. The entire world wide web E.g. The entire world wide web

(= ‘The Semantic Web’)(= ‘The Semantic Web’)

► The ontology problemThe ontology problem

► Different topics used differently Different topics used differently

► at different times & by different authorsat different times & by different authors

► No inferencing (yet)No inferencing (yet)► E.g. E.g. IF Aria IF Aria sung-insung-in Act I Act I

AND Act I AND Act I part-ofpart-of Tosca ToscaTHEN Aria THEN Aria sung-insung-in Tosca Tosca

► Limitations in what can be saidLimitations in what can be said► E.g. Child must have exactly one motherE.g. Child must have exactly one mother

► E.g. Mother of a baby kangaroo must also be a E.g. Mother of a baby kangaroo must also be a kangarookangaroo

BioHealthInformaticsGroup

Metadata technologies:Metadata technologies:

Stand-off markupStand-off markup

BioHealthInformaticsGroup

Stand-off vs DirectStand-off vs Direct

► Direct markupDirect markup► Metadata is physically merged with dataMetadata is physically merged with data

► E.g. metadata tags within XML documentsE.g. metadata tags within XML documents

► Stand-offStand-off► Metadata is physically in a different data fileMetadata is physically in a different data file

► Metadata tags include pointer to region of another file that Is Metadata tags include pointer to region of another file that Is being marked upbeing marked up

BioHealthInformaticsGroup

Stand-off:Stand-off: Strengths & WeaknessesStrengths & Weaknesses

►StrengthStrength►You don’t have to have permission to write in the You don’t have to have permission to write in the

document being marked updocument being marked up

►WeaknessWeakness►The owner of the document you’ve marked up can change The owner of the document you’ve marked up can change

or move it without telling youor move it without telling you

►Very difficult (but not impossible) to provide stand-off Very difficult (but not impossible) to provide stand-off mark-up for with dynamically created documentsmark-up for with dynamically created documents

►See See http://cohse.semanticweb.org/http://cohse.semanticweb.org/