1
SM
S M
anag
emen
t & T
echn
olog
y
Semantic Analysis in IA
Matthew HodgsonACT regional-lead, Web and Information Management
23 Sept 2007
2
SM
S M
anag
emen
t & T
echn
olog
y
3
SM
S M
anag
emen
t & T
echn
olog
y
4
SM
S M
anag
emen
t & T
echn
olog
y
Jeffrey Veen on analysing content
“a mind-numbingly detailed odyssey through your web site...
…this process…is a relatively straightforward process of clicking through your web site and recording what you find.”
Source: http://www.adaptivepath.com/ideas/essays/archives/000040.php
5
SM
S M
anag
emen
t & T
echn
olog
y
6
SM
S M
anag
emen
t & T
echn
olog
y
Content overview – first take Medical restrictions text Free-text built in Word and hand-crafted (*grrr*) Unclassified Varied consistency within and between texts Highly complex sentence structures in pseudo-legalese Style reflects the author rather than
the meaning in the communication
Content needed for re-use Content output was needed for reuse by others Multiple audiences Multiple purposes for re-use
Codification Codification (after authoring) takes too long Need to reduce timeframes!
7
SM
S M
anag
emen
t & T
echn
olog
y
The task . . .analyse and codify
Concept 1
Concept 2Concept 3
Concept 4 Concept 5
Concept 5
8
SM
S M
anag
emen
t & T
echn
olog
y Linguistics…a whole discipline devoted to the
study of language
9
SM
S M
anag
emen
t & T
echn
olog
y
“You’re joking!?” All language has structure – even someone’s pseudo-legal English
Analysing language is actually easier than you might think
10
SM
S M
anag
emen
t & T
echn
olog
y
The approach
Analyse semantics of content There is a predicable structure It’s all just Lego™ building blocks (nouns, verbs,
adjectives, etc) Implied meaning can be made overt
New tools for IAs to play with! Understand semantics, the structure of sentences,
and you can analyse, categorise and codify English!
11
SM
S M
anag
emen
t & T
echn
olog
y
Language as Lego™
Building blocks Subject (S) Verb (V) Object (O)
Order of blocks Differs depending on the language
12
SM
S M
anag
emen
t & T
echn
olog
y
Order from chaos
SVO languages English, French, Chinese, Bulgarian, SwahiliSOV Japanese, Turkish, KoreanVSO Classical Arabic, Celtic and HawaiianVOS Fijian, Yoda’s amusing phrases
13
SM
S M
anag
emen
t & T
echn
olog
y
Subjects, verbs and objects
The a
Subject Verb Object
red appleapple is
Sometimes, though, the SVO structure is hidden: The apple is red or The apple is a red apple?
Uncovering the hidden structure helps to differentiate between the subject and the object and identify the who and what
14
SM
S M
anag
emen
t & T
echn
olog
y
Sentences as (apple) trees
VERB OBJECTSUBJECT
The apple is a red apple
NounPhrase
NounPhrase
VerbPhrase
Det NounVerb(be)
Det Adj Noun
Root
15
SM
S M
anag
emen
t & T
echn
olog
y
Semantic analysis
Medical restrictions wording:
Restricted benefitGastro-oesophageal reflux disease; Scleroderma oesophagus;
Authority requiredPeptic ulcer
16
SM
S M
anag
emen
t & T
echn
olog
y
Semantic analysis (cont.)
Actual sentence Peptic ulcer
Implied sentence The prescription of medicine is restricted to the
initial treatment of patients with peptic ulcer
17
SM
S M
anag
emen
t & T
echn
olog
y
the prescription of medicine is restricted to the
DETVNDET PN
(SUBJECT)AUX
VAUX
treatment ofinitial peptic ulcerpatients with
NADJ P P ADJ NN
NounPhrase
PreposPhrase
NounPhrase
Root VerbPhrase
NounPhrase
PreposPhrase
NounPhrase
18
SM
S M
anag
emen
t & T
echn
olog
yWHO
TREATED?
treatment of patientsinitial
Initi
al o
r co
ntin
uin
g
70 year old
mother
pregnant
Co
nd
itio
n b
ein
g t
rea
ted
form
Pra
ctic
al a
spe
cts
Ob
ject
the prescription of medicine is restricted to the
Su
bje
ct
Ve
rb
femalecontinuing
other ADJ
male
Pa
tien
t d
esc
rip
tors
(p
op
ula
tion
/gro
up
)
details of doctorrecord
daterecord
sign
receivingdBMARD treatment
previouslyPBS-
subsidised
PB
S s
ub
sid
ise
d
receivingPBS-
subsidiseddBMARD treatment
treated immunologistclinical
Lim
itatio
n o
fP
resc
rib
ing
to
a s
pe
cific
spe
cia
list
gro
up
withnausea and
vomiting
advanced psoriasis
peptic ulcerwith
tumorwith malignant
scleroderma oesphaguswith
with
with chronic pain
chemotherapycytotoxic
receivingA 5HT3
antagonist
radiotherapyreceiving
Exi
stin
g t
rea
tme
nt
de
scri
pto
rs
of
po
pu
latio
n
not toresponding anelgesics
not
ADJ
receiving
treated dermatologist
WHATCONDITION?
+
ADJ
NOUN
PREP
VERB
by
by
KEY
not previously
ACTIONREQUIRED
=complete
Authority action sheet
includewhole body
area diagrams
treat for period of time
provide historypreivous
prescribe repeatsnumber
with seizures
not toother
anti-epilepticdrugs
receiving treatment2 years
incomplete resolution
ADJ/PP
of
no indication of
surgeryhaving
responding
unable take of topiramatesolid form
partial
hormone dependent metastatic
cancerwith
Me
asu
res/
de
scri
pto
rso
f C
on
diti
on
se
veri
ty(A
DJ)
breast
contact Medicare
obtainAuthority number
19
SM
S M
anag
emen
t & T
echn
olog
y
“Who Treated” semantic model
Age
Patient Group
Documented history
[mg ...etc]
[CLINICIAN] Requiring special expertise in
Requiring no special expertise
[EXPERTISE]
[SEVERITY] [CONDITION]
Sex
PBS subsidised
PBS non-subsidised
At a dose of
Weekly
Daily
Monthly
Yearly
Fortnightly
Hourly
Hours
Days
Weeks
Months
Years
Vocation Veteran
Male
Female
All
Ethnicity [ETHNICITY]
Entitlement [?]
[LIST]
[LIST]
Pregnant
Breastfeeding
[ADJECTIVE]
Veteran
?
[MEASURED AS]?
Co-administered with
That meet a specific definition/criteria as set out in [LIST of references]
General schedule of Lipid-lowering Drugs
and
[DEFINED BY]
Treatments
Within timeframe of
Over a period of
Trials
Treatment with
Treatment of
Treatment for
Initial
Continuing
Maintenance
Effective
Ineffective
Inappropriate
Initiation
Stabilisation
In conjunction with
Not in conjunction with
Following
Preceeding
Received
Has not received
Not responding
Responding
Failed to qualify for
Qualified for
Not indicated
Indicated
Has had
Has not had
Can have
Can not have
Can not receive
Disease progression
Disease regression
Treated by
Diagnosis confirmed by
=
[NUMBER]Over
Under
Exactly
Between
At least
[DRUG]
[TREATMENT]
Diet
Exercise
Surgery [TYPE]
[THERAPY]
Evidence of
[PROCEDURE]
in
[DISORDER]
Symptoms?
Clinical findings
Starts new prepositional-phrase in the same text-block
Starts new prepositional-phrase in the same text-block
Starts new prepositional-phrase
in the same text-block
As measured by?
As evidenced by
Starts new prepositional-phrase
in the same text-block
20
SM
S M
anag
emen
t & T
echn
olog
y
“Authority Action” semantic model
Authority Action
(allow) Maximum
Therapy
Supply
(allow) Minimum
In writing
By telephone
[TIME]
days
weeks
months
Therapy
Supply[AMOUNT]
Repeats[AMOUNT]
Repeats[AMOUNT]
Initial
Subsequent
Ongoing
Initial
Subsequent
Ongoing
Initial
Subsequent
Ongoing In writing
By telephone
To complete
Followed by
In writing
By telephone Within timeframe of [TIME]
days
weeks
months
Treatment
Treatment
Electronically
Electronically
Electronically
Remaining
Remaining
Remaining
In writing
By telephone
Electronically
Initial
Subsequent
Ongoing
Remaining
Where approval
[TIMEFRAME]
To [AUTHORITY]
Medicare
To [AUTHORITY]
Medicare
To [AUTHORITY]
Medicare
...etc...
...etc...
...etc...
Repeats[AMOUNT]
Starts new prepositional-phrase
in the same text-block
Starts new prepositional-phrase
in the same text-block
Starts new prepositional-phrase
in the same text-block
21
SM
S M
anag
emen
t & T
echn
olog
y
High-level semantic overview
HOWAUTHORISED
WHATCONDITION
WHO TREATED
Notes and Cautions + + + + =
DefinitionsAge
limitations
Clinical initiation or
continuation criteria
Prescribing clinicians
Prescribing adviceCondition
Contact information
Grandfathering clauses Patient
groups
Prior treatments Severity
Patient GroupDefinitions Condition Authority ActionForeword
22
SM
S M
anag
emen
t & T
echn
olog
y
How did the ‘trees’ help?
Inferred How people think about and structure contentDescribed Business processes that produce contentIdentified Where content quality is poor so it can be improved Critical components of the sentence for codificationDesigned Taxonomies and describe folk taxonomiesBuilt Systems to help bring some structure to content authoring
23
SM
S M
anag
emen
t & T
echn
olog
y
How can I do this stuff too?! (a side-step)
Theory is important An understanding of semantics - sentence trees
and grammar Text books by authors like Fromkin and Rodman
can help through the tricky bits
Need good tools Conexor: www.conexor.fi/demo/syntax Big sheets of paper (and an electronic whiteboard) Visio (not PowerPoint!)
24
SM
S M
anag
emen
t & T
echn
olog
y
Demo
Connexor www.conexor.fi/demo/syntax
25
SM
S M
anag
emen
t & T
echn
olog
y
Introducing ways to codify restrictionsHow are we actually going to codify the stuff?! Give people Lego™ or ‘fridge-magnets’ to build sentences Build a prototype to explore and demonstrate conceptual design
Communicate Talk about ideas with business owners Explore possibilities with end-users Build-in ‘no surprises’ into change management
Iterate Iterate and refine concepts and design before it was built
Inform Developers of intent and requirements The building of an ‘tool’ for codifying content (hooray for Axure!)
26
SM
S M
anag
emen
t & T
echn
olog
y
Demo
Protyotyping with Axure
27
SM
S M
anag
emen
t & T
echn
olog
y
28
SM
S M
anag
emen
t & T
echn
olog
y
29
SM
S M
anag
emen
t & T
echn
olog
y
30
SM
S M
anag
emen
t & T
echn
olog
y
31
SM
S M
anag
emen
t & T
echn
olog
y
32
SM
S M
anag
emen
t & T
echn
olog
y
33
SM
S M
anag
emen
t & T
echn
olog
y
34
SM
S M
anag
emen
t & T
echn
olog
y
35
SM
S M
anag
emen
t & T
echn
olog
y
Why should I care about this? Google uses semantic analysis to index content
Translation software uses semantic analysis to identify ‘components’ for translation
Good sentence structure equals: Accurate indexing Higher rank relevance of content Happy people (they find what they’re looking for)
36
SM
S M
anag
emen
t & T
echn
olog
y
Summing upContent is still king, but: Is it’s quality any good? Does it match your website’s categories? Is your metadata ok? Can people find the content they need? Do you need to understand your content better?
Semantic analysis can: Make your content audits more objective Inform processes to improve the quality of the content Inform processes to improve search engine indexing Inform metadata creation Improve website navigation design
37
SM
S M
anag
emen
t & T
echn
olog
y
email: [email protected]: www.smsmt.com
blog: magia3e.wordpress.comtwitter: magia3e
community: iacanberra.org
cartoons: © Garry Larson
Please Sir, can I have some more…?
38
SM
S M
anag
emen
t & T
echn
olog
y
Fin
Top Related