Traversing Documents by Using Semantic Relationships

Post on 21-Feb-2016

43 views 0 download

Tags:

description

Traversing Documents by Using Semantic Relationships. Presenter: Bilal Gonen. Semantic Browser. Semantic browser is a tool that supports browsing and navigation of a document space by utilizing the semantic relationships. Physical Links vs. Virtual Links. affects. co_occurs_with. - PowerPoint PPT Presentation

Transcript of Traversing Documents by Using Semantic Relationships

Traversing Documents by Using Semantic Relationships

Presenter: Bilal Gonen

Semantic Browser

• Semantic browser is a tool that supports browsing and navigation of a document space by utilizing the semantic relationships.

Physical Links vs. Virtual Links

co_occurs_with

analyzesaffects

is_result_of

href

href

href

href

affects

affects

affects

co_occurs_withco_occurs_with

co_occurs_with

co_occurs_with

co_occurs_withis_result_of

is_result_of

is_result_of

is_result_of

A Real Example

How are these articles related?

How do we find other documents related with “melanoma”?

One common option is to use statistical techniques.

Recommendation Systems

Amazon, the best-known commercial recommender system, recommends books to customers based on the statistical similarity between customers' previous purchases.

The product: Digital Camera

Customers who bought this item also often bought

Digital Memory Card

Statistical proximity

A Real Example

Such a statistical technique may return these terms.

sun's harmful raysskinskin cancerlegankleskin pigmentmelaninaneuploidy

There are no named relationships.

Ontology

Pathologic Function

Body Substance

Substance

Diagnostic Procedure

Amino Acid Peptide or

ProteinDisease or Syndrome

Neoplastic Process

affects

Instance level

Schema level

Ureteral Calculi

Kidney Neoplasms

InstanceOf InstanceOf InstanceOf

analyzes

rdfs:subClassOfrdfs:subClassOf

rdfs:subClassOf

Ontology is at the heart of Semantic Web.

Relationships In Ontology

breast cancer bone cancer

non-melanoma melanoma

blood cancerskin cancer

cancers

aneuploidy euploidymonoploidy

chromosomal disorder

is_result_of

A Real Example

Our approach is to offer several relationships to the user.

aneuploidyallelic imbalancechromosome aberrations

This is what user is interested in.

affectsco_occurs_with

occurs_in

is_result_of

Return files which includes “aneuploidy”

Chromosomal AneuploidiesIdentification of AneuploidyDefinition of AneuploidyAneuploidy and Deletions

Name of files in which “aneuploidy” occurs.

JSP(Java Server Page)

Java Script

AJAXuser

Lucene Index for documentsPubMed

dataset

Ontology SemDis API

Lucene indexing is used to index the documents with the 21,945 MESH terms when they occur in the documents.

User Interface(HTML page)

The advantage of the AJAX technology is to send and receive only needed information between the client and server.

request

Built in LSDIS Lab. This API is used to process the triples in the ontology.

Contains 135 classes and 49 relationships in schema level. And 21,945 entity instances in the instance level

Contains 48,252 documents

Because we also used the synonyms of the 21,945 MESH terms, therefore we used ~104,000 terms to index the documents.

AJAX (Asynchronous JavaScript And XML)

AJAX (Asynchronous JavaScript And XML)

Only this part is loaded to the client side.

User Interface(HTML page)

The MESH term is sent to server as a request to get its types from the ontology by using the SemDis API.

Java Script

AJAX

JSP(Java Server Page)

user

Lucene Index for documentsPubMed

dataset

Ontology SemDis API

SemDis API gets the types of the instance term from the ontology.

request

response

keyword

related documents

List of the documents are returned from the Lucene index.

Questions, Comments

Thank you…

Email: bilalgonen@gmail.comWeb: www.bilalgonen.com