Project Update Matt Williams XML Document Visualization and Retrieval.
-
Upload
letitia-conley -
Category
Documents
-
view
214 -
download
0
description
Transcript of Project Update Matt Williams XML Document Visualization and Retrieval.
![Page 1: Project Update Matt Williams XML Document Visualization and Retrieval.](https://reader036.fdocuments.us/reader036/viewer/2022082723/5a4d1b5a7f8b9ab0599aaadf/html5/thumbnails/1.jpg)
Project Update
Matt Williams
XML Document Visualization and Retrieval
![Page 2: Project Update Matt Williams XML Document Visualization and Retrieval.](https://reader036.fdocuments.us/reader036/viewer/2022082723/5a4d1b5a7f8b9ab0599aaadf/html5/thumbnails/2.jpg)
Background XML vs Web Doc Added Structure
<book> <title>My First XML</title> <prod id="33-657“ media="paper"> </prod> <chapter>Introduction to XML <para>What is HTML</para> <para>What is XML</para> </chapter> <chapter>XML Syntax <para>Elements must have a closing
tag</para> <para>Elements must be properly
nested</para> </chapter></book>
Can we take advantage of this structure when searching for documents?
![Page 3: Project Update Matt Williams XML Document Visualization and Retrieval.](https://reader036.fdocuments.us/reader036/viewer/2022082723/5a4d1b5a7f8b9ab0599aaadf/html5/thumbnails/3.jpg)
Information Retrieval Standard Information Retrieval (IR)
tf*idf tf – frequency of a term in a doc Idf – inverse document frequency Number of documents containing the
term
![Page 4: Project Update Matt Williams XML Document Visualization and Retrieval.](https://reader036.fdocuments.us/reader036/viewer/2022082723/5a4d1b5a7f8b9ab0599aaadf/html5/thumbnails/4.jpg)
Information Retrieval A fair bit of previous work on adding structure
to IR queries.
Examples XIRQL – Fuhr and GroBjohann
//book/chapter[heading $cw$ “InfoVis”] XXL – Theobald and Weikum
Select Z From Index Where zoos.~animal.~cougar as Z
But… What if we are unsure of the structure? What if we have variability in the structure?
![Page 5: Project Update Matt Williams XML Document Visualization and Retrieval.](https://reader036.fdocuments.us/reader036/viewer/2022082723/5a4d1b5a7f8b9ab0599aaadf/html5/thumbnails/5.jpg)
Information Retrieval My goal is to provide an interface to
explore the XML collection with limited information
Meta-Schema Information – Element Index Visual Clustering – Multidimensional
Scaling Visual Queries – Element Selection
![Page 6: Project Update Matt Williams XML Document Visualization and Retrieval.](https://reader036.fdocuments.us/reader036/viewer/2022082723/5a4d1b5a7f8b9ab0599aaadf/html5/thumbnails/6.jpg)
Related Work Visual Information Seeking
Homefinder / Periodic Table – Algerg and Shneiderman
![Page 7: Project Update Matt Williams XML Document Visualization and Retrieval.](https://reader036.fdocuments.us/reader036/viewer/2022082723/5a4d1b5a7f8b9ab0599aaadf/html5/thumbnails/7.jpg)
Related Work Galaxies
Wise et al.
Visual Web Retrieval Lighthouse - Leuski
![Page 8: Project Update Matt Williams XML Document Visualization and Retrieval.](https://reader036.fdocuments.us/reader036/viewer/2022082723/5a4d1b5a7f8b9ab0599aaadf/html5/thumbnails/8.jpg)
Related Work ZUI – Pad, Jazz, and Piccolo
Ben Bederson SpaceTree
Jesse Grosjean et al. TreeMaps ??
Ben Shneiderman
![Page 9: Project Update Matt Williams XML Document Visualization and Retrieval.](https://reader036.fdocuments.us/reader036/viewer/2022082723/5a4d1b5a7f8b9ab0599aaadf/html5/thumbnails/9.jpg)
Multidimensional Scaling
Document Similarity Dimensionality Reduction From full
dimensional distance measure 2 dimensional distance measure
Problems – Speed?
![Page 10: Project Update Matt Williams XML Document Visualization and Retrieval.](https://reader036.fdocuments.us/reader036/viewer/2022082723/5a4d1b5a7f8b9ab0599aaadf/html5/thumbnails/10.jpg)
Test Environment eXist – Open Source XML Native Database
Wolfgang M. Meier http://exist-db.org/
I am working on providing a front end to the Database that provides: A Selectable Element Index Interactive Results That Dynamically
Cluster and Zoom
![Page 11: Project Update Matt Williams XML Document Visualization and Retrieval.](https://reader036.fdocuments.us/reader036/viewer/2022082723/5a4d1b5a7f8b9ab0599aaadf/html5/thumbnails/11.jpg)
Thus Far Lots of Learning!! XML Databases Multidimensional Scaling XML Queries XML Information
Retrieval Zoomable Interfaces Treemaps
Added basic GUI to eXist Added a Service to offer
the element Index as part of the API