B. Nguyen BDA 20021 Construction and Maintenance of a SPIN (Set of Pages of Interest) using Active...
-
Upload
tristand-guillon -
Category
Documents
-
view
106 -
download
0
Transcript of B. Nguyen BDA 20021 Construction and Maintenance of a SPIN (Set of Pages of Interest) using Active...
![Page 1: B. Nguyen BDA 20021 Construction and Maintenance of a SPIN (Set of Pages of Interest) using Active XML Serge Abiteboul, Grégory Cobena, Benjamin Nguyen,](https://reader035.fdocuments.us/reader035/viewer/2022062417/551d9d8e497959293b8c33cb/html5/thumbnails/1.jpg)
B. Nguyen BDA 2002 1
Construction and Maintenance of a SPIN (Set of Pages of Interest) using
Active XML
Serge Abiteboul, Grégory Cobena, Benjamin Nguyen, Antonella Poggi
INRIA-FUTURS, Projet GemoEmail: [email protected]
![Page 2: B. Nguyen BDA 20021 Construction and Maintenance of a SPIN (Set of Pages of Interest) using Active XML Serge Abiteboul, Grégory Cobena, Benjamin Nguyen,](https://reader035.fdocuments.us/reader035/viewer/2022062417/551d9d8e497959293b8c33cb/html5/thumbnails/2.jpg)
B. Nguyen BDA 2002 2
Qui? Travaux au sein de l’ex-projet Verso
(Gemo) Serge Abiteboul Grégory Cobena Antonella Poggi Benjamin Nguyen
Collaboration au projet RNTL e.dot avec le labo BIA de l’INRA sur le risque alimentaire
![Page 3: B. Nguyen BDA 20021 Construction and Maintenance of a SPIN (Set of Pages of Interest) using Active XML Serge Abiteboul, Grégory Cobena, Benjamin Nguyen,](https://reader035.fdocuments.us/reader035/viewer/2022062417/551d9d8e497959293b8c33cb/html5/thumbnails/3.jpg)
B. Nguyen BDA 2002 3
Quoi? Développer une approche:
Flexible, générique, déclarative de spécification d’un entrepôt de données du Web
Simplifier l’acquisition de ces données du Web + Utilisation de services
Proposer une plate-forme de développement d’entrepôts de données.
![Page 4: B. Nguyen BDA 20021 Construction and Maintenance of a SPIN (Set of Pages of Interest) using Active XML Serge Abiteboul, Grégory Cobena, Benjamin Nguyen,](https://reader035.fdocuments.us/reader035/viewer/2022062417/551d9d8e497959293b8c33cb/html5/thumbnails/4.jpg)
B. Nguyen BDA 2002 4
Comment?
![Page 5: B. Nguyen BDA 20021 Construction and Maintenance of a SPIN (Set of Pages of Interest) using Active XML Serge Abiteboul, Grégory Cobena, Benjamin Nguyen,](https://reader035.fdocuments.us/reader035/viewer/2022062417/551d9d8e497959293b8c33cb/html5/thumbnails/5.jpg)
B. Nguyen BDA 2002 5
Plan de la Présentation 1- Une nouvelle problématique…
2- SPIN Idées de base Architecture Exemple
3- Objectifs futurs
![Page 6: B. Nguyen BDA 20021 Construction and Maintenance of a SPIN (Set of Pages of Interest) using Active XML Serge Abiteboul, Grégory Cobena, Benjamin Nguyen,](https://reader035.fdocuments.us/reader035/viewer/2022062417/551d9d8e497959293b8c33cb/html5/thumbnails/6.jpg)
B. Nguyen BDA 2002 6
1- Une nouvelle problématique…
![Page 7: B. Nguyen BDA 20021 Construction and Maintenance of a SPIN (Set of Pages of Interest) using Active XML Serge Abiteboul, Grégory Cobena, Benjamin Nguyen,](https://reader035.fdocuments.us/reader035/viewer/2022062417/551d9d8e497959293b8c33cb/html5/thumbnails/7.jpg)
B. Nguyen BDA 2002 7
Problématique générale des entrepôts de données 1/
“The topic of data warehousing encompasses architectures, algorithms and tools for bringing together selected data from multiple databases or other information sources into a single repository, called a Data Warehouse.”
J.Widom, Research Problems in Data Warehousing, CIKM 1995
![Page 8: B. Nguyen BDA 20021 Construction and Maintenance of a SPIN (Set of Pages of Interest) using Active XML Serge Abiteboul, Grégory Cobena, Benjamin Nguyen,](https://reader035.fdocuments.us/reader035/viewer/2022062417/551d9d8e497959293b8c33cb/html5/thumbnails/8.jpg)
B. Nguyen BDA 2002 8
Problématique générale des entrepôts de données 2/
Info source Info sourceInfo source
Data Warehouse
Wrapper Wrapper Wrapper
INTEGRATOR
![Page 9: B. Nguyen BDA 20021 Construction and Maintenance of a SPIN (Set of Pages of Interest) using Active XML Serge Abiteboul, Grégory Cobena, Benjamin Nguyen,](https://reader035.fdocuments.us/reader035/viewer/2022062417/551d9d8e497959293b8c33cb/html5/thumbnails/9.jpg)
B. Nguyen BDA 2002 9
Sujets de Recherche Wrappers/Monitors Integrator Warehouse specification (WHIPS-
SIGMOD 1997) Diverses optimisations
![Page 10: B. Nguyen BDA 20021 Construction and Maintenance of a SPIN (Set of Pages of Interest) using Active XML Serge Abiteboul, Grégory Cobena, Benjamin Nguyen,](https://reader035.fdocuments.us/reader035/viewer/2022062417/551d9d8e497959293b8c33cb/html5/thumbnails/10.jpg)
B. Nguyen BDA 2002 10
SPIN: Les différences
Un travail autour du document plutôt que dans le document
Intégration Plus grand nombre de sources (chaque
document web/Service) Moins de structure dans chacune des pages Des thèmes très variés
Ergonomie et simplicité Une architecture simple et modulaire Une approche pour utilisateur ‘novice’
![Page 11: B. Nguyen BDA 20021 Construction and Maintenance of a SPIN (Set of Pages of Interest) using Active XML Serge Abiteboul, Grégory Cobena, Benjamin Nguyen,](https://reader035.fdocuments.us/reader035/viewer/2022062417/551d9d8e497959293b8c33cb/html5/thumbnails/11.jpg)
B. Nguyen BDA 2002 11
Brefs rappels XML (W3C)
WSDL (W3C) Format XML pour décrire des services Orienté document ou procédural Utilisé avec d’autres protocoles (SOAP)
ActiveXML
![Page 12: B. Nguyen BDA 20021 Construction and Maintenance of a SPIN (Set of Pages of Interest) using Active XML Serge Abiteboul, Grégory Cobena, Benjamin Nguyen,](https://reader035.fdocuments.us/reader035/viewer/2022062417/551d9d8e497959293b8c33cb/html5/thumbnails/12.jpg)
B. Nguyen BDA 2002 12
ActiveXML Travaux en cours:
S.Abiteboul, T.Milo, O.Benjelloun, I.Manolescu, A.Bonifati, L.Segoufin…+ équipe SPIN!
AXML = XML + Appels de services Langage déclaratif Peer-to-peer Mise en oeuvre très simple de services
web
![Page 13: B. Nguyen BDA 20021 Construction and Maintenance of a SPIN (Set of Pages of Interest) using Active XML Serge Abiteboul, Grégory Cobena, Benjamin Nguyen,](https://reader035.fdocuments.us/reader035/viewer/2022062417/551d9d8e497959293b8c33cb/html5/thumbnails/13.jpg)
B. Nguyen BDA 2002 13
2- SPIN
![Page 14: B. Nguyen BDA 20021 Construction and Maintenance of a SPIN (Set of Pages of Interest) using Active XML Serge Abiteboul, Grégory Cobena, Benjamin Nguyen,](https://reader035.fdocuments.us/reader035/viewer/2022062417/551d9d8e497959293b8c33cb/html5/thumbnails/14.jpg)
B. Nguyen BDA 2002 14
Le projet Une volonté de généricite et de simplicité
dans la construction d’un entrepôt Un langage déclaratif permettant de spécifier un SPIN Implémentation des services (modules) constituant la
base du système Approche modulaire
Implémentation en Java, XML, XSLT (B. Zhu)
Spécification ‘haut niveau’ en Active XML (langage ‘data-centric’, calcul distribue)
![Page 15: B. Nguyen BDA 20021 Construction and Maintenance of a SPIN (Set of Pages of Interest) using Active XML Serge Abiteboul, Grégory Cobena, Benjamin Nguyen,](https://reader035.fdocuments.us/reader035/viewer/2022062417/551d9d8e497959293b8c33cb/html5/thumbnails/15.jpg)
B. Nguyen BDA 2002 15
Entrepôt=Intention+Extension Intention
Définition déclarative a base de services existants (SOAP, WSDL, UDDI… AXML)
Écriture de services propres
Extension Pages webs stockées dans un répositoire XML Enrichissement continu de l’extension Interrogation via requêtes XOQL (V. Aguillera)
![Page 16: B. Nguyen BDA 20021 Construction and Maintenance of a SPIN (Set of Pages of Interest) using Active XML Serge Abiteboul, Grégory Cobena, Benjamin Nguyen,](https://reader035.fdocuments.us/reader035/viewer/2022062417/551d9d8e497959293b8c33cb/html5/thumbnails/16.jpg)
B. Nguyen BDA 2002 16
Architecture
AXML processor
XOQLEngine
XOQL Service
XyDiffXylemeServices
SPINServices
Web Service Application Internet
Web Services
Crawler
AXMLClient
Xmlrepository
![Page 17: B. Nguyen BDA 20021 Construction and Maintenance of a SPIN (Set of Pages of Interest) using Active XML Serge Abiteboul, Grégory Cobena, Benjamin Nguyen,](https://reader035.fdocuments.us/reader035/viewer/2022062417/551d9d8e497959293b8c33cb/html5/thumbnails/17.jpg)
B. Nguyen BDA 2002 17
Exemple : Sèvres
Un utilisateur veut créer un entrepôt de données sur la ville de Sèvres…
Comment faire cela en quelques lignes?
![Page 18: B. Nguyen BDA 20021 Construction and Maintenance of a SPIN (Set of Pages of Interest) using Active XML Serge Abiteboul, Grégory Cobena, Benjamin Nguyen,](https://reader035.fdocuments.us/reader035/viewer/2022062417/551d9d8e497959293b8c33cb/html5/thumbnails/18.jpg)
B. Nguyen BDA 2002 18
Fil conducteur Description de l’entête de l’entrepôt Description de l’intention
Manière très générale Réutilise comme paramètre
Description des services Services génériques Services particuliers a l’entrepôt
![Page 19: B. Nguyen BDA 20021 Construction and Maintenance of a SPIN (Set of Pages of Interest) using Active XML Serge Abiteboul, Grégory Cobena, Benjamin Nguyen,](https://reader035.fdocuments.us/reader035/viewer/2022062417/551d9d8e497959293b8c33cb/html5/thumbnails/19.jpg)
B. Nguyen BDA 2002 19
Modèle de données: Entête<spin:warehouse name="Sèvres"> <spin:head> <spin:owner id="Serge" /> <spin:title>Sèvres Warehouse</spin:title> <spin:accessControlList> <spin:access group="friends" mode="call"/> <spin:access group="all" mode="read"/> </spin:accessControlList> </spin:head> <spin:spin name="sevres"> <spin:intension> ... </spin:intension> <spin:extension> ... </spin:extension> <spin:services> ... </spin:services> </spin:spin> <spin:spin name="sevres-sculpture"> ... </spin:spin></spin:warehouse>
![Page 20: B. Nguyen BDA 20021 Construction and Maintenance of a SPIN (Set of Pages of Interest) using Active XML Serge Abiteboul, Grégory Cobena, Benjamin Nguyen,](https://reader035.fdocuments.us/reader035/viewer/2022062417/551d9d8e497959293b8c33cb/html5/thumbnails/20.jpg)
B. Nguyen BDA 2002 20
Modele de données: Intention (‘pur’ XML)
<spin:spin name="sevres"> <spin:intension> <spin:bound>3000</spin:bound> <keywords> <keyword>Sèvres</keyword> <keyword>92310</keyword> </keywords> <interestingSites> <site>http://www.ville-sevres.fr/</site> <site>http://www.vertsdesevres.com/</site> </interestingSites> </spin:intension> ...</spin:spin>
![Page 21: B. Nguyen BDA 20021 Construction and Maintenance of a SPIN (Set of Pages of Interest) using Active XML Serge Abiteboul, Grégory Cobena, Benjamin Nguyen,](https://reader035.fdocuments.us/reader035/viewer/2022062417/551d9d8e497959293b8c33cb/html5/thumbnails/21.jpg)
B. Nguyen BDA 2002 21
…les services utilisent les données de l’intention
<spin:services>% Keyword Querylet askGoogle($name) be {for each $X in<axml:sc name="http://www.google.com/googleSearch"> <axml:params> <axml:param name="keyword" xpath="//spin:spin[name=$name]/keywords" /> </axml:params></axml:sc>do insert (//spin:spin[name=$name]/spin:extension/<spin:url id=$X>)}
![Page 22: B. Nguyen BDA 20021 Construction and Maintenance of a SPIN (Set of Pages of Interest) using Active XML Serge Abiteboul, Grégory Cobena, Benjamin Nguyen,](https://reader035.fdocuments.us/reader035/viewer/2022062417/551d9d8e497959293b8c33cb/html5/thumbnails/22.jpg)
B. Nguyen BDA 2002 22
Services (suite)% Interesting siteslet crawlInterestingSites($name) be{for each $X in<axml:sc name="http://www.myservices.com/getSite"> <axml:params> <axml:param name="url" xpath="//spin:spin[name=$name]/spin:intension/interestingSites/site/" /> <axml:param name="depth">5</axml:param> <axml:param name="bound" xpath="//spin:spin[name=$name]/spin:intension/spin:bound/" /> </axml:params></axml:sc>do insert (//spin[name=$name]/spin:extension/<spin:url id=$X opinion="yes">)}</spin:services></spin:warehouse>
![Page 23: B. Nguyen BDA 20021 Construction and Maintenance of a SPIN (Set of Pages of Interest) using Active XML Serge Abiteboul, Grégory Cobena, Benjamin Nguyen,](https://reader035.fdocuments.us/reader035/viewer/2022062417/551d9d8e497959293b8c33cb/html5/thumbnails/23.jpg)
B. Nguyen BDA 2002 23
Services supplémentaires Classification Annotations de l’utilisateur Évolution temporelle Requêtes (préecrites) sur le
document résultat avec XOQL
![Page 24: B. Nguyen BDA 20021 Construction and Maintenance of a SPIN (Set of Pages of Interest) using Active XML Serge Abiteboul, Grégory Cobena, Benjamin Nguyen,](https://reader035.fdocuments.us/reader035/viewer/2022062417/551d9d8e497959293b8c33cb/html5/thumbnails/24.jpg)
B. Nguyen BDA 2002 24
Un service avancé: La gestion des M-A-J de manière transparente
let aggregate($name, $D1, $D2) be {insert //spin:spin[name=$name]/spin:extension[date=$D1]/ <delta from=$D1 to=$D2> ... %the delta </delta> </spin:extension> //spin:spin[name=$name]/spin:extension[date=$D2]/ <axml:sc name="applyDelta"> <axml:params> <axml:param name="from" xpath="../spin:extension[date=$D1]" /> <axml:param name="delta-loc" xpath="../delta[from=$D1 && tp=$D2]" /> </axml:params> <validity>CLONE VALUE</validity> <refreshPolicy>ON DEMAND</refreshPolicy> </axml:sc> </spin:extension>delete //spin:spin[name=$name]/spin:extension[date=$D2]/spin:url}
![Page 25: B. Nguyen BDA 20021 Construction and Maintenance of a SPIN (Set of Pages of Interest) using Active XML Serge Abiteboul, Grégory Cobena, Benjamin Nguyen,](https://reader035.fdocuments.us/reader035/viewer/2022062417/551d9d8e497959293b8c33cb/html5/thumbnails/25.jpg)
B. Nguyen BDA 2002 25
Extension (résultat)<spin:extension date="31 jul 2001"> <spin:url id="http://www.mysite.com/mypage.html">% En utilisant d’autres services <content>...</content> <link>http://www.yahoo.com/</link> <link>http://www-rocq.inria.fr/</link> <type>HTML</type> <last_update>28 jul 2001</last_update> <classification>Resume</classification> <site>http://www.inria.fr/</site> </spin:url> ...</spin:extension>
![Page 26: B. Nguyen BDA 20021 Construction and Maintenance of a SPIN (Set of Pages of Interest) using Active XML Serge Abiteboul, Grégory Cobena, Benjamin Nguyen,](https://reader035.fdocuments.us/reader035/viewer/2022062417/551d9d8e497959293b8c33cb/html5/thumbnails/26.jpg)
B. Nguyen BDA 2002 26
Implémentation Bibliothèque de services web
‘génériques’ pour l’aide a la création d’entrepôts Crawler Classification (THESUS) Diff (evolution temporelle de l’entrepôt) Moteur de requêtes Présentation (XSLT)
AXML (O. Benjelloun)
![Page 27: B. Nguyen BDA 20021 Construction and Maintenance of a SPIN (Set of Pages of Interest) using Active XML Serge Abiteboul, Grégory Cobena, Benjamin Nguyen,](https://reader035.fdocuments.us/reader035/viewer/2022062417/551d9d8e497959293b8c33cb/html5/thumbnails/27.jpg)
B. Nguyen BDA 2002 27
3- Objectifs futurs
![Page 28: B. Nguyen BDA 20021 Construction and Maintenance of a SPIN (Set of Pages of Interest) using Active XML Serge Abiteboul, Grégory Cobena, Benjamin Nguyen,](https://reader035.fdocuments.us/reader035/viewer/2022062417/551d9d8e497959293b8c33cb/html5/thumbnails/28.jpg)
B. Nguyen BDA 2002 28
Quelques pistes…
Méthodologique: Approche ‘a la UML’ Définition de concepts simples Présentation graphique compréhensible/ ergonomique Solution d’implémentation directe Quel modèle/langage conceptuel?
Amélioration des services Plus évolues Plus interdépendants
Gestion du travail coopératif Gestion des utilisateurs Problèmes de sécurité
![Page 29: B. Nguyen BDA 20021 Construction and Maintenance of a SPIN (Set of Pages of Interest) using Active XML Serge Abiteboul, Grégory Cobena, Benjamin Nguyen,](https://reader035.fdocuments.us/reader035/viewer/2022062417/551d9d8e497959293b8c33cb/html5/thumbnails/29.jpg)
B. Nguyen BDA 2002 29
Questions?