VIVO Multi-site search Structure and function overview.
-
Upload
cassandra-sims -
Category
Documents
-
view
214 -
download
2
Transcript of VIVO Multi-site search Structure and function overview.
![Page 1: VIVO Multi-site search Structure and function overview.](https://reader035.fdocuments.us/reader035/viewer/2022081008/56649e4d5503460f94b43a1e/html5/thumbnails/1.jpg)
VIVOMulti-site searchStructure and function overview
![Page 2: VIVO Multi-site search Structure and function overview.](https://reader035.fdocuments.us/reader035/viewer/2022081008/56649e4d5503460f94b43a1e/html5/thumbnails/2.jpg)
What is it?• A search tool for ISF-compatible sites• VIVO, Profiles, Loki…
• Search index is built from all client sites• Provides relative ranking of results across sites
• Two pieces of software• An application that builds a Solr search index• A web-app that presents a GUI for searching
• Configurable• Decide which sites to index, and which classes of individuals
• Open source, built from open source components
![Page 3: VIVO Multi-site search Structure and function overview.](https://reader035.fdocuments.us/reader035/viewer/2022081008/56649e4d5503460f94b43a1e/html5/thumbnails/3.jpg)
Data flow
UserBrowser
MSSWeb server
MSSSolr server
Search indexWeb page
Searchresult
Searchrequest
AJAX
MSS Indexer
IndexerIndexer
IndexerClient sites
Searchrecords
RDF
![Page 4: VIVO Multi-site search Structure and function overview.](https://reader035.fdocuments.us/reader035/viewer/2022081008/56649e4d5503460f94b43a1e/html5/thumbnails/4.jpg)
Data flow
MSSIndexer
Clientsite
Discovery request
List of URIs
LOD request
RDFLOD request
RDFLOD request
RDFLOD request
RDF
![Page 5: VIVO Multi-site search Structure and function overview.](https://reader035.fdocuments.us/reader035/viewer/2022081008/56649e4d5503460f94b43a1e/html5/thumbnails/5.jpg)
Scalable• Search index is a standard Solr webapp• Compatible with any standard JEE server
• Indexer is multi-threaded• For small number of client sites, using standard Java threads• For large number of client sites, using the Apache Hadoop
framework for distributed processing• Interleaves requests among clients, for reduced load
• Front-end GUI uses AJAX Solr client• GUI server serves static HTML and AJAX-based JavaScript• Presentation is accomplished by JavaScript in the browser
![Page 6: VIVO Multi-site search Structure and function overview.](https://reader035.fdocuments.us/reader035/viewer/2022081008/56649e4d5503460f94b43a1e/html5/thumbnails/6.jpg)
For the community• Get the software• Configure for your sites and your classes• Install Solr on a server• Install and run the indexer• Install the front end GUI on a server
![Page 7: VIVO Multi-site search Structure and function overview.](https://reader035.fdocuments.us/reader035/viewer/2022081008/56649e4d5503460f94b43a1e/html5/thumbnails/7.jpg)
Ready for enhancement• The indexer is assembled from components at runtime• Improve a component• Contribute to the community• Site admins may configure their indexer to use your component.
• The front end is based on the AJAX Solr toolkit• Create your own front end look and feel• Contribute to the community• Site admins may install your front end, instead of the default
front end
![Page 8: VIVO Multi-site search Structure and function overview.](https://reader035.fdocuments.us/reader035/viewer/2022081008/56649e4d5503460f94b43a1e/html5/thumbnails/8.jpg)
Configuration
Evaluation
Scheduling
Discovery
Synchronization
Population
Prioritization
Assembly
Modeling
Indexing
The Indexer - Configuration• Assemble the application• Use standard components or
contributed alternatives• Create the site list• Name• Type of installation (e.g. VIVO 1.5,
Profiles)• Classes to be indexed
• Get runtime options
Configuration
Built on the Digester component from Apache Commons.Processed like server.xml file in Tomcat.
![Page 9: VIVO Multi-site search Structure and function overview.](https://reader035.fdocuments.us/reader035/viewer/2022081008/56649e4d5503460f94b43a1e/html5/thumbnails/9.jpg)
Configuration
Evaluation
Scheduling
Discovery
Synchronization
Population
Prioritization
Assembly
Modeling
Indexing
The Indexer - Evaluation• Scheduling• Check to see which sites are due for
discovery• Discovery• Ask each site for its list of URIs
• With “last modified” dates, if available
• Synchronization• Create stub records for new URIs• Remove expired records
Scheduling
Discovery
Synchronization
![Page 10: VIVO Multi-site search Structure and function overview.](https://reader035.fdocuments.us/reader035/viewer/2022081008/56649e4d5503460f94b43a1e/html5/thumbnails/10.jpg)
Configuration
Evaluation
Scheduling
Discovery
Synchronization
Population
Prioritization
Assembly
Modeling
Indexing
The Indexer - Population• Prioritization• Create an ordered list of URIs for
indexing.• Modeling• For each URI, ask the site for RDF
statements to build the individual model• Indexing• Translate the individual model into a
record in the search indexPrioritization
Modeling
Indexing