DSpace 1.6 usage statistics: How does it work? Ben Bosman ... · @mire Contribution to DSpace 1.6...
Transcript of DSpace 1.6 usage statistics: How does it work? Ben Bosman ... · @mire Contribution to DSpace 1.6...
![Page 1: DSpace 1.6 usage statistics: How does it work? Ben Bosman ... · @mire Contribution to DSpace 1.6 Core of @mireÕs Content & Usage Analysis Logging usage events in search index Querying](https://reader034.fdocuments.us/reader034/viewer/2022042409/5f2577de8741fb74bd379d3d/html5/thumbnails/1.jpg)
@MIREDSpace 1.6 usage statistics:
How does it work?
Ben Bosman - @mire
![Page 2: DSpace 1.6 usage statistics: How does it work? Ben Bosman ... · @mire Contribution to DSpace 1.6 Core of @mireÕs Content & Usage Analysis Logging usage events in search index Querying](https://reader034.fdocuments.us/reader034/viewer/2022042409/5f2577de8741fb74bd379d3d/html5/thumbnails/2.jpg)
Outline
1 - Introduction
2 - Technical Overview
3 - User Interface Additions
4 - Advanced Use Cases
5 - High Load Installations
6 - Content & Usage Analysis Add-on Module
![Page 3: DSpace 1.6 usage statistics: How does it work? Ben Bosman ... · @mire Contribution to DSpace 1.6 Core of @mireÕs Content & Usage Analysis Logging usage events in search index Querying](https://reader034.fdocuments.us/reader034/viewer/2022042409/5f2577de8741fb74bd379d3d/html5/thumbnails/3.jpg)
1 - Introduction
Community Survey
Highest rated feature request36% of requests in survey
@mire Contribution to DSpace 1.6
Core of @mire’s Content & Usage AnalysisLogging usage events
in search index
Querying usage events to provide statisticsOn-the-fly queries instead of predefined reports
For XMLUI & JSPUI
![Page 4: DSpace 1.6 usage statistics: How does it work? Ben Bosman ... · @mire Contribution to DSpace 1.6 Core of @mireÕs Content & Usage Analysis Logging usage events in search index Querying](https://reader034.fdocuments.us/reader034/viewer/2022042409/5f2577de8741fb74bd379d3d/html5/thumbnails/4.jpg)
1 - Introduction
In-house storage of usage data:
No dependency on external servicesavailability, long-term support, ...
No privacy issuesCreate your own Back-ups
Storing original usage events
No limitations on views of the dataView usage data in detailFull history availableNot only aggregated (e.g. per month)
![Page 5: DSpace 1.6 usage statistics: How does it work? Ben Bosman ... · @mire Contribution to DSpace 1.6 Core of @mireÕs Content & Usage Analysis Logging usage events in search index Querying](https://reader034.fdocuments.us/reader034/viewer/2022042409/5f2577de8741fb74bd379d3d/html5/thumbnails/5.jpg)
2 - Technical Overview
Usage events:
Community homepage visitsCollection homepage visitsItem visitsBitstream downloads
Data per usage event:
TimestampIP addressLocation: continent, country, cityand much more ...
![Page 6: DSpace 1.6 usage statistics: How does it work? Ben Bosman ... · @mire Contribution to DSpace 1.6 Core of @mireÕs Content & Usage Analysis Logging usage events in search index Querying](https://reader034.fdocuments.us/reader034/viewer/2022042409/5f2577de8741fb74bd379d3d/html5/thumbnails/6.jpg)
2 - Technical Overview
Usage event logging
Apache Solr Open source enterprise search platform from the Apache Lucene projectNew web application added to DSpace
PerformanceFast logging in search indexCan easily be deployed on a separate serverAdvanced solutions for fast querying based on caching
![Page 7: DSpace 1.6 usage statistics: How does it work? Ben Bosman ... · @mire Contribution to DSpace 1.6 Core of @mireÕs Content & Usage Analysis Logging usage events in search index Querying](https://reader034.fdocuments.us/reader034/viewer/2022042409/5f2577de8741fb74bd379d3d/html5/thumbnails/7.jpg)
2 - Technical Overview
How to store statistics data
Define the fields to store:dspace/solr/statistics/conf/schema.xmlActual storage:org.dspace.statistics.SolrLogger.post()doc.addField(“fieldname”, fieldvalue);
![Page 8: DSpace 1.6 usage statistics: How does it work? Ben Bosman ... · @mire Contribution to DSpace 1.6 Core of @mireÕs Content & Usage Analysis Logging usage events in search index Querying](https://reader034.fdocuments.us/reader034/viewer/2022042409/5f2577de8741fb74bd379d3d/html5/thumbnails/8.jpg)
2 - Technical Overview
<field name="type" type="integer" indexed="true" stored="true" required="true" /> <field name="id" type="integer" indexed="true" stored="true" required="true" /> <field name="ip" type="string" indexed="true" stored="true" required="false" /> <field name="time" type="date" indexed="true" stored="true" required="true" /> <field name="epersonid" type="integer" indexed="true" stored="true" required="false" /><field name="country" type="string" indexed="true" stored="true" required="false" /><field name="city" type="string" indexed="true" stored="true" required="false"/> <field name="owningComm" type="integer" indexed="true" stored="true" required="false" multiValued="true" />
![Page 9: DSpace 1.6 usage statistics: How does it work? Ben Bosman ... · @mire Contribution to DSpace 1.6 Core of @mireÕs Content & Usage Analysis Logging usage events in search index Querying](https://reader034.fdocuments.us/reader034/viewer/2022042409/5f2577de8741fb74bd379d3d/html5/thumbnails/9.jpg)
2 - Technical Overview
Display of statistics data
Depends on the User interfaceStatisticsTransformer for XMLUIDisplayStatisticsServlet for JSPUI
![Page 10: DSpace 1.6 usage statistics: How does it work? Ben Bosman ... · @mire Contribution to DSpace 1.6 Core of @mireÕs Content & Usage Analysis Logging usage events in search index Querying](https://reader034.fdocuments.us/reader034/viewer/2022042409/5f2577de8741fb74bd379d3d/html5/thumbnails/10.jpg)
2 - Technical Overview
Display Statistics - XMLUI
StatisticsTransformer generates the DRI with the statistics informationUses multiple StatisticsDisplay objects
StatisticsListing for a two column tableStatisticsTable for a multiple column table
![Page 11: DSpace 1.6 usage statistics: How does it work? Ben Bosman ... · @mire Contribution to DSpace 1.6 Core of @mireÕs Content & Usage Analysis Logging usage events in search index Querying](https://reader034.fdocuments.us/reader034/viewer/2022042409/5f2577de8741fb74bd379d3d/html5/thumbnails/11.jpg)
2 - Technical Overview
![Page 12: DSpace 1.6 usage statistics: How does it work? Ben Bosman ... · @mire Contribution to DSpace 1.6 Core of @mireÕs Content & Usage Analysis Logging usage events in search index Querying](https://reader034.fdocuments.us/reader034/viewer/2022042409/5f2577de8741fb74bd379d3d/html5/thumbnails/12.jpg)
2 - Technical Overview
Display Statistics - JSPUI
DisplayStatisticsServlet Uses StatisticsBean objects to store the informationdisplay-statistics.jsp builds tables from StatisticsBean objects
![Page 13: DSpace 1.6 usage statistics: How does it work? Ben Bosman ... · @mire Contribution to DSpace 1.6 Core of @mireÕs Content & Usage Analysis Logging usage events in search index Querying](https://reader034.fdocuments.us/reader034/viewer/2022042409/5f2577de8741fb74bd379d3d/html5/thumbnails/13.jpg)
2 - Technical Overview
![Page 14: DSpace 1.6 usage statistics: How does it work? Ben Bosman ... · @mire Contribution to DSpace 1.6 Core of @mireÕs Content & Usage Analysis Logging usage events in search index Querying](https://reader034.fdocuments.us/reader034/viewer/2022042409/5f2577de8741fb74bd379d3d/html5/thumbnails/14.jpg)
3 - User Interface Additions
3 small changes in detail
Create repository wide statisticsSpecify a timespanSeparate downloads and page visits
![Page 15: DSpace 1.6 usage statistics: How does it work? Ben Bosman ... · @mire Contribution to DSpace 1.6 Core of @mireÕs Content & Usage Analysis Logging usage events in search index Querying](https://reader034.fdocuments.us/reader034/viewer/2022042409/5f2577de8741fb74bd379d3d/html5/thumbnails/15.jpg)
3 - User Interface Additions
Create repository wide statistics
Add a Transformer to create overview of e.g. top 3 items.
StatisticsListing statListing = new StatisticsListing(new StatisticsDataVisits(dso));statListing.setTitle("Top 3 items");statListing.setId("list-top-items");DatasetDSpaceObjectGenerator dsoAxis = new DatasetDSpaceObjectGenerator();dsoAxis.addDsoChild(Constants.ITEM, 3, false, -1);statListing.addDatasetGenerator(dsoAxis);addDisplayListing(division, statListing);
![Page 16: DSpace 1.6 usage statistics: How does it work? Ben Bosman ... · @mire Contribution to DSpace 1.6 Core of @mireÕs Content & Usage Analysis Logging usage events in search index Querying](https://reader034.fdocuments.us/reader034/viewer/2022042409/5f2577de8741fb74bd379d3d/html5/thumbnails/16.jpg)
3 - User Interface Additions
Specify a timespan
Limit displayed statistics data on e.g. item display page to specific timespan.
StatisticsSolrDateFilter dateFilter = new StatisticsSolrDateFilter();dateFilter.setStartDate(startDate);dateFilter.setEndDate(endDate);dateFilter.setTypeStr(“month”);statListing.addFilter(dateFilter);
Determine timespan in User Interface
![Page 17: DSpace 1.6 usage statistics: How does it work? Ben Bosman ... · @mire Contribution to DSpace 1.6 Core of @mireÕs Content & Usage Analysis Logging usage events in search index Querying](https://reader034.fdocuments.us/reader034/viewer/2022042409/5f2577de8741fb74bd379d3d/html5/thumbnails/17.jpg)
3 - User Interface Additions
Separate downloads and page visits
Split up item views and file downloads in item display statistics
StatisticsListing statListing = new StatisticsListing(new StatisticsDataVisits(dso));DatasetDSpaceObjectGenerator dsoAxis = new DatasetDSpaceObjectGenerator();dsoAxis.addDsoChild(Constants.BITSTREAM, 10, false, -1);statsList.addDatasetGenerator(dsoAxis);
![Page 18: DSpace 1.6 usage statistics: How does it work? Ben Bosman ... · @mire Contribution to DSpace 1.6 Core of @mireÕs Content & Usage Analysis Logging usage events in search index Querying](https://reader034.fdocuments.us/reader034/viewer/2022042409/5f2577de8741fb74bd379d3d/html5/thumbnails/18.jpg)
4 - Advanced Use Cases
Extend the data being stored
Harvest usage data
Store popular searches
Recommendations
![Page 19: DSpace 1.6 usage statistics: How does it work? Ben Bosman ... · @mire Contribution to DSpace 1.6 Core of @mireÕs Content & Usage Analysis Logging usage events in search index Querying](https://reader034.fdocuments.us/reader034/viewer/2022042409/5f2577de8741fb74bd379d3d/html5/thumbnails/19.jpg)
4 - Advanced Use Cases
Extend the data being stored
ReferrerStore referrer to visualize incoming links from other websites, and internal navigationReferrer URL retrieved from the browser
![Page 20: DSpace 1.6 usage statistics: How does it work? Ben Bosman ... · @mire Contribution to DSpace 1.6 Core of @mireÕs Content & Usage Analysis Logging usage events in search index Querying](https://reader034.fdocuments.us/reader034/viewer/2022042409/5f2577de8741fb74bd379d3d/html5/thumbnails/20.jpg)
4 - Advanced Use Cases
Extend the data being stored
Fields to be stored are defineddspace/solr/statistics/conf/schema.xml
<field name="referrer" type="string" indexed="true" stored="true" required="false" />
![Page 21: DSpace 1.6 usage statistics: How does it work? Ben Bosman ... · @mire Contribution to DSpace 1.6 Core of @mireÕs Content & Usage Analysis Logging usage events in search index Querying](https://reader034.fdocuments.us/reader034/viewer/2022042409/5f2577de8741fb74bd379d3d/html5/thumbnails/21.jpg)
4 - Advanced Use Cases
Extend the data being stored
Storage is handled by post() method in the SolrLogger classdoc1.addField(“referrer”, request.getHeader("referrer"));
Extend the user interface to use this new data
![Page 22: DSpace 1.6 usage statistics: How does it work? Ben Bosman ... · @mire Contribution to DSpace 1.6 Core of @mireÕs Content & Usage Analysis Logging usage events in search index Querying](https://reader034.fdocuments.us/reader034/viewer/2022042409/5f2577de8741fb74bd379d3d/html5/thumbnails/22.jpg)
4 - Advanced Use Cases
Harvest Usage Data
GoalsMining usage data from partner institutionsCompare usage data amongst different institutionsConstruct cross-institution recommendations based on usage data
Example set-upsNEEOPIRUS
![Page 23: DSpace 1.6 usage statistics: How does it work? Ben Bosman ... · @mire Contribution to DSpace 1.6 Core of @mireÕs Content & Usage Analysis Logging usage events in search index Querying](https://reader034.fdocuments.us/reader034/viewer/2022042409/5f2577de8741fb74bd379d3d/html5/thumbnails/23.jpg)
4 - Advanced Use Cases
Store popular searches
GoalsDisplay top searches within your repositoryDisplay relevant additional search terms to be included or excludedRank search results based on usage by other visitors
RequirementsStore executed search termsStore relation amongst search termsStore relation between search terms and opened items
![Page 25: DSpace 1.6 usage statistics: How does it work? Ben Bosman ... · @mire Contribution to DSpace 1.6 Core of @mireÕs Content & Usage Analysis Logging usage events in search index Querying](https://reader034.fdocuments.us/reader034/viewer/2022042409/5f2577de8741fb74bd379d3d/html5/thumbnails/25.jpg)
4 - Advanced Use Cases
Recommendations
Build recommendations solution based on the concept of ‘users who visited this item, also visited …’
![Page 26: DSpace 1.6 usage statistics: How does it work? Ben Bosman ... · @mire Contribution to DSpace 1.6 Core of @mireÕs Content & Usage Analysis Logging usage events in search index Querying](https://reader034.fdocuments.us/reader034/viewer/2022042409/5f2577de8741fb74bd379d3d/html5/thumbnails/26.jpg)
5 - High Load Installations
High load installations
Generate large amounts of usage dataSlower query executionResponse time increasesRequest time outs if response time is too long
![Page 27: DSpace 1.6 usage statistics: How does it work? Ben Bosman ... · @mire Contribution to DSpace 1.6 Core of @mireÕs Content & Usage Analysis Logging usage events in search index Querying](https://reader034.fdocuments.us/reader034/viewer/2022042409/5f2577de8741fb74bd379d3d/html5/thumbnails/27.jpg)
5 - High Load Installations
Solution
Optimization using Solr server featuresAutocommit systemQuery warmup system
![Page 28: DSpace 1.6 usage statistics: How does it work? Ben Bosman ... · @mire Contribution to DSpace 1.6 Core of @mireÕs Content & Usage Analysis Logging usage events in search index Querying](https://reader034.fdocuments.us/reader034/viewer/2022042409/5f2577de8741fb74bd379d3d/html5/thumbnails/28.jpg)
5 - High Load Installations! ! Autocommit
Autocommit feature optimizes storage of usage events
Out of the box, synchronous commits of usage events are used
The autocommit feature enables asynchronous commits of these events
![Page 29: DSpace 1.6 usage statistics: How does it work? Ben Bosman ... · @mire Contribution to DSpace 1.6 Core of @mireÕs Content & Usage Analysis Logging usage events in search index Querying](https://reader034.fdocuments.us/reader034/viewer/2022042409/5f2577de8741fb74bd379d3d/html5/thumbnails/29.jpg)
5 - High Load Installations! ! Autocommit
Remove solr.commit() from SolrLogger
Add the AutoCommit code to the solrconfig.xml:<autoCommit> <maxDocs>10000</maxDocs> <maxTime>900000</maxTime></autoCommit>
![Page 30: DSpace 1.6 usage statistics: How does it work? Ben Bosman ... · @mire Contribution to DSpace 1.6 Core of @mireÕs Content & Usage Analysis Logging usage events in search index Querying](https://reader034.fdocuments.us/reader034/viewer/2022042409/5f2577de8741fb74bd379d3d/html5/thumbnails/30.jpg)
5 - High Load Installations! ! Query warmup
Query warmup is used to optimize the query execution time
Preheated queries are cached by the Solr server based on the filter query
Queries are being warmed up for the current month
Query warmup required at:
Server startupEnd of each month
![Page 31: DSpace 1.6 usage statistics: How does it work? Ben Bosman ... · @mire Contribution to DSpace 1.6 Core of @mireÕs Content & Usage Analysis Logging usage events in search index Querying](https://reader034.fdocuments.us/reader034/viewer/2022042409/5f2577de8741fb74bd379d3d/html5/thumbnails/31.jpg)
5 - High Load Installations! ! Query warmup
Server startup:
<listener event="firstSearcher" class="solr.QuerySenderListener">
End of month:
Execute all expected queries for next month
![Page 32: DSpace 1.6 usage statistics: How does it work? Ben Bosman ... · @mire Contribution to DSpace 1.6 Core of @mireÕs Content & Usage Analysis Logging usage events in search index Querying](https://reader034.fdocuments.us/reader034/viewer/2022042409/5f2577de8741fb74bd379d3d/html5/thumbnails/32.jpg)
5 - High Load Installations
More detailed information about how to improve your performance
Performance improvements tips page:Coming soon at http://atmire.com/statistics_performance.phpRegister at http://atmire.com/contact.php to be notified when the explanation has been completedOr email [email protected]
![Page 33: DSpace 1.6 usage statistics: How does it work? Ben Bosman ... · @mire Contribution to DSpace 1.6 Core of @mireÕs Content & Usage Analysis Logging usage events in search index Querying](https://reader034.fdocuments.us/reader034/viewer/2022042409/5f2577de8741fb74bd379d3d/html5/thumbnails/33.jpg)
6 - Content and Usage Analysis Module
CUA module
Designed and developed by @mireModule core contributed to DSpace 1.6Same data loggingImproved interface
![Page 34: DSpace 1.6 usage statistics: How does it work? Ben Bosman ... · @mire Contribution to DSpace 1.6 Core of @mireÕs Content & Usage Analysis Logging usage events in search index Querying](https://reader034.fdocuments.us/reader034/viewer/2022042409/5f2577de8741fb74bd379d3d/html5/thumbnails/34.jpg)
6 - Content and Usage Analysis Module
Statlets
Configurable display in repositoryDetermine the displayed dataSeparate configuration for item, collection, community, repository homepage
GraphsGenerate various types of graphs, and integrate them in the display pages
![Page 35: DSpace 1.6 usage statistics: How does it work? Ben Bosman ... · @mire Contribution to DSpace 1.6 Core of @mireÕs Content & Usage Analysis Logging usage events in search index Querying](https://reader034.fdocuments.us/reader034/viewer/2022042409/5f2577de8741fb74bd379d3d/html5/thumbnails/35.jpg)
6 - Content and Usage Analysis Module
![Page 36: DSpace 1.6 usage statistics: How does it work? Ben Bosman ... · @mire Contribution to DSpace 1.6 Core of @mireÕs Content & Usage Analysis Logging usage events in search index Querying](https://reader034.fdocuments.us/reader034/viewer/2022042409/5f2577de8741fb74bd379d3d/html5/thumbnails/36.jpg)
6 - Content and Usage Analysis Module
Administrator interface
Wide range of reportsCreated instantaneous in the web interfaceConfigure type of report to be requestedFast access to results to verify the configurationGenerate data in a few clicks
View report as:Data TableDownloadable SpreadsheetVarious Graph types
![Page 37: DSpace 1.6 usage statistics: How does it work? Ben Bosman ... · @mire Contribution to DSpace 1.6 Core of @mireÕs Content & Usage Analysis Logging usage events in search index Querying](https://reader034.fdocuments.us/reader034/viewer/2022042409/5f2577de8741fb74bd379d3d/html5/thumbnails/37.jpg)
6 - Content and Usage Analysis Module
![Page 38: DSpace 1.6 usage statistics: How does it work? Ben Bosman ... · @mire Contribution to DSpace 1.6 Core of @mireÕs Content & Usage Analysis Logging usage events in search index Querying](https://reader034.fdocuments.us/reader034/viewer/2022042409/5f2577de8741fb74bd379d3d/html5/thumbnails/38.jpg)
6 - Content and Usage Analysis Module
![Page 39: DSpace 1.6 usage statistics: How does it work? Ben Bosman ... · @mire Contribution to DSpace 1.6 Core of @mireÕs Content & Usage Analysis Logging usage events in search index Querying](https://reader034.fdocuments.us/reader034/viewer/2022042409/5f2577de8741fb74bd379d3d/html5/thumbnails/39.jpg)
6 - Content and Usage Analysis Module
Content analysis
Visualize repository growthDisplay amount of records per yearCompare growth amongst various communities
Visualize distributionDisplay amount of records per type, language, …
![Page 40: DSpace 1.6 usage statistics: How does it work? Ben Bosman ... · @mire Contribution to DSpace 1.6 Core of @mireÕs Content & Usage Analysis Logging usage events in search index Querying](https://reader034.fdocuments.us/reader034/viewer/2022042409/5f2577de8741fb74bd379d3d/html5/thumbnails/40.jpg)
6 - Content and Usage Analysis Module
![Page 41: DSpace 1.6 usage statistics: How does it work? Ben Bosman ... · @mire Contribution to DSpace 1.6 Core of @mireÕs Content & Usage Analysis Logging usage events in search index Querying](https://reader034.fdocuments.us/reader034/viewer/2022042409/5f2577de8741fb74bd379d3d/html5/thumbnails/41.jpg)
6 - Content and Usage Analysis Module