Initial proposal for DSpace statistics application

10
Statistics application Statistics application for DSpace (initial for DSpace (initial proposal) proposal) Federico Paparoni Federico Paparoni

description

Initial proposal for DSpace statistics application

Transcript of Initial proposal for DSpace statistics application

Page 1: Initial proposal for DSpace statistics application

Statistics application for Statistics application for DSpace (initial proposal)DSpace (initial proposal)

Federico PaparoniFederico Paparoni

Page 2: Initial proposal for DSpace statistics application

ArchitectureArchitecture

Page 3: Initial proposal for DSpace statistics application

Request LayerRequest Layer

Informations are collected by different sourcesInformations are collected by different sources The main points that will log the statistics data The main points that will log the statistics data

are:are:1.1. Filter : a filter that logs informations about Filter : a filter that logs informations about

general hits and visitors of the DSpace platformgeneral hits and visitors of the DSpace platform2.2. View: every JSPTag (ItemTag, CollectionListTag View: every JSPTag (ItemTag, CollectionListTag

and so on) will log informations about hits and and so on) will log informations about hits and everything that can be read from HttpRequesteverything that can be read from HttpRequest

3.3. Search : searches on the DSpace platform will Search : searches on the DSpace platform will log searched querieslog searched queries

Page 4: Initial proposal for DSpace statistics application

Request Layer/2Request Layer/2

These point of logging will use a These point of logging will use a particular logger, defined in a new particular logger, defined in a new Log4j file propertiesLog4j file properties

So the structure of DSpace logging So the structure of DSpace logging will not change and a more detailed will not change and a more detailed log, for statistics purposes, will be log, for statistics purposes, will be createdcreated

Page 5: Initial proposal for DSpace statistics application

Request Layer/3Request Layer/3

A possible layout of this log file:A possible layout of this log file:

2007-04-19 17:10:28,031 INFO [Filter] Page hits from 151.100.41.122007-04-19 17:10:28,031 INFO [Filter] Page hits from 151.100.41.12 2007-04-19 17:10:30,031 INFO [Filter] Page hits from 151.100.41.122007-04-19 17:10:30,031 INFO [Filter] Page hits from 151.100.41.12 2007-04-19 17:11:40,031 INFO [View] ItemView : NameOfItem : 151.100.41.122007-04-19 17:11:40,031 INFO [View] ItemView : NameOfItem : 151.100.41.12 2007-04-19 17:11:41,031 INFO [View] ItemView : HttpReferer : 2007-04-19 17:11:41,031 INFO [View] ItemView : HttpReferer :

http://www.somehost.comhttp://www.somehost.com 2007-04-19 17:11:42,031 INFO [View] CollectionView : HttpReferer : 2007-04-19 17:11:42,031 INFO [View] CollectionView : HttpReferer :

http://www.somehost.comhttp://www.somehost.com

Page 6: Initial proposal for DSpace statistics application

Request Layer/4Request Layer/4

The work that has to be done for The work that has to be done for this layer is :this layer is :

1.1. Identify points of loggingIdentify points of logging

2.2. Identify informations loggedIdentify informations logged

3.3. Create a logfile with a suitable Create a logfile with a suitable formatformat

Page 7: Initial proposal for DSpace statistics application

Core LayerCore Layer

This layer, as the old stats application, will This layer, as the old stats application, will parse the log file and will submit parse the log file and will submit informations on the DBinformations on the DB

The communications between Core Module The communications between Core Module and DB can be created in different ways:and DB can be created in different ways:

DSpace lib: Org.dspace.storage.rdbmsDSpace lib: Org.dspace.storage.rdbms HibernateHibernate iBatisiBatis

Page 8: Initial proposal for DSpace statistics application

Core Layer/2Core Layer/2

Tables created and managed by Core Tables created and managed by Core Module will mantain the statistics Module will mantain the statistics informationsinformations

There will be also a “Cleanup module”, There will be also a “Cleanup module”, that will aggregate old informations on that will aggregate old informations on some tables, to don’t have a waste of some tables, to don’t have a waste of resourcesresources

This “Cleanup module” can be also This “Cleanup module” can be also executed from the Web interfaceexecuted from the Web interface

Page 9: Initial proposal for DSpace statistics application

Web LayerWeb Layer

The Web Layer will create different The Web Layer will create different views, containing the informations views, containing the informations collectedcollected

Communication with DB using some Communication with DB using some library (as Core Module)library (as Core Module)

Informations will be organized in Informations will be organized in different views, using different different views, using different formats formats

Page 10: Initial proposal for DSpace statistics application

Open questionsOpen questions

Open questions:Open questions:

1.1. Private access or Public access?Private access or Public access?

2.2. Configuration based views?Configuration based views?

3.3. JMS logging based? JMS logging based?