Handbook on Major Statistical Data Management...

58
United Nations Economic Commission for Africa African Centre for Statistics Handbook on Major Statistical Data Management Platforms

Transcript of Handbook on Major Statistical Data Management...

Page 1: Handbook on Major Statistical Data Management Platformsnsdsguidelines.paris21.org/sites/default/files/EDITED_Ha…  · Web viewHandbook on Major Statistical Data Management Platforms.

United Nations Economic Commission for Africa

African Centre for Statistics

Handbook on Major Statistical Data Management Platforms

Addis AbabaOctober 2011

Page 2: Handbook on Major Statistical Data Management Platformsnsdsguidelines.paris21.org/sites/default/files/EDITED_Ha…  · Web viewHandbook on Major Statistical Data Management Platforms.
Page 3: Handbook on Major Statistical Data Management Platformsnsdsguidelines.paris21.org/sites/default/files/EDITED_Ha…  · Web viewHandbook on Major Statistical Data Management Platforms.

Contents

I. BACKGROUND INFORMATION.........................................................................11.1. STATISTICALDATA..........................................................................................................................1

1.1.1. microdata.....................................................................................................................11.1.2. macrodata....................................................................................................................1

1.2. STATISTICAL DATA MANAGEMENT SYSTEM........................................................................................21.3. JUSTIFICATION OF THE ASSIGNMENT.................................................................................................21.4. ORGANIZATION OF THIS DOCUMENT.................................................................................................2

II. PROJECT DEFINITION...........................................................................................32.1. OBJECTIVE...................................................................................................................................32.2. MODE OF OPERATION...................................................................................................................32.3. SCOPE OF WORK..........................................................................................................................4

III. MAJOR REQUIREMENTS OF STATISTICAL DATA MANAGEMENT SYSTEMS43.1. DATA CAPTURING.........................................................................................................................43.2. DATA STORAGE AND RETRIEVAL......................................................................................................43.3. DATA PROCESSING AND DISSEMINATION...........................................................................................53.4. STANDARD DATA SHARING AND EXCHANGE.......................................................................................53.5. METADATA MANAGEMENT............................................................................................................63.6. INDICATORS MANAGEMENT............................................................................................................63.7. INTEGRATION WITH OTHER SYSTEMS................................................................................................73.8. DATA SECURITY............................................................................................................................7

3.8.1. Backup and Restore Features ......................................................................................73.8.2. Access Control..............................................................................................................73.8.3. User management........................................................................................................83.8.4. Users and data auditing...............................................................................................8

3.9. GIS SUPPORT..........................................................................................................................................83.10. REPORTING FEATURES...............................................................................................................................83.11. TRAINING................................................................................................................................................93.12. USER INTERFACE......................................................................................................................................93.13. ALERTING FEATURE...................................................................................................................................93.14. ANALYSIS TOOLS....................................................................................................................................103.15. SCALABILITY..........................................................................................................................................103.16. EXTENDIBILITY........................................................................................................................................103.17. SYSTEM ENVIRONMENT...........................................................................................................................10

IV. AVAILABLE STATISTICAL DATA MANAGEMENT SYSTEMS.........................104.1. LIST OF STATISTICAL DATA MANAGEMENT SYSTEMS......................................................................................104.2. PRODUCT DESCRIPTIONS..........................................................................................................................11

4.2.1. CountrySTAT (FAO).....................................................................................................114.2.2. DevInfo (UNICEF)........................................................................................................124.2.3. Eurotrace (Eurostat)..................................................................................................124.2.4. LABORSTA (ILO)..........................................................................................................144.2.5. Live database (World Bank).......................................................................................144.2.6. Nesstar.......................................................................................................................154.2.7. StatBase (UNECA).......................................................................................................164.2.8. StatWorks (OECD)......................................................................................................16

4.3. FEATURE COMPARISONS..........................................................................................................................19

Page 4: Handbook on Major Statistical Data Management Platformsnsdsguidelines.paris21.org/sites/default/files/EDITED_Ha…  · Web viewHandbook on Major Statistical Data Management Platforms.

V. SOFTWARE SELECTION GUIDELINES............................................................215.1. HIDDEN FACTORS FOR SOFTWARE SELECTION..............................................................................................21

5.1.1. Vendor history and experience...................................................................................215.1.2. Cost............................................................................................................................225.1.3. Ease of use/adoption..................................................................................................225.1.4. Maintenance..............................................................................................................225.1.5. Familiarity..................................................................................................................235.1.6. Security......................................................................................................................235.1.7. Software as a service (SaaS).......................................................................................23

5.2. IMPORTANT STEPS INSELECTING THE RIGHT SDMS......................................................................................235.2.1. Needs Analysis...........................................................................................................245.2.2. Management support................................................................................................245.2.3. Requirements specification........................................................................................255.2.4. RFP Preparation.........................................................................................................265.2.5. Software demonstration.............................................................................................275.2.6. System selection and contract negotiation................................................................27

VI. CONCLUSION...................................................................................................27VII. RECOMMENDATIONS......................................................................................28ANNEX.........................................................................................................

301. QUESTIONNAIRE TO NSOS..................................................................................................................302. QUESTIONNAIRE TO EXPERTS...............................................................................................................323. QUESTIONNAIRE TO VENDORS..............................................................................................................34

REFERENCES...............................................................................................................36

Page 5: Handbook on Major Statistical Data Management Platformsnsdsguidelines.paris21.org/sites/default/files/EDITED_Ha…  · Web viewHandbook on Major Statistical Data Management Platforms.

I. BACKGROUND INFORMATION

1. There is broad consensus among African Governments and development partners about the need for better statistics in support of sound policy formulation for the achievement of internationally-agreed goals, including the Millennium Development Goals (MDGs). Governments of African States realize that the right use of better statistics is essential for good policies and development outcomes. This recognition requires more accurate and timely statistics supported by a robust and integrated information technology environment. 2. National statistical offices (NSOs) on the continent, however, are providing limited statistical products and services in terms of quantity, type and quality, and are therefore unable to respond adequately to the increasing demand by their Governments and the international community for better development statistics.

3. One of the recommendations put forward by the Data Management Working Group during the first and second meetings of the Statistical Commission for Africa was to set up a group of experts made up of statisticians, and data management and geo-information experts, to evaluate the major statistical data management platforms available and compare their features so that member States and their partners can make informed decisions on the selection of platforms for statistical data collection, production and dissemination. The recommendation was prompted by the plethora of offers of data management platforms that member States receive. Some of these offers are at no or reduced cost as part of assistance projects, while others are at commercial values. Even when there are no financial costs, accepting all offers would result in duplication of efforts with associated wastage of scarce human capacity, and the possibility of data inconsistencies. Feature documentation of such statistical data management systems as well as selection guidelines or a handbook will, therefore facilitate the right platform selection to enhance the sustainability of information infrastructures and associated tools for the effective management and dissemination of statistical data, applications and services.

1.1. Statistical data

4. The notion of statistical data encompasses all the facts and estimated values of a certain specific entity. In the context of this handbook, “statistical data” refers to sequences of observations or estimated values of social, economic, political and environmental entities. Although there are various ways of classifying and differentiating statistical data, micro- and macrodata are worth mentioning in order to understand the scope of this assignment.

1.1.1. Microdata

5. Microdata are data about individual objects such as a person, event, transaction, etc. Every object can be characterized by properties. The values of these properties are considered as microdata. In microdata sets, each row typically represents an individual object and each column an attribute or characteristic feature of the object. Microdata are often collected from each object through a survey or individual measurement.

1.1.2. Macrodata

6. Macrodata are estimated values of statistical characteristics of sets of objects. Macrodata can be generated by combining, aggregating, or summarizing microdata or by

© 2011 African Centre for Statistics (UNECA) Page 1

Page 6: Handbook on Major Statistical Data Management Platformsnsdsguidelines.paris21.org/sites/default/files/EDITED_Ha…  · Web viewHandbook on Major Statistical Data Management Platforms.

direct observation and estimation of a group of entities. Macrodata comprise files containing tabulations, counts and frequencies.

1.2. Statistical Data Management Systems

7. A statistical data management system (SDMS) is a system that can model, store and manipulate data in a manner well suited to the needs of users who want to perform statistical analyses on the data. SDMSs offer process-oriented feature sets which help users traverse from data capture through the process of statistical data validation and production and information dissemination.

8. Statistical data analysis functionalities, including data validation, standardization support, metadata management and indicator management, are some of the core features of SDMSs which differentiate them from ordinary database systems.

9. Statistical data management systems are expected to:

(a) Increase the quality of the statistical information produced;

(b) Improve processes of statistical data analysis; and

(c) Modernize and increase the quality of data dissemination.

1.3. Justification of the assignment

10. National, regional and subregional statistical offices and organizations often need to compile data from various sources and disseminate the data to diverse user communities. That need should be a determining factor in choosing a statistical data management platform for such offices and organizations, which presupposes that the officers responsible for the selection have adequate knowledge of the capabilities of the various offerings. That is not always the case and some offices, therefore, end up with systems that may not fully satisfy their needs. Some have implemented multiple systems to benefit from system complementarity.

11. While it is not necessarily wrong to implement multiple systems if the situation warrants it, member States have expressed the need for guidance on the capabilities of the various options to make informed decisions with regard to the optimum platform (or platforms) for their particular environments. This handbook is therefore intended to address this need by documenting feature descriptions of the existing platforms. It also presents guidelines to be followed in selecting the required platform for the task at hand.

1.4. Organization of this document

12. This document is organized as follows. Section 2 presents the project definition where objective, mode of operation and scope are described. Section 3 outlines the critical requirements of a statistical data management system. This is by no means an exhaustive list of features, but is intended to serve as a reference for organizations. Section 4 documents the features of major statistical data management systems which are currently in use in member States and partner institutions. This is simple feature documentation of SDMSs which should not be considered as a feature comparison. Section 5 outlines the system selection guidelines

© 2011 African Centre for Statistics (UNECA) Page 2

Page 7: Handbook on Major Statistical Data Management Platformsnsdsguidelines.paris21.org/sites/default/files/EDITED_Ha…  · Web viewHandbook on Major Statistical Data Management Platforms.

and describes the major factors which influence the process of SDMS selection. This section also presents the steps to be followed in selecting the right SDMS for an organization. Finally, concluding remarks and recommendations are presented in Sections 6 and 7 respectively.

II. PROJECT DEFINITION

2.1. Objective

13. The main objective of this initiative is to produce a publication that documents characteristic features of major statistical data management platforms to serve as a guide for member States implementing data management services.

2.2. Mode of operation

14. In order to achieve the above objective, participatory design principles were strictly followed in the implementation of this initiative. Participatory design is an approach which gives much attention to the active involvement of all stakeholders in the whole implementation process of an initiative. The approach promotes participative communication and learning among stakeholders (including system vendors, experts, system users, management) and is also known for reducing last minute surprises by gradually and continuously informing participating individuals involved in the project.

15. To that end, the following operations were performed in the course of the initiative:

(a) An expert group, comprising individuals from different countries and institutions, was formed to support the initiative;

(b) An online discussion forum was set up to communicate ideas around selecting a suitable statistical data management platform;

(c) An expert group meeting was held and valuable feedback and suggestions on the draft handbook were forwarded after the discussions;

(d) Questionnaires were designed and distributed to three different types of stakeholders, namely: national statistical offices, experts and system vendors; (see attached)

(e) Physical observation of a selected site was conducted. This was to gauge how comfortable users were in using the system. Other working environments for the system were also taken into consideration;

(f) A review of technical specifications for selected statistical data management and dissemination platforms was conducted; and

(g) Demonstrations of selected statistical data management and dissemination systems were undertaken.

16. In general, intensive communications and discussions with all stakeholders were conducted to produce this document, including via an online discussion forum, emails, telephone discussions and the distribution of questionnaires.

© 2011 African Centre for Statistics (UNECA) Page 3

Page 8: Handbook on Major Statistical Data Management Platformsnsdsguidelines.paris21.org/sites/default/files/EDITED_Ha…  · Web viewHandbook on Major Statistical Data Management Platforms.

2.3. Scope of work

17. This initiative focused on macrodata management systems, identified as the area of immediate need by member States. Microdata management platforms will be dealt with separately as the needs in that area are different.

18. The project is also limited to analysing and documenting statistical data management platforms which are currently in use in the national statistical offices of member States and/or partner institutions. Systems deployed elsewhere are not given much attention in this document.

III. MAJOR REQUIREMENTS OF STATISCAL DATA MANAGEMENT SYSTEMS

3.1. Data capture

19. It is obvious that a statistical data management system should allow users to capture statistical data. The main requirement of the system is to capture all the data the users intend to store. The system should also offer appropriate data entry schemes. Some users might need to compile their data in other software such as MS Excel and need to import this into the system in batch mode.

20. The system is also expected to validate the data at the time of entry. Data validation is a critical feature for SDMSs.

21. Most commercial word processing packages use AutoText which is currently expanded to Building Blocks to facilitate data entry. In the word processing context, building blocks are stored snippets that can contain formatted/unformatted text, graphics, and other objects, which can be defined and inserted by the user into a document when needed. Building Blocks as a concept can also be implemented in SDMSs to improve data entry by speeding up the process and reducing errors.

22. Pulling data through web services from third-party database systems is also a crucial data capture feature that most SDMSs are required to possess.

3.2. Data storage and retrieval

23. Statistical organizations are responsible for collecting and storing a huge amount of statistical data just to feed the decision makers, researchers and the general public with timely and accurate information. Due to the magnitude of the amount of data maintained and the users’ expectations and demands for quality data, the processes of storage and providing access need to be supported by a robust statistical database system.

24. Storage and retrieval is, therefore, one of the major requirements of any statistical database system. Database systems need to store huge amounts of data in a systematic manner. They should also offer a flexible, intuitive and simple retrieval module which assists decision makers, the general public, and other users with limited system manipulation expertise to access the information from the database.

© 2011 African Centre for Statistics (UNECA) Page 4

Page 9: Handbook on Major Statistical Data Management Platformsnsdsguidelines.paris21.org/sites/default/files/EDITED_Ha…  · Web viewHandbook on Major Statistical Data Management Platforms.

3.3. Data processing and dissemination

25. Any statistical data management system is expected to perform data processing activities such as coding, editing, and data harmonization to list just a few. Once data is processed and the required adjustments are made, the database system should provide a dissemination facility.

26. Nowadays, the Internet is the most widely used dissemination medium. This technology is composed of a number of functional features:

(a) Electronic mail serves as a common platform for sending electronic messages. It is mostly appropriate for periodical reports to a selected and predefined user community;

(b) Websites are used to publish statistical information at a specified location on the Internet for the general public; and

(c) Websites also furnish features that help transport statistical data files in different formats (Excel, PDF, Word, etc.). They are, increasingly, becoming dissemination channels for statistical data. They offer a simple, comparatively cheap and efficient way to provide timely information to the core users of statistics as well as to a broader audience.

27. Most statistical database systems therefore possess a facility to publish information in a web readable format. Accordingly, web publishing capability is a critical SDMS requirement.

3.4. Standard data sharing and exchange

28. National statistical offices face tremendous pressure to provide reports to other organizations including Government offices, international development organizations, and partners. At the same time, NSOs need to capture data from various sources, including partner institutions, with different formats. It is also abundantly clear that these activities are performed frequently and entail a huge amount of data flow. Keying in such data manually is mostly a resource-intensive, tedious and error-prone activity which needs to be reduced as far as possible.

29. Synergies, standardization and optimization of processes and infrastructures are the only solution to this challenge. Standard exchange formats such as Statistical Data and Metadata Exchange (SDMX) can help by improving quality and efficiencies in the exchange and dissemination of data and metadata through:

(a) Harmonization and coherence of data;

(b) Preservation of meaning by coupling data with metadata that defines and explains it accurately;

(c) Use of an open format such as XML rather than a proprietary one; and

(d) Facilitating and standardizing the use of new technologies such as XML and Web services. Many NSOs are already using, or are planning to use, XML as the basis for

© 2011 African Centre for Statistics (UNECA) Page 5

Page 10: Handbook on Major Statistical Data Management Platformsnsdsguidelines.paris21.org/sites/default/files/EDITED_Ha…  · Web viewHandbook on Major Statistical Data Management Platforms.

their data management and dissemination systems. By choosing SDMX, the proliferation of many XML grammars could be avoided.

3.5. Metadata management

30. Metadata are defined as data about data, and refer to the definitions, descriptions of procedures, methodologies, system parameters and operational results that characterize and summarize statistical data. Metadata are data describing different quality aspects of statistical data, such as file contents, and definitions of objects, populations, variables, etc. This includes details on data accuracy, for example descriptions of the differences between the observed/estimated and true values of variables and statistical characteristics. Metadata can include information on which statistical data are available, where they are located, and how they can be accessed. Metadata also might contain a description of the content and layout, and a description of validation, aggregation and reports preparation rules. In other words, metadata can be considered as an entity describing the meaning, accuracy, availability and other important characteristics of the underlying data. These characteristic features of the underlying data are essential for correctly identifying and retrieving relevant statistical data for a specific problem as well as for correctly interpreting and reusing the data.

31. Metadata is critical because data are only made accessible through their accompanying documentation. Without a description of their various elements, data resources will manifest themselves to the end user as more or less meaningless collections of numbers. The metadata provides the bridge between the producers of data and their users and conveys information that is essential for secondary analysis.

32. As metadata is critical, metadata management is one of the core requirements of SDMSs. It is this feature which manages the metadata required for defining the content, quality, security, accessibility and other aspects of the actual database. The system, through the metadata management module, is expected to present a description of data content and layout, as well as a description of validation, aggregation and reports preparation rules.

33. Currently, standardization of metadata elements makes information sharing more reliable and universal. The use of metadata standards enables producers to describe data sets fully and coherently. They also facilitate data discovery, retrieval and use. The Data Documentation Initiative is an example of a metadata standard which is used for documenting data sets and designed to be fully machine readable and machine processable. Metadata standard compliance is another critical requirement that a SDMS needs to demonstrate.

3.6. Indicators management

34. Statistical indicators are any quantitative data that provide evidence about the quantity, quality or standard of an entity. The following are some examples of indicators collected by the World Bank (http://data.worldbank.org/indicator):

(a) Expenditure per student, primary (% of GDP per capita);

(b) Public spending on education, total (% of government expenditure);

(c) Expenditure per student, secondary (% of GDP per capita); and

© 2011 African Centre for Statistics (UNECA) Page 6

Page 11: Handbook on Major Statistical Data Management Platformsnsdsguidelines.paris21.org/sites/default/files/EDITED_Ha…  · Web viewHandbook on Major Statistical Data Management Platforms.

(d) Pupil-teacher ratio (primary)

35. In most cases, SDMSs should enable users to create new indicators and manage existing ones. The management might include operations such as categorizing indicators into thematic groups, deleting existing indicators, or any other modifications.

3.7. Integration with other systems

36. In this era of technology, it cannot be thought that there is only a single software system to manage processes of an organization. For different reasons, most organizations deploy multiple technology solutions through time to manage their day-to-day activities. Ultimately, however, as those systems are working to realize the vision of a single organization, the need for integration arises. The same requirement might arise with statistical data management systems.

37. System integration deals with making two or more systems communicate. Such communication can happen with different levels of proximity. Support for a standard import/export facility can be used to transfer data from one system to another, or refer data held in another database.

3.8. Data security

38. Data security is a broad concept and can be defined from various perspectives, each defining separate SDMS requirements. Some of these perspectives are presented in the subsequent paragraph:

3.8.1. Backup and restore features

39. An SDMS should provide an automatic backup feature for all inputs made to the system. It should also furnish a restore facility, which will enable the system to recover lost data. A manual backup and restore feature is also a crucial component of any database system. Users (administrators) should be allowed to configure periodic backups or run one-time backup processes.

3.8.2. Access control

40. In most database systems access control is defined through roles, which determine permissions. Roles job functions within the context of an organization with some associated semantics regarding the authority and responsibility conferred on the user assigned to the role. A role can be configured to consolidate the users’ responsibilities, and the permissions that users require to perform a specific function.

41. Permissions can be granted to access functions such as data editing, data approval and other administrative functions, or to access restricted data such as that which is geographically specific. Role-based access control is required because it simplifies mass updates of user permissions; an organization need only change the permissions or role, and the users assigned that role will inherit the new set of permissions automatically.

© 2011 African Centre for Statistics (UNECA) Page 7

Page 12: Handbook on Major Statistical Data Management Platformsnsdsguidelines.paris21.org/sites/default/files/EDITED_Ha…  · Web viewHandbook on Major Statistical Data Management Platforms.

3.8.3. User Management

42. An SDMS, especially if it runs in a multi-user environment, is expected to provide a feature that enables organizations to define administrative functions and manage users based on specific requirements such as job role or geographic location.Depending on the nature of the organization, different approaches can be followed to create a user as indicated below:

(a) User registration by centralized administration: In this approach, a system administrator is responsible for creating and managing all users of the system. This approach is appropriate in cases where the database has a small number of users;

(b) Delegating administration: Instead of relying on a centralized administrator to manage all users, an organization can create local administrators and grant them sufficient privileges to manage a specific subset of the organization's users. This provides the organization with a more granular level of security, and the ability to make the most effective use of its administrative capabilities; and

(c) Self-service Requests: This approach enables end users to request initial access or additional access to the system. Access requests of users are either approved by the system (with minimal privileges) or reviewed by the system administrator before approval. A self-service registration process is an ideal approach when the number of system users is big and in cases where users are not known to the administrator before requesting data. This system is mostly used to grant access privileges for websites.

3.8.4. Users and data auditing

43. An SDMS should possess a feature to audit users and changes they make to the database. It should allow the tracking of users' activities. Audit reports should give detailed historical information on users' activities. Some applications offer real-time information on user activities.

44. Audit trails also help to keep a history of changes to important data. With an audit trail, it is easy to determine how data elements obtained their current value.

3.9. GIS support

45. Geographical Information System (GIS) technology is now a mature technology which is used to present attractive and intuitive reports using maps. Due to the fact that most statistical data is geocentric, GIS support is a critical feature requirement of an SDMS.

3.10. Reporting features

46. An SDMS should have a reporting engine which allows users to generate different types of reports. The reporting engine is expected to have predefined report templates as well as allow users to design new and ad hoc reports on the fly.

© 2011 African Centre for Statistics (UNECA) Page 8

Page 13: Handbook on Major Statistical Data Management Platformsnsdsguidelines.paris21.org/sites/default/files/EDITED_Ha…  · Web viewHandbook on Major Statistical Data Management Platforms.

3.11. Training

47. The critical factor in the success of a major system implementation project is the knowledge transfer that takes place before and during implementation. This can be accomplished using a combined approach where the main objectives are both to educate and to train.

48. Hence, training, though not directly considered as a statistical system requirement, is a major factor to be considered when evaluating a specific platform. Questions such as the following should be asked:

(a) Does the vendor have a sound training strategy?

(b) What is the training approach?

(c) Is there separate training for ordinary end users and key users?

3.12. User interface

49. The SDMS user interface is the medium which helps the user to communicate with the system. In order for the user to fully utilize the system’s functionalities, the system must have a simple, attractive and intuitive user interface. Generally, graphical user interfaces are preferable to their command line counterparts. Items to consider when evaluating a user interface include:

(a) Ease of customization of the look and feel of the user interface by database managers without the intervention of the developer. These are simple modifications, such as increasing/decreasing font size, changing colours of buttons, menus and texts, of the user interface;

(b) Validating data entry - when users enter invalid data, the system should return an error message so that the user can correct the invalid entry;

(c) Error reporting/feedback - the system should offer a facility to report unexpected errors to the developer; and

(d) Wizards - the system should guide the user step by step to complete processes.

50. The system should also offer an expert mode whereby expert users can use shortcuts to operations.

3.13. Alerting feature

51. In organizational applications, users - mostly managers – commonly prefer to get information when a predefined action occurs with the database. This event could be a new inclusion in the database, an approval request, or a threshold exceeded. Such a function is known as an alerting feature.

© 2011 African Centre for Statistics (UNECA) Page 9

Page 14: Handbook on Major Statistical Data Management Platformsnsdsguidelines.paris21.org/sites/default/files/EDITED_Ha…  · Web viewHandbook on Major Statistical Data Management Platforms.

52. An SDMS is required to have an alerting feature, which will assist users to configure alert types and alert recipient groups. Once configured, the system should automatically send alerts at the time of occurrence of the predetermined event.

3.14. Analysis tools

53. Data analysis is an integral part of SDMSs. The major requirement is that an SDMS should possess an easy-to-use analysis tool. Users should be able to easily understand the tool and interpret the results.

3.15. Scalability

54. Scalability is the ability of software to handle a growing amount of work. In statistical database systems this is generally related to the increasing amount of data. If the system quickly reaches a point where it cannot support new additions of data, users, and/or node of operation, the system is not scalable. 55. An example of a scalability requirement can be described as follows:

The system should have a capacity of supporting up to five years with a maximum increase in database size, number of terminals/workstations, and/or activity levels without a server upgrade or a significant decrease in system response or performance.

3.16. Extendibility

56. Extendibility is the extent to which software can be adapted to new requirements. It refers to the magnitude of the effort required to add additional features after implementation of an SDMS. Database systems developed on the basis of component-based architecture are mostly highly extendible. They use plug-and-play components for new additional features.

57. As the addition of new features after implementation of a statistical database system is inevitable, the SDMS is required to be extendible so that the owning organization can incorporate new features with minimal effort and expense.

3.17. System environment

58. The system environment in which an SDMS is running should be given due attention. It is quite difficult to strictly identify a specific environment setting as this varies from organization to organization.

59. The system environment refers to the operating system and the relational database engine an SDMS is running on. Consideration should be given to whether the software can run on a network or is a stand-alone product.

IV. AVAILABLE STATISCAL DATA MANAGEMENT SYSTEMS

4.1. List of statistical data management systems

60. By employing different data collection methods such as the administration of questionnaires, discussion forums and a literature review, we have come to understand that

© 2011 African Centre for Statistics (UNECA) Page 10

Page 15: Handbook on Major Statistical Data Management Platformsnsdsguidelines.paris21.org/sites/default/files/EDITED_Ha…  · Web viewHandbook on Major Statistical Data Management Platforms.

the following statistical data management systems are in use on the continent. It should be noted, however, that the following list only includes those systems which deal with macrodata management:

(a) CountrySTAT (Food and Agriculture Organization of the United Nations - FAO);

(b) Devinfo (United Nations Children’s Fund - UNICEF);

(c) Eurotrace (Eurostat);

(d) LABORSTA (International Labour Organization - ILO);

(e) Live Database (World Bank);

(f) Nesstar;

(g) StatBase (United Nations Economic Commission for Africa - UNECA); and

(h) StatWorks (Organization for Economic Cooperation and Development - OECD).

4.2. Product descriptions

4.2.1 CountrySTAT (FAO)

61. CountrySTAT is a statistical database system for food and agriculture statistics at the national and subnational levels. It provides access to statistics across thematic areas such as production, prices, trade and consumption. CountrySTAT is the country-specific version of a statistical data management system called FAOSTAT which is deployed at FAO. CountrySTAT serves as a complementary system to FAOSTAT, in that the two systems can seamlessly integrate for data sharing and consolidation. FAOSTAT is designed to consolidate data transferred from specific CountrySTAT deployments to generate quality international statistics on food and agriculture.

62. CountrySTAT has two data categories, namely core and details. The core data category consists of national data shared with the FAOSTAT database. The design of the core data category enables both FAO and country-level statistical offices to easily transfer data between their respective STAT databases. On the other hand, the details category provides more detailed data with subnational relevance and with the lowest levels of disaggregation.

63. CountrySTAT can operate in many popular data formats: HTML, XML, Microsoft Excel, Microsoft Access, Comma-SeparatedValue (CSV) files and others. In addition, SDMX Technical Standards Version 2.0 is supported for the exchange of data and metadata based on a common information model. As it is a web-based system, there is no need to build costly new computer networks to link government offices for the purpose of data exchange.

64. Deploying CountrySTAT requires a Windows operating system, Microsoft Internet Information Server, and PC-Axis and PC-Web family software. Depending on the

© 2011 African Centre for Statistics (UNECA) Page 11

Page 16: Handbook on Major Statistical Data Management Platformsnsdsguidelines.paris21.org/sites/default/files/EDITED_Ha…  · Web viewHandbook on Major Statistical Data Management Platforms.

implementation environment, CountrySTAT can be deployed with a PC-Axis database or can be extended to utilize popular database engines including Oracle, Sybase or MS-SQL.

4.2.2. DevInfo (UNICEF)

65. DevInfo is an integrated desktop and web-enabled tool that supports both standard and user-defined indicators. A standard set of MDG indicators is at the core of the DevInfo package. In addition, at the regional and country levels, database administrators have the option to add local indicators to their databases. The software supports an unlimited number of levels of geographical coverage: from the global level to regional, subregional, national and subnational levels down to subdistrict and village levels (including data on schools, health centres, water points, and other infrastructure).

66. DevInfo has simple and user-friendly features that can be used to query the database and generate tables, graphs and maps. The system provides an ideal tool for evidence-based planning, results-focused monitoring, and advocacy. It allows data to be organized, stored and displayed in a uniform way to facilitate data sharing at the country level across government departments, United Nations agencies and development partners. 67. Data from DevInfo can be exported to XLS, HTML, PDF, CSV and XML files and imported from spreadsheets in a standardized format. DevInfo also has a data exchange module for importing data from industry-standard statistics software packages such as SPSS, SAS, Stata, Redatam, and CSPro.

68. DevInfo is distributed royalty-free to all member States and United Nations agencies for deployment on both desktops and the Internet. The user interface of the system and the contents of the databases it supports include country-specific branding and packaging options which have been designed to ensure broad ownership by national authorities. UNICEF has absolutely no restrictions on the database and its use.

69. The most common DevInfo users include United Nations country teams, national statistical offices, planning ministries and district planners. Frequent users also comprise members of the media (for reporting and tracking human development data), educational institutions (for analysing data and helping students gain data access), as well as DevInfo administrators (for customizing the system and adding data through advanced database administration modules).

4.2.3. Eurotrace suite

70. Eurotrace is a statistical data processing software for external trade statistics which is designed by Eurostat. It has end-to-end features which enable users to capture, process, store and disseminate statistical data. More specifically, it has tools for data entry, data transfers, data checking, data editing, validation, and dissemination. Eurotrace can also be used as a companion package with ASYCUDA (Automated System for Customs Data).

© 2011 African Centre for Statistics (UNECA) Page 12

Page 17: Handbook on Major Statistical Data Management Platformsnsdsguidelines.paris21.org/sites/default/files/EDITED_Ha…  · Web viewHandbook on Major Statistical Data Management Platforms.

71. It is composed of the following three main software modules:

(a) Eurotrace Editor – is designed to allow users to enter data efficiently and export to Eurotrace DBMS. With Eurotrace, data can be grouped into manageable subsets that can be distributed to many people for correction and adjustment. It tracks down which subsets are produced and matches the corrected subsets to the original ensuring control of the complete distribution of the data processing effort. Eurotrace also provides wizards for translation and aggregation of data;

(b) Eurotrace DMBS - permits users complete preparation of statistical data including metadata management, management of validation rules, data aggregation and transformation, and management of data import/export. Manual data correction and editing is also possible by exporting data to Eurotrace Editor; and

(c) Comext Browser - is a system for the storage, analysis and retrieval of statistical data, which is used to view, extract and do calculations on external trade data. Comext also offers facilities to assist the dissemination of data. The browser has both server and client versions. The main characteristics of the Comext Browser include its multidimensional and virtual spreadsheet, integration with Microsoft Excel, exporting to HTML and XML formats, online analytical processing (OLAP) engine to perform aggregation and/or combine data among different nomenclatures on the fly, and multilingual nomenclatures for successor-predecessor relationships.

72. The following figure presents the three basic components of Eurotrace and their integration:

Fig. 1. Eurotrace Suite Modules (Source: Eurotrace brochure)

73. From the technical point of view, the Eurotrace suite of programmes is built with Microsoft Visual Basic and C++ programming languages and supports Data Access Objects (DAO) and open database connectivity standards (ODBC).

© 2011 African Centre for Statistics (UNECA) Page 13

Page 18: Handbook on Major Statistical Data Management Platformsnsdsguidelines.paris21.org/sites/default/files/EDITED_Ha…  · Web viewHandbook on Major Statistical Data Management Platforms.

4.2.4. LABORSTA (ILO)

74. LABORSTA is a statistical data management platform of the International Labour Organization, operated by ILO Department of Statistics. It was developed to manage labour-related statistical data such as:

(a) Total and economically active population;

(b) Employment ;

(c) Unemployment ;

(d) Hours of work;

(e) Wages;

(f) Labour costs;

(g) Consumer price indices;

(h) Occupational injuries;

(i) Strikes and lockouts;

(j) Household income and expenditure; and

(k) International labour migration.

75. It also has a predefined metadata definition which can be accessed by users of the system. Users can file their request to the system in a query form and can download their query results in a format of their choice.

4.2.5. Live database (World Bank)

76. The Live Database (LDB) is a user-friendly computer-based data tool that consists of

(a) A Local Database - a tool for in-depth economic work;

(b) Query- a tool for storing and manipulating economic and sectoral variables; and

(c) Africa Briefings-presorted ready-to-use data.

77. The system was developed by the World Bank’s Africa Region with two complementary goals in mind: in the short term, to provide staff in the region with an efficient means of collecting, analysing and manipulating economic and sectoral data, and in the long term, to become the linchpin of a major capacity-building effort in African countries, aimed at upgrading local capacity in statistical data collection and analysis.

© 2011 African Centre for Statistics (UNECA) Page 14

Page 19: Handbook on Major Statistical Data Management Platformsnsdsguidelines.paris21.org/sites/default/files/EDITED_Ha…  · Web viewHandbook on Major Statistical Data Management Platforms.

78. LDB is a fully web-based system with an intuitive and friendly user interface. It utilizes web services to allow seamless integration and data sharing with third-party systems. 79. LDB is also equipped with OLAP technology. This means that users can perform complex calculations on the fly. Such capabilities were not previously available or required expensive programmers to execute. At the same time, the system is designed as a toolkit, using off-the-shelf technology that allows it to be replicated, transferred and installed anywhere with little software and hardware know-how.

80. LDB has the built-in flexibility to allow the addition of new indicators, customization of standard reports, change in the methodology used to calculate growth rates, etc. It is a system, not simply a database with current data from the World Bank.

4.2.6. Nesstar

81. Nesstar is a suite of software tools which offer features to publish, locate, and access statistical data. It represents a system of software architecture that helps users to create, locate, access and operate statistical data. Nesstar has added a level to the already existing web technology by creating a web server geared towards statistical data manipulation based on widely adopted data documentation standards. Accordingly, the demands of recognized systems such as the Data Documentation Initiative and open source initiatives such as JBoss are a key component of the Nesstar suite of products.

82. Though there are quite a number of tools available, the main Nesstar components are the following:

(a) Nesstar Publisher - is a data management programme, which consists of data and metadata conversion and editing tools, enabling the user to prepare materials for publication to a Nesstar server. It can also be used as a stand-alone tool for the preparation of data and metadata. The Publisher enables users to enhance data sets by combining a wide range of catalogue and contextual information, which can then be viewed within the Nesstar web client called Nesstar WebView;

(b) Nesstar Server- is built as an extension to a normal web server by incorporating statistical data management features. As well as providing all the usual facilities for publishing web content, this server provides the ability to publish statistical information that can be searched, browsed, analysed and downloaded by users. This is done either by using a standard web browser or using Nesstar WebView; and

(c) Nesstar WebView- is a web-based system for the dissemination of statistical data which can be used to view tabular (cube) data as well as metadata that have been published using Nesstar Publisher and made available on a Nesstar server. The WebView allows users to search for, locate, browse, analyse, and download a wide variety of statistical and related data within a web browser. With the help of third party mapping solutions such as GeoServer, it can also display statistical data in maps. It also helps users to perform data analysis including cross-tabulation, correlation and regression.

© 2011 African Centre for Statistics (UNECA) Page 15

Page 20: Handbook on Major Statistical Data Management Platformsnsdsguidelines.paris21.org/sites/default/files/EDITED_Ha…  · Web viewHandbook on Major Statistical Data Management Platforms.

4.2.7. StatBase

83. StatBase is a statistical data management platform developed by the African Centre for Statistics of UNECA as a central database system to manage all macro time series at subregional offices and NSOs of member States. StatBase aims at sound and proper management of macrodata and easy access to statistical information by users of all categories.

84. StatBase is developed using the latest web-based architecture in order to benefit from technological advancements, and is based on stable database management systems. The back-end component of the StatBase application runs on Windows as server operating system, MS SQL as database server, and Internet Information Server as application server.

85. As StatBase is a web-enabled system, it follows all the web-based user-interface standards which make access to the system intuitive and simple. Major features of StatBase are:

(a) Multi-user functionality;

(b) Multisector data management capability;

(c) Document management functions;

(d) Structured generically to allow management of indicators;

(e) Metadata management capability;

(f) Import/export functionality;

(g) Role-based access control;

(h) Allows storage, retrieval and dissemination of national and subnational data levels (up to four levels in addition to cities/town), and periodicity (annual, quarterly and monthly);

(i) Parameter driven application;

(j) A centralized database system;

(k) Latest relational database technology;

(l) Complete scalability for any size of data and number of users; and

(m) Facilitates textual data management.

4.2.8. StatWorks (OECD)

86. StatWorks is a generic software toolkit for statistical database management designed and implemented by OECD. It uses MS SQL as a database engine where statistical data is stored and managed. The platform manages statistical production processes including initial

© 2011 African Centre for Statistics (UNECA) Page 16

Page 21: Handbook on Major Statistical Data Management Platformsnsdsguidelines.paris21.org/sites/default/files/EDITED_Ha…  · Web viewHandbook on Major Statistical Data Management Platforms.

data migration, database administration, security management, data capture and validation, indicators management, metadata management, data querying, and data export.

87. StatWorks is designed to be fully integrated with other statistical data management tools developed by OECD. The statistical information system architecture of OECD has the following major tools which are vital for StatWorks:

(a) OECD.stat - a data-sharing and dissemination environment of OECD. It is a data warehouse platform designed to store and disseminate corporate statistical data. Third-party OLAP tools can also be used to analyse the data stored in the warehouse;

(b) MetaStore - a web-based system designed to manage metadata which

describes characteristic features of data sets including structure, collection methods, manipulation techniques, quality attributes, etc; and

(c) OECD eXplorer - a web-based interface to explore, analyse and visualize statistics. It has mapping features and visual presentations such as bubble charts and a parallel coordinates plotter, which enable users to analyse groups of areas of interest.

88. The overall OECD statistical data analysis environment is depicted in the following figure:

Fig. 2. OECD statistical data analysis environment (Source: [5])

89. StatWorks developers are in the process of replacing existing CD-ROM-based data exchange services with a web-based warehouse-to-warehouse SDMX enabled system. In addition, StatWorks intensively utilizes spreadsheets, most notably Excel, for data computation, presentation and visualization.

© 2011 African Centre for Statistics (UNECA) Page 17

Page 22: Handbook on Major Statistical Data Management Platformsnsdsguidelines.paris21.org/sites/default/files/EDITED_Ha…  · Web viewHandbook on Major Statistical Data Management Platforms.

© 2011 African Centre for Statistics (UNECA) Page 18

Page 23: Handbook on Major Statistical Data Management Platformsnsdsguidelines.paris21.org/sites/default/files/EDITED_Ha…  · Web viewHandbook on Major Statistical Data Management Platforms.

4.3. Feature comparisonsThe following table presents a summary of the features of the statistical database systems described above:

Features

Cou

ntry

STA

T

Dev

Info

Eur

otra

ce

LA

BO

RST

A

LD

B

Nes

star

Stat

Bas

e

Stat

Wor

ks

Data storage and retrieval yes yes yes yes yes yes yes yesData entry features yes yes yes yes yes yes yes yesData processing and dissemination yes yes yes yes yes yes yes yesSDMX support yes yes yes NIF1 NIF yes no yesMetadata management yes yes yes NIF NIF yes yes yesIndicators management NIF yes NIF NIF NIF NIF yes NIFUser management yes yes yes yes yes yes yes yesMulti-user support yes yes yes yes yes yes yes yesGIS support no yes no no yes no yes yesData security features yes yes yes yes yes yes yes yesGraphical user interface yes yes yes yes yes yes yes yesCustomization capabilities no no no no no no no noAvailability of wizards that guide users through a series of steps necessary to complete a defined process, without the use of commands or traditional menus

no no yes no no no no no

Ability to perform checking and validation of user input before sending data to the server

NIF yes NIF NIF NIF NIF yes NIF

Alerting system NIF NIF no no NIF no no NIFAudit trail management NIF yes NIF NIF NIF NIF yes yes

1NIF: No information found

© 2011 African Centre for Statistics Page 19

Page 24: Handbook on Major Statistical Data Management Platformsnsdsguidelines.paris21.org/sites/default/files/EDITED_Ha…  · Web viewHandbook on Major Statistical Data Management Platforms.

Features

Cou

ntry

STA

T

Dev

Info

Eur

otra

ce

LA

BO

RST

A

LD

B

Nes

star

Stat

Bas

e

Stat

Wor

ks

Web-based yes yes yes yes yes yes yes yesMulti-language support yes yes yes yes yes yes yes yesMultisector support no yes yes no yes yes yes yesPlatform independence NIF yes yes no NIF yes yes NIFCreation of new report templates no no no no no no no noContinuous and automatic backup no no no no no no no noEnd user training yes yes yes NIF NIF yes yes yesAvailability of ongoing support and maintenance services after implementation

yes yes NIF NIF yes yes NIF

Availability of context-sensitive help no no no no no no no noDocument management NIF NIF NIF NIF NIF NIF yes yes

© 2011 African Centre for Statistics Page 20

Page 25: Handbook on Major Statistical Data Management Platformsnsdsguidelines.paris21.org/sites/default/files/EDITED_Ha…  · Web viewHandbook on Major Statistical Data Management Platforms.

V. SOFTWARE SELECTION GUIDELINES

90. Successful software selection and implementation begins with a comprehensive project domain specification and planning. However, there are issues which are mostly not visible or underestimated at the planning phase but have a huge negative influence on the project if not properly handled. Such hidden factors are discussed in the following section, which is followed by a presentation of statistical data management system selection guidelines.

5.1. Hidden factors for software selection

91. Software is an integral part of day-to-day activities in an organization. It is no less important than the products and services acquired. This also applies to statistical data management systems. In many ways, selecting an SDMS is no different from selecting a product or service. Naturally, some of the same purchase criteria apply – brand, service, and maintenance costs. In spite of the obviousness of the above, SDMS or, for that matter, any software selection is a grey zone, an underdeveloped arena. This accounts for the high incidence of “shelfware” – software that is bought with grand intentions, but ends up on dusty shelves. This mainly happens because purchases are made based on what immediately meets the eye – technical features. This mistake is understandable, because technical features are well documented and advertised, and easy for the buyer to use as purchasing criteria. But with this approach, factors that are equally, if not more important, but not as immediately obvious, are neglected. Some of these critical factors are presented in the subsequent paragraphs.

5.1.1. Vendor history and experience

92. Vendor background is essential because vendors, directly or indirectly, are likely to be responsible for working with the sensitive statistical data of an organization. A background check is crucial as the investment needs to be made with a dependable vendor with a proven track record. Some questions to ask about the vendor would be:

(a) How long has the vendor operated?

(b) How long has the vendor been in the field of software development?

(c) Is the vendor the software developer, or are they merchandising the software?

(d) What is the vendor’s niche? Does the vendor understand our organization’s niche well enough to know our needs?

(e) Who are the customers of this vendor?

(f) Who is using this SDMS?

(g) What did the customers say about the vendor/SDMS?

© 2011 African Centre for Statistics Page 21

Page 26: Handbook on Major Statistical Data Management Platformsnsdsguidelines.paris21.org/sites/default/files/EDITED_Ha…  · Web viewHandbook on Major Statistical Data Management Platforms.

5.1.2. Cost

93. There is no denying the importance of cost effectiveness in software choices. Yet costs should be seen in a broad perspective; low entry costs may well result in higher total costs over the life of the system. Both one-time fixed costs and subsequent recurring costs should be considered when selecting a SDMS.

94. A cost-benefit analysis is a critical activity to determine investment feasibility. Costs should be compared with the system’s range of features and functionalities. A system may not be the cheapest, but it may allow you to perform many functions. On the other hand, opting for many features can be a trap, because users never get around to using half of them. Many features may not relate to the needs to be addressed.

5.1.3. Ease of use adoption

95. A system should have an intuitive interface, and the use of features should be self-evident. The shorter the learning curve in training a new user, the better. The software should also have the ability to easily fit into existing systems with which it will have to communicate.

96. Adoption strategy and the vendor’s experience in assisting the organization’s training of end users are key areas to be taken into consideration. One of the key selection criteria would be the knowledge transfer approach the vendor is following with regard to the system. This could be end user training, key user training in system implementation and configuration, and continuous or one-off training. Free training seminars or their new avatar - webinars (online seminars) - greatly help users to get up to speed with software at no extra cost. In some cases the company might offer paid training, which may be essential.

97. If users are not equipped with the maximum skills and knowledge to operate the system, and if the system is too complex, it will become “shelfware”. Hence, ease of use and adoption approaches need to be given due attention in the SDMS selection process.

5.1.4. Maintenance

98. Maintenance costs and effort have a major impact on the performance and adoptability of a SDMS, and hence, form an important criterion of the buying decision. If the system is hosted by the vendor, it is of utmost importance that it be available online (“uptime”) at all times. A minimum uptime of 99 per cent should be sought. Signing a service level agreement is a common practice to ensure a contractual agreement with the vendor that quantifies performance indicators such as uptime.

99. The vendor’s upkeep of the system is also important. Efforts exerted by the vendor to constantly improve the system indicate commitment to providing quality services. Evaluating the frequency of bug fixes, upgrade releases and the availability of strong user communities demonstrates whether the vendor is active and dependable for post-implementation maintenance. It is a good habit to review the vendor’s newsletter, release notes or the “what’s new” section on its website, as frequent updates are indicative of a dynamic vendor.

© 2011 African Centre for Statistics Page 22

Page 27: Handbook on Major Statistical Data Management Platformsnsdsguidelines.paris21.org/sites/default/files/EDITED_Ha…  · Web viewHandbook on Major Statistical Data Management Platforms.

5.1.5. Familiarity

100. The “look and feel” of the system is a major selection criterion. The new system should keep the basic layout and navigation schemes that are already familiar, as this makes for a quicker transition for the users. A comparison with the operating system in which the system is to be implemented is recommended. For instance, a system with a Mac schema would not fit well in Windows. It is also customary to look for a SDMS which can run in a familiar environment in terms of operating system, development tools, reporting tools, etc.

5.1.6. Security

101. Security is a top consideration for statistical data management systems. The organization needs to be assured that its data are secure and that there are no risks of data being compromised. The extent of the security consideration might vary from one organization to another depending on the sensitivity of data.

102. As discussed in the previous sections, security has various elements: data security, function security and system security are all factors that should be given due attention.

5.1.7. Software as a service (SaaS)

103. With the emergence and maturity of cloud computing, services such as SaaS are gaining popularity. SaaS is a scheme whereby the services of a software system, in this case a SDMS, are acquired without physically purchasing the software. The software is hosted or deployed at the site of the developer or vendor and access privilege is given to the client upon subscription. Once the client is granted access to the system, its functionalities can be used through a network or the Internet.

104. SaaS avoids hardware investments, which in turn drastically reduces the initial investment cost. The scheme also avoids maintenance costs, as the vendor is responsible for maintaining the system. There is also no need to hire dedicated support staff at the organization’s site as most support-related activities are handled by the vendor. System updates and upgrades can be automatically performed at the vendor’s site.

105. However, some organizations might not feel secure with their critical data being stored on the server of an external company. The vendor should be trusted and there must be binding agreements to cover the entire business. The cost and quality of network connections are also issues to be considered if an organization opts to implement SaaS.

5.2. Important steps in selecting the right SDMS

106. As in any project, a well-planned and researched approach must be adopted to ensure success in SDMS selection. SDMS selection requires a significant investment of time and resources, involvement of the entire organization, and a considerable amount of research, planning and re-evaluation along the way.

© 2011 African Centre for Statistics Page 23

Page 28: Handbook on Major Statistical Data Management Platformsnsdsguidelines.paris21.org/sites/default/files/EDITED_Ha…  · Web viewHandbook on Major Statistical Data Management Platforms.

107. The following are important steps in the SDMS selection/acquisition process. It should be noted, however, that these steps may not exactly fit the requirements of every organization; rather they can serve as guidelines for the SDMS selection process. Organizations may modify the steps presented below depending on their culture, size and environmental settings.

5.2.1. Needs Analysis

108. A needs analysis normally starts with a review of the current system being used to manage statistical data in the organization. This will be followed by a process of identifying the problems or shortcomings of the current system, leading to documentation of improvement requirements.

109. Interviewing existing system (manual or automated) users, managers or other stakeholders is a common method of conducting a needs analysis. The following are some of the questions all stakeholders should be asked to collect data for the needs assessment:

(a) What is the level of dependence on manual forms?

(b) What is the level of support from the current SDMS supplier?

(c) Does the current software support the organization’s mission statement? If not, what improvements could be made?

(d) Are there any areas of waste or possible inefficiencies in the current system that need to be tackled urgently?

(e) Do you feel you receive adequate reporting from the existing system? What additional areas of reporting or type of information would you like to see with the new system?

(f) Do you feel users spend a significant amount of time producing reports?

(g) How easy is the user interface to use?

(h) Do you feel the current system captures all the required data?

(i) Do you feel the current system handles the growing user and data volume?

(j) Do you feel your critical data is well secured in the current system?

110. The main objective of the needs analysis is to identify and document the gaps between the current system and the needs of the organization to fulfil its mission.

5.2.2. Management support

111. Once the need is identified and justified, it must be presented to the management for approval of the project plan and resources. A budget needs to be allocated and most importantly,

© 2011 African Centre for Statistics Page 24

Page 29: Handbook on Major Statistical Data Management Platformsnsdsguidelines.paris21.org/sites/default/files/EDITED_Ha…  · Web viewHandbook on Major Statistical Data Management Platforms.

management commitment should be secured. A system development project is likely to fail without management support.

112. Assigning a manager to lead the project is a crucial step. He or she will serve as a mentor and sponsor for the project and will also be an invaluable resource in the event that the project team struggles with difficult users and managers.

5.2.3. Requirements specification

113. Once the needs are identified and acquisition of the new SDMS is justified, a detailed list of requirements must be prepared. The major requirements of a SDMS are discussed in Section 3 above. However, system requirements presented in Section 3 are not prescriptive and may not be relevant for every statistical organization. Rather, they are intended to serve as a springboard for producing more detailed requirement specifications for a specific project based on the results of the needs assessment.

114. It is a good practice to focus more on the key differentiating criteria of the system in order to identify the most critical requirements. This is important to quickly, yet thoroughly evaluate system vendors.

115. It is also a common practice to prioritize required features as:

(a) “Must have” features;

(b) Desired features; and

(c) “Wish list” features.

116. The degree of fit of the SDMS to each required feature should be analysed and must be one of the following:

(a) System fully meets the requirement;

(b) System meets the requirement with customization;

(c) System meets the requirement with third party add-on products; and Error: Reference source not found

(d) System does not meet the requirement.

117. An extract of the requirements specification with minor modification constitutes the terms of reference (ToR) which can be used in the request for proposal (RFP) document.

© 2011 African Centre for Statistics Page 25

Page 30: Handbook on Major Statistical Data Management Platformsnsdsguidelines.paris21.org/sites/default/files/EDITED_Ha…  · Web viewHandbook on Major Statistical Data Management Platforms.

5.2.4. RFP Preparation

118. After initially reviewing the available SDMS vendors (see Section 4 above), it is necessary to prepare the RFP, which is the best means of communicating the full project requirements to the potential vendors.

119. The RFP might contain, but not be limited to, the following items:

(a) Cover letter summarizing the request for proposal;

(b) General information and scope of work;

(i) Introduction;

(ii) Overview and background of the organization;

(iii) Objective of the project;

(iv) Scope of the project;

(v) Relationship to other systems;

(vi) Project schedule and deadline for vendor response;

(c) ToR;

(d) Instructions to bidders;

(i) Other binding information with regard to the bid;

(ii) Evaluation criteria;

(iii) Proposal response format.

(e) Vendor profile;

(f) Proposed statistical data management solution;

(g) Implementation services;

(h) Training services;

(i) Data migration services;

(j) Warranty period and annual maintenance;

© 2011 African Centre for Statistics Page 26

Page 31: Handbook on Major Statistical Data Management Platformsnsdsguidelines.paris21.org/sites/default/files/EDITED_Ha…  · Web viewHandbook on Major Statistical Data Management Platforms.

(k) Cost breakdown;

(l) Available references; and

(m) General and specific conditions

5.2.5. Software Demonstration

120. Some SDMS vendors offer online demonstrations of their systems. Exploring the demo is a very helpful way to evaluate a system. Once the RFP is sent out to potential vendors it is good practice to invite them for on-site demos. To avoid vendors following a simple sales presentation, they should be requested to prepare structured demo scripts based on the requirements of the organization.

121. The following are some of the questions which should be raised during on-site demo sessions:

(a) Did the demonstration follow the format or demo script provided?

(b) Did the representative review all of the “must-have” items?

(c) Did the system appear easy to use?

(d) What is the level of confidence in the system’s capacity to fulfil the majority of the requirements?

(e) Is the system a significant improvement on what is currently used by the organization?

5.2.6. System selection and contract negotiation

122. The processes outlined in Sections 3 and 4 of this document, the vendor’s response to the RFP, and evaluation of both online and on-site demos facilitate selection of the right statistical data management system and vendor.

123. As most system/software contracts are written by the software vendor, it is important to negotiate the contract to protect the organization’s interests and save effort, time and cost that might be incurred during and after system implementation.

124. Implementation issues such as project management, scheduling, staffing, data migration, and training should be well articulated and thought out at the commencement of the project.

VI. CONCLUSION

125. This handbook has been prepared with the intention of guiding African statistical organizations in their SDMS selection process. Accordingly, a list of the core features of a

© 2011 African Centre for Statistics Page 27

Page 32: Handbook on Major Statistical Data Management Platformsnsdsguidelines.paris21.org/sites/default/files/EDITED_Ha…  · Web viewHandbook on Major Statistical Data Management Platforms.

generic statistical data management system is outlined. Major statistical data management systems for macrodata processing which are currently deployed in statistical offices of member States are also presented. This is followed by system selection guidelines and tips which should be given utmost consideration when selecting a statistical data management platform.

126. The list of features is not ordered according to level of importance, because the importance of features varies from one organization to another depending on the scope, nature and overall environment of the statistical information infrastructure. The features discussed are not prescriptive, rather their level of importance is measured according to the needs of the organization looking for a specific SDMS. A thorough needs assessment exercise is therefore a critical success factor in selecting the right statistical data management system.

127. It is obvious that technical features are the key selection criteria. In most cases, those are well documented and visible, which simplifies SDMS selection on the basis of features. However, just as crucial as, or even more important than, the technical features are the hidden, or so-called soft factors which must be taken into consideration when selecting statistical data management systems. These hidden factors are difficult to measure but have a great impact on the success of implementation. As direct measurement is not always possible, some research may be required to gauge the impact of such factors.

128. The steps to be followed in selecting a statistical data management system are also discussed in this handbook. Formal system acquisition procedures avoid unnecessary waste of time, money and other resources. A critical part of this exercise is to secure management willingness and approval. A statistical data management system deployment exercise without the support of high-level management is guaranteed to fail.

VII. RECOMMENDATIONS

129. As mentioned in the section on “Mode of Operation”, a lot of effort was made to gather as much information as possible to document the characteristic features of the major statistical data management platforms described. However, in the case of some platforms, notably Live Database and LABORSTA, it was difficult to obtain all the required information. It is strongly recommended that further investigation is carried out into other possibilities, such as acquisition and configuration of these platforms in a local server, to fully understand and document their features.

130. Secondly, in most African countries there are a number of government departments, semi-government organizations, private institutions and non-governmental organizations providing various statistics. For instance, in addition to the national statistical office, which is the main official statistical data provider (in most cases), there are dozens of other government offices including the national bank and ministries of finance, trade, tourism, agriculture, health, and education. Most sector associations also have data that are available to the public. Private institutions and non-governmental organizations compile statistical data on a daily basis. Just as different parts of a country’s economic and socio-demographic entities are interconnected, data released by different institutions are also interrelated and need to be consistent with each other. Strategic deployment of statistical data management platforms plays a significant role in

© 2011 African Centre for Statistics Page 28

Page 33: Handbook on Major Statistical Data Management Platformsnsdsguidelines.paris21.org/sites/default/files/EDITED_Ha…  · Web viewHandbook on Major Statistical Data Management Platforms.

promoting the consistency and the seamless integration of data. It is highly recommended, therefore, that a similar initiative be commissioned in order to document the current status of national statistical systems and to investigate and suggest a way forward to achieve robust and integrated national statistical data management architecture.

© 2011 African Centre for Statistics Page 29

Page 34: Handbook on Major Statistical Data Management Platformsnsdsguidelines.paris21.org/sites/default/files/EDITED_Ha…  · Web viewHandbook on Major Statistical Data Management Platforms.

ANNEX

1. Questionnaire to NSOs

Questionnaire – Handbook Development for the Selection of Statistical Data Management Platform

1. Your organization:

Name:      

2. Please list what you consider as the essential functional features of a statistical data

management system.

2.1.      

3. What are the data management platforms or systems that you have worked on or

know for statistical data management? Please complete one system features sheet

(make copies as necessary) for each system/platform you have used or know about.

Thank you.

System Details Number:       (please complete one sheet for each system)

4. Name of data management system:

     

5. Is this system currently used in your office? Yes No

6. Please list the major features/functions of this data management system:

3.1.      

7. Support for periodical data backup/restore?

Fully supported Partially supported Not supported

8. Support for data import/export?

Fully supported Partially supported Not supported

© 2011 African Centre for Statistics Page 30

Page 35: Handbook on Major Statistical Data Management Platformsnsdsguidelines.paris21.org/sites/default/files/EDITED_Ha…  · Web viewHandbook on Major Statistical Data Management Platforms.

9. How do you evaluate the user interface?

Attractive and simple Attractive but complex

Unattractive but simple Unattractive and complex

10. Please list the dissemination media supported by this data management system (e.g.

www, CD, etc.):

     

11. Please list the international data dissemination formats supported by this data

management system (e.g. SDMX):

11.1.      

12. Vendor:

     

13. How would you rate the training provided by the system’s vendor?

Very good Satisfactory Not adequate None

14. How would you rate other support given by the system’s vendor?

Very good Satisfactory Not adequate None

15. How would you rate the security of the system (intruder access control)?

Strong security feature Moderate security feature

Weak security feature I don’t know

16. Sectoral support: Multisectoral Single sector

17. If the system is multi-sectoral, please list the statistical sectors supported:

17.1.      

© 2011 African Centre for Statistics Page 31

Page 36: Handbook on Major Statistical Data Management Platformsnsdsguidelines.paris21.org/sites/default/files/EDITED_Ha…  · Web viewHandbook on Major Statistical Data Management Platforms.

2. Questionnaire to experts

Questionnaire – Handbook Development for the Selection of Statistical Data Management Platform

1. Your organization:

Name:      

2. Please list what you consider as the essential functional features of a statistical data

management system.

     

3. What do you think are the critical functional requirements of a statistical data management

system?

     

What are the data management platforms or systems that you have worked on or know for

statistical data management? Please complete one system features sheet (make copies as

necessary) for each system/platform you have used or know about. Thank you.

4. Name of data management system:

     

5. Is this system currently used in your office? Yes No

6. Please list the major features/functions of this data management system:

     

7. Support for periodical data backup/restore?

Fully supported Partially supported Not supported

8. Support for data import/export?

Fully supported Partially supported Not supported

© 2011 African Centre for Statistics Page 32

Page 37: Handbook on Major Statistical Data Management Platformsnsdsguidelines.paris21.org/sites/default/files/EDITED_Ha…  · Web viewHandbook on Major Statistical Data Management Platforms.

9. How do you evaluate the user interface?

Attractive and simple Attractive but complex

Unattractive but simple Unattractive and complex

10. Please list the dissemination media supported by this data management system (e.g.

www, CD, etc.):

     

11. Please list the international data dissemination formats supported by this data

management system (e.g. SDMX):

     

12. Vendor:

     

13. How would you rate the training provided by the system’s vendor?

Very good Satisfactory Not adequate None

14. How would you rate other support given by the system’s vendor?

Very good Satisfactory Not adequate None

15. How would you rate the security of the system (intruder access control)?

Strong security feature Moderate security feature

Weak security feature I don’t know

16. Sectoral support: Multi-sectoral Single sector

17. If the system is multi-sectoral, please list the statistical sectors supported:

     

© 2011 African Centre for Statistics Page 33

Page 38: Handbook on Major Statistical Data Management Platformsnsdsguidelines.paris21.org/sites/default/files/EDITED_Ha…  · Web viewHandbook on Major Statistical Data Management Platforms.

3. Questionnaire to vendors

Questionnaire – Statistical Data Management Platform

1. Name of statistical data management system:

     

2. Vendor:

     

3. Availability of data entry function? Yes No

4. Batch data entry function (e.g. through Excel data sheet) ? Yes No

5. Ability to attach documents/other resources? Yes No

6. Ability to delete records? Yes No

7. Ability to view/undelete deleted entries? Yes No

8. Availability of user manual? Yes No

9. Availability of online/context help? Yes No

10. Facility to create report template? Available Not available

11. Facility to generate standard (predefined) reports? Supported Not supported

12. Facility to generate ad hoc (on the fly) reports? Available Not available

13. Facility to publish reports to web pages? Available Not available

14. List major metadata items that can be defined in the database (e.g. unit of measure,

indicators, scales, etc.)

     

15. Please list all other major features/functions of this data management system:

     

16. Support for periodical data backup/restore?

Fully supported Partially supported Not supported

17. Support for data import/export?

Fully supported Partially supported Not supported

© 2011 African Centre for Statistics Page 34

Page 39: Handbook on Major Statistical Data Management Platformsnsdsguidelines.paris21.org/sites/default/files/EDITED_Ha…  · Web viewHandbook on Major Statistical Data Management Platforms.

18. How do you evaluate the user interface?

Attractive and simple Attractive but complex

Unattractive but simple Unattractive and complex

19. List the dissemination media supported by this data management system (e.g. www,

CD, etc.):

     

20. List the international data dissemination formats supported by this data management

system (e.g. SDMX):

     

21. Language support:

Single language Dual language Multi-language

Languages supported:

     

22. How would you rate the security of the system (intruder access control)?

Strong security feature Moderate security feature

Weak security feature I don’t know

23. Sectoral support: Multi-sectoral Single sector

24. If the system is multi-sectoral, please list the statistical sectors supported:

     

25. User responsibility management

     

26. Database management engine used to store data (Oracle, MS SQL, MySql, etc.)

     

27. Operating system the database is running on

     

28. The database system is desktop application server based – accessible through

network

© 2011 African Centre for Statistics Page 35

Page 40: Handbook on Major Statistical Data Management Platformsnsdsguidelines.paris21.org/sites/default/files/EDITED_Ha…  · Web viewHandbook on Major Statistical Data Management Platforms.

REFERENCES

1. Nesstar documentation from http://www.nesstar.com/, accessed on 26 April 2011.

2. Sen, P., Key Issues in Managing and Utilizing IT as a Strategic Resource for NSOs . Available from http://unstats.un.org/unsd/dnss/kf/it_country_docs.aspx. Accessed 26 April 2011.

3. The Eurotrace Suite Workshop on Updated and New Recommendations for IMTS and their Implementation in the Sub-Saharan Region, 1-5 November 2010, Lusaka, Zambia.

4. Fletcher, T., StatWorks – an IT Toolkit for Statistical Data Management. Available from www.oecd.org/dataoecd/50/38/18247342.pdf. Accessed 20 April 2011.

5. Fletcher, T., Sharing Statistical Software – an Update on the OECD Experience, Meeting on the Management of Statistical Information Systems (MSIS 2010), Daejeon, Republic of Korea, 26-29 April 2010.

6. Committee on Statistics, Integrated National Statistical System and BPS Information Technology Development, twelfth session of United Nations Economic and Social Commission for Asia and the Pacific, Bangkok, Thailand, 29 November-1 December 2000)

7. Lukhwareni, T.J., S.F. Madonsela, D.E. Mokhuwa and L.M. Podile, Management of Metadata in National Statistical Agency. Fourteenth Conference of Commonwealth Statisticians, 5–9 September 2005.

8. Technology Group International, Software Selection Process Steps. Available from www.tgiltd.com. Accessed 21 April 2011.

9. Rizzo, F., The SDMX Service Architecture for the Perspective of a National Statistical Institute. Meeting of the Joint OECD/UNECE Expert Group on Statistical Data and Metadata Exchange, Palais des Nations, United Nations Economic Commission for Europe, Geneva, Switzerland, 8-9 March 2010.

© 2011 African Centre for Statistics Page 36