TURKISH STATISTICAL INSTITUTE INFORMATION TECHNOLOGIES DEPARTMENT 1 HARZEMLI, The DDI Based...

37
TURKISH STATISTICAL INSTITUTE INFORMATION TECHNOLOGIES DEPARTMENT 1 HARZEMLI, The DDI Based Statistical Production Platform İlker GÜVEN Head of Data Management Group [email protected]

Transcript of TURKISH STATISTICAL INSTITUTE INFORMATION TECHNOLOGIES DEPARTMENT 1 HARZEMLI, The DDI Based...

Page 1: TURKISH STATISTICAL INSTITUTE INFORMATION TECHNOLOGIES DEPARTMENT 1 HARZEMLI, The DDI Based Statistical Production Platform İlker GÜVEN Head of Data Management.

TURKISH STATISTICAL INSTITUTE

INFORMATION TECHNOLOGIES DEPARTMENT 1

HARZEMLI, The DDI Based Statistical Production Platform

İlker GÜVEN

Head of Data Management Group

[email protected]

Page 2: TURKISH STATISTICAL INSTITUTE INFORMATION TECHNOLOGIES DEPARTMENT 1 HARZEMLI, The DDI Based Statistical Production Platform İlker GÜVEN Head of Data Management.

TURKISH STATISTICAL INSTITUTE

INFORMATION TECHNOLOGIES DEPARTMENT 2

The QUESTION Is there a way to generate a standardized approach to

statistics production with an increased level of quality and decreased workload?

Why this QUESTION should be handled?

The role of Statistical Institutes includes making surveys and

disseminating data for statistical subjects.

Statistical Institutes need to integrate the data of different subject-

matter units with each other.

Usage of common/standardized classification brings benefits for

implementing GSBPM & GSIM.

Page 3: TURKISH STATISTICAL INSTITUTE INFORMATION TECHNOLOGIES DEPARTMENT 1 HARZEMLI, The DDI Based Statistical Production Platform İlker GÜVEN Head of Data Management.

TURKISH STATISTICAL INSTITUTE

INFORMATION TECHNOLOGIES DEPARTMENT 3

Standardized data entry screens and standardized data send/receive

methods in data collection.

Standardized data model for dissemination.

Why this QUESTION should be handled? (cont’d…)

Page 4: TURKISH STATISTICAL INSTITUTE INFORMATION TECHNOLOGIES DEPARTMENT 1 HARZEMLI, The DDI Based Statistical Production Platform İlker GÜVEN Head of Data Management.

TURKISH STATISTICAL INSTITUTE

INFORMATION TECHNOLOGIES DEPARTMENT 4

DDI + Rule Files + Classification Server => for survey part

Uniform data model + Classification Server => for dissemination part

The SOLUTION

Page 5: TURKISH STATISTICAL INSTITUTE INFORMATION TECHNOLOGIES DEPARTMENT 1 HARZEMLI, The DDI Based Statistical Production Platform İlker GÜVEN Head of Data Management.

TURKISH STATISTICAL INSTITUTE

INFORMATION TECHNOLOGIES DEPARTMENT 5

HARZEMLI, The DDI-based statistical production platform

Metadata-driven, dynamically created web-based surveys that have similar look-and-feels

Shortened and standardized IT processes with respect to metadata

Metadata is actively used in phases of GSBPM

Standard names of all the variables in terms of data integrity

Shortened duration for data collection, more time to analyze

Easy integration for compilation of private sector data in related surveys

Similar database tables whose structures are generated automatically

The RESULT

Page 6: TURKISH STATISTICAL INSTITUTE INFORMATION TECHNOLOGIES DEPARTMENT 1 HARZEMLI, The DDI Based Statistical Production Platform İlker GÜVEN Head of Data Management.

TURKISH STATISTICAL INSTITUTE

INFORMATION TECHNOLOGIES DEPARTMENT 6

MEDAS (Central Dissemination System – A subsystem of Harzemli Platform)

Reduced reporting and interface burden in dissemination

Much easier comparability of data

Correlation detection between any subjects for any end user

The RESULT (cont’d…)

Page 7: TURKISH STATISTICAL INSTITUTE INFORMATION TECHNOLOGIES DEPARTMENT 1 HARZEMLI, The DDI Based Statistical Production Platform İlker GÜVEN Head of Data Management.

TURKISH STATISTICAL INSTITUTE

INFORMATION TECHNOLOGIES DEPARTMENT 7

HARZEMLI  (al-Khwarizmi)

( c. 780 – c. 850), formerly Latinized as Algoritmi or Algaurizin, was a Persian[1][5] mathematician, astronomer and geographer

 "Algebra" is derived from al-jabr, one of the two operations he used to solve quadratic equations. 

Algorism and algorithm stem from Algoritmi, the Latin form of his name.[7] His name is also the origin of (Spanish) guarismo[8] and of (Portuguese) algarismo, both meaning digit.

Page 8: TURKISH STATISTICAL INSTITUTE INFORMATION TECHNOLOGIES DEPARTMENT 1 HARZEMLI, The DDI Based Statistical Production Platform İlker GÜVEN Head of Data Management.

TURKISH STATISTICAL INSTITUTE

INFORMATION TECHNOLOGIES DEPARTMENT 8

BEFORE HARZEMLI

Page 9: TURKISH STATISTICAL INSTITUTE INFORMATION TECHNOLOGIES DEPARTMENT 1 HARZEMLI, The DDI Based Statistical Production Platform İlker GÜVEN Head of Data Management.

TURKISH STATISTICAL INSTITUTE

INFORMATION TECHNOLOGIES DEPARTMENT 9

BEFORE HARZEMLI

• Developer 1 + Subject matter unit 1 staff

• Developer 2 + Subject matter unit 2 staff

• Developer 3 + Subject matter unit 3 staff

• Developer n + Subject matter unit n staff

Page 10: TURKISH STATISTICAL INSTITUTE INFORMATION TECHNOLOGIES DEPARTMENT 1 HARZEMLI, The DDI Based Statistical Production Platform İlker GÜVEN Head of Data Management.

TURKISH STATISTICAL INSTITUTE

INFORMATION TECHNOLOGIES DEPARTMENT 10

BEFORE HARZEMLI

Different Java applications for each survey

Different data designs for each survey

No convention in metadata between subjects => lack of data integrity

Long period of time for the completion of survey due to papers

Page 11: TURKISH STATISTICAL INSTITUTE INFORMATION TECHNOLOGIES DEPARTMENT 1 HARZEMLI, The DDI Based Statistical Production Platform İlker GÜVEN Head of Data Management.

TURKISH STATISTICAL INSTITUTE

INFORMATION TECHNOLOGIES DEPARTMENT 11

WITH HARZEMLI

No need to write new Java codes

Similar data design for each subject

Totally metadata-driven, automatically generated web applications

Shorter period of time for the completion of survey, since there is no paper

Page 12: TURKISH STATISTICAL INSTITUTE INFORMATION TECHNOLOGIES DEPARTMENT 1 HARZEMLI, The DDI Based Statistical Production Platform İlker GÜVEN Head of Data Management.

TURKISH STATISTICAL INSTITUTE

INFORMATION TECHNOLOGIES DEPARTMENT 12

HARZEMLI

Consists of:

Metadata Editor (Nesstar, now developing our own editor)

Rule Editor (generates Rule XML files)

Desktop, Mobile, and Web editions

Management Console with modules (Analysis,Visualization, etc. )

MEDAS

Page 13: TURKISH STATISTICAL INSTITUTE INFORMATION TECHNOLOGIES DEPARTMENT 1 HARZEMLI, The DDI Based Statistical Production Platform İlker GÜVEN Head of Data Management.

TURKISH STATISTICAL INSTITUTE

INFORMATION TECHNOLOGIES DEPARTMENT 13

HARZEMLI (SURVEY PART)

MetadataEditorClassification

ServerRule Editor

DDI Document Rule XML File

Harzemli Application

Code ListsVariables

Web based survey interface

Page 14: TURKISH STATISTICAL INSTITUTE INFORMATION TECHNOLOGIES DEPARTMENT 1 HARZEMLI, The DDI Based Statistical Production Platform İlker GÜVEN Head of Data Management.

TURKISH STATISTICAL INSTITUTE

INFORMATION TECHNOLOGIES DEPARTMENT 14

HARZEMLI MILESTONES

  Milestones of the project Development dates

1 Harzemli Desktop 2012

2 Harzemli Rule Editor 2012

3 Harzemli Management Console 2012

4 IDM 2013

5 Harzemli Web 2013

6 MEDAS 2014

7 Harzemli Mobile 2014

8 Harzemli Analysis 2014

Page 15: TURKISH STATISTICAL INSTITUTE INFORMATION TECHNOLOGIES DEPARTMENT 1 HARZEMLI, The DDI Based Statistical Production Platform İlker GÜVEN Head of Data Management.

TURKISH STATISTICAL INSTITUTE

INFORMATION TECHNOLOGIES DEPARTMENT 15

Survey Migration to HARZEMLI

Harzemli

Desktop

Harzemli

Web

Harzemli

MobileTOTAL

2013 6 26 * 32

2014 4 46 * 50

2015 2 13 7 22

TOTAL 12  85 7 104

Page 16: TURKISH STATISTICAL INSTITUTE INFORMATION TECHNOLOGIES DEPARTMENT 1 HARZEMLI, The DDI Based Statistical Production Platform İlker GÜVEN Head of Data Management.

TURKISH STATISTICAL INSTITUTE

INFORMATION TECHNOLOGIES DEPARTMENT 16

PRODUCTIVITY GAINS (Time)

Decreased time for software engineers to develop data entry applications

4 WEEKS 1 WEEK

Decreased time for data collection

8% decrease in data collection time

50% increase in time available for regional offices to analyze their data, thanks to instant access to data

12% decrease in the analysis time that is necessary for the staff of central office.

Time period necessary to prepare the press releases has been shortened. For example, the time period that is necessary for the preparation of the monthly Labor Statistics press releases has been shortened by 4 days.

Page 17: TURKISH STATISTICAL INSTITUTE INFORMATION TECHNOLOGIES DEPARTMENT 1 HARZEMLI, The DDI Based Statistical Production Platform İlker GÜVEN Head of Data Management.

TURKISH STATISTICAL INSTITUTE

INFORMATION TECHNOLOGIES DEPARTMENT 17

PRODUCTIVITY GAINS (Quality)

Software Process Standardization

Data Integrity

Common code development

Common Components Available

Page 18: TURKISH STATISTICAL INSTITUTE INFORMATION TECHNOLOGIES DEPARTMENT 1 HARZEMLI, The DDI Based Statistical Production Platform İlker GÜVEN Head of Data Management.

TURKISH STATISTICAL INSTITUTE

INFORMATION TECHNOLOGIES DEPARTMENT 18

PRODUCTIVITY GAINS (Costs)

870 trees would be cut to produce 10 Million A4 papers only in 2015 if the paper editions of surveys had continued.

58% reduction in the cost of pollsters’ travelling expenses

Page 19: TURKISH STATISTICAL INSTITUTE INFORMATION TECHNOLOGIES DEPARTMENT 1 HARZEMLI, The DDI Based Statistical Production Platform İlker GÜVEN Head of Data Management.

TURKISH STATISTICAL INSTITUTE

INFORMATION TECHNOLOGIES DEPARTMENT 19

Dissemination System of Harzemli Platform

(Just like the survey part) Before MEDAS, dissemination databases:

Different database designs

Different applications for each statistical subject

Lack of standard codes

No data integrity, hard to detect correlations

MEDAS (Central Dissemination Project)

Page 20: TURKISH STATISTICAL INSTITUTE INFORMATION TECHNOLOGIES DEPARTMENT 1 HARZEMLI, The DDI Based Statistical Production Platform İlker GÜVEN Head of Data Management.

TURKISH STATISTICAL INSTITUTE

INFORMATION TECHNOLOGIES DEPARTMENT 20

BEFORE MEDAS

• Database specialist 1 + Subject matter unit 1 staff

• Database Specialist 2 + Subject matter unit 2 staff

• Database Specialist 3 + Subject matter unit 3 staff

• Database Specialist n + Subject matter unit n staff

Page 21: TURKISH STATISTICAL INSTITUTE INFORMATION TECHNOLOGIES DEPARTMENT 1 HARZEMLI, The DDI Based Statistical Production Platform İlker GÜVEN Head of Data Management.

TURKISH STATISTICAL INSTITUTE

INFORMATION TECHNOLOGIES DEPARTMENT 21

Before MEDAS: IT involved too much in data dissemination because of

manual processes

Our old dissemination technology brings security threats

MEDAS

Page 22: TURKISH STATISTICAL INSTITUTE INFORMATION TECHNOLOGIES DEPARTMENT 1 HARZEMLI, The DDI Based Statistical Production Platform İlker GÜVEN Head of Data Management.

TURKISH STATISTICAL INSTITUTE

INFORMATION TECHNOLOGIES DEPARTMENT 22

MEDAS

Dissemination of different statistical subjects through generic data model - single data source and single application

enables to compare any number of subjects in the same report => usage of classification server has a major role

Reduced reporting burden thanks to single application

Modern pivot tablesCentral Dissemination DatabaseComparability

Page 23: TURKISH STATISTICAL INSTITUTE INFORMATION TECHNOLOGIES DEPARTMENT 1 HARZEMLI, The DDI Based Statistical Production Platform İlker GÜVEN Head of Data Management.

TURKISH STATISTICAL INSTITUTE

INFORMATION TECHNOLOGIES DEPARTMENT 23

IT just builds the pipeline and the data is filled with the related units

Reduction in person-dependency => good for managers Easy to develop web services thanks to generic data

model Easier database administration since the number of

schemas(workspaces) reduces dramatically

MEDAS

Page 24: TURKISH STATISTICAL INSTITUTE INFORMATION TECHNOLOGIES DEPARTMENT 1 HARZEMLI, The DDI Based Statistical Production Platform İlker GÜVEN Head of Data Management.

TURKISH STATISTICAL INSTITUTE

INFORMATION TECHNOLOGIES DEPARTMENT 24

MEDAS MEDAS Report Screen – See the results for different subjects on

the same report

Page 25: TURKISH STATISTICAL INSTITUTE INFORMATION TECHNOLOGIES DEPARTMENT 1 HARZEMLI, The DDI Based Statistical Production Platform İlker GÜVEN Head of Data Management.

TURKISH STATISTICAL INSTITUTE

INFORMATION TECHNOLOGIES DEPARTMENT 25

MEDAS

MEDAS Report Screen – See the results for different subjects on the same report

Page 26: TURKISH STATISTICAL INSTITUTE INFORMATION TECHNOLOGIES DEPARTMENT 1 HARZEMLI, The DDI Based Statistical Production Platform İlker GÜVEN Head of Data Management.

TURKISH STATISTICAL INSTITUTE

INFORMATION TECHNOLOGIES DEPARTMENT 26

In use since April 2014 26 / 62 statistical subjects have been migrated to

MEDAS 46 at the end of 2015 & 62 at the end of 2016

MEDAS

Page 27: TURKISH STATISTICAL INSTITUTE INFORMATION TECHNOLOGIES DEPARTMENT 1 HARZEMLI, The DDI Based Statistical Production Platform İlker GÜVEN Head of Data Management.

TURKISH STATISTICAL INSTITUTE

INFORMATION TECHNOLOGIES DEPARTMENT 27

So far, Harzemli Platform has had great success and has seen great support by both the presidency and staff from the units.

Harzemli Platform is going to enlarge by the new modules that are currently being developed to be added in the near future.

FINAL WORDS

Page 28: TURKISH STATISTICAL INSTITUTE INFORMATION TECHNOLOGIES DEPARTMENT 1 HARZEMLI, The DDI Based Statistical Production Platform İlker GÜVEN Head of Data Management.

TURKISH STATISTICAL INSTITUTE

INFORMATION TECHNOLOGIES DEPARTMENT 28

Thank you

Page 29: TURKISH STATISTICAL INSTITUTE INFORMATION TECHNOLOGIES DEPARTMENT 1 HARZEMLI, The DDI Based Statistical Production Platform İlker GÜVEN Head of Data Management.

TURKISH STATISTICAL INSTITUTE

INFORMATION TECHNOLOGIES DEPARTMENT 29

APPENDIX

Page 30: TURKISH STATISTICAL INSTITUTE INFORMATION TECHNOLOGIES DEPARTMENT 1 HARZEMLI, The DDI Based Statistical Production Platform İlker GÜVEN Head of Data Management.

TURKISH STATISTICAL INSTITUTE

INFORMATION TECHNOLOGIES DEPARTMENT 30

HARZEMLI WEB

Mainly used for;

Business surveys

Surveys applied to public bodies and universities

No paper-based businesses survey since 2014

Page 31: TURKISH STATISTICAL INSTITUTE INFORMATION TECHNOLOGIES DEPARTMENT 1 HARZEMLI, The DDI Based Statistical Production Platform İlker GÜVEN Head of Data Management.

TURKISH STATISTICAL INSTITUTE

INFORMATION TECHNOLOGIES DEPARTMENT 31

Applying survey (e.g, household surveys) without an internet connection

Designed for netbooks / mini laptops

Extra control mechanisms for data integrity (we call them edit codes)

Sending / receiving data to central databases via web services when connected to Internet

HARZEMLI DESKTOP

Page 32: TURKISH STATISTICAL INSTITUTE INFORMATION TECHNOLOGIES DEPARTMENT 1 HARZEMLI, The DDI Based Statistical Production Platform İlker GÜVEN Head of Data Management.

TURKISH STATISTICAL INSTITUTE

INFORMATION TECHNOLOGIES DEPARTMENT 32

Android application designed for tablets

Similar mechanisms with Harzemli Desktop

Take advantage of mobile operating systems and leightweight devices on the field

HARZEMLI MOBILE

Page 33: TURKISH STATISTICAL INSTITUTE INFORMATION TECHNOLOGIES DEPARTMENT 1 HARZEMLI, The DDI Based Statistical Production Platform İlker GÜVEN Head of Data Management.

TURKISH STATISTICAL INSTITUTE

INFORMATION TECHNOLOGIES DEPARTMENT 33

Harzemli sistemine yeni bir çalışma eklemeye yarayan menü

Harzemlide yapılan bütün çalışmaların mevcut durum ve analiz raporlarının oluşturulduğu, eklendiği ve gösterildiği menü

Bölge müdürlükleri tarafından yürütülen araştırmaların alan ve masabaşı iş yükünün raporlandığı ekran

Sms modülü hanelerin ve firmaların araştırmalar konusunda bilgilendirildiği menü

Harzemli masaüstü ve mobilde yapılan çalışmalar için kimlik doğrulama işleminin gerçekleştirildiği menü

Harzemli sistemindeki çalışmaların bilgi işlem tarafından yönetimini sağlar(veritabanı tablolarının oluşturulması gibi)

Harzemli masaüstü ve mobilde gerçekleştirilen çalışmaların kullanıcı ve form yetkilendirilmesinin yapıldığı menü

HARZEMLI MANAGEMENT CONSOLE

Page 34: TURKISH STATISTICAL INSTITUTE INFORMATION TECHNOLOGIES DEPARTMENT 1 HARZEMLI, The DDI Based Statistical Production Platform İlker GÜVEN Head of Data Management.

TURKISH STATISTICAL INSTITUTE

INFORMATION TECHNOLOGIES DEPARTMENT 34

HARZEMLI ANALYSIS

Users create and run their own error-finder rules, run special rules, or trigger and run analyses on streams/files from other statistical systems ( SPSS, SAS, R)

Backstage data mining techniques => Suspicious records database

Smart reports provide the users(subject matter unit staff) with the reasons

Subject matter unit staff approves if the reason is satisfactory

Page 35: TURKISH STATISTICAL INSTITUTE INFORMATION TECHNOLOGIES DEPARTMENT 1 HARZEMLI, The DDI Based Statistical Production Platform İlker GÜVEN Head of Data Management.

TURKISH STATISTICAL INSTITUTE

INFORMATION TECHNOLOGIES DEPARTMENT 35

HARZEMLI ANALYSIS

Page 36: TURKISH STATISTICAL INSTITUTE INFORMATION TECHNOLOGIES DEPARTMENT 1 HARZEMLI, The DDI Based Statistical Production Platform İlker GÜVEN Head of Data Management.

TURKISH STATISTICAL INSTITUTE

INFORMATION TECHNOLOGIES DEPARTMENT 36

HARZEMLI ANALYSIS – Data VisualizationVisual Analysis module prepared with R software contributes to more effective analysis. SQL sentence generated according to thetable selected by the user and column values of this table are sent to R server.

Page 37: TURKISH STATISTICAL INSTITUTE INFORMATION TECHNOLOGIES DEPARTMENT 1 HARZEMLI, The DDI Based Statistical Production Platform İlker GÜVEN Head of Data Management.

TURKISH STATISTICAL INSTITUTE

INFORMATION TECHNOLOGIES DEPARTMENT 37

Data Process of MEDAS