Data Processing of the 2010 Population and Housing Census 15-19 September 2008, Bangkok, Thailand...

32
Data Processing of the 2010 Population and Housing Census 15-19 September 2008, Bangkok, Thailand National Statistical Office, Thailand

Transcript of Data Processing of the 2010 Population and Housing Census 15-19 September 2008, Bangkok, Thailand...

Page 1: Data Processing of the 2010 Population and Housing Census 15-19 September 2008, Bangkok, Thailand National Statistical Office, Thailand.

Data Processing of the 2010 Population and

Housing Census

15-19 September 2008, Bangkok, Thailand

National Statistical Office, Thailand

Page 2: Data Processing of the 2010 Population and Housing Census 15-19 September 2008, Bangkok, Thailand National Statistical Office, Thailand.

Hardware & Software of ICR System

TELEform / ABBYY Functions

Step of ICR System in NSO

Specific questionnaires for ICR System

CONTENT

DATA CAPTURING

Page 3: Data Processing of the 2010 Population and Housing Census 15-19 September 2008, Bangkok, Thailand National Statistical Office, Thailand.

NSO was firstly used ICR System to process the Population Census questionnaires in 2000 by scanning the 16 million households (16 million Forms) which

spent only 8 months to process the raw data instead of 18 months by using Key in Data System.

ICR for The Population Census 2000

DATA CAPTURING

Page 4: Data Processing of the 2010 Population and Housing Census 15-19 September 2008, Bangkok, Thailand National Statistical Office, Thailand.

TELEform Hardware & Software System in 2000

TELEform Hardware System TELEform Software System

NetServer for TELEform

Server (1)

NetServer for Database

Server (1)

Reader Modules

Workstations (21)

Verifier Modules

Workstations (55)

Scanner Control

Workstations (6)

Scanner Fujitsu M4099D (6)

TELEform 6.2 Elite Enterprise

Edition Components :

TELEform Designer

TELEform Reader

TELEform Verifier

DATA CAPTURING

Page 5: Data Processing of the 2010 Population and Housing Census 15-19 September 2008, Bangkok, Thailand National Statistical Office, Thailand.

ICR System in NSO (Thailand) can be divided into 2 parts :

ICR System in 2003

TELEform Software System

ABBYY Software System

DATA CAPTURING

Page 6: Data Processing of the 2010 Population and Housing Census 15-19 September 2008, Bangkok, Thailand National Statistical Office, Thailand.

NSO hired ABBYY

Software to process about

25% of The Agricultural

Census 2003 questionnaires

that were totally 5.8 million

households (24 million

forms).

ICR for The Agricultural Census 2003

DATA CAPTURING

Page 7: Data Processing of the 2010 Population and Housing Census 15-19 September 2008, Bangkok, Thailand National Statistical Office, Thailand.

TELEform Hardware & Software System in 2003

TELEform Hardware System TELEform Software System

TELEform 6.2 Elite Enterprise

Edition Components :

TELEform Designer

TELEform Reader

TELEform Verifier

NetServer for TELEform

Server (1)

NetServer for Database

Server (1)

Reader Modules

Workstations (21)

Verifier Modules

Workstations (30)

Scanner Control

Workstations (6)

Scanner Fujitsu M4099D (6)

DATA CAPTURING

Page 8: Data Processing of the 2010 Population and Housing Census 15-19 September 2008, Bangkok, Thailand National Statistical Office, Thailand.

ABBYY Hardware & Software System in 2003

ABBYY Hardware System ABBYY Software System

ABBYY FormReader 6.0 Enterprise Edition Components:

Form Design

Administration Station

Recognition Station

Correction Station

IBM Server X Series 225 (1) Correction Station (1)

Verifier Modules

Workstations (25)

Scanner Control

Workstations (4)

Scanner Fujitsu M4099D (4)

Storageflex LT707 (1)

DATA CAPTURING

Page 9: Data Processing of the 2010 Population and Housing Census 15-19 September 2008, Bangkok, Thailand National Statistical Office, Thailand.

DATA CAPTURING

TELEform & ABBYY Functions

Page 10: Data Processing of the 2010 Population and Housing Census 15-19 September 2008, Bangkok, Thailand National Statistical Office, Thailand.

TELEform / ABBYY Designer Function

To create template form by fix field boxes on questionnaire.

Questionnaire

Template

DATA CAPTURING

Page 11: Data Processing of the 2010 Population and Housing Census 15-19 September 2008, Bangkok, Thailand National Statistical Office, Thailand.

To evaluate the questionnaires

Export the corrected questionnaires to a data file

Send the unclear questionnaires to TELEform/ABBYY

Verifier Function for correcting and transferring the

corrected questionnaires to a data file

Store scanned images

TELEform Reader / ABBYY Administration Function

DATA CAPTURING

Page 12: Data Processing of the 2010 Population and Housing Census 15-19 September 2008, Bangkok, Thailand National Statistical Office, Thailand.

To correct questionnaires that

contain mismarked or illegible fields

The corrected questionnaires are

automatically exported to a data file

TELEform / ABBYY Verifier Function

DATA CAPTURING

Page 13: Data Processing of the 2010 Population and Housing Census 15-19 September 2008, Bangkok, Thailand National Statistical Office, Thailand.

Scanning speed support A7 to A3 paper sizes

Simplex is provided 90 papers / minute. (A4 portrait)

Duplex is provided 180 images / minute.(A4)

NSO questionnaires projects are mostly printed with A3 (297 x 420 mm.) paper sizes.

Functions Speed

Functions Estimated Speed (sheets/minute)

Scanner 45

Reader 17

Verifier 5

DATA CAPTURING

Page 14: Data Processing of the 2010 Population and Housing Census 15-19 September 2008, Bangkok, Thailand National Statistical Office, Thailand.

DATA CAPTURING

Step of ICR System in NSO

Page 15: Data Processing of the 2010 Population and Housing Census 15-19 September 2008, Bangkok, Thailand National Statistical Office, Thailand.

Scan and Forms Distribution :

The questionnaires are scanned in each Block / Village and created Multi Page Image Files.

Step of ICR System in NSO

Forms Evaluation :

The questionnaire images are evaluated. The corrected questionnaires which skipped Verifier Workstations and directly exported to Database server.

DATA CAPTURING

Page 16: Data Processing of the 2010 Population and Housing Census 15-19 September 2008, Bangkok, Thailand National Statistical Office, Thailand.

Forms Verification :

The unclear questionnaires are needed to review and corrected it in Verifier Workstations before transferring to Database server.

Step of ICR System in NSO (cont’)

Data Export :

Link a data file from Database server to IBM Mainframe System

Store Scanned image files to CD.

DATA CAPTURING

Page 17: Data Processing of the 2010 Population and Housing Census 15-19 September 2008, Bangkok, Thailand National Statistical Office, Thailand.

Scan & Forms Distribution

Questionnaire Scan Image File

DATA CAPTURING

Page 18: Data Processing of the 2010 Population and Housing Census 15-19 September 2008, Bangkok, Thailand National Statistical Office, Thailand.

Forms Verification and Data Export

Verify Storage (Images files)

Export data for

processing

CD

DATA CAPTURING

Page 19: Data Processing of the 2010 Population and Housing Census 15-19 September 2008, Bangkok, Thailand National Statistical Office, Thailand.

Input – Output of ICR System

Questionnaire

ICRICR

DATA CAPTURING

Ascii files

Image files

Page 20: Data Processing of the 2010 Population and Housing Census 15-19 September 2008, Bangkok, Thailand National Statistical Office, Thailand.

ICR Linkage System

ABBYY Software

Transfers

Storage(HD 880

GB)

Mainframe

HP Server

COMPAQ

Server

Processing(Editing & Reporting)

Scanners 4 unit

IBM Server

controllerPC 6 unit

PC 4 unit controller

30 unit

Verifications

21 unit

Readers

Correction

1 unit station

25 unit

Verifications

- Administration- Export- Recognition

CD

Questionnaires

- Backup Data - Software - Database

S Scanners6 unit Questionnaires

TELEform Software

Page 21: Data Processing of the 2010 Population and Housing Census 15-19 September 2008, Bangkok, Thailand National Statistical Office, Thailand.

DATA CAPTURING

Specific Questionnaires for ICR System

Page 22: Data Processing of the 2010 Population and Housing Census 15-19 September 2008, Bangkok, Thailand National Statistical Office, Thailand.

Specific questionnaires for ICR System

The questionnaires must be designed and printed in quality of paper, specific colour answer field boxes (blue, green, red)

To record the questionnaires should be used at least 2HB pencil

To distribute and collect as well as return questionnaires should be done with caution.

DATA CAPTURING

Page 23: Data Processing of the 2010 Population and Housing Census 15-19 September 2008, Bangkok, Thailand National Statistical Office, Thailand.

ICR Benefits

Reduce Cost

Reduce Time

Efficient Data Capture

Increase Data Accuracy

DATA CAPTURING

Page 24: Data Processing of the 2010 Population and Housing Census 15-19 September 2008, Bangkok, Thailand National Statistical Office, Thailand.

Strictly designed questionnaires : Paper, size, color,

figure and answer field boxes

Record questionnaires should be fixed pencil and

handwriting

Distribution and return questionnaires should be

carful

Major Problems Encountered in 2000 CensusMajor Problems Encountered in 2000 Census

DATA CAPTURING

Page 25: Data Processing of the 2010 Population and Housing Census 15-19 September 2008, Bangkok, Thailand National Statistical Office, Thailand.

DATA EDITING

EDITING & TABULATION

Page 26: Data Processing of the 2010 Population and Housing Census 15-19 September 2008, Bangkok, Thailand National Statistical Office, Thailand.

DATA PROCESSING STEP

Machine Edit

Listing

Error?

Checking Error & List Data

Comparing with questionnaire images

Table Checking

Yes

No

Questionnaires

ICR

Accept?No Yes

Validate Data(Manual, Cold deck, Hot deck)

Tabulation &Report

Tab/Report

Page 27: Data Processing of the 2010 Population and Housing Census 15-19 September 2008, Bangkok, Thailand National Statistical Office, Thailand.

Possible code:

Check in each field which the out-of-range fields

values is shown in asterisk (*) code.

Validity:

Check characteristics of the message structure.

DATA PROCESSING STEP

Editing Process

Page 28: Data Processing of the 2010 Population and Housing Census 15-19 September 2008, Bangkok, Thailand National Statistical Office, Thailand.

Consistency:

Imputation:

Automatic editing programs.

DATA PROCESSING STEP

Editing Process (cont’)

Check inconsistent values within record and across record. Messages are shown the related conditional codes. All error is printed in continuous paper forms to be considerated and validated by subject matter until no messages error found.

Page 29: Data Processing of the 2010 Population and Housing Census 15-19 September 2008, Bangkok, Thailand National Statistical Office, Thailand.

Tabulation:

DATA PROCESSING STEP

Tabulation Process

Report summary data which can be processed after data completely cleaned for subject matter to analyze the results of output.

Page 30: Data Processing of the 2010 Population and Housing Census 15-19 September 2008, Bangkok, Thailand National Statistical Office, Thailand.

DATA PROCESSING STEP

Mainframe : IBM Multiprise 2000 Model 206

1.1 Operating System- OS/390 v.2 release 8

1.2 Compiler- PL/I

1.3 Statistic Program- Base SAS

1.4 Application Development Tools- Performance Reporter for

OS/390

Page 31: Data Processing of the 2010 Population and Housing Census 15-19 September 2008, Bangkok, Thailand National Statistical Office, Thailand.

DATA PROCESSING STEP

Personal Computer (PC)

2.1 Operating System- Windows XP

2.2 Package- MS Office 2003- MS Studio v.6 (Visual FoxPro)- SPSS- CSPro 3.3

Page 32: Data Processing of the 2010 Population and Housing Census 15-19 September 2008, Bangkok, Thailand National Statistical Office, Thailand.

THANK YOU

FOR YOUR ATTENTION