Data Processing of the 2010 Population and Housing Census 15-19 September 2008, Bangkok, Thailand...
-
Upload
brianna-thornton -
Category
Documents
-
view
216 -
download
0
Transcript of Data Processing of the 2010 Population and Housing Census 15-19 September 2008, Bangkok, Thailand...
Data Processing of the 2010 Population and
Housing Census
15-19 September 2008, Bangkok, Thailand
National Statistical Office, Thailand
Hardware & Software of ICR System
TELEform / ABBYY Functions
Step of ICR System in NSO
Specific questionnaires for ICR System
CONTENT
DATA CAPTURING
NSO was firstly used ICR System to process the Population Census questionnaires in 2000 by scanning the 16 million households (16 million Forms) which
spent only 8 months to process the raw data instead of 18 months by using Key in Data System.
ICR for The Population Census 2000
DATA CAPTURING
TELEform Hardware & Software System in 2000
TELEform Hardware System TELEform Software System
NetServer for TELEform
Server (1)
NetServer for Database
Server (1)
Reader Modules
Workstations (21)
Verifier Modules
Workstations (55)
Scanner Control
Workstations (6)
Scanner Fujitsu M4099D (6)
TELEform 6.2 Elite Enterprise
Edition Components :
TELEform Designer
TELEform Reader
TELEform Verifier
DATA CAPTURING
ICR System in NSO (Thailand) can be divided into 2 parts :
ICR System in 2003
TELEform Software System
ABBYY Software System
DATA CAPTURING
NSO hired ABBYY
Software to process about
25% of The Agricultural
Census 2003 questionnaires
that were totally 5.8 million
households (24 million
forms).
ICR for The Agricultural Census 2003
DATA CAPTURING
TELEform Hardware & Software System in 2003
TELEform Hardware System TELEform Software System
TELEform 6.2 Elite Enterprise
Edition Components :
TELEform Designer
TELEform Reader
TELEform Verifier
NetServer for TELEform
Server (1)
NetServer for Database
Server (1)
Reader Modules
Workstations (21)
Verifier Modules
Workstations (30)
Scanner Control
Workstations (6)
Scanner Fujitsu M4099D (6)
DATA CAPTURING
ABBYY Hardware & Software System in 2003
ABBYY Hardware System ABBYY Software System
ABBYY FormReader 6.0 Enterprise Edition Components:
Form Design
Administration Station
Recognition Station
Correction Station
IBM Server X Series 225 (1) Correction Station (1)
Verifier Modules
Workstations (25)
Scanner Control
Workstations (4)
Scanner Fujitsu M4099D (4)
Storageflex LT707 (1)
DATA CAPTURING
DATA CAPTURING
TELEform & ABBYY Functions
TELEform / ABBYY Designer Function
To create template form by fix field boxes on questionnaire.
Questionnaire
Template
DATA CAPTURING
To evaluate the questionnaires
Export the corrected questionnaires to a data file
Send the unclear questionnaires to TELEform/ABBYY
Verifier Function for correcting and transferring the
corrected questionnaires to a data file
Store scanned images
TELEform Reader / ABBYY Administration Function
DATA CAPTURING
To correct questionnaires that
contain mismarked or illegible fields
The corrected questionnaires are
automatically exported to a data file
TELEform / ABBYY Verifier Function
DATA CAPTURING
Scanning speed support A7 to A3 paper sizes
Simplex is provided 90 papers / minute. (A4 portrait)
Duplex is provided 180 images / minute.(A4)
NSO questionnaires projects are mostly printed with A3 (297 x 420 mm.) paper sizes.
Functions Speed
Functions Estimated Speed (sheets/minute)
Scanner 45
Reader 17
Verifier 5
DATA CAPTURING
DATA CAPTURING
Step of ICR System in NSO
Scan and Forms Distribution :
The questionnaires are scanned in each Block / Village and created Multi Page Image Files.
Step of ICR System in NSO
Forms Evaluation :
The questionnaire images are evaluated. The corrected questionnaires which skipped Verifier Workstations and directly exported to Database server.
DATA CAPTURING
Forms Verification :
The unclear questionnaires are needed to review and corrected it in Verifier Workstations before transferring to Database server.
Step of ICR System in NSO (cont’)
Data Export :
Link a data file from Database server to IBM Mainframe System
Store Scanned image files to CD.
DATA CAPTURING
Scan & Forms Distribution
Questionnaire Scan Image File
DATA CAPTURING
Forms Verification and Data Export
Verify Storage (Images files)
Export data for
processing
CD
DATA CAPTURING
Input – Output of ICR System
Questionnaire
ICRICR
DATA CAPTURING
Ascii files
Image files
ICR Linkage System
ABBYY Software
Transfers
Storage(HD 880
GB)
Mainframe
HP Server
COMPAQ
Server
Processing(Editing & Reporting)
Scanners 4 unit
IBM Server
controllerPC 6 unit
PC 4 unit controller
30 unit
Verifications
21 unit
Readers
Correction
1 unit station
25 unit
Verifications
- Administration- Export- Recognition
CD
Questionnaires
- Backup Data - Software - Database
S Scanners6 unit Questionnaires
TELEform Software
DATA CAPTURING
Specific Questionnaires for ICR System
Specific questionnaires for ICR System
The questionnaires must be designed and printed in quality of paper, specific colour answer field boxes (blue, green, red)
To record the questionnaires should be used at least 2HB pencil
To distribute and collect as well as return questionnaires should be done with caution.
DATA CAPTURING
ICR Benefits
Reduce Cost
Reduce Time
Efficient Data Capture
Increase Data Accuracy
DATA CAPTURING
Strictly designed questionnaires : Paper, size, color,
figure and answer field boxes
Record questionnaires should be fixed pencil and
handwriting
Distribution and return questionnaires should be
carful
Major Problems Encountered in 2000 CensusMajor Problems Encountered in 2000 Census
DATA CAPTURING
DATA EDITING
EDITING & TABULATION
DATA PROCESSING STEP
Machine Edit
Listing
Error?
Checking Error & List Data
Comparing with questionnaire images
Table Checking
Yes
No
Questionnaires
ICR
Accept?No Yes
Validate Data(Manual, Cold deck, Hot deck)
Tabulation &Report
Tab/Report
Possible code:
Check in each field which the out-of-range fields
values is shown in asterisk (*) code.
Validity:
Check characteristics of the message structure.
DATA PROCESSING STEP
Editing Process
Consistency:
Imputation:
Automatic editing programs.
DATA PROCESSING STEP
Editing Process (cont’)
Check inconsistent values within record and across record. Messages are shown the related conditional codes. All error is printed in continuous paper forms to be considerated and validated by subject matter until no messages error found.
Tabulation:
DATA PROCESSING STEP
Tabulation Process
Report summary data which can be processed after data completely cleaned for subject matter to analyze the results of output.
DATA PROCESSING STEP
Mainframe : IBM Multiprise 2000 Model 206
1.1 Operating System- OS/390 v.2 release 8
1.2 Compiler- PL/I
1.3 Statistic Program- Base SAS
1.4 Application Development Tools- Performance Reporter for
OS/390
DATA PROCESSING STEP
Personal Computer (PC)
2.1 Operating System- Windows XP
2.2 Package- MS Office 2003- MS Studio v.6 (Visual FoxPro)- SPSS- CSPro 3.3
THANK YOU
FOR YOUR ATTENTION