Data Conversion Managing the Human Side of a Technical Process CS474 – Mar 2, 2006.
-
Upload
moris-townsend -
Category
Documents
-
view
212 -
download
0
Transcript of Data Conversion Managing the Human Side of a Technical Process CS474 – Mar 2, 2006.
Data Conversion
Managing the Human Side of a Technical Process
CS474 – Mar 2, 2006
Introduction and Objectives
Primary objectives– Help you to understand the process of
converting from one database form to another– Add some strategies to your toolkit for handling
data conversion issues– Provide you with an appreciation of the
challenges of implementing a database system.
What is data conversion?
Conversion is change. Data conversion is
change from one data form or format to another.
Conversion is not magic.
Data conversion
Requires thought and planning. Requires asking some difficult questions
– Value– Cost– Effort– Time
Requires work!
Data conversion vs data migration
Migration usually involves moving files/tables from one computer platform to another or one operating system to another.
Migration usually leaves the data intact. For example, if I was storing the name of a school as a 30 character value, it will remain a 30 character value.
Usually, underlying data structure remains the same.
Data conversion
Data conversion involves changing some of those underlying values.
Data conversion may pull apart or draw together data stored in different data files or different tables within the same database.
Data conversion may involve substantial changes to the way the data looks and behaves
Data conversion is involved when moving from one database application to another or one application system to another.
Data versus Information
Data is the raw values stored on the computer. Information has meaning. A programmer can write a program to move
data. An knowledgeable user must guide the
programmer to insure that the data is moved in a way that has meaning.
Consider the datum – 3.172
What is it? In an academic setting it might be a GPA. In weather, it might be a temperature. In physics, it might be a measurement.
Data is simply a value…Information gives that value meaning.
Conversion
Involves manipulating data in one system and trying to preserves its information value in another.
Exercise
Do a manual data conversion.
What do you need to design the conversion?
1. Legacy (or source) data.
2. New (or target) database.
3. What else?
Tasks
1. Review the legacy data. Look at each of the samples and look for similarities but also differences in the data.
2. Begin with the setup tables. Assume that where appropriate we have foreign key constraints on the data.
3. Then fill in the other tables. Id and StudentId are synonyms.
4. Key fields are underlined. Notice the only surrogate key is the id number.
5. Ask questions where you are not sure of the semantic (or meaning) of the data.
What do you need for design?
1. Legacy (or source) data.
2. New (or target) database.
3. Data dictionary for both systems.
4. Knowledgeable user on the legacy system.
5. System support people on the legacy system.
Challenges you encountered.
Data in several different format. Different values for the same item. Data “forced” because it didn’t fit. We had description, but what codes to use? Or maybe we had codes, but what descriptions
to use?
Data Conversion is a Cooperative Effort
The vendor who is charged with carrying out the conversion brings in a thorough knowledge of the new software system and its data requirements.
The user community brings in a (sometimes) thorough knowledge of what data they capture and how they use it.
Vendor goals in a conversion
Control size of project Keep the process profitable Keep the customer satisfied
Organization goals in a conversion
Keep costs low Have minimal impact on the organization Come out with a much better system Have all the needed data in place
Conversion truism
The very best job the conversion specialist can do is the very minimum that the customer expects.
A generic organization chart – Administrative Staff
Executive StaffPresident/VP level
Departmental Directors – functional
offices Departmental Directors – IT Support
StaffOffice Managers
Clerical Staff Key UserKey IT staffers
Project Manager
Executive Staff
What do they want out of the project?
$$$$ Service/effectiveness increases
Must authorize the expenditures for the major projects. Usually are directly involved in the purchase decision.
Director Level
Need more detailed information
Want little disruption in operation of their areas Will be held to their budgets Must authorize overtime
Usually involved in the purchase decision
Office manager
Usually is very close to the data.
Bridge between money people and “line staff” Concerned about getting the day to day work
done.
May not be involved in the purchase decision
Clerical staff
Limited resources. Limited time. WIIFM
Typically not involved in the purchase decision
How do we get them all to work together
Initial presentation (Conversion and project)– All are invited to attend– Includes president on down to individual end user– All hear the same message(s)
Schedule time for each group individually– Provide attention– Listen to their needs– Provide some solutions
Follow-up with written reports – There is no question later on about who heard what.
What are the common messages?
Data conversion provides a unique opportunity
To make data more useful– consistent code use– cross department data sharing
To clean house– remove data that you are not using– remove inconsistent data– change values which are undesirable (i.e..
random number vs. SSN as id’s).
Message
Cooperation among departments is necessary Loss of data is okay if it is planned, and leads
to a good outcome for the school.
Types of conversion
Manual (Time and effort)– Records are re-keyed into the system
Mechanical ($$)– Programs move and manipulate data
No conversion– Conscious decision
Typical (balance)– All three in some part
Message
In some cases you will need to re-key information. (Workload effect)
In some cases, you may decide to pay someone to write programs. (Cost effect)
And in some cases not converting some data is the best choice.
These are all possible choices to handle the needs across the institution.
Who does conversion?
You Hired temps 3rd party programmers Combination
MESSAGE – There will be work involved of the line staff, the people that know the data and how it is used.
Some important terms
Legacy system – The old system or systems which are being converted. May be called “source” data.
New system is called the “target”. Mapping – Process of developing the
specification for the conversion. The Design Translation – Process of changing one value to
a new value on the basis of a cross referencing table.
Conversion life cycle
Determine scope – under organization control Mapping (specification) – roadmap of the
process Programming (implementation) – done off-site First full pass – validation activities involve the
entire staff Validation and correction – its okay to find
errors Final “go-live” run – we also talk about
downtime, again so that all players know the ground rules
Follow-up support
Message
Show the places in the process where the organization must plan resources.
Remind people of “down-time”. Sometimes go-live is not planned around all of the offices activity.
Primary message is work! Plan for people to be needed to perform tasks periodically in the process.
How do we decide what to convert?
Cost effectiveness Data integrity Cross institutional
consistency and concerns
Cost effectiveness
How many records are in each file/table? Do you really use all of your data? Do you really need all of your data? How much time/money are you willing to
invest?$$$
Time
Scope
This message goes a long way
primarily with the executive and director level. puts the customer in control each person who is concerned begins to
understand that they will have a say in how things are done
each person also begins to realize that they may have to trade off some data for value in the process
Do you have data integrity?
Standards for names, addresses, use of id numbers, etc.?
Standards enforced? Cross-institutional consistency?
– Major codes– School codes– Prefix/suffix
System enforced consistency (codes, foreign keys).
What can we do about problems?
Data removal – Identify the bad values and do not bring those over.
Data combination and data translation – Use translation tables (old value to new value mapping) to take differing values and bring them under one approved set of values.– Example, multiple major codes or free format
values.– Different uses of prefixes in different offices.
In Summary
To co-opt the user community – invite all to become involved.
Provide some consistent simple messages that they can hold on to throughout the process. Reinforce those messages
Provide good rationale for why the hard decisions need to be made (provide instruction and ammunition for the arguments).
Put the responsibility for decisions into the hands of the organization.
This is not being done TO them but WITH them.
Technical solutions to some problems
Mixture of values – translation tables. – Example, course names – put out all of the variants and let
users choose. Duplicate rows – algorithm for determining potential
duplicates and combining their information. Columns used for multiple purposes – translation
tables Setup data inconsistent in tables – usually we do setup
as a manual conversion and then build translations to fit.
Go live process considerations
Criticality of the system Normal “downtime” of the organization Plan for the worst case
– What do we do if?..... Backup and recovery
Preparation – confidence level with process time
Phased or all at once?
Questions?