ion and Structure of Data
-
Upload
matthewcox2 -
Category
Documents
-
view
215 -
download
0
Transcript of ion and Structure of Data
-
8/2/2019 ion and Structure of Data
1/57
Organisation and Structure
of Data
-
8/2/2019 ion and Structure of Data
2/57
of Data
Introduction
This unit is identified by a filename
A collection of data
stored in one unit
What is a file?
-
8/2/2019 ion and Structure of Data
3/57
File Structure
A file is made up of a number of records
A record is made up of a number of fields
Each line in a file is a
record
Each field in a record
holds a piece of data
-
8/2/2019 ion and Structure of Data
4/57
File Structure
RecordType
Credit /Debit
Date Description Amount
B C 01/05/2011Counter
credit
250.00
B D 03/05/2011Chq No:176534
26.34
B D 04/05/2011Chq No:176535
134.28
B C 12/05/2011Countercredit
50.00
Records
Fields
-
8/2/2019 ion and Structure of Data
5/57
File Structure
When all the records are the same lengththe file is said to be a:
fixed length file
ID Forename Surname DOB
92013 Sidney JONES 23/041978
92017
Lorraine LAIDLAW 12/11/1979
92114 Mandy MERRITT 08/03/1979
File structureID: 5Surname: 15Forename: 15
DOB: 8
-
8/2/2019 ion and Structure of Data
6/57
Fixed Length Files
Advantage: Reading them can be very fast because the computer
knows where each record/field is
Disadvantage: Lots of unused space in each record, therefore larger
file sizes
-
8/2/2019 ion and Structure of Data
7/57
File Structure
If each record has a different recordlength, then the file is said to be:
variable length file
Note how each field is separated by an *
-
8/2/2019 ion and Structure of Data
8/57
File Sizes
Can only be worked out on fixed length files
1. Need to work out how long each field is:
ID: 5Surname: 15Forename: 15YearGroup: 2FormNo: 1DOB: 8
2. Add the fields togetherto get the record size:
46
3. Finally multiply by thenumber of records inthe whole file:
46 x 1000 = 46,000 bytes
-
8/2/2019 ion and Structure of Data
9/57
File Structure
Fixed length files: Very good when you have lots of data that is
always the same length
Eg: transaction details (transaction id, accountnumber, bank sort code, amount)
Variable length files: Very good when you fields that differ in size Eg: customer details (title, name, address,
gender, DOB,etc)
-
8/2/2019 ion and Structure of Data
10/57
File Types
Serial file:
Records on this type of file will be readfrom top to bottom
Contains data in no particularorder records are usually stored
in the order they are received
-
8/2/2019 ion and Structure of Data
11/57
File Types
Sequential file:
File is still read from top to bottom, but issorted first
Records are stored in some sort
of order for example accountnumber order
-
8/2/2019 ion and Structure of Data
12/57
File Types
Indexed sequential file:
The file can be: Read sequentially from top to bottom
A specific record found because of the index
Each record is given a key which
uniquely identifies it, and the fileis kept in the key order
-
8/2/2019 ion and Structure of Data
13/57
File Types
Random access file:
Because of the key, the file does not needto be kept in order
The file will be accessed based on the key
and not reading from top to bottom
Each record is given a unique key
is generated by an algorithm
-
8/2/2019 ion and Structure of Data
14/57
Adding Data to Files
Serial files: New records are appended to the bottom of
the file
Sequential files: Records are read in
one at a time and thenew record is slottedin when appropriate
1 2 3 5
Old file
1 2 3 4 5
New file
4
-
8/2/2019 ion and Structure of Data
15/57
Adding Data to Files
Indexed sequential files: The record is added based on the key being
slotted into the correct place, the same as
sequential files
Random access files: The records dont need to be stored in any
order. The index that points to where therecord is located is updated
-
8/2/2019 ion and Structure of Data
16/57
Deleting Data from Files
Serial files: Each record is copied
across one by one to a
new file. The deletedrecord is just missedout
Sequential files: Same as with serial files
1 2 3 4 5
1 2 3 5
Old file
New file
-
8/2/2019 ion and Structure of Data
17/57
Deleting Data from Files Indexed sequential files:
Both the key and the record must be deleted
Random access files: As the record can be stored anywhere, only the
key needs to be deleted without reorganisingthe file structure
-
8/2/2019 ion and Structure of Data
18/57
File Types
All of the file types described previouslyare relatively old, though they are still used
A newer method for storing data is a:
database
-
8/2/2019 ion and Structure of Data
19/57
Databases
What is a database?
What do we mean by management of the data?
Selection (retrieval) of data Updating data already there Inserting new data Deleting data we dont want anymore
A collection of related data organised in astructured way so as to allow easy
management of the data
-
8/2/2019 ion and Structure of Data
20/57
Databases
What is data?
What is information?
Raw facts or figures
Data that has a meaning
-
8/2/2019 ion and Structure of Data
21/57
Databases
So, is this data or information?
20/06/1969 Data
Whilst it is obviously a date, you dont knowwhat the date relates to
Date of first moon landing = 20/06/1969
Now it is information
-
8/2/2019 ion and Structure of Data
22/57
Databases
A database is managed by:
A DBMS is:
Database management system (DBMS)
A collection of programs that provides thenecessary tools to create and manipulate the
data in a database
-
8/2/2019 ion and Structure of Data
23/57
Database Management System
The DBMS sits betweenthe applications you arerunning and the files
that are holding thedata
DBMS
Applications
File System
The DBMS will manage this data by: Checking for data inconsistencies
Minimising duplicated data
Retrieve related data from different files
-
8/2/2019 ion and Structure of Data
24/57
Databases
Advantages DisadvantagesData can be accessed quicklyand manipulated to createnew data
Requires time to set up
A single database can beshared by many users
Centralised data can beeasier to steal
Data validation can ensuregood quality data
If data is not correct then allusers will see the wrong data
Data duplication can beavoided
Security of data can be
centralised
-
8/2/2019 ion and Structure of Data
25/57
Elements of a Database
Table:
Record:
Field:
A complete set of data the equivalentof a file. Also known as an entity
One row of related data within a table
A property or characteristic of a table.Also known as an attribute
-
8/2/2019 ion and Structure of Data
26/57
Types of Database
Flat-file database
Relational database
A single table
Multiple tables
-
8/2/2019 ion and Structure of Data
27/57
Designing a Database
The first thing to do is to work out whatdata we have to store
To do this you would do the analysis part ofthe software development lifecycle
Questionnaires
Interviews
Any current documentation
-
8/2/2019 ion and Structure of Data
28/57
Designing a Database Next, we need to decide what is data and
what is information
21 dataAge:
information
How could we store this?
Date of birth
-
8/2/2019 ion and Structure of Data
29/57
Designing a Database Next, we need to break everything down
into:
Atomic data
Mr Brad Pitt
Could be heldas name But we have 3
separate
pieces of data
TitleForename
Surname
Atomic data
-
8/2/2019 ion and Structure of Data
30/57
-
8/2/2019 ion and Structure of Data
31/57
Data Types
Standard data types:
String / TextInteger / Number / Short / LongDecimal / Real / Single / Double
Date and TimeBoolean
Why dont we have currency?
-
8/2/2019 ion and Structure of Data
32/57
Databases
To help organise the data we need a
This is
Primary key
A field in which a unique piece ofdata can be held for each record
-
8/2/2019 ion and Structure of Data
33/57
Primary Keys
What sort of field would be suitable for:
Students at college?
Customers for an online shop?
Cars at a garage?
StudentID
Email address / CustomerID
Car registration
-
8/2/2019 ion and Structure of Data
34/57
Getting Data Back
One of the advantages to using a database isthe ability to get back just the data you want
For example: All the names of students over 18
Registrations of cars with no road tax
How much tax a specific employee paid within acertain timeframe
We do this with the use of queries
-
8/2/2019 ion and Structure of Data
35/57
Queries
Queries involve using a special languagecalled
to get back and manipulate data from thedatabase
Structured Query Language (SQL)
-
8/2/2019 ion and Structure of Data
36/57
Queries
Queries can be used in a number of ways:
SELECTing records from the database thatmeet a specific criteria (filtering)
COUNTing the number of records that meet a
specific criteria
Performing calculations on fields that meet
certain criteria (eg: SUM)
-
8/2/2019 ion and Structure of Data
37/57
Validation and Verification
When a user enters data into a systemthere is a chance that it could be wrong
GIGO
Garbage In, Garbage Out
-
8/2/2019 ion and Structure of Data
38/57
Validation and Verification
Errors can occur
When data is captured
When hardcopy data is copied onto a computer
When data is transmitted within a computer system
When data is being processed by software
To prevent this we use validation andverification
-
8/2/2019 ion and Structure of Data
39/57
Verification - Validation
Checking bycomparison that noalterations have beenmade to data as it is
transferred from onesystem to another oron first entry onto acomputer
EG: keying data twice andcomparing input/outputs
Checking that data issensible rejectingdata that is not.
EG: presence check /format check.
-
8/2/2019 ion and Structure of Data
40/57
Transcription errors
Occurs when data in manually copied
-
8/2/2019 ion and Structure of Data
41/57
Transcription errors
Usually due to :
Bad handwriting
Misreading
Mishearing
Long strings of meaningless numbers
-
8/2/2019 ion and Structure of Data
42/57
Transcription errors
Main type - transpositional errors:
When two characters are swapped over
134638
136438
-
8/2/2019 ion and Structure of Data
43/57
Activity
For each of these transcription errorsexplain what is wrong and how the error islikely to have been made:
SO23 5RT entered as SO23 SRT Leeming entered as Lemming 419863 entered as 419683
2000000 entered as 200000 238.591 entered as 2385.91 23/05/89 entered as 23/05/07 199503 entered as 195503
-
8/2/2019 ion and Structure of Data
44/57
Transmission errors
Data that is already entered correctlyon a computer becomes corrupted whentransferred to another computer
-
8/2/2019 ion and Structure of Data
45/57
Processing Errors
Errors due to incorrectly written software
Could be due to: Incorrect calculations
Records being ignored
Wrong records being updated
-
8/2/2019 ion and Structure of Data
46/57
Verification
Remember this is
For example:
Double-entry verification Using check-digits
On-screen verification
Checking that data is correct
-
8/2/2019 ion and Structure of Data
47/57
Verification
Double-Entry Verification:
This is the entering of data twice
For example: Passwords
Email addresses
-
8/2/2019 ion and Structure of Data
48/57
Verification
On-screen verification:
This is where the user is presented with theentered data for them to manually check over
Often used in online forms when signing up to
things
-
8/2/2019 ion and Structure of Data
49/57
Verification
Check digit:
A digit in a numerical field used to check that
the overall sequence of numbers are valid
Done by automatically performing a calculationon the other numbers
EG: Barcodes
Bank account numbers
-
8/2/2019 ion and Structure of Data
50/57
Validation
Remember this is
For example:
Character check
Checking that data follows the rules of
the program and is sensible
-
8/2/2019 ion and Structure of Data
51/57
Validation
Character check:
Check for the appropriate range of
characters
For example: Upper / lower case characters Numeric / alpha-numeric in correct fields
-
8/2/2019 ion and Structure of Data
52/57
Validation
Format check:
Checks to see if data is in a valid format
For example: Dates being DD/MM/YYYY
-
8/2/2019 ion and Structure of Data
53/57
Validation Length check:
Checks to see the length of the data enteredis correct
For example: Account numbers Credit card numbers
-
8/2/2019 ion and Structure of Data
54/57
Validation
Range check:
Checks that data is neither too large or smalland fits within certain parameters
For example: Age is over 18 after DOB entered
-
8/2/2019 ion and Structure of Data
55/57
Validation
Look-up lists:
Only allowing the user to select values from alist
For example: Titles when entering name (Mr, Mrs, etc) Months when entering a date
-
8/2/2019 ion and Structure of Data
56/57
Validation Presence check:
Checks to see whether any data has beenentered into a field
For example: When filling out online forms
-
8/2/2019 ion and Structure of Data
57/57
Validation
Type check:
Checks to see if the appropriate data type hasbeen entered
For example: Checking that text isnt entered into a number field Checking that only dates are entered into