GIS Data Models GEOG 370 Christine Erlien, Instructor.
-
Upload
oswin-lambert -
Category
Documents
-
view
217 -
download
1
Transcript of GIS Data Models GEOG 370 Christine Erlien, Instructor.
GIS Data Models
GEOG 370
Christine Erlien, Instructor
GIS Data Models: Why?
Knowing how GIS data are structured helps us to use GIS programs more effectively
– Basic computer file structures
– Database structures
Basic computer file structures
What is where? – Computer file structures allow the
computer to store, order, & search data
Types:– Simple list– Ordered sequential– Indexed file
Basic computer file structures: Simple list
Simple List– Most basic– No order, no organization– Input is simple just add on– Searching difficult & inefficient
– Example: If my class roster were ordered based on when you added this class
Basic computer file structures: Ordered sequential files
Ordered sequential files– Records ordered by alphabetic or
numerical character sequence• How? Algorithm: divide and conquer
– Record compared to records preceding & following to determine which 1/2 to search
– Repeat until done
– Inserting a record is slow– Searching more efficient than simple list
Basic computer file structures:Ordered sequential files
Example file:Chapel Hill
Cary
Durham
Graham
Greensboro
Raleigh
To add: Maggie Valley
What’s the process?
Basic computer file structures
Indexed files– Database index
• Can be built for field that uniquely identifies a record (primary key) or other fields
• Used to determine the location of rows in a file that satisfy some condition
• Keys & indexes can be extracted & sorted and original file accessed faster than the original file could be sorted
– Types• Direct: Each record searched for particular properties • Inverted: Index based on anticipated search criteria
Indexed files
Inverted index
Direct index
Advantages– Quicker (i.e., reduces computational time)
Disadvantages– Inverted
• Requires knowledge of likely search criteria• Data additions require recalculation of index
Basic computer file structures:Indexed files
Databases & Database Structures
What is where?
– Geographic searches data retrieval
– Data retrieval requires data organization
Databases & Database Structures Database: Collection of multiple files
– Requires more elaborate structure for management
DBMS: Database Management System
Database structure types– Hierarchical data structures– Network systems– Relational database systems
Database Structures: Hierarchical
Hierarchical data structures– One-to-many (parent-child) relationship– Requires relationship be defined before
structure & decision rules developed– Advantage:
• Easy to search
– Disadvantage:• Knowledge of all questions that might be asked
necessary – Unanticipated criteria make search impossible
• Large index files memory intensive, slow access
Hierarchical Database Structures
Database Structures: Network Systems Network Systems
– Allow users to move from data item to data item through a series of pointers
• Pointers: Computer structures that direct a piece of data to all others to which it relates (connect one file location to another)
– Pointers indicate relationships among data items
Database Structures: Network Systems
Database Structures: Network Systems
Advantages:– Less rigid than hierarchical structure– Can handle many-to-many relationships– Reduce data redundancy – Greater search flexibility
Disadvantages:– In very complex GIS databases, the
number of pointers can get quite large storage space
Database Structures: Relational Databases Predominant in GIS Tuples: Ordered records/rows of
attribute values Primary Key: Unique identifier for each
record in a relational table Lu_code Crop type Status Cost
010001 Row crops Active 1000/ha
020001 Orchards Dormant 1500/ha
021001 Rangeland Active 900/ha
010001 Row crops Active 1100/ha
010404 Garden farms
Active 1250/ha
010001 Row crops Dormant 1050/ha
Database Structures: Relational Databases
Joining tables Relational join– Matching data from one table to
corresponding data in another table
– How? Link the primary key to the foreign key
• Primary Key: Unique identifier in 1st table• Foreign key: Column in 2nd table to which
primary key is linked
Database Structures: Relational Databases
Relational DB & Normal Forms
Normal forms: A set of rules established to indicate the form tables should take
Goal: Reduce database redundancy database performance is better
First normal form – Table must contain columns & rows– Columns will be used for searches, so only
one value per cell
Second normal form– Every column that is not the primary key
should be dependent on the primary key• On the entire primary key if primary key is
comprised of more than one column
Relational DB & Normal Forms
| PART | WAREHOUSE | QUANTITY | WAREHOUSE-ADDRESS |
Key: Part & Warehouse togetherAddress only dependent on warehouse portion of key
| PART | WAREHOUSE | QUANTITY | | WAREHOUSE | WAREHOUSE-ADDRESS |
Example from William Kent, "A Simple Guide to Five Normal Forms in Relational
Database Theory", Communications of the ACM 26(2), Feb. 1983, 120-125.
Relational DB & Normal Forms
Third Normal Form– Nonprimary keys must depend on primary
key– Primary key does not depend on any
nonprimary key
| EMPLOYEE | DEPARTMENT | LOCATION |Key field: EmployeeLocation is redundant & not dependent on key field
| EMPLOYEE | DEPARTMENT | | DEPARTMENT | LOCATION |
Normalization of Database Tables
Normalization: Process of organizing data in a database– Creating tables & establishing relationships
between them according to rules of normal form
– Goal: Make the database more flexible by eliminating redundancy and inconsistent dependency
Normalization of Database Tables
Problem with data redundancy:– Wastes disk space
– Creates maintenance problems• If data existing in more than one place must be
changed must be changed the same way in each case
Normalization & Normal Forms
Describing databases– If the 1st rule is observed, the database is
said to be in "first normal form." – If the first 3 rules are observed, the
database is considered to be in "third normal form."
Additional levels of normalization are possible, but 3rd normal form is considered the highest level necessary for most applications
Recap
File types– Simple list– Ordered Sequential– Indexed
Databases: Many files– Structure necessary access to data in 1 or
more files easier
Database types– Hierarchical– Network– Relational