Post on 13-Jul-2015
Lecture 5, Wednesday 17th September 2014
DEPARTMENT OF GEOGRAPHY AND ENVIRONMENT
UNIVERSITY OF DHAKA
According to NCDCDS (The US National Committee for Digital Cartographic Data Standards) there are five dimensions for geographic data quality. In addition, ICA proposed two more dimensions.
1. Lineage of Geographic data
2. Positional Accuracy of Geographic data
3. Attribute Accuracy of Geographic data
4. Logical consistency
5. Completeness of Geographic data
6. Temporal accuracy
7. Semantic accuracy
This refers to the sources of materials from which a specific set of geographic data was derived
Lineage provides following questions to a user about data:
1. Who collected data?
2. When were the data collected?
3. How collected?
4. How were the data converted?
5. What algorithms were used to process the data?
6. What was the precision of computation?
“Closeness” of coordinate values to the “true” positions of the real world
Generally, maps are accurate to roughly one line width or 0.5 mm. This is known as minimum mapping unit. A 0.5 mm resolution is equivalent to 5 m on 1:10000 scale maps and 125 m on 1:250000 scale maps.
Positional accuracy of data can be measured by two ways:
1. Planimetric accuracy
2. Height accuracy
Scale Effective Resolution (m)
1:2500 1.25
1:10000 5
1:24000 12
1:50000 25
1:100000 50
1:250000 125
1:500000 250
1:1000000 500
1:10000000 5000
Defined as the “closeness” of the descriptive data in the geographic database to the true or assumed values of the real world features that they may represent
Different ways are used to measure attribute accuracy:
For metric attribute (DEM, TIN), accuracy may always be simply expressed as measurement error
For categorical attributes (land use classification) it is very difficult to measure accuracy of spatial data. In such case, attribute accuracy usually evaluated in terms of other factors, such as-
1. The classification scheme
2. The amount of gross error
3. The degree of heterogeneity of the polygons
Defined as a square array of values, denoted as C, which cross-tabulates the number of sample spatial data units assigned to a particular category relative to the actual category as verified by the reference data
Constructed to show the frequency of discrepancies between encoded values and their corresponding reference values of sample
In the error matrix, rows represent the categories of the classification of the database obtained by the user
The columns indicate the classification of the reference data obtained by source data or field visit
Diagonal elements represent correctly classified spatial data
Off-diagonal elements represent the frequencies of misclassification of various categories
If in a particular error matrix, all the non-zero entries lie on the diagonal, it indicates that no misclassification at the sample locations has occurred and an overall accuracy of 100% is obtained
When misclassifications occur, it can be termed either as an error of commission/user accuracy (error of inclusion) or an error of omission/ producers accuracy (errors of exclusion)
Overall Accuracy
Computed by dividing the total number of correctly classified pixels by the total number of reference pixels
The maximum value of the overall accuracy is 100 when there is perfect agreement between the database and the reference data. The minimum value is 0.
OA can also be termed as PCC (Percent Correctly Classified). The following equation can be used:
PCC or OA= (Sd /n)* 100%
Where,
Sd = sum of values along diagonal
N= total number of sample locations
Sample Data
Reference Data Total
Exposed soil
Cropland Range Sparse woodland
Forest Water
Exposed soil
1 2 0 0 0 0 3
Cropland
0 5 0 2 3 0 10
Range
0 3 5 1 0 0 9
Sparse woodland
0 0 4 4 0 0 8
Forest
0 0 0 0 4 0 4
Water
0 0 0 0 0 1 1
Total 1 10 9 7 7 1 35
This can be computed by dividing the number of correctly classified pixels in each category (on the major diagonal) by number of training set pixels used for that category (the column total)
Producer’s accuracy= (C i / C t) *100%
Where,
Ci= correctly classified sample locations in column
Ct= total number of sample locations in column
EO=100-producer’s accuracy
Calculation of PA
Exposed soil =1/1 =100%
Cropland =5/10 =50%
Range =5/9 =55.6%
Sparse woodland =4/7 =57.1%
Forest =4/7 =57.1%
Water body =1/1 =100%
Computed by dividing the number of correctly classified pixels in each category by the total number of pixels that were classified in that category (the row total)
This figure is a measure of commission error and indicates the probability that a pixel classified into a given category actually represents that category on the ground
UA= (Ri / Rt) *100
Where,
Ri= correctly classified sample locations in row
Rt= total number of sample locations in row
Error of commission=100-users accuracy
Calculation of UA
Exposed soil =1/3 =33.3%
Cropland =5/10 =50%
Range =5/9 =55.6%
Sparse woodland =4/8 =50%
Forest =4/4 =100%
Water body =1/1 =100%
4. Logical consistency
Description of the fidelity of the relationships between the real
world and encoded geographic data
In GIS, topological model is an example of assigning logical
consistency
>> consistency of the data model
>> consistency of the positional and attribute data
>> consistency between data files
5. Completeness of Geographic data
Are all possible objects included within the database?
A. Spatial completeness
B. B. Thematic completeness
6. Temporal accuracy
Measure of data quality with respect to the representation of time in geographic database
A. World time
B. Database time
7. Semantic accuracy
>> how correctly spatial objects are labeled on
named
>> correct encoding in accordance with a set of
features
Datum
A geodetic datum (plural datums, not data) is a reference from
which measurements are made.
In surveying and geodesy, a datum is a set of reference points
on the Earth's surface against which position measurements are
made.
Horizontal datums are used for describing a point on the earth's surface, in latitude and longitude or another coordinate system.
Vertical datums are used to measure elevations or underwater depths.
A coordinate system defines the location of a point on a planar or spherical surface.
Types of coordinate system
A. Based on Nature
B. Based on Extent
A. Based on Nature
1. Plane coordinate system
2. Geographic coordinate system
B. Based on Extent
1. Global coordinate system
2. Local coordinate system
Some coordinate systems
1. Cartesian coordinate system
2. Universal Transverse Mercator (UTM)
3. WGS 84
The World Geodetic System 1984 (WGS84) is the datum used
by the Global Positioning System (GPS). The datum is defined
and maintained by the United States National Geospatial-
Intelligence Agency (NGA).
Coordinates computed from GPS receivers are likely to be
provided in terms of the WGS84 datum and the heights in
terms of the WGS84 ellipsoid.
4. Everest 1830
India and other countries of the world made measurements in
their countries and defined reference surface to serve as
Datum for mapping.
In India the reference surface was defined by Sir George
Everest, who was Surveyor General of India from 1830 to
1843.
It has served as reference for all mapping in India. Indian
system can be called Indian Geodetic System as all
coordinates are referred to it. The reference surface was
called Everest Spheroid.
Geoid
An imaginary surface that coincides with mean sea level in the ocean and its extension through the continents.
A hypothetical surface that corresponds to mean sea level and extends at the same level under the continents.
The geoid is used as a reference surface for astronomical measurements and for the accurate measurement of elevationon the Earth's surface.
Ellipsoid
A geometric surface, symmetrical about the three coordinate axes, whose plane sections are ellipses or circles