“ Duplicate ” Entries in Gazetteers jordan Hastings Department of Geography University of...

Post on 15-Jan-2016

216 views 0 download

Tags:

Transcript of “ Duplicate ” Entries in Gazetteers jordan Hastings Department of Geography University of...

“Duplicate” Entries in Gazetteers

jordan HastingsDepartment of Geography

University of CaliforniaSanta Barbara

Gazetteer “Duplicates”Names & Features (1)

Naming Features in the Environment Linguistic Necessity Identity and Ownership Navigation and Wayfinding

Features Cover a Large Territory Crisp or Diffuse Compact or Extended Tangible or Abstract

Gazetteer “Duplicates”Names & Features (2)

Locations are Numerous & Various Multiscale Generalized Dis-coordinated Time-variant

Gazetteer “Duplicates”Names & Features (3)

Names are Numerous & Various Polynymous Mis-spelled Multilingual Time-variant

Gazetteer “Duplicates”Names & Features (4)

Lake Bigler, thru 1920s Lake Bonpland (also Bondland), thru

1890s Da-ow-a-ga, thru 1850s

$

$

$

$

$

$

$

$

$

$

$

$

$

$

$

$

$

Dollar Point

Kings Beach

South Lake Tahoe

Sunnyside-Tahoe City

Tahoe Vista

Carson

Incline Village-Crystal Bay

Indian Hills

Johnson Lane

Kingsbury

MindenStateline

Zephyr Cove-Round Hill Village

Gazetteer “Duplicates”Feature Types (1)

Dependable Type System Because Features are “Objects” Because Human Mind Categorizes

Types present in Taxonomy Hierarchy is Natural in Environment Because Human Mind Categorizes

Gazetteer “Duplicates”Feature Types (2) – Examples

Cultural Environment Nations -> States -> Provinces -> Districts

Gazetteer “Duplicates”Feature Types (2) - Examples

Physical Environment Watersources:

Springs-->Seeps Watercourses:

Rivers-->Streams-->Creeks Waterbodies:

Lakes-->Ponds-->Sloughs ?Glaciers

Gazetteer “Duplicates”Fundaments (1)

Definition: GazetteerA spatial dictionary of named & typed features in the environment

Implications Features uniquely identified Searchable by name and type Also searchable geospatially

Gazetteer “Duplicates”Fundaments (2)

Duplicates: An approximate notion Firm types, ±close in hierarchy Locations ±close dependent on scale Names ±close dependent on

language … or not at all

All aspects variant in time

Gazetteer “Duplicates”Fundaments (3)

Database Implications / Support Custom Datatypes

Hierarchy Geometry

Multiple Attribution (unlimited) Names Locations

Efficient Geospatial Processing

Gazetteer “Duplicates”Approach (1)

Independent Measures of Duplicates 1. Type Thesaurus Metrics

Inter-feature: hierarchy, explicit linkages 2. Geospatial Metrics

Intra-feature: size, compactness, … Inter-feature: distance, overlap, …

3. Geonomial Metrics Intra-feature: NL translation [not considered

yet] Intra-feature: stemming, soundex,

substitution

Gazetteer “Duplicates”Approach (2)

Unified Assessment of Duplicates Weighted Combination of Measures

1 Type 2 Location(s) 3 Name(s)

Geographic Visualization, over Maps Final Authority of Human Cataloger

random features

grouped features

prep

rework

Gazetteer “Duplicates”Processing Cycle

random features

grouped features

prep

rework

Gazetteer “Duplicates”Processing Cycle

random features

grouped features

accepted suspended

prep

weigh

feature

database

Gazetteer “Duplicates”Processing Cycle

random features

grouped features

accepted suspended

prep

weigh

feature

database

Gazetteer “Duplicates”Processing Cycle

review

random features

grouped features

accepted suspended

trash

review

post

prep

weighrework

reject feature

database

Gazetteer “Duplicates”Processing Cycle

[end]