CIVIUM: GIS For Everyone, The Information Commons, and The Universal Database CMU HCII 4/16/2003...

38
CIVIUM: GIS For Everyone, The Information Commons, and The Universal Database CMU HCII 4/16/2003 Peter Lucas MAYA Design

Transcript of CIVIUM: GIS For Everyone, The Information Commons, and The Universal Database CMU HCII 4/16/2003...

Page 1: CIVIUM: GIS For Everyone, The Information Commons, and The Universal Database CMU HCII 4/16/2003 Peter Lucas MAYA Design.

CIVIUM:GIS For Everyone,

The Information Commons, and

The Universal Database

CMU HCII

4/16/2003

Peter Lucas

MAYA Design

Page 2: CIVIUM: GIS For Everyone, The Information Commons, and The Universal Database CMU HCII 4/16/2003 Peter Lucas MAYA Design.

or:

Gnutella meets

Encyclopedia Galactica

Page 3: CIVIUM: GIS For Everyone, The Information Commons, and The Universal Database CMU HCII 4/16/2003 Peter Lucas MAYA Design.

The Universal Database The Information Commons Information-Centric GIS

Convergence of three lines of research:

Page 4: CIVIUM: GIS For Everyone, The Information Commons, and The Universal Database CMU HCII 4/16/2003 Peter Lucas MAYA Design.

Toward the Universal Database

Topic: Persistent State in a Distributed World

Premise: If the Net is becoming One Huge Computer, don’t we also need One Huge Database?

Requirement: Information Liquidity

Page 5: CIVIUM: GIS For Everyone, The Information Commons, and The Universal Database CMU HCII 4/16/2003 Peter Lucas MAYA Design.

Toward the Universal Database

Conflicting Identity Spaces

Conflicting Schemata

Impediments to Data Liquidity:

Page 6: CIVIUM: GIS For Everyone, The Information Commons, and The Universal Database CMU HCII 4/16/2003 Peter Lucas MAYA Design.

Toward the Universal Database

Two new ideas: U-forms Shepherds

One old idea: Layered Semantics

The VIA Repository

Page 7: CIVIUM: GIS For Everyone, The Information Commons, and The Universal Database CMU HCII 4/16/2003 Peter Lucas MAYA Design.

U-forms

A generic “container” for mobile data The u-form is an abstract data type, not a

representational format. A u-form is simply a bundle of name-value

pairs associated with a universally-unique identifier (UUID).

attribute name 1 value 1attribute name 2 value 2

… …

attribute name n value n

<UUID>

Page 8: CIVIUM: GIS For Everyone, The Information Commons, and The Universal Database CMU HCII 4/16/2003 Peter Lucas MAYA Design.

Shepherds

“Shepherd”

Page 9: CIVIUM: GIS For Everyone, The Information Commons, and The Universal Database CMU HCII 4/16/2003 Peter Lucas MAYA Design.

A Shepherd Space

Shepherds

Page 10: CIVIUM: GIS For Everyone, The Information Commons, and The Universal Database CMU HCII 4/16/2003 Peter Lucas MAYA Design.

VIA Repository Ultrapeer Server A terabyte-scale distributed repository server (JOSHUA.MAYA.COM)

Data stored uniformly according to the VIA data model: Each item has universally-unique identifier Non-relational storage model Extremely modular Not dependent on fixed schema Rich, mature type system with many language bindings Supports arbitrary collections and entity/relation style programming

Data may be accessed many ways Web browser CSV XML Native language bindings (C/C++, Java, Python, Palm OS) Future: web services

Peer-to-peer “Shepherds” replication architecture permits real-time distributed replication of some or all of database for high-availability access

Distributed Indexing supports disconnected operations

Page 11: CIVIUM: GIS For Everyone, The Information Commons, and The Universal Database CMU HCII 4/16/2003 Peter Lucas MAYA Design.
Page 12: CIVIUM: GIS For Everyone, The Information Commons, and The Universal Database CMU HCII 4/16/2003 Peter Lucas MAYA Design.
Page 13: CIVIUM: GIS For Everyone, The Information Commons, and The Universal Database CMU HCII 4/16/2003 Peter Lucas MAYA Design.

What if we wanted to do a large, public works project in Cyberspace?

Obvious answer: The Digital Commons (AKA virtual encyclopedia)

The Information Commons

Page 14: CIVIUM: GIS For Everyone, The Information Commons, and The Universal Database CMU HCII 4/16/2003 Peter Lucas MAYA Design.

Civium Information Space

Open, completely distributed public information space

Information liberated from the machines (don’t lose data when a web-page goes away)

Resolve tension between Rigid editorial control (e.g. Yahoo!) and spontaneous, user-contributed chaos (e.g., Everything2)

Goals:

Page 15: CIVIUM: GIS For Everyone, The Information Commons, and The Universal Database CMU HCII 4/16/2003 Peter Lucas MAYA Design.

Civium Information Space

3 Rivers Connect -- regional nonprofit whose mission is to prototype the “Information Commons” in SW Pennsylvania

CIVIUM will be a worldwide generalization of this concept

Intended to create an enduring public information resource

Probably for-profit and not-for-profit aspects

Page 16: CIVIUM: GIS For Everyone, The Information Commons, and The Universal Database CMU HCII 4/16/2003 Peter Lucas MAYA Design.

Bulk Imports of Open-source Data

Provides consistent points of reference for users (Places, Airlines, Corporations, Schools, Governments, etc, etc)

Uniform identifiers vastly simplify data fusion problems.

Real world data come with properties (populations, geolocations, census data…) and relationships (distances, transportation networks…) that impose structure and texture on the information space.

Each import enriches the semantic “web of facts”

Page 17: CIVIUM: GIS For Everyone, The Information Commons, and The Universal Database CMU HCII 4/16/2003 Peter Lucas MAYA Design.

Civium Information Space

Existing data Places (5.5 million items)

• All worldwide geopolitical entities (every last village!)– Locations (lat/long)– Populations (cities > 100,000)– Administrative units (countries, provinces)– Alternate feature names

• Physical features– Schools, parks, mountains, churches, cemeteries, etc– Marked with feature type, lat/long, often nearby city– Worldwide coverage uneven– Not exhaustive (e.g., not all schools)

Airports• Essentially all commercial and military airports, many airstrips• Runway length• ICAO/FAA codes• Locality; lat/long/elevation

Page 18: CIVIUM: GIS For Everyone, The Information Commons, and The Universal Database CMU HCII 4/16/2003 Peter Lucas MAYA Design.

Civium Information Space

Existing data (con’t) U.S. Military Bases U.S. National Parks Amtrak Stations Sample of detailed regional data (SW Pennsylvania)

• Roads

• Hydrography

• Points of interest

• Zip codes

• Landmark buildings

• Cultural/recreational items

• much more

Page 19: CIVIUM: GIS For Everyone, The Information Commons, and The Universal Database CMU HCII 4/16/2003 Peter Lucas MAYA Design.

Civium Information Space

Will have soon:

• Complete US Census data– Block-level demographics

» Incomes» Ethnicity» Population density

– All roads/railroads, etc– Landmark buildings– Street address ranges– Zip codes & Zip Code Tabulation Areas (ZCTAs)– Will form basis of ability to geocode by address

and lat/long

Page 20: CIVIUM: GIS For Everyone, The Information Commons, and The Universal Database CMU HCII 4/16/2003 Peter Lucas MAYA Design.

Civium Information Space

Will have soon (con’t):• All US schools/colleges and libraries

– Public/private– Demographics– District data– Budget data

• Worldwide transportation network– Highways– Rail– Ship

• Worldwide population density data– Square kilometer resolution– Independent of political borders

• Corporations• Ships at sea• Real-time and historical Weather Reports

Page 21: CIVIUM: GIS For Everyone, The Information Commons, and The Universal Database CMU HCII 4/16/2003 Peter Lucas MAYA Design.

Schemata

Underlying storage mechanism completely schema-free

Schemata are “layered” using VIA “Roles” mechanism

A “Role” is specification of the semantics of specified attribute names.

A U-form may play any number of roles (so long as they do not conflict with each other)

Roles may be added to a u-form at any time Introspective: roles are represented as

u-forms (and therefore have UUIDs) U-forms may have overlapping role-sets, so

ontology compatibility and evolution can be negotiated by partners incrementally

Page 22: CIVIUM: GIS For Everyone, The Information Commons, and The Universal Database CMU HCII 4/16/2003 Peter Lucas MAYA Design.

Schemata

Examples of RolesEntity

Attributes TYPE Value

LABEL string A one word description of thecontents of this u-form

NAME string A multi-word description of the

contents of this u-formDESCRIPTION string A short text description of the

contents of this u-form

Page 23: CIVIUM: GIS For Everyone, The Information Commons, and The Universal Database CMU HCII 4/16/2003 Peter Lucas MAYA Design.

Schemata

Examples of RolesPerson (simplified)

inherits from: ENTITY

Attributes TYPE ValueNAME string The person’s full nameFAMILYNAME string The person’s family nameGIVENNAME string The person’s given nameBIRTHDATE date The person’s birthdateADDRESS UUID Relation to a u-form of

role “address”TELEPHONE UUID relation to a u-form of

role “telephone_numberTITLE string The person’s job title

Page 24: CIVIUM: GIS For Everyone, The Information Commons, and The Universal Database CMU HCII 4/16/2003 Peter Lucas MAYA Design.

Schemata

Relations to Formal Ontologies

Role mechanism is not a substitute for formal ontology efforts (e.g., CyCorp.)

MAYA limits its data characterization efforts only to the most basic roles -- motivation primarily to provide a controlled space of attribute names, not to organize knowledge

Roles provide a syntax for capturing external ontologies Conflicting schemata can be incrementally reconciled by

creating redundant attributes and mapping rules between their values

Page 25: CIVIUM: GIS For Everyone, The Information Commons, and The Universal Database CMU HCII 4/16/2003 Peter Lucas MAYA Design.

GeoBrowser Demo

Page 26: CIVIUM: GIS For Everyone, The Information Commons, and The Universal Database CMU HCII 4/16/2003 Peter Lucas MAYA Design.

GeoBrowser Demo

Page 27: CIVIUM: GIS For Everyone, The Information Commons, and The Universal Database CMU HCII 4/16/2003 Peter Lucas MAYA Design.

Timeline Visualization

Page 28: CIVIUM: GIS For Everyone, The Information Commons, and The Universal Database CMU HCII 4/16/2003 Peter Lucas MAYA Design.

Encouraging Input

Must define a clear value proposition. Candidate:

Give us your bits We will aggregate, organize, and visualize You get back topsight

Other vital considerations Must establish reputation as a venue that

“matters” Low transaction costs essential: people will

contribute if it is easy

Page 29: CIVIUM: GIS For Everyone, The Information Commons, and The Universal Database CMU HCII 4/16/2003 Peter Lucas MAYA Design.

Encouraging Input

Potential model for user contributions:

“Every item a discussion forum”

Each data display (metaphorically) has a “front side” and a “back side”

Front side contains the data and/or visualization

Back side contains user comments, reviews, ratings, etc. (similar to Amazon user reviews)

Page 30: CIVIUM: GIS For Everyone, The Information Commons, and The Universal Database CMU HCII 4/16/2003 Peter Lucas MAYA Design.

Zoom In to South-central Asia

Page 31: CIVIUM: GIS For Everyone, The Information Commons, and The Universal Database CMU HCII 4/16/2003 Peter Lucas MAYA Design.

View w/o Geo-political Borders

Page 32: CIVIUM: GIS For Everyone, The Information Commons, and The Universal Database CMU HCII 4/16/2003 Peter Lucas MAYA Design.

Drag-select Region

Page 33: CIVIUM: GIS For Everyone, The Information Commons, and The Universal Database CMU HCII 4/16/2003 Peter Lucas MAYA Design.

Pop-up Display: Refugee Data

QuickTime™ and a decompressorare needed to see this picture.

Page 34: CIVIUM: GIS For Everyone, The Information Commons, and The Universal Database CMU HCII 4/16/2003 Peter Lucas MAYA Design.

Desktop “Front Side”

QuickTime™ and aTIFF (Uncompressed) decompressorare needed to see this picture.

Page 35: CIVIUM: GIS For Everyone, The Information Commons, and The Universal Database CMU HCII 4/16/2003 Peter Lucas MAYA Design.

Desktop “Back Side”

Page 36: CIVIUM: GIS For Everyone, The Information Commons, and The Universal Database CMU HCII 4/16/2003 Peter Lucas MAYA Design.

Palmtop “Front Side”

QuickTime™ and aTIFF (Uncompressed) decompressorare needed to see this picture.

Page 37: CIVIUM: GIS For Everyone, The Information Commons, and The Universal Database CMU HCII 4/16/2003 Peter Lucas MAYA Design.

Palmtop “Back side”:

QuickTime™ and aTIFF (Uncompressed) decompressorare needed to see this picture.

Page 38: CIVIUM: GIS For Everyone, The Information Commons, and The Universal Database CMU HCII 4/16/2003 Peter Lucas MAYA Design.

Summary

Civium will comprise an open, public information space not controlled by anyone

Peer-to-peer architecture and replication decouple the data from any particular set of storage venues

Centrally-maintained “armature” of bulk-imported public data serves as trellis upon which user-contributed data will accrue

Ultra-peer network of terabyte-scale machines provides framework for access.