UCL DEPARTMENT OF GEOGRAPHY CensusGIV Geographic Information Visualisation of Census Data Pablo...
-
Upload
antony-fields -
Category
Documents
-
view
220 -
download
0
Transcript of UCL DEPARTMENT OF GEOGRAPHY CensusGIV Geographic Information Visualisation of Census Data Pablo...
UCL DEPARTMENT OF GEOGRAPHYUCL DEPARTMENT OF GEOGRAPHYUCL DEPARTMENT OF GEOGRAPHY
CensusGIVGeographic Information Visualisation of Census Data
Pablo Mateos
Oliver O’Brien
Department of Geography
University College London
CASA Seminar9 December 2009
www.censusprofiler.org
UCL DEPARTMENT OF GEOGRAPHY
Contents
• Context & Justification• CensusGIV Aims &
Objectives• Design Considerations• System Architecture• Demo
UCL DEPARTMENT OF GEOGRAPHY
Context & Justification
UCL DEPARTMENT OF GEOGRAPHY
The Generation
• Those born after 1993 have only known life with the Internet– A generation whose first port of call for knowledge is the
internet through Google’s search engine, as opposed to books, libraries or traditional (off-line) information sources
(CIBER, 2008)
UCL DEPARTMENT OF GEOGRAPHY
Moving Beyond “Traditional Web-GIS”
UCL DEPARTMENT OF GEOGRAPHY
Geographic Visualisation at UCL
• www.londonprofiler.org• www.maptube.org• www.publicprofiler.org/WorldNames• www.nationaltrustnames.org• atlas.publicprofiler.org
• Coming soon www.censusprofiler.org
UCL DEPARTMENT OF GEOGRAPHY
London Profiler – KML Search & Feeds
UCL DEPARTMENT OF GEOGRAPHY
Geoweb 2.0 in Teaching
UCL Geography undergraduate field course in London
UCL DEPARTMENT OF GEOGRAPHY
Geovisualisation (GVis)
• Refers to the visual representation of spatial data. • GVis as a research tool use for:
– Hypotheses generation, knowledge discovery, analysis, presentation and evaluation (Buckley, 2000)
• Increasing realisation of the potential for ‘geography’ to provide the primary basis for innovative visualisation and knowledge exploration
(Dodge, McDerby and Turner, 2006)
• Recognised potential of GVis – To make sense of increasingly large datasets– Produce alternative representations of space
UCL DEPARTMENT OF GEOGRAPHY
Current Census Thematic Maps by ONS
• Neighbourhood Statistics (NeSS)– 11 steps to
view a census thematic map!
• Mapping in CASWEB – not present
UCL DEPARTMENT OF GEOGRAPHY
NeSS maps via SVG/ Flex applications
UCL DEPARTMENT OF GEOGRAPHY
The need for Census mapping is clear!
UCL DEPARTMENT OF GEOGRAPHY
gCensus: A First Approach
• Query-based KML maps of 2000 US Census variables• http://gecensus.stanford.edu
UCL DEPARTMENT OF GEOGRAPHY
CensusGIV Aims & Objectives
UCL DEPARTMENT OF GEOGRAPHY
CensusGIV: Objectives
1. Develop a prototype to provide innovative geographical visualization of the Census small area statistics datasets.
2. Provide an extensive technical evaluation of the different technological alternatives.
3. Proposal to scale up to a full service in 2011.
4. Promote the use of innovative geographic visualisation of population datasets using mapping mashups.
UCL DEPARTMENT OF GEOGRAPHY
CensusGIV: Plan
• ESRC Census Development grant £80,000• Timeframe: 15 months (2009/10)• Develop a Geovisualisation prototype of the UK
2011 Census using “Geoweb 2.0” technologies• Mapping mashups based on data feeds from an
ONS “Census hypercube” or NeSS data stream
UCL DEPARTMENT OF GEOGRAPHY
People
• UCL Geography– Pablo Mateos (P.I.)– Paul Longley (co-P.I.)– Oliver O’Brien
• UCL CASA– Mike Batty (co-P.I.)– Richard Milton (consultant)
• User Panel– Jointly with EDINA DIaD project
UCL DEPARTMENT OF GEOGRAPHY
CensusGIV: Requirements and Issues
• User not faced with queries or complex questions!– Start with a map (e.g. population density)– Automatic scale-determined geographical units– Base map backdrop
• Available to the general public & “mashable”• Issues:
– Intellectual Property Rights • Geographic boundaries & Census datasets
– Data size: Over 3,000 Census variables x 300k geog units– Managing a large number of concurrent users
UCL DEPARTMENT OF GEOGRAPHY
Evaluation Criteria for Final Solution
• Scalability• Response time• Maximum number of concurrent users• Data storage and retrieval• Flexibility of geovisualisation options• Ease of use and simplicity• Intellectual Property Rights (IPR) issues• Cost of development and implementation
UCL DEPARTMENT OF GEOGRAPHY
Geovisualisation Prototypes
• Different technologies have been explored:– WMS/ WFS– Adobe Flash (Flex) vector maps– SVG vector maps– KML vector maps with Google Maps API– Raster maps with OpenLayers
UCL DEPARTMENT OF GEOGRAPHY
CensusGIV: Timeline
• October 2008 – February 2009– Evaluation phase (completed)
• October 2009 – June 2010– Developing prototype
• Trade-offs to be made between:– response time, storage space, concurrent users, IPR protection,
ease of navigability, flexible visualisation, back-end/front-end solutions, cost
• First version of prototype to be tested this month• ONS / Census Programme to decide full implementation
for 2011 Census
UCL DEPARTMENT OF GEOGRAPHY
Design Considerations
UCL DEPARTMENT OF GEOGRAPHY
Fundamental Design Decisions
• Server-based rasters– Faster on the client side– Fast enough on the server side– Not delivering restricted data to the client
• Open Source software– Leverage the powerful OpenLayers mapping API– More powerful than Google Maps API– An active development community– Full access to the source – can do “cool stuff”
• “Slippy” map– Intuitive– Encourages exploration
UCL DEPARTMENT OF GEOGRAPHY
Maps of Population Data
• Cartograms– Fairer
representation– Multiple variables
can be shown together
• Choropleth Maps– Easier to relate to
• Surface mapping– Interpolation
UCL DEPARTMENT OF GEOGRAPHY
Accessing the Census Data
• Neighbourhood Statistics– Hunter vs Gatherer– NeSS Data API (SOAP)– CSV Downloads
• Still tedious – for each UV:– Download files for each GOR– Stitch them together (has been automated)– Create corresponding tables in the database, add data– Add ranking scores– Add metadata
– NeSS Data API (REST) coming February 2010
• CASWEB
UCL DEPARTMENT OF GEOGRAPHY
Structure of The Web-App
• OpenLayers “slippy” map– Fully opaque grey base layer
• Could be switched for aerial imagery from Google/Microsoft
– Opaque choropleth overlay• Variable translucency if aerial
imagery underneath
– Context overlay• Points, lines and names
• Sea area in lighter grey
• Otherwise transparent
– POIs• e.g. schools, hospitals
• SVG vectors rather than tiles
• “Clickable”
UCL DEPARTMENT OF GEOGRAPHY
Screenshot
UCL DEPARTMENT OF GEOGRAPHY
Why a Custom Context Layer?
• Having full control is a definite advantage– Underlay
• Google colours/features can clash with choropleths
• Lose the context if choropleth is fully opaque
– Overlay• Google labels can obscure information
• Google’s cartography recently changed (for the better)
• But no control over future changes
UCL DEPARTMENT OF GEOGRAPHY
Cartography of the Context Layer
• Difficult to get right– Urban vs Rural
• Strictly Black & White• Few point features
– Hospitals, airports, place names
• Fewer areal features– Lakes, sea
• Mainly a network of roads/rivers/railways• Less is more
UCL DEPARTMENT OF GEOGRAPHY
Creating the Context Layer
• PostGIS database– Using the OpenStreetMap dataset for the UK– Relatively slow to create the images from the data
• ~50 database queries for each image tile
• Higher zoom levels have tiles with smaller extent, but we include more detail at these levels, which cancels out the speed increase
– Render on demand “unimportant” tiles at zoom levels 16-18– Pre-render everything else
• Painter’s Algorithm
UCL DEPARTMENT OF GEOGRAPHY
Painter’s Algorithm
• Two hierarchies of layering– Feature-based layering
• Land, water, road/railway casings & cores, place names
– Intra-feature level z-ordering• Complex road junctions• Railway/road crossing
UCL DEPARTMENT OF GEOGRAPHY
Pre-rendering of Context Layer• Rendered on “gibin”, a quad-core computer running Linux
• Utilising the Python “Threading” module – 4 tiles created at once
• The image “tiles” are PNGs with an alpha layer
• Bounding box: -10.7 W to 1.8 E, 49.8 N to 60.9 N (All of the UK)
Zoom Level
Scale No of Tiles
Size /MB
Detail Time /min
6-9 < 1:1M 790 5 Cities, motorways < 1
10 1:600,000 2,146 15 + towns, trunk roads, lakes 1
11 1:300,000 8,208 40 + main roads, rivers, airfields 2
12 1:150,000 32,318 156 + minor roads, railways, villages 7
13 1:72,000 128,250 500 + main road, water & area names 24
14 1:36,000 510,962 1.4 GB + paths 1h 28
15 1:18,000 2,041,572 4.4 GB + minor road names 5h 34
UCL DEPARTMENT OF GEOGRAPHY
The Context Layer (Levels 6-11)
UCL DEPARTMENT OF GEOGRAPHY
The Context Layer (Levels 12-17)
On Demand On Demand
UCL DEPARTMENT OF GEOGRAPHY
Creating the Choropleth Layers
• PostGIS database of census data• Would never want to pre-render all the choropleths at all zoom levels
– 1000+ metrics × 10 groupings × 30 colour schemes × 2 colour orders × 13 zoom levels × 000s of tiles per zoom
– Makes sense to cache most popular zoom levels, metrics, colours
– Most people will never “explore” the map at a greater zoom level - usage decreases exponentially with the number of clicks in a web app.
• Specially crafted URL– Boundary Table, Data Table, Metric
– Bounding Box, Zoom
– Colour Scheme, No of Groups
– Range Type, Range Attributes• Min/Max
• Average/Deviation
UCL DEPARTMENT OF GEOGRAPHY
The Modifiable Areal Unit Problem
Boundary Type
Level Average No of Vertices (Simplified)
Number
MSOA 6, 7, 8 145 7,196 (Eng & Wal)
LSOA 9, 10, 11 62 40,884 (not N.I.)
OA 12 - 18 26 223,131
UCL DEPARTMENT OF GEOGRAPHY
The Modifiable Areal Unit Problem
MSOA
UCL DEPARTMENT OF GEOGRAPHY
The Modifiable Areal Unit Problem
LSOA
UCL DEPARTMENT OF GEOGRAPHY
The Modifiable Areal Unit Problem
OA
UCL DEPARTMENT OF GEOGRAPHY
The Modifiable Areal Unit Problem
UCL DEPARTMENT OF GEOGRAPHY
Colour Theory
• “practical guidance to colour mixing and the visual impacts of specific colour combinations”
• Formal considerations– Colour harmony (complementary colours – pink vs blue)– Colour context (bright colours beside subdued colours)– Colour blindness
• Very subjective
UCL DEPARTMENT OF GEOGRAPHY
Colour Considerations
• Colour should relate to data type:– Sequential– Diverging– Qualitative
• The “most of the UK is countryside” problem– Try not to use bright colours for the countryside.
• Hot Bad High Cold Natural Good Neutral Girls Boys
UCL DEPARTMENT OF GEOGRAPHY
Colour Harmony
• Colour Harmony– Complementary Colours– Analogous Colours
• Colour Variation– Hue– Saturation– Lightness
UCL DEPARTMENT OF GEOGRAPHY
Colourbrewer
• Cynthia Brewer’s colorbrewer2.com– Provides a set of “good” colour schemes which can be
incorporated easily into Python scripts, ArcMap, etc.– Generally vary by hue and/or lightness
• Sequential– Lightness should be varied, use analogous colours if varying hue– Plenty of “good” maps that don’t follow this rule
• Diverging– Mid-point should be a light colour– Extremes should have darker colours with complementary hues
• Qualitative– Hues should vary
UCL DEPARTMENT OF GEOGRAPHY
Aerial Imagery
• Very easy!• OpenLayers
– Google Maps imagery layer– Microsoft Virtual Earth layer
• Only useful when zoomed in• Need to be mindful that
colour imagery interferes with choropleth colours
• No longer self-contained
layerAerial = new OpenLayers.Layer.Google("Aerial Imagery", {numZoomLevels: 16, type: G_SATELLITE_MAP, sphericalMercator: true});
UCL DEPARTMENT OF GEOGRAPHY
Points of Interest (POIs)
• PostgreSQL (or MySQL) database• Can be a completely separate server• Client’s OpenLayers does the work• Aim is to provide even more context
• School names & performance indicators
UCL DEPARTMENT OF GEOGRAPHY
User Interface
• How do you get people to explore the maps?– Maptube “visual directory”– Hierarchical drop-down lists– Tag cloud of keywords, maybe with a hierarchy
Less is more Choice is good
UCL DEPARTMENT OF GEOGRAPHY
User Interface – Tag Cloud
• Useful for exploring if you don’t know what you want• More structured alternative needed for specific research
UCL DEPARTMENT OF GEOGRAPHY
System Architecture
UCL DEPARTMENT OF GEOGRAPHY
A Note on Python
• If you don’t use it already, you will!– ArcGIS 9.4
• “Python is now integrated directly into ArcMap [9.4]. I say it every year, but if you are an ArcGIS Desktop user, you need to take a close look at python as your scripting language.” - James Fee
• The best thing about Python is:– Tidy scripts!
UCL DEPARTMENT OF GEOGRAPHY
Server Room
Servers
tiler1tiles1
tiler2tiles2www
tiler3tiles3
blog pois
Web browsers
tbadev
UCL DEPARTMENT OF GEOGRAPHY
System Architecture – Website
Apache(www)
Web browser
UCL DEPARTMENT OF GEOGRAPHY
System Architecture – Context
Apache(tiles2)
Tile exists
?
Yes
Tile
No
404 • No python involved– Less strain on the
server
• Web browser may have to request image twice– Slow for the client
Apache(www)
Web browser
UCL DEPARTMENT OF GEOGRAPHY
Python
System Architecture – Context
Apache(tiler2)
renderer.py
Cache
XML
gen_tile.py
mod_python
Apache(www)
Web browser
XML
UCL DEPARTMENT OF GEOGRAPHY
Python
System Architecture – Choropleth
Apache(www)
Apache(tiler3)
mod_python
renderer.py
Tile exists
?Yes
Tile
No
Cache (low
zoom)
Colorbrewer
gen_tile.py
Web browser
UCL DEPARTMENT OF GEOGRAPHY
Scalability
• OpenLayers allows multiple servers to be specified for retrieving image tiles• Different servers for
different tasks• Random server chosen
per-tile• So should scale?
• Process is still processor intensive if generating the tiles at the same time
• Stress testing needed
UCL DEPARTMENT OF GEOGRAPHY
Prototype: Current State & Next Steps
On-demand tile generation
Fast (enough) Will scale (hopefully!) OSM not quite “complete”
but getting there Context layer finished Some data added
× Legend× Automated data updates× Tag cloud× Improve cartography× Internet Explorer 6
www.savethedevelopers.org× Other census & ONS data× Interactive data combination× Scotland & Northern Ireland× Points of Interest
Running until June 2010
UCL DEPARTMENT OF GEOGRAPHY
Live Demo
• http://www.censusprofiler.org/prototype/
UCL DEPARTMENT OF GEOGRAPHY
Google Earth
UCL DEPARTMENT OF GEOGRAPHY
Q&A
www.censusprofiler.org
www.oliverobrien.co.uk
Google Street View and Google Earth POI data is Copyright Google. Google Maps mapping data is Copyright Tele Atlas. Google aerial imagery is Copyright Digital Globe, Infoterra Ltd, Bluesky, GeoEye, Getmapping plc, The Geoinformation Group. OpenStreetMap data is CC-BY-SA OpenStreetMap and contributors. Logos depicted are generally Copyright of their respective organisations. Some image tiles include boundary information supplied by EDINA’s UKBORDERS service. The Census data is supplied by the Office for National Statistics.
The Word Cloud was produced with Wordle. The colour wheel diagrams are from worqx.com. The Painter’s Algorithm picture and the HSL colour diagram are from Wikipedia. The cartogram was produced by James Cheshire. The corresponding choropleth was produced by the BBC.
The following references were used in the first part of this presentation:CIBER (2008) information behaviour of the researcher of the future. A report commissioned by The British Library and JISC 11 January 2008. http://www.bl.uk/news/pdf/googlegen.pdf Goodchild (2007) Citizens as Sensors: The world of Volunteered Geography. Workshop on Volunteered Geographic Information, Santa Barbara, CA. December 13-14, 2007 http://www.ncgia.ucsb.edu/projects/vgi/docs/position/Goodchild_VGI2007.pdfO’Reilly, T (2005) What Is web 2.0 Design Patterns and Business Models for the Next Generation of Software http://www.oreillynet.com/pub/a/oreilly/tim/news/2005/09/30/whatisWeb20.htmlTurner A (2007) Introduction to Neogeography. O’Reilly Media Short Cuts.