Building Geospatial Business Intelligence Solutions with Free and Open Source Components
Transcript of Building Geospatial Business Intelligence Solutions with Free and Open Source Components
![Page 1: Building Geospatial Business Intelligence Solutions with Free and Open Source Components](https://reader036.fdocuments.us/reader036/viewer/2022081623/613d7a08736caf36b75dc902/html5/thumbnails/1.jpg)
Building Geospatial Business Intelligence Solutions with Free and Open Source Components
FOSS4G 2007
Etienne DubéThierry BadardYvan Bédard
Centre for Research in GeomaticsUniversité Laval, Québec, Canada
![Page 2: Building Geospatial Business Intelligence Solutions with Free and Open Source Components](https://reader036.fdocuments.us/reader036/viewer/2022081623/613d7a08736caf36b75dc902/html5/thumbnails/2.jpg)
Outline
1. BI for dummies.
2. Merging BI and GIS.
3. Open source software for Geospatial BI.GeoKettle: a Spatial ETL tool for data warehousing.
Doing Spatial OLAP with Mondrian.
4. Conclusion, thanks and questions.
![Page 3: Building Geospatial Business Intelligence Solutions with Free and Open Source Components](https://reader036.fdocuments.us/reader036/viewer/2022081623/613d7a08736caf36b75dc902/html5/thumbnails/3.jpg)
What is BI (Business Intelligence)?
“Business intelligence (BI) is a business management term, which refers to applications and technologies that are used to gather, provide access to, and analyze data and information about company operations.”
– Wikipedia
Examples of components and applications:Data warehousing
Reporting tools
Dashboards
Data mining
On-line Analytical Processing (OLAP)
Something your boss or client is possiblyinterested into, and asked you to investigate.
??
© 2005, United Feature Syndicate
![Page 4: Building Geospatial Business Intelligence Solutions with Free and Open Source Components](https://reader036.fdocuments.us/reader036/viewer/2022081623/613d7a08736caf36b75dc902/html5/thumbnails/4.jpg)
The Data Warehouse
Repository of an organization’s historical data, for analysis purposes.Primarily destined to analysts and decision makers.Separate from operational (OLTP) systems (source data).Contents are often presented in a summarized form (e.g. key performance indicators, dashboards).Optimized for:
Large volumes of data (up to terabytes);Fast response to analytical queries (vs. update speed):
de-normalized data schemas,summary (aggregate) data,dimensional modeling.
![Page 5: Building Geospatial Business Intelligence Solutions with Free and Open Source Components](https://reader036.fdocuments.us/reader036/viewer/2022081623/613d7a08736caf36b75dc902/html5/thumbnails/5.jpg)
Why merge BI and GIS software?
Because …
“About eighty percent of all data stored in corporate databases
has a spatial component” [Franklin 1992]
Franklin, C. 1992. An Introduction to Geographic Information Systems: Linking Maps to Databases. Database, April, pp. 13-21
![Page 6: Building Geospatial Business Intelligence Solutions with Free and Open Source Components](https://reader036.fdocuments.us/reader036/viewer/2022081623/613d7a08736caf36b75dc902/html5/thumbnails/6.jpg)
Why merge BI and GIS software?
Imagine you are a decision maker in public health policy…You will certainly have difficulties to answer to questions like:
Where are the urban spots that are more sensitive to heat waves, intense rain, flooding or droughts in a specific geographic area?How many people with cardiovascular, respiratory, neurological and psychological diseases will there be in 2025 and 2050 in a specific geographic area?How many people with low income live alone in a building requiring major repairs in a specific geographic area?
![Page 7: Building Geospatial Business Intelligence Solutions with Free and Open Source Components](https://reader036.fdocuments.us/reader036/viewer/2022081623/613d7a08736caf36b75dc902/html5/thumbnails/7.jpg)
To answer these questions …
You can use:GIS
Implies the writing of very complex SQL queries
Sometimes, a long and hard job which requires dedicated human resources
Need to be done anew everytime data change or new analyses have to be achieved
Classical BI tools (OLAP clients, reporting tools)Unable to handle the spatial dimension of data (or only a very basic support)
Merging GIS and BI tools (e.g. Spatial OLAP)To fully exploit the spatial component
No need to write any SQL statements, just click away!.
![Page 8: Building Geospatial Business Intelligence Solutions with Free and Open Source Components](https://reader036.fdocuments.us/reader036/viewer/2022081623/613d7a08736caf36b75dc902/html5/thumbnails/8.jpg)
# of people with respiratory diseases, by sex, at a specific spatial level
![Page 9: Building Geospatial Business Intelligence Solutions with Free and Open Source Components](https://reader036.fdocuments.us/reader036/viewer/2022081623/613d7a08736caf36b75dc902/html5/thumbnails/9.jpg)
Temporal evolution of heat waves (for 2001, 2025 and 2050)
![Page 10: Building Geospatial Business Intelligence Solutions with Free and Open Source Components](https://reader036.fdocuments.us/reader036/viewer/2022081623/613d7a08736caf36b75dc902/html5/thumbnails/10.jpg)
Spatial drill down operation
![Page 11: Building Geospatial Business Intelligence Solutions with Free and Open Source Components](https://reader036.fdocuments.us/reader036/viewer/2022081623/613d7a08736caf36b75dc902/html5/thumbnails/11.jpg)
# of people 55 to 84 years old who live alone
3 cartographic representations of the same analysis
![Page 12: Building Geospatial Business Intelligence Solutions with Free and Open Source Components](https://reader036.fdocuments.us/reader036/viewer/2022081623/613d7a08736caf36b75dc902/html5/thumbnails/12.jpg)
# of people 55 to 84 years old who live alone
3 cartographic representations of the same analysis
![Page 13: Building Geospatial Business Intelligence Solutions with Free and Open Source Components](https://reader036.fdocuments.us/reader036/viewer/2022081623/613d7a08736caf36b75dc902/html5/thumbnails/13.jpg)
# of people 55 to 84 years old who live alone
3 cartographic representations of the same analysis
![Page 14: Building Geospatial Business Intelligence Solutions with Free and Open Source Components](https://reader036.fdocuments.us/reader036/viewer/2022081623/613d7a08736caf36b75dc902/html5/thumbnails/14.jpg)
# of people 55 to 84 years old who live alone
3 cartographic representations of the same analysis
The previous screenshots come from a prototype developed on JMap Spatial OLAP software from Kheops Technology in the SII-41 project “An innovative interactive web tool to better understand climate-related health vulnerabilities” (co-leaders : Profs. Pierre Gosselin and Thierry Badard) funded by the GEOIDE NCE in Geomatics
![Page 15: Building Geospatial Business Intelligence Solutions with Free and Open Source Components](https://reader036.fdocuments.us/reader036/viewer/2022081623/613d7a08736caf36b75dc902/html5/thumbnails/15.jpg)
Components of a BI infrastructure
ETL systems
Data loading
Data sources(OLTP systems)
Dataextraction
Data Warehouse OLAP
Reporting tools
Data mining
![Page 16: Building Geospatial Business Intelligence Solutions with Free and Open Source Components](https://reader036.fdocuments.us/reader036/viewer/2022081623/613d7a08736caf36b75dc902/html5/thumbnails/16.jpg)
Introduction to ETL
A type of software used to populate the data warehouse, from one or many OLTP data sources.
ETL:Extract data from operational sources;
Transform it, to correct errors, conform it to defined standards and restructure contents to fit target schema;
Load data into the warehouse.
ETL handles both the insertion of new data and the update of existing data.
![Page 17: Building Geospatial Business Intelligence Solutions with Free and Open Source Components](https://reader036.fdocuments.us/reader036/viewer/2022081623/613d7a08736caf36b75dc902/html5/thumbnails/17.jpg)
Pentaho Data Integration (Kettle project)
Free software (LGPL) ETL tool, built with Java.Originally developed by Matt Casters (www.ibridge.be).LGPL since december 2005.Acquired by Pentaho Corp. (an open source BI company) in April 2006.Runs on Windows, Linux, MacOS X and any other platform supporting Java & SWT.http://kettle.pentaho.org
![Page 18: Building Geospatial Business Intelligence Solutions with Free and Open Source Components](https://reader036.fdocuments.us/reader036/viewer/2022081623/613d7a08736caf36b75dc902/html5/thumbnails/18.jpg)
GeoKettle: a geo-enabled version of Kettle
Kettle handles typical SQL data types:Number, String, Date, Boolean, Integer, BigNumber, Binary
What do we need to do to add support for geospatial vector data?
A native Geometry data type.
Some I/O support for vector GIS files and DBMS.
Transformation steps for:topological predicates (intersects, contains, …)
spatial analysis (overlays, buffers, …)
Scripting support for Geometry objects (JavaScript).
![Page 19: Building Geospatial Business Intelligence Solutions with Free and Open Source Components](https://reader036.fdocuments.us/reader036/viewer/2022081623/613d7a08736caf36b75dc902/html5/thumbnails/19.jpg)
Kettle’s GUI
Using Spoon to create a GeoKettle ETL transformation:
![Page 20: Building Geospatial Business Intelligence Solutions with Free and Open Source Components](https://reader036.fdocuments.us/reader036/viewer/2022081623/613d7a08736caf36b75dc902/html5/thumbnails/20.jpg)
Geometry data type
Kettle data types apply to Value objects, each value corresponding to a field in a row.
We added a new Geometry data type, based on the GeOxygene framework.(http://oygene-project.sourceforge.net)
![Page 21: Building Geospatial Business Intelligence Solutions with Free and Open Source Components](https://reader036.fdocuments.us/reader036/viewer/2022081623/613d7a08736caf36b75dc902/html5/thumbnails/21.jpg)
I/O of geospatial data
We have implemented native supportfor PostGIS 1, using its PostgreSQLJDBC Wrapper.
Values read from/written to GEOMETRY columns are transparently converted back and forth between PGGeometry and GeoKettle’s native Geometryobjects.
No need to use AsText() and GeomFromText() !
Also read-only support for Shapefiles(using GeoTools 2).
Geometries converted to Geometry type, and other alphanumeric fields (in DBF file) converted to appropriate basic types.
1. PostGIS is Refractions Research’s spatial extension for PostgreSQL: postgis.refractions.net2. GeoTools is an open source Java GIS toolkit: geotools.codehaus.org
![Page 22: Building Geospatial Business Intelligence Solutions with Free and Open Source Components](https://reader036.fdocuments.us/reader036/viewer/2022081623/613d7a08736caf36b75dc902/html5/thumbnails/22.jpg)
Spatial analysis and scripting functionalities
Topological predicates for “Filter rows” step (e.g. intersects, contains, is disjoint from…).
Exposing Geometry objects in JavaScript.
![Page 23: Building Geospatial Business Intelligence Solutions with Free and Open Source Components](https://reader036.fdocuments.us/reader036/viewer/2022081623/613d7a08736caf36b75dc902/html5/thumbnails/23.jpg)
Upcoming features for GeoKettle
Read/write support for more GIS file formats (supported by GeoTools) and DBMS (e.g. Oracle Spatial).
A GUI transformation step for spatial analysis.
Enforcement of SRIDs and native support for coordinate system transformations.
Embedded map viewer (for transformation preview).
![Page 24: Building Geospatial Business Intelligence Solutions with Free and Open Source Components](https://reader036.fdocuments.us/reader036/viewer/2022081623/613d7a08736caf36b75dc902/html5/thumbnails/24.jpg)
Components of a BI infrastructure
ETL systems
Data loading
Data sources(OLTP systems)
Dataextraction
Data Warehouse OLAP
Reporting tools
Data mining
![Page 25: Building Geospatial Business Intelligence Solutions with Free and Open Source Components](https://reader036.fdocuments.us/reader036/viewer/2022081623/613d7a08736caf36b75dc902/html5/thumbnails/25.jpg)
Intro to OLAP and Spatial OLAP
OLAP – On-Line Analytical Processing
“… is an approach to quickly providing answers to analytical queries that are multidimensional in nature.”
– Wikipedia
Insistence on quick: response time < 5 seconds
OLAP server and query languages (MDX).
OLAP clients:Cross-tabs
Charts (histograms, pie charts, graphs)
Spatial OLAP (SOLAP) adds support for geospatial data (map displays and interaction).
![Page 26: Building Geospatial Business Intelligence Solutions with Free and Open Source Components](https://reader036.fdocuments.us/reader036/viewer/2022081623/613d7a08736caf36b75dc902/html5/thumbnails/26.jpg)
OLAP and SOLAP vocabulary
CubeDimension:
TemporalThematicGeospatial
HierarchyLevelMemberMeasure
DescriptiveGeospatial
Fact
![Page 27: Building Geospatial Business Intelligence Solutions with Free and Open Source Components](https://reader036.fdocuments.us/reader036/viewer/2022081623/613d7a08736caf36b75dc902/html5/thumbnails/27.jpg)
OLAP and SOLAP vocabulary
CubeDimension:
TemporalThematicGeospatial
HierarchyLevelMemberMeasure
DescriptiveGeospatial
Fact
Store sales Warehouseinventory
Suppliersorders
![Page 28: Building Geospatial Business Intelligence Solutions with Free and Open Source Components](https://reader036.fdocuments.us/reader036/viewer/2022081623/613d7a08736caf36b75dc902/html5/thumbnails/28.jpg)
OLAP and SOLAP vocabulary
CubeDimension:
TemporalThematicGeospatial
HierarchyLevelMemberMeasure
DescriptiveGeospatial
Fact
Geospatial Temporal
Thematic
![Page 29: Building Geospatial Business Intelligence Solutions with Free and Open Source Components](https://reader036.fdocuments.us/reader036/viewer/2022081623/613d7a08736caf36b75dc902/html5/thumbnails/29.jpg)
OLAP and SOLAP vocabulary
CubeDimension:
TemporalThematicGeospatial
HierarchyLevelMemberMeasure
DescriptiveGeospatial
Fact
![Page 30: Building Geospatial Business Intelligence Solutions with Free and Open Source Components](https://reader036.fdocuments.us/reader036/viewer/2022081623/613d7a08736caf36b75dc902/html5/thumbnails/30.jpg)
OLAP and SOLAP vocabulary
CubeDimension:
TemporalThematicGeospatial
HierarchyLevelMemberMeasure
DescriptiveGeospatial
Fact
Product
Place
Time
2005-11
Cross-c
ountr
y skis
Quebe
c City
FactDimensions Measures
Place Time Product Sold units Sales price
Quebec City 2005-11 XC skis 582 $145,500
![Page 31: Building Geospatial Business Intelligence Solutions with Free and Open Source Components](https://reader036.fdocuments.us/reader036/viewer/2022081623/613d7a08736caf36b75dc902/html5/thumbnails/31.jpg)
Mondrian (Pentaho Analysis Services)
Mondrian is an open source(Common Public License)OLAP server, written in Java.
Originally developed by Julian Hyde, since 2001. Acquired by Pentaho Corp. in November 2005.
Uses MDX as its query language.
JDBC connections to data sources (ROLAP).
FOSS projects using Mondrian:JPivot (JSP-based web OLAP client)
Other Pentaho BI components
JRubik (desktop OLAP client, with Swing GUI)
http://mondrian.pentaho.org
![Page 32: Building Geospatial Business Intelligence Solutions with Free and Open Source Components](https://reader036.fdocuments.us/reader036/viewer/2022081623/613d7a08736caf36b75dc902/html5/thumbnails/32.jpg)
Using geospatial data with Mondrian
We have a data warehouse based on PostgreSQL + PostGIS. Let’s serve Spatial OLAP cubes from that!
Solution: use PostGIS JDBC wrapper with Mondrian:We can define spatial member properties for GEOMETRYcolumns in the cube schema.
The client application retrieves the spatial property value and casts it to org.postgis.PGgeometry.
Display it on a map, do spatial analysis and other funky stuff.
Unlike other projects combining GIS and OLAP and as far as we know, this approach is the first to integrate geo objects as part of the cube (instead of fetching them from an external spatial DBMS or GIS file).
![Page 33: Building Geospatial Business Intelligence Solutions with Free and Open Source Components](https://reader036.fdocuments.us/reader036/viewer/2022081623/613d7a08736caf36b75dc902/html5/thumbnails/33.jpg)
Upcoming work: towards GeoMondrian
Implement a native geospatial MDX data type in Mondrian…
… to uniformize handling of geodata, regardless of source DBMS (PostGIS, Oracle Spatial).
… to enable the development of Geospatial MDX extensions (spatial analysis and aggregate functions).
To achieve a complete Geospatial BI solution, develop graphical and web front-ends such as dashboards combining cross-tabs, charts and map displays.
![Page 34: Building Geospatial Business Intelligence Solutions with Free and Open Source Components](https://reader036.fdocuments.us/reader036/viewer/2022081623/613d7a08736caf36b75dc902/html5/thumbnails/34.jpg)
Conclusion
Open source BI is still in its infancy…Open source Geospatial BI is even younger…But now is your chance to participate in the growth of this new and exciting segment of FOSS!
Stay tuned for an alpha release of GeoKettle, at http://geosoa.scg.ulaval.ca.A video file which illustrates the capabilities of GeoKettle is already available at: http://geosoa.scg.ulaval.ca/fr.
![Page 35: Building Geospatial Business Intelligence Solutions with Free and Open Source Components](https://reader036.fdocuments.us/reader036/viewer/2022081623/613d7a08736caf36b75dc902/html5/thumbnails/35.jpg)
Acknowledgments
NSERC Industrial Research Chair in Geospatial Databases for Decision Support (Prof. YvanBédard, Université Laval)http://mdspatialdb.chair.scg.ulaval.ca
GeoSOA research group (Prof. Thierry Badard, Université Laval) on Geospatial Service Oriented Architectures for mobile decision-supporthttp://geosoa.scg.ulaval.ca
Canadian Institute of Geomatics Scolarship Awardhttp://www.cig-acsg.ca
![Page 36: Building Geospatial Business Intelligence Solutions with Free and Open Source Components](https://reader036.fdocuments.us/reader036/viewer/2022081623/613d7a08736caf36b75dc902/html5/thumbnails/36.jpg)
Appendices
![Page 37: Building Geospatial Business Intelligence Solutions with Free and Open Source Components](https://reader036.fdocuments.us/reader036/viewer/2022081623/613d7a08736caf36b75dc902/html5/thumbnails/37.jpg)
Code snippet
How to retrieve a PGgeometry object from a Mondrian member property…
// ...// m is an existing Member object (mondrian.olap.Member)for(mondrian.olap.Property prop : m.getProperties()) {
pw.println(" property: " + prop.getName());Object pval = m.getPropertyValue(prop.getName());String pvalstr;
if(pval instanceof org.postgis.PGgeometry) {
// property is a PostGIS geometryorg.postgis.PGgeometry pggeom = (org.postgis.PGgeometry) pval;// convert geometry to WKT stringpvalstr = pggeom.toString();
// We could also do something else with the PostGIS// geometry from the member, e.g. convert it to a GIS// framework object (JTS, GeOxygene, ...), then use it// for displaying a web map or doing spatial analysis.
}else {
// ...