Post on 29-Dec-2015
Digital Data Visualization
May 1, 2001
Hwan-Seung Yong
Dept. of Computer Science & Eng
Ewha Womans Univ.
hsyong@ewha.ac.kr
Data Visualization 2
Contents
• Background
• Visualization Example– OLAP
– Data Mining• Multimedia Data Mining
• Spatial Data Mining
• Text Mining
• New Visual Approach – Visual ICON Language
– Visual Language
• Future Trend
Data Visualization 3
Database View of Data Visualization
• File Processing: – Record by record navigation,
• Network/Hierarchical Data Model– Record based interface using Text
– Records have network/hierarchical structure
• Conceptual Modeling, Database Design– Entity-Relationship Model
– ER Diagram
• Relational Model– 2 dimensional Table
– QBE User Interface: 2 dimensional
Data Visualization 4
ERwin Database Designer
Data Visualization 5
Access Query Interface: QBE
Data Visualization 6
Definition of Visualization
• To form a mental vision, image, or picture of something not visible (an abstraction)– To make visible to the mind or imagination
– [Oxford Dictionary, 1989]
• Visualization is a method of Computing– It transforms the symbolic into geometric, enabling researchers to
observe their simulations and computation.
– Enrich the process of scientific discovery
– Foster profound and unexpected insights
– In many fields, it is already revolutionizing the way scientists do science
– [MCC89]
Data Visualization 7
Scientific Visualization/Goals
• exploration/exploitation of data and information
• enhancing understanding of concepts and processes
• gaining new (unexpected, profound) insights
• making invisible visible
• effective presentation of significant features
• quality control of simulations, measurements
• increasing scientific productivity
• medium of communication/collaboration
Data Visualization 8
Visualization and adjacent disciplines
• Computer Graphics: Efficiency of algorithms (CG) vs effectiveness of use (V).
• Computer Vision: Mapping from pictures to abstract description (CV) vs mapping from abstract description to pictures (V).
• Image Processing: Mapping from data domain to data domain (IP) vs mapping from data domain to picture domain (V).
• Art and Design: Aesthetics and style (AD) versus expressiveness and effectiveness (V).
Data Visualization 9
Kind of Digital Data
• Atomic Value (Numeric, String, Boolean)
• Multimedia Data– Sound & Audio, Video, Text
• Complex Data Structure– Tuple, Set, Array, Stack, Queue, Tree, Graph etc
• Large Set of Data– Database
• What to visualize?
• Why
• How
Data Visualization 10
New Data Processing Technique
• Object-Oriented/Relational Data Model– Complex Data: Graph style
• Multimedia: – Visual Interface is required
– Time/Space/Sound and 3 dimension
• Data Warehousing, OLAP and– Multi-dimensional Modeling and Cube Browser
• Data Mining– Visual Interface for Mining
– Visual data mining• Data pattern analysis
• Clustering
Data Visualization 11
Why Visualization?
• Development of H/W and S/W– Computer graphic and visualization technology
• Interactive and Windows Age
• Visual programming Language– Visual Basic, Visual C++ etc.
• Visual ICON language – Emoticon
• Multimedia and Animation
Data Visualization 12
Scientific Data Visualization
Data Visualization 13
Boxplot Analysis
• Five-number summary of a distribution:Minimum, Q1, M, Q3, Maximum
• Boxplot– Data is represented with a box
– The ends of the box are at the first and third quartiles, i.e., the height of the box is IRQ
– The median is marked by a line within the box
– Whiskers: two lines outside the box extend to Minimum and Maximum
Data Visualization 14
A Boxplot
A boxplot
Data Visualization 15
Visualization of Data Dispersion: Boxplot Analysis
Data Visualization 16
Data Visualization Systems
• AVS, IBM Visualization Data Explorer, SGI Explorer
• Khoros, SciAn, other PD vis packages
• NetMap
• S-Plus, SPSS, MatLab, Mathematica, MAPLE
• XmdvTool, Xgobi
• Xsauci
Data Visualization 17
From Tables and Spreadsheets to Data Cubes
• A data warehouse is based on a multidimensional data model which vie
ws data in the form of a data cube
• A data cube, such as sales, allows data to be modeled and viewed in multi
ple dimensions
– Dimension tables, such as item (item_name, brand, type), or time(day, wee
k, month, quarter, year)
– Fact table contains measures (such as dollars_sold) and keys to each of the rel
ated dimension tables
• In data warehousing literature, an n-D base cube is called a base cuboid. T
he top most 0-D cuboid, which holds the highest-level of summarization, i
s called the apex cuboid. The lattice of cuboids forms a data cube.
Data Visualization 18
Visualization of OLAP Model using Star Schema
time_keydayday_of_the_weekmonthquarteryear
time
location_keystreetcityprovince_or_streetcountry
location
Sales Fact Table
time_key
item_key
branch_key
location_key
units_sold
dollars_sold
avg_sales
Measures
item_keyitem_namebrandtypesupplier_type
item
branch_keybranch_namebranch_type
branch
Data Visualization 19
A Concept Hierarchy: Dimension (location)
all
Europe North_America
MexicoCanadaSpainGermany
Vancouver
M. WindL. Chan
...
......
... ...
...
all
region
office
country
TorontoFrankfurtcity
Data Visualization 20
View of Warehouses and Hierarchies
Specification of hierarchies
• Schema hierarchy
day < {month < quarter; week} < year
• Set_grouping hierarchy
{1..10} < inexpensive
Data Visualization 21
Multidimensional Data
• Sales volume as a function of product, month, and region
Pro
duct
Regio
n
Month
Dimensions: Product, Location, TimeHierarchical summarization paths
Industry Region Year
Category Country Quarter
Product City Month Week
Office Day
Data Visualization 22
A Star-Net Query Model
Shipping Method
AIR-EXPRESS
TRUCKORDER
Customer Orders
CONTRACTS
Customer
Product
PRODUCT GROUP
PRODUCT LINE
PRODUCT ITEM
SALES PERSON
DISTRICT
DIVISION
OrganizationPromotion
CITY
COUNTRY
REGION
Location
DAILYQTRLYANNUALYTime
Each circle is called a footprint
Data Visualization 23
OLAP User Interface: Drilling Down• Drilling Down to the lowest level of Customer Dimension
Data Visualization 24
Examples: Discovery-Driven Data Cubes
Data Visualization 25
Browsing a Data Cube
• Visualization
• OLAP capabilities
• Interactive manipulation
Data Visualization 26
Data Visualization 27
OLAP (Summarization) Display Using MS/Excel 2000
Data Visualization 28
3D Cube Browser
Data Visualization 29
Data Mining Result Visualization
• Presentation of the results or knowledge obtained from data mining in visual forms
• Examples
– Scatter plots and boxplots (obtained from descriptive data mining)
– Decision trees
– Association rules
– Clusters
– Outliers
– Generalized rules
Data Visualization 30
Visualization of Association
Data Visualization 31
Data Visualization 32
Data Visualization 33
Data Visualization 34
Data Visualization 35
Data Visualization 36
Market-Basket-Analysis (Association)—Ball graph
Data Visualization 37
Display of Association Rules in Rule Plane Form
Data Visualization 38
Display of Decision Tree (Classification Results)
Data Visualization 39
Output: A Decision Tree for “buys_computer”
age?
overcast
student? credit rating?
no yes fairexcellent
<=30 >40
no noyes yes
yes
30..40
Data Visualization 40
Visualization of a decision tree in MineSet 3.0
Data Visualization 41
Display of Clustering (Segmentation) Results
Data Visualization 42
C-BIRD: Content-Based Image Retrieval from Digital libraries
Search
by image colors
by color percentage
by color layout
by texture density
by texture Layout
by object model
by illumination invariance
by keywords
Data Visualization 43
Multi-Dimensional Search in Multimedia Databases Color layout
Data Visualization 44
Color histogram Texture layout
Multi-Dimensional Analysis in Multimedia Databases
Data Visualization 45
Refining or combining searches
Search for “blue sky”(top layout grid is blue)
Search for “blue sky andgreen meadows”(top layout grid is blue and bottom is green)
Search for “airplane in blue sky”(top layout grid is blue and keyword = “airplane”)
Mining Multimedia Databases
Data Visualization 46
Multidimensional Analysis of Multimedia Data
• Multimedia data cube– Design and construction similar to that of traditional data cubes from relational da
ta– Contain additional dimensions and measures for multimedia information, such as
color, texture, and shape• The database does not store images but their descriptors
– Feature descriptor: a set of vectors for each visual characteristic• Color vector: contains the color histogram• MFC (Most Frequent Color) vector: five color centroids• MFO (Most Frequent Orientation) vector: five edge orientation centroids
– Layout descriptor: contains a color layout vector and an edge layout vector
Data Visualization 47
Mining Multimedia Databases in
Data Visualization 48
Data Visualization 49
REDWHITE
BLUE
GIFJPEG
By Format
By Colour
Sum
Cross Tab
REDWHITE
BLUE
Colour
Sum
Group By
Measurement
JPEGGIF Small
Very Large
REDWHITEBLUE
By Colour
By Format & Colour
By Format & Size
By Colour & Size
By FormatBy Size
Sum
The Data Cube and the Sub-Space Measurements
Medium
Large
• Format of image• Duration• Colors• Textures• Keywords• Size• Width• Height• Internet domain of image• Internet domain of parent pages• Image popularity
Mining Multimedia Databases
Data Visualization 50
Spatial Relationships from Layout
property P1 next-to property P2property P1 on-top-of property P2
Different Resolution Hierarchy
Mining Multimedia Databases
Data Visualization 51
Data Visualization 52
Data Visualization 53
Data Visualization 54
Classification in MultiMediaMiner
Data Visualization 55
• Special features:– Need # of occurrences besides Boolean existence, e.g.,
• “Two red square and one blue circle” implies theme “air-show”
– Need spatial relationships
• Blue on top of white squared object is associated with brown bottom
– Need multi-resolution and progressive refinement mining
• It is expensive to explore detailed associations among objects at high resolution
• It is crucial to ensure the completeness of search at multi-resolution space
Mining Associations in Multimedia Data
Data Visualization 56
Data Visualization 57
Text Miner: Feature Extracton example from IBM Intelligent Miner
Data Visualization 58
Visual Data Mining & Data Visualization
• Integration of visualization and data mining– data visualization
– data mining result visualization
– data mining process visualization
– interactive visual data mining
• Visual Data Mining: the process of discovering implicit but useful knowledge from large data sets using visualization techniques
• Data visualization– Data in a database or data warehouse can be viewed
• at different levels of granularity or abstraction
• as different combinations of attributes or dimensions
– Data can be presented in various visual forms
Data Visualization 59
Boxplots from Statsoft: multiple variable combinations
Data Visualization 60
Visualization of data mining results in SAS Enterprise Miner: scatter plots
Data Visualization 61
Visualization of association rules in MineSet 3.0
Data Visualization 62
Visualization of cluster groupings in IBM Intelligent Miner
Data Visualization 63
GeoMiner Visualization Example
Data Visualization 64
Spatial Clustering
Data Visualization 65
Spatial Association• Association Rules
– isa(X, "Golf Course") -> closeto(X, "Man-Made Channel") (61%, 61%). isa(X, "Golf Course") & closeto(X, "Secondary road") -> closeto(X, "Open space") (64%, 78%).
Data Visualization 66
Data Mining Process Visualization
• Presentation of the various processes of data mining in visual forms so that users can see– How the data are extracted
– From which database or data warehouse they are extracted
– How the selected data are cleaned, integrated, preprocessed, and mined
– Which method is selected at data mining
– Where the results are stored
– How they may be viewed
Data Visualization 67
Visualization of Data Mining Processes by Clementine
Data Visualization 68
Interactive Visual Data Mining
• Using visualization tools in the data mining process to help users make smart data mining decisions
• Example
– Display the data distribution in a set of attributes using colored sectors or columns (depending on whether the whole space is represented by either a circle or a set of columns)
– Use the display to which sector should first be selected for classification and where a good split point for this sector may be
Data Visualization 69
Interactive Visual Mining by Perception-Based Classification (PBC)
Data Visualization 70
Visual ICON Language
• Video Annotation Problem– 과거에는 비디오 데이타들이 1 회성으로 사용– 전문가들이 주석을 달아 저장 , 검색– 현대는 반복 재사용 비디오의 시대
• 어떻게 비디오 데이터를 검색할 것인가 ?
• Keyword based approach 의 한계– Do not describe temporal structure of video
– Not semantic representation• ‘dog’ and ‘German shepherd’
– Do not describe relations between descriptions• Only ‘man’, ‘dog’ ‘bite’ not “dog bite man”
– Do not scale, set of new keyword increase
Data Visualization 71
Language for representation of Video content
• ICON Annotation Language, why?– Quick recognition and browsing of annotation
– Accurate and readable
– Global, international use
• Example– 'Arnold, an adult male, wears a jacket'
– ‘scene is located inside a bar in United States of America’
– Character action: full body actions, head actions, arm actions, and leg actions
Data Visualization 72
Language for representation of Video content
• Number of object– single object, two objects, or groups of objects
• Media Timeline Editor– Timeline annotation of Icon sentence
• Icon Space, icon palette, – a utility for constructing and retrieving iconic
sentences
Data Visualization 73
Media Timeline Editor
Data Visualization 74
Icon Space
Data Visualization 75
ICONS used (sample)
Data Visualization 76
MIT Visual Language Project
Data Visualization 77
Some Words: Integration of Text and Visual Icon
Data Visualization 78
Future Trend
• Animated Visualization vs static visualization
• 3D Visualization vs 2D Visualization
• 3D with Animated Visualization
• Cinematic Technique is becoming more and more important for User Interface– Lev Manovich, Professor of UCSD – The language of New Media, 2000, MIT Press
• Find New metaphor – Spiral Curve etc.