Data Mining and Data Visualization – Tools to Allow Students to do BIG STUFF with BIG DATA -...
-
Upload
cengage-learning -
Category
Education
-
view
203 -
download
2
description
Transcript of Data Mining and Data Visualization – Tools to Allow Students to do BIG STUFF with BIG DATA -...
Students Doing BIG STUFF with BIG DATA Dan Matthews – Trine University
Trine University – Angola IndianaDepartment of Informatics
And Cybersecurity
INFORMATICS – OUR WAY
“The success of computing is in the resolution of problems, found in areas that are predominately outside of computing..”
Data Mining AKA:
Information Harvesting
Knowledge Mining
Knowledge Discovery
Data Dredging
Data Pattern Processing
Data Archaeology
Database Mining
Siftware Analytics
Business Intelligence
And more…
A DECENT DEFINITION
• The process of discovering meaningful new correlations, patterns, and trends but sifting through large amounts of stored data, using pattern recognition technologies and statistical and mathematical techniques.
A number of technology skills are needed:
Data Mining
Database Managemen
t
Machine Learning
Artificial Intelligence
Analysis of Algorithms
Statistics
Visualization
Data Warehousing
Security
Technology Ethics
“In order to discover anything, you must be looking for something.”
Laws of Serendipity
I had to mine this data the hard way.
What I won’t talk about today but these concepts are important to learn in a class on data mining.
Having fun “playing” with and mining data!
Visualization to gain insight and knowledge
David McCandless Data Visualization TED Talk
WEKA: the software• Machine learning/data mining software written in Java
(distributed under the GNU Public License)• Used for research, education, and applications• Complements “Data Mining” by Witten & Frank• Main features:– Comprehensive set of data pre-processing tools, learning algorithms
and evaluation methods– Graphical user interfaces (incl. data visualization)– Environment for comparing learning algorithms
@relation heart-disease-simplified
@attribute age numeric@attribute sex { female, male}@attribute chest_pain_type { typ_angina, asympt, non_anginal, atyp_angina}@attribute cholesterol numeric@attribute exercise_induced_angina { no, yes}@attribute class { present, not_present}
@data63,male,typ_angina,233,no,not_present67,male,asympt,286,yes,present67,male,asympt,229,yes,present38,female,non_anginal,?,no,not_present...
WEKA only deals with “flat” files
Visual Analytics
BusinessIntegration
Tableau 8AnyData
FastPerformance
Web & MobileAuthoring
Visual Analytics
BusinessIntegration
Tableau 8AnyData
FastPerformance
Web & MobileAuthoring
Forecasting
Sets and visual groups
Shared Filters
Treemaps, bubble charts, word clouds
New marks card
Freeform dashboards
Data Blending improvements
Parallelized dashboards
Faster quick filters
Data Engine & Extract performance
Fast graphics and calculations
Performance recorder
Salesforce.com
Google Analytics & Google BigQuery
Cloudera Impala, Cassandra, HortonWorks, Hadapt, Karmasphere
SAP HANA
Data Extract API
JavaScript API
Data Server Security
Server Auditing
Distributed Data Engine
Web Authoring
iPad and Android authoring
Local rendering
Subscriptions
Tableau for Academia
Time to play!
Dan Matthews – Associate Professor – Trine [email protected]