MaxQDPro TeamAnjan.K Harish.R
II Sem M.Tech CSE
04/12/23 MCSE 202 : Topics in DB Systems 1
Introduction to GNOME Need for Mining Mining Challenges GNOME Data Access
◦ Database usage grid◦ Components◦ Features
GNOME Data miner Summary References
04/12/23 2MCSE 202 : Topics in DB Systems
GNOME is acronym for GNU Network Object Model Environment.
International Project that provides software development frameworks initially developed for desktop environment.
GNU project compatible with Unix like OS and sit on the top Kernel
GNOME-DB ◦ aims to provide free unified data access architecture
to GNOME projects.◦ Known for its pretty good data management API’s.
04/12/23MCSE 202 : Topics in DB Systems 3
The Explosive Growth of Data: from terabytes to petabytes◦ Data collection and data availability
Automated data collection tools, database systems, Web, computerized society
◦ Major sources of abundant data Business: Web, e-commerce, transactions, stocks, … Science: Remote sensing, bioinformatics, scientific
simulation, … Society and everyone: news, digital cameras, YouTube
We are drowning in data, but starving for knowledge! Data mining—Automated analysis of massive data sets also
called as Knowledge Discovery Process (KDD).
04/12/23MCSE 202 : Topics in DB Systems 4
04/12/23MCSE 202 : Topics in DB Systems 5
Copyright © Data Mining and warehousing by Han et al.,
Exceeds the designers expectations Data warehouses typically grow asynchronously. Establishing the scalability of a system across the
lifetime . Data is everywhere Data is inconsistent
◦ Records are different in each system ◦ Noisy Data
Performance issues◦ Running queries to summarize data for stipulated long
period takes operating system for task (max. Load)
04/12/23MCSE 202 : Topics in DB Systems 6
GNOME has its own tool for data access similar to proprietary Microsoft OLE.
Key issue in the data access is with heterogeneous data sources and variety of different access methods to each of them
Access methods and SQL are not standards de-facto.
Its middleware to access various data sources Libgda is the actual tool used for this
purpose.
04/12/23MCSE 202 : Topics in DB Systems 7
04/12/23MaxQDPro: Kettle- ETL Tool 8
Consists of Three Major components◦ Libgda (Library Gnome Data access)
Data abstraction layer Manages data stored in databases Interfaces with Glib and LibXML Can be use for non-GNOME applications
◦ Libgnomedb DB widget library Depends on GTK+
◦ Mergeant Front end for DB administration and application
developers.
04/12/23MCSE 202 : Topics in DB Systems 9
Easier access to several database engine Metadata extractor Easy to use API’s Comes with Console and Graphical UI Open source or General Public license Direct editing of DB data Compatible with most programming
language Distributed transactions are supported.
04/12/23MCSE 202 : Topics in DB Systems 10
Open Source Data Mining Tools, collection of experimental GUI-based tools written in Python and GTK by Togaware
Uses GDA to access the heterogeneous data sources
Build the warehouse after essential processing and transformation steps with help flexible GNOME API’s
04/12/23MCSE 202 : Topics in DB Systems 11
GUI can be used for the visual checks. Used on Unix- variant system like Debian,
Red Hat, Ubuntu etc., Mining system is generic so can be used
for most of the routine works. New Data mining tool by GNOME is Rattle Greening is a decision tree builder with
stochastic boosting and random forests
04/12/23MCSE 202 : Topics in DB Systems 12
Some of the associated application with GDM◦ Decision trees◦ Apply Apriori Association rules for identifying
Frequent item set.◦ Bayes Classification for building and classifying
the trained data.◦ Bar chart and Binning Chart◦ GDM plot utility for Q-Q plot, Histogram
analysis, Correlation plot
04/12/23MCSE 202 : Topics in DB Systems 13
Introduction to GNOME Need for mining
◦ KDD◦ Challenges
GNOME Data Access◦ Components◦ Features
GNOME Data Mining
04/12/23 14MCSE 202 : Topics in DB Systems
[1] An article in URL http://www.gnome.org[2] Han et.al., “Data Mining and
Warehousing” 2nd Edition
[3] An article in URL http://www.gnomedb.org[4] An article in wikipedia.org
04/12/23 15MCSE 202 : Topics in DB Systems
Top Related