Business Intelligence

19
Enterprise Data Warehouse ABC University Final Course Project MIS 563 – Business Intelligence Systems Professor Miriam Masullo Ting Yin February 22, 2015 1

Transcript of Business Intelligence

Page 1: Business Intelligence

Enterprise Data Warehouse

ABC University

Final Course Project

MIS 563 – Business Intelligence Systems

Professor Miriam Masullo

Ting Yin

February 22, 2015

1

Page 2: Business Intelligence

Table of Contents

Introduction---------------------------------------------------------------------------------------------- 3

Business Intelligence System Procedure-------------------------------------------------------------3

Project Requirements ----------------------------------------------------------------------------------3

Assumptions ---------------------------------------------------------------------------------------------4

Technical Infrastructure Enhancements -------------------------------------------------------------4

Project Requirements Definition Activities -------------------------------------------------------5

Project Plan -------------------------------------------------------------------------------------------5

Database Design ---------------------------------------------------------------------------------6

Snow Flake Schema ----------------------------------------------------------------------------------7

Data Model --------------------------------------------------------------------------------------------8

Extract/Transfer/Load--------------------------------------------------------------------------------8

Data Mining Tool ----------------------------------------------------------------------------------10

Conclusion ---------------------------------------------------------------------------------------11

2

Page 3: Business Intelligence

Introduction

ABC University has asked for a data warehouse that can provide a unified view of

information about its students, staff, and instructors. The school’s data is currently stored in

multiple databases. The objective is to create an effective and efficient way for storing, keeping,

and retrieving the data. This paper will describe the proposed database.

Data modeling, the database, and modeling tools can be examined in terms of Bill

Inman’s theory. This current paper will discuss Informatica’s Extract Transfer Load (ETL) tool,

which produces clean data, and the Oracle Data Miner (ODM), which is used as a selection tool.

The goal is to transition to a larger, unified system. It is hoped that a business intelligence

(BI) system can bring about the changes that will allow the school to stay competitive in the

market.

BI Procedure

The following will explain the BI procedures that will be followed during implementation

and to facilitate further improvement. BI introduces the business opportunity that must be

addressed, and the discussion of the system will continue throughout this paper

Project Requirement

Enterprise data warehouse for ABC University will use the BI Application Release

Concept. The BI model is used for software development. The model develops systematically

from one phase to the next in a downward-flowing fashion. The model follows steps in order

from the beginning, (1) to (10): 1) Business Opportunity, 2) Decision-Support Strategy, 3)

Project Planning, 4) Strategic Information Requirements, 5) Business Analysis, 6) Design, 7)

Development, 8) Testing, 9) Implementation, and 10) Release Evaluation (Atre, 8).

3

Page 4: Business Intelligence

Educational business opportunities are the primary drivers for this academic BI

application. The proposed BI applications are implemented across organizational BI design and

development plans by incorporating and analyzing data across various similar organizations and

departments. BI decision-support requirements are more strategic information requirements than

operational functional requirements. Analysis of BI projects emphasizes educational business

analysis. The ongoing BI application releases assessment and evaluation to promote iterative

development.

Assumptions

It is assumed that all the computers involved in this project have accesses to the Internet.

The databases and warehouses will not be accessible by computers without online access. As a

second assumption, the participants have at least some basic training in business intelligence or

related studies. They can follow direction and catch up with plans on their own without further

training in BI.

Technical Infrastructure Enhancement (Atre, 120)

1) New database management system (DMS) or upgrades to the existing DMS:

2) New development tools

3) New data access or reporting tools

4) New data mining tool

5) New metadata repository or enhancements to it

6) New network requirements

4

Page 5: Business Intelligence

Project Requirements Definition

The BI activities will follow the path as described in the diagram below. BI project scope

will be addressed continuously to ensure that the objective remains achievable within the defined

timeframe. Items 1 and 2 can define the technical and non-technical enhancements. The

requirement announcement will inform the participants in the BI project about the types of

software and hardware that are needed. Items 3 and 4 will address reporting requirements and

data sources as requested by the business analysts. The data model and service level agreement

will be developed after the scope is reviewed. Each of these items can be secured to generate a

detailed requirements document that will be referred to throughout the initial release. (Atre, 120)

Project Plan

A project plan has been organized to show the timeframe to carry out this project.

Specific business intelligence tasks will be followed in the following table.

5

Page 6: Business Intelligence

Database Design

I will apply Bill Inmon’s approach while designing the database. Inmon’s techniques and

requirements are able to accommodate the needs of ABC University’s BI project. Inmon uses a

6

Page 7: Business Intelligence

“top-down” approach to a data warehouse schema architecture. The dimensional data within the

data warehouse will contain information about specific business processes. If data marts are used

to rapidly retrieve reports, this can work well with the university’s requirements.

Data marts that gather information from a centralized data repository will allow the

school to effectively and efficiently use the warehouse. There will be specific data marts for

students, faculty, and non-instructional staff. Each student record must contain a unique student

identification number that allows information about that student to be accessed. Each student

should have only one student identification number.

Unique student, faculty, and employee identification numbers will be used to connect the

database for reporting purposes. Since each number is unique, the Oracle database can establish

and process individuals who accessed data through the database. Additionally, the system can set

up specific keys for connection. Oracle is the database of choice for the school system, as it is a

well-established database and features many modern tools.

Snowflake Schema

Snowflake schema can be used to track internal data. The structure consists of a

centralized database, which contains all information about students, staff, and faculty and points

to other related structures to access specific information by means of:

1. Centralized DB

a. A link that uses a primary identification key to access general information:

b. Name, address, phone number, and student, staff, and faculty ID numbers can be

accessed in this way

2. A link that can utilize a primary identification key to access financial information:

a. A secondary key can be created to allow access to departmental information

b. The database will store information related to each department’s employees

3. A link that will use a primary instructor identification key to access faculty information:

a. Employee title, grade level, salary, start and end dates, office, and courses taught can be

accessed in this way

b. A second key can be created to connect with a database that will store lists of students,

classroom locations, and assigned textbooks

7

Page 8: Business Intelligence

4. A link that will use the USI to access employee information:

a. Employee position, start date, and salary can be accessed in this way

b. A second key can be created to connect to departmental databases

Data marts will be created based on the data about students, staff, and faculty that will be

needed for reports. SQL will be the tool of choice for generating reports for upper management.

These reports will help the school grow and develop by identifying areas that could be improved

and changes that would reduce redundancy. A significant percentage of the potential

improvements can be generated from the improved database structure, or by physical changes.

Data Model

The following diagram shows a sample of the Entity Relationship Diagram (ERD) data models

that can be used. The model shows the relationship among the data entities.

http://www.assignmenthelp.net/assignment_help/ER-diagram-for-institute

Extract/Transfer/Load

Extract, transform, and load (ETL) is a process of extracting data from one database,

manipulating that data, and then placing the resultant dataset into another database. After the data

have been arranged, sorted, and analyzed, it can become an important tool for helping ABC

University make better decisions. This makes the BI process an integral part of any decision

support system.

8

Page 9: Business Intelligence

Many large organizations have accumulated numerous years of data. The data may have

been derived from customer information that was originally gathered from old COBOL

applications. The old version can now be upgraded and can combine a series of data marts. The

data need to be reconciled and organized so that new systems can accept the information.

The ETL process follows somewhat unique procedures to achieve a uniform format.

Reformatting the ETL process requires taking all the data sources and arranging the data in a

format that can be used later for analyses. It can combine the data and minimize the vast number

of duplications that organizations have accumulated. Data cleansing is needed to eliminate

incomplete data, orphan records, and any other dirty data. Using an ETL tool can provide a

structured design, data cleansing, and support operational resilience. Before any automated tool

can be used, it is important to have a map of the target database to be created.

ETL Tool Selection

The ETL process is both demanding and intensive. An organization may perform an

extraction in a company that requires a tool to cleanse the data and then complete the ETL

process. However, there is no single best product to accomplish everything. The DB environment

has a vast spectrum of challenges. Those challenges may require some ETL tools to be more

focused in specific areas. Expenses, experience, support, UI, and scalability are just some of the

factors that the University must address when selecting an ETL tool.

Informatica

For as long as data existed, there has been a need for data integration. As employers

transitioned from traditional mainframes to a client/server set-up and now to cloud computing,

the BI process really began to grow. Informatica provided the call to provide solutions for

organization’s data integration. Informatica’s core business model is focused around ETL, data

masking, data quality, data replication, and information lifecycle management. The obvious

point here is that they need to understand data. Informatica provides different aspects of data

integration and have been involved in the development of cloud computing.

Informatica’s PowerCenter Express Enterprise

Given Informatica’s versatility and scalability, my selection is Informatica’s PowerCenter

Express Enterprise. The PowerCenter Express solution has two levels of standard and enterprise

9

Page 10: Business Intelligence

editions. PowerCenter offers an end-to-end solution that can help organizations to transfer data

from older databases: from the old COBOL compiled DBs onto the mainframe of current SQL

DBs. Then, it converts them into one target data warehouse or data mart. PowerCenter may have

a simple GUI for the novice, which has been deemed extremely capable by experienced users.

Informatics has the ability to read data and clean data from multiple platforms. When Informatics

is compared to Ascential, the performance results are impressive.

Summary: ETL Tools’ Effectiveness

For the majority of companies that integrate databases, it is often necessary to use ETL

tools. However, even when a developer has acquired in-depth experience about ETL, using a tool

can provide consistency. Developing a custom tool may also result in a custom fee. When the

developer and users merge with each other, the custom setup may have issues that could in turn

increase the cost. The developer can use industry standards for the next developer to build upon.

Fortunately, as an organization grows and includes more data in their data warehouse, savings

can accumulate rather quickly.

Data Mining Tool

Once all the data are in one place, the next step is to find the useful information that can

be extracted from the database. It is very important to understand how these data can be

manipulated and used for the benefit of the University’s mission. To achieve this goal, we should

choose the right tool to help the user find patterns, hidden knowledge, and useful information

within the data. We have chosen Oracle for ETL (extraction, transformation, and loading).

Choosing Oracle Data Mining (ODM) over other tools is a more cost effective decision

for both software and labor. ODM offers data mining functionality as a native SQL function

within the Oracle database. This tool of choice for data mining and reporting offers many

algorithms that can be used to address any business problem. ABC University’s database uses

Oracle and ODM.

ODM also offers a graphical user interface (GUI) to show users various data patterns and

relationships resulting from data mining. The GUI enables Oracle data analysts to work with the

data already stored in the university’s database, and it can assist with the university’s community

initiatives by offering predictions and recommendations. Data mining GUI offers user-friendly

10

Page 11: Business Intelligence

tools that can help people explore the data graphically and create and evaluate multiple data

mining models. The GUI applies ODM models to new data and reveals insights and predictions

throughout the enterprise.

Oracle’s database SQL Application Program Interface mines Oracle data and releases

results in real-time. The data, models, and results remain in the Oracle database so that data

movement is minimized, data security is maximized, and latency of information is shortened.

ODM serves over 400,000 (400 * 10 3 ) customers in more than 100 different countries. It also

provides Cloud computing solutions as an open source database. ODM comes with many other

applications to handle the high technology demands of ABC University. ODM’s GUI is free of

charge and comes fully equipped with Oracle SQL Developer. The tool has visibility that stores

data, provides visualization of graphical data, and accesses multiple data models. To help users

learn how to use the software faster, Oracle’s Live Virtual Class online training speeds up the

learning curve

Conclusion

The BI system proposed for an enterprise-wide data warehouse has now become

available. This structure will offer a unified view of the school’s information, combining

departmental information into a seamless data retrieval system. Newly developed tools can

provide users with quick and reliable information in half the time of the current system. The

proposed BI system is targeted to meet the school’s needs. The prices of the additions are also

affordable for the University.

The proposal discussed in here will transfer ABC University to earn competitive

advantage for business related or academic related processing. The BI system can provide

reporting, prediction, and analytics needed for the school to stay competitive in today market.

The BI systems are supposed to meet all the school objectives and goals for upcoming students,

instructors, and employees. The BI system will be able to provide the user the tool they need to

face any situation the participants in the university may face such as course registration, tuition,

and billing. The new BI system can help the school to provide useful information that the school

needs to remain competitive in today market.

11

Page 12: Business Intelligence

Reference

Atre, Larissa. T. Moss S. (2003). Business Intelligence Roadmap: The Complete Project Lifecycle for DecisionSupport Applications. Pearson Learning Solutions. VitalBook file.

Berger, Charles. (2012) “Oracle Data Mining Blog” Retrieved fromhttps://blogs.oracle.com/datamining/tags/virtual

George, S. (2012) “Inmon vs. Kimball: Which approach is suitable for your data warehouse?” Retrieved from http://searchbusinessintelligence.techtarget.in/tip/Inmon-vs-Kimball-Which-approach-is-suitable-for-your-data-warehouse

Informatica. (2014). “Why Informatica? Why now?” Retrieved fromhttp://www.informatica.com/Images/03045_6485_why-informatica.pdf

Oracle. (2014). “Oracle Data Mining”. Retrieved fromhttp://www.oracle.com/

Oracle. (2014). “Oracle Database Developer Data Modeler” Retrieved from http://www.oracle.com/technetwork/developer-tools/datamodeler/overview/index.html

TechTarget. (2014). “Informatica PowerCenter Real Time Edition (PowerCenter RTE)” Retrieved from

http://searchdatamanagement.techtarget.com/review/Informatica-PowerCenter-Real-Time-Edition-PowerCenter-RTE

12