Chapitre de Montréal du TDWI - montrealtdwi.org · Chapitre de Montréal du TDWI ... a broad array...

35
1 Chapitre de Montréal du TDWI 18 novembre 2010 Qu’est ce que TDWI ? TDWI provides education, training, certification, news, and research for executives and information technology (IT) professionals worldwide. Founded in 1995, TDWI is the premier educational institute for business intelligence and data warehousing. 2

Transcript of Chapitre de Montréal du TDWI - montrealtdwi.org · Chapitre de Montréal du TDWI ... a broad array...

1

Chapitre de Montréal du TDWI

18 novembre 2010

Qu’est ce que TDWI ?

TDWI provides education, training, certification,

news, and research for executives and

information technology (IT) professionals

worldwide.

Founded in 1995, TDWI is the premier

educational institute for business intelligence

and data warehousing.

2

2

Mission du chapitre de Montréal

Informer objectivement la communauté sur les

meilleures pratiques BI et les tendances

émergentes en matière d’Intelligence d’Affaires.

Offrir de la formation sur mesure, promouvoir le

réseautage et le partage de connaissances en

tenant compte de la culture d’affaires

Québecoise.

3

Montreal Chapter Board

• Alain Bond, President

• Elise Lacoste, Vice President

• Yvan Dupont, Sponsor Coordinator

• Sylvie Fréchette, Board Member

• Stéphan Robitaille, Board Member

• David-Marc Petit, Board Member

• Frédéric Gingras, Board Member

4

3

5

Merci à nos commanditaires!

6 6

Svp. / Please

Fermez la sonnerie de votre cellulaire /

Turn off your cell phone ringing

4

7

Ordre du jour / Agenda

08h00 - 08h10 Mot d’ouverture Opening remarks

08h10 - 09h10 Introduction à l’informatique en nuage

Par Simon Castonguay, CA, CISA, DIFA

Directeur , Risque, Performance,

Technologie et Conformité

KPMG s.r.l./SENCRL

Introduction to Cloud Computing

By Simon Castonguay, CA, CISA, DIFA

Manager, Risk, Performance, Technology

and Compliance

KPMG LLP

09h10 – 9h20 Pause Break

9h20 - 10h20 Intelligence d’affaires dans le nuage :

risques, avantages et exemples

concrets

Par Simon Castonguay

BI in the Cloud: Risks, Benefits &

Examples from the Real World

By Simon Castonguay

10h20 - 10h30 Mot de la fin et prix de présence Parting words and door prizes

Simon Castonguay, CA, CISA, DIFA Manager, Risk, Performance, Technology and Compliance

KPMG LLP

Simon Castonguay is a manager in the Forensic Accounting group of KPMG in

Montreal. His work is primarily focused on data mining, financial modeling and IT-

related support solutions in the context of investigations.

Simon is a Chartered Accountant and an Certified Information System Auditor and,

in August of 2010, he completed the Diploma in Investigative and Forensic

Accounting at UofT and conducted a research on Cloud Computing and its impact

of today’s business.

In addition to his Forensic & IT work, Simon conducted internal audit, valuation,

loss quantification and strategic risk assessment engagements.

Simon is on many boards, including the Montreal Youth Symphonic Orchestra,

Force Jeunesse and the Fonds de développement du Collège Jean-de-Brébeuf.

His lives in Montreal with his fiancée.

5

BI in the Cloud: Concepts, Risk and Opportunities

Simon Castonguay, CA, CISA, DIFA

Manager, RPTC, KPMG LLP

[email protected]

Presentation Overview – Part 1

• What we all Know of Cloud Computing

• What is Cloud Computing?

– Is the Cloud Old News?

– Is It Different From What We Know?

• Proposed Definition

• Characteristics of the Cloud

• Types of Clouds

• How can BI be Leverage by the Cloud?

6

Presentation Overview – Part 2

• Another Example

• Let’s Not Get too Excited

• Risks Associated with Cloud Computing

– Technological/Security

– Legal

• What Can be Done?

• What is the Future of BI in the Cloud?

• Conclusion and Question Period

What We All Know of Cloud

Computing (1/2)

• Currently at its peek as a buzzword

• Popular topic on Google:

7

What We All Know of Cloud

Computing (2/2)

• Very trendy at the moment

– Even Microsoft offers a “Cloud” version of Microsoft Windows:

Windows Azure http://www.microsoft.com/windowsazure

– Most popular web apps are cloud-based:

• Something to do with the Internet…

What is Cloud Computing?

• Wikipedia:

– Cloud computing is Internet-based computing, whereby shared resources,

software, and information are provided to computers and other devices on

demand, like the electricity grid.

• What experts say:

– “… a broad array of web-based services aimed at allowing users to obtain a

wide range of functional capabilities…” – Jeff Kaplan

– “…the user-friendly version of Grid Computing.” – Trevor Doerksen

– “The interesting thing about cloud computing is that

we've redefined cloud computing to include everything

that we already do…” – Larry Ellison, CEO of Oracle

8

Is This All New?

• Not according to Larry

Ellison…

“(…) The computer industry is

the only industry that is more

fashion-driven than women’s

fashion. Maybe I’m an idiot, but

I have no idea what anyone is

talking about. What is it? It’s

complete gibberish. It’s insane.

When is this idiocy going to

stop?”

• New solution for an old problem?

The CTSS (IBM 7094), at MIT in 1963.

The first time-sharing system in the history of computer engineering…

Same Problem, Different Reality:

CRAY XT5

16

9

Is It Different than What We Know?

Central computer (“node”) dispatches the

processing to other computer and gets the result

back (1 to n)

Ideal for large tasks that require a lot of power.

No central computer – the resources are

virtualized, generated and transformed

dynamically to execute the tasks (n to n)

Optimized for tasks of any nature and any size.

Ok… But What is Cloud Computing?

• Vaquero et al. (2009):

– Large pool of easily usable and accessible virtualized resources (such as

hardware, development platforms and/or services).

– These resources can be dynamically reconfigured to adjust to a variable

load (scale), allowing also for an optimum resources utilization.

– This pool of resources is typically exploited by a pay-per-use model…

– … in which guarantees are offered by the Infrastructure Provider by means

of customized Service Level Agreements.

10

Hillary Clinton’s schedule as First Lady was a 17,481 pages non-searchable PDF.

A Senior Engineer at the Washington Post tried to OCR the document to make it

fully searchable. He soon realized that with the newspaper’s computer this would

take him a year to process.

Hillary Clinton’s Schedule and the Cloud

Using Amazon’s EC2 he was able to gain access to more processing power and

perform the task in a reasonable amount of time.

• 200 servers were allocated to the task.

• 1,407 virtual machine hours for processing

• Overall delay to give the document to the journalists: 26 hours

• Total cost for the whole operation: $144.62

The “Cloud Model”

Source: National Institute of Standards and Technology.

11

Characteristics of the Cloud (1/2)

• Service-oriented

– Interface abstract the implementation and enable a completely automated

response to provide service to customer

• Scalable and elastic

– Service can scale up or down.

– Elasticity is also an economic model…

• Shared (“pool of resources”)

– Focus on using IT services with maximum efficiency and economy of scale

Characteristics of the Cloud (2/2)

• Metered by use

– Implies a usage-accounting model for measuring the use of services, which

could be used later to create different pricing plans and models.

– Pay-as-you-go, subscriptions, fixed plans, etc.

• Uses Internet Technology

– Uses Internet protocols, identifiers, formats, etc.

– IP, URL, HTTP, etc.

12

Service Models

USER

APPLICATION (SaaS)

(the applications)

PLATFORM (PaaS)

(the computer)

INFRASTRUCTURE (IaaS)

(the components)

SERVER

Hybrid

Processing of private

data through public

applications

Types of Clouds

Public Cloud

Google Apps

IBM Blue Cloud

Amazon EC2

Private Cloud

Unique provider

Controlled resources

Known controls

13

Trick Question

Do you see a difference between a

PRIVATE CLOUD

and

IMPARTITION

(as we know it)?

Cloud Computing Maturity Model

Source: GTSI

14

Enough with the Theory…

Let’s try to figure out how we could have

BI in the Cloud ! (?)

BI in the Cloud: A Growing Market

• Predicted annual growth rate: 22.4% through 2013

• Growth comes from two SaaS types:

– Specific analytical tools

• Optimization, Planning, Inventory analysis, etc.

– Self-service analytical tools

• Shared data, Creation, Visualization of customized report

• Datawarehouse in the Cloud: not there yet

• ETL in the Cloud: not quite either

15

Classical BI Model and Cloud

29

Aggregation

Enterprise Information

Portal

Common Dimensions (Students,

Course Curriculum,

etc.)

DM -

Cubes

Dashboard &

Scorecards

Metadata

Access

Analytics &

Statistical Tools

Ad hoc

Analytical

Reports

Pre-defined

Analytical

Reports

Source

Data

Integration

(ETL)

Enterprise DW

« Single Version

of the truth » Data Marts

Analytics &

Reporting Tools

Students Info

DM -

Cubes

DM -

Cubes

DM -

Cubes

DM -

Cubes

DM -

Cubes

Analytical

Applications

Archived

Data

>3 years

Transport

(Pull/Push)

Staging

(Raw Data)

Financial

Systems

HR

Systems

CRM

Systems

SCM

Systems

Financial

Data

HR Data

Operational

Systems

External

Sources

Other

Reference

Data

Data

Extr

act

Data

Tra

nfe

r Donation Manageme

nt Info S

tan

dard

ize

an

d C

lean

se

Tra

nsf

orm

Load

CRM Data

SCM Data

External

Data

External

Data

Cloud Computing and Data Warehouses

(1/2)

Characteristics Public Cloud Conventional Data

Center

Infinite computing resources on demand Yes No

Elimination of an up-front commitment by Cloud users Yes No

Ability to pay for use of computing resources on a short-

term basis as needed Yes No

Economies of scale due to very large data centers Yes Usually not

Higher utilization by multiplexing of workloads from

different organizations Yes

Depends on the size

of the company

Simplify operation and increase utilization via resource

virtualization Yes No

Source: ARMBRUST, Michael et al. A View of Cloud Computing. Communications of the ACM. Vol. 53, No. 4,

April 2010. P. 50 à 58

16

Cloud Computing and Data Warehouses

(2/2)

Data Warehouse Limitations* How the Cloud can help?

Because data must be extracted, transformed

and loaded into the warehouse, there is an

element of latency in data warehouse.

IaaS: transfer and processing of data from source

systems to integration area to data warehouse can be

accelerated at will with proper virtualized components

Over their life, data warehouses can have high

costs (hardware, maintenance, licenses, etc.).

Pay-as-you-go & SOA: the client only pays for its

usage of the resources in the Cloud; the provider has

the responsibility to update and maintain the

resources up-to-date for no additional costs.

Data warehouses technologies can get outdated

relatively quickly. There is a cost of delivering

suboptimal information to the organization.

Elasticity: IT and economic concept where the right

amount of resource is allocated to the task, not too

more.

Data warehouses are not the optimal

environment for unstructured data.

Data Analytics in the Cloud: extensive capabilities

allow for powerful and quick data analyses.

* According to Wikipedia (“Data Warehouse”)

DW in the Cloud: Challenges

• Get your data in the cloud

• Traditional OLAP DW don’t translate well into the Cloud

– But there are success stories:

• Netflix went in the Cloud: From ORACLE to S3 & SimpleDB

• Facebook: 15PB in DW, use Hive (Hadhoop) to support BI needs

• Cloud is not a good platform for I/O intensive application like BI

– Solution: Solid-State Drives? SolidFire…

17

Evolution of BI Infrastructure

Resources

needed

Time

Resources

Used

Too much resources v.

what is really needed

Just the right

infrastructure

BLUE LINE:

In-house

Infrastructure

(as updated according

to budgets, priorities,

etc.)

GREEN LINE:

Resources allocated by

the Cloud (as updated

according to… well…

your needs…)

Break…

18

35

Ordre du jour / Agenda

08h00 - 08h10 Mot d’ouverture Opening remarks

08h10 - 09h10 Introduction à l’informatique en nuage

Par Simon Castonguay, CA, CISA, DIFA

Directeur , Risque, Performance,

Technologie et Conformité

KPMG s.r.l./SENCRL

Introduction to Cloud Computing

By Simon Castonguay, CA, CISA, DIFA

Manager, Risk, Performance, Technology

and Compliance

KPMG LLP

09h10 – 9h20 Pause Break

9h20 - 10h20 Intelligence d’affaires dans le nuage :

risques, avantages et exemples

concrets

Par Simon Castonguay

BI in the Cloud: Risks, Benefits &

Examples from the Real World

By Simon Castonguay

10h20 - 10h30 Mot de la fin et prix de présence Parting words and door prizes

Presentation Overview – Part 2

• Another Example

• Let’s Not Get too Excited

• Risks Associated With Cloud Computing

– Technological/Security

– Legal

• What Can Be Done?

• What Is the Future of BI In the Cloud?

• Conclusion and Question Period

19

Another Example

Staging/ETL Pentaho + Amazon EC2

PENTAHO AMAZON EC2

• Uses clusters to process ETL

• 1 main cluster + slaves (as necessary)

• Clustering is software based and can be

used with any infrastructure

• Allow compute resources when needed

including:

− Servers

− IP addresses, etc.

• Low cost : 20TB of Elastic Block Storage

(≈SAN) for 1 hour=$2.80

•Ideal for large ETL batch processing (small window of time ; lots of data to process)

•In practical terms (according to a Bayon Technologies study):

− Process of 1G rows of raw data in 1 hour

− Overall cost: <$6.00

This is all nice…

BUT! (There’s always a “but”, isn’t there?)

20

Gartner’s Hype Cycle

Gartner’s Hype Cycle

21

The Cloud & BI (1/3)

• Where would the Cloud

fit in the BI Framework?

Enterprise BI Framework

Business Strategy

Alignment

Governance

Performance Management

Process & Reporting

Integrated Information

Management

Business Intelligence

Platform

Infrastructure

Bu

sin

es

s

Te

ch

nic

al

USER

APPLICATION (SaaS)

(the applications)

PLATFORM (PaaS)

(the computer)

INFRASTRUCTURE (IaaS)

(the components)

SERVER

Gartner’s Hype Cycle: Details

22

Risks to be Considered (1/12)

Risks to be Considered (3/12)

• The CIA Triad

– Confidentiality

– Integrity

– Availability

• Privacy and Compliance

– Notice

– Choice

– Access

– Security

– Enforcement

23

Risks to be Considered (3/12)

• Segregation of duties & virtualization issues

– Modifications by unauthorized users

– Unauthorized or unintentional modifications by authorized users

– Authentication: “whodunit?”

– Policies and procedures governance

Risks to be Considered (4/12)

24

Risks to be Considered (5/12)

• Data lock-in

– 2009: Coghead case

*Source: http://www.zdnet.com/blog/collaboration/cloud-bursts-as-coghead-calls-it-quits/349

Risks to be Considered (6/12)

• Data deletion

– How are we sure that the data that resides in the Cloud has been deleted

according to our data life cycle management policies?

*Source: www.bbc.co.uk

25

Risks to be Considered (7/12)

• Confidentiality of data – from A to Z

– How are we sure that sufficient and appropriate controls exists to ensure that

the confidentiality of the data is not compromised at any time?

“The confidential records of millions of British

gamblers who bet with top bookmaker Ladbrokes

have been offered for sale to The Mail on Sunday.

The huge data theft is now at the centre of a

criminal investigation after this newspaper was

given the personal information of 10,000

Ladbrokes customers and offered access to its

database of 4.5 million people in the UK and

abroad.

Last night we alerted Ladbrokes to the damaging

security breach and handed the customer files to

the Information Commissioner's Office (ICO),

Britain's data watchdog, which immediately began

to investigate.”

Risks to be Considered (8/12)

• Applicable legislation

– Massachusetts General Law, Chapter 93H (“Privacy Law”)

• Applies to anybody that has personal info on Mass. resident

• Required controls: administrative, technical and physical

– European Union Directive 95/46/EC (“Data Protection Directive”)

• Applies whenever the “controller” of the data uses installation within the

European Union

• Information can be transferred to third countries only if they offer

sufficient protection

Are we aware of our legal obligations when

the data moves around the world?

26

Risks to be Considered (9/12)

• E.U. 95/46/EC – Penalties

– Enforced by countries of the E.U.

– E.g.: Netherlands*:

• “Certain violations of the DPA, such as non-compliance with a

notification obligation, qualify as a criminal offence. The sanction is a

penal fine up to a maximum of EUR 3,700. If the criminal offence

was deliberate, the sanction is a penal fine of up to a maximum of

EUR 7,400 or a prison sentence of up to a maximum of six

months.”

• “In 2009, the Dutch Independent Post and Telecommunications

Authority (“OPTA”) imposed a fine of EUR 250,000 on a natural person

for sending unsolicited email in violation of the TA (see below). Together

with this fine, OPTA issued an administrative order for a penalty sum of

EUR 5,000 per day.”

*https://clientsites.linklaters.com/Clients/dataprotected/Pages/TheNetherlands.aspx

Risks to be Considered (10/12)

• E.U. 95/46/EC – Penalties (cont’d)

– France*:

• The Commission Nationale de l’Informatique et des Libertés may issue a

wide array of penalties including: (i) a warning; (ii) a formal demand;

(iii) the issuing of an injunction to cease processing; and

(iv) financial sanctions of up to EUR 150,000 for the first breach

(and up to EUR 300,000 in the case of a repeat breach within five

years).

– U.K.**:

• the Information Commissioner will soon have the power to impose an

administrative fine of up to £500,000 if (i) there is a serious breach of the

data protection principles, (ii) this is likely to cause substantial damage

or substantial distress, and (iii) the breach is deliberate or reckless. This

power is expected to come into force in April 2010.

*https://clientsites.linklaters.com/Clients/dataprotected/Pages/France.aspx

** https://clientsites.linklaters.com/Clients/dataprotected/Pages/UnitedKingdom.aspx

27

Risks to be Considered (11/12)

• Service Level Agreement (“SLA”)

– Part of the solution?

• A well-designed, customized SLA should address an important part of

the risks of doing business in the Cloud

• Think about the jurisdictions where your data could go: do you cover

everything?

• Controls, controls, controls…

– Part of the problem?

Risks to be Considered (12/12)

Windows Azure Privacy Policy

Security of Your Personal Information

Microsoft is committed to protecting the security of your personal information and

information collected for advertising purposes. We use a variety of security

technologies and procedures to help protect your personal information from

unauthorized access, use, or disclosure. For example, we store the personal

information you provide on computer systems with limited access, which are located in

controlled facilities.

Sharing of Your Personal Information

(…)

We occasionally hire other companies to provide limited services on our behalf,

such as handling the processing and delivery of mailings, providing customer support,

hosting websites, processing transactions, or performing statistical analysis of our

services. Those service providers will be permitted to obtain only the personal

information they need to deliver the service. They are required to maintain the

confidentiality of the information and are prohibited from using it for any other purpose.

28

What Can Be Done? (1/4)

• Strategic thinking… at every stage!

– Doing business in the Cloud can be a game

changer. Make sure you have the right people

to perform the right tasks

– Follow the best practice in business change

implementation

• Have high management on-board

• Identify a champion

• Consider user acceptance

• Etc.

– It’s not just about IT !

Enterprise BI Framework

Business Strategy

Alignment

Governance

Performance Management

Process & Reporting

Integrated Information

Management

Business Intelligence

Platform

Infrastructure

Bu

sin

es

sT

ec

hn

ica

l

What Can Be Done? (2/4)

• Make sure accountability is clear

– The last thing you want to put in the Cloud is accountability

– Internal / External

– It’s 3:00 am and you have to make an important call… who do you want to

pick it up?

• Follow the data

– Use access and processing logs

– Monitor (not fully mature, according to Gartner…)

– Make sure your provider is with you on that (SLA)

29

What Can Be Done? (3/4)

• Your provider should follow you… not the opposite!

– The Cloud is an enabler, but it shouldn’t change the core of your business

strategy

– The Cloud should be the extension of your way of doing business: make

sure it follows your policies and procedures (SLA)

• Plan the unthinkable

– Cloud makes it easier to be litigation ready

– Have you thought of a disaster recovery plan?

What Can Be Done? (4/4)

• Manage the access

– Categorize your data: not everything could be good for the Cloud

• Public data

• Private data

• Sensitive data

• Confidential data

– A bad access management policy outside of the Cloud will not do better

once the data is in the Cloud

• Have that thing audited: SAS70/CICA5970

– Are all the controls there and are they working well?

– Do they fit your need?

30

Conclusion (1/2)

• Companies now have the possibility to benefit from limitless IT resources

for a small price. Some might even outsource important parts of their IT

function into the cloud.

• Even for small companies, data will be stored in the cloud: the size of a

company will not be linked to its capacity to spread data all around the

world anymore.

– Don’t have to be Wal-Mart to benefit from the Cloud and, therefore, from BI

solutions

– Wider market for BI experts/consultants

Conclusion (2/2)

• From the insider’s perspective:

– Two questions to consider at any stage:

• Are you ready for the Cloud?

• Is the Cloud ready for you?

– Approach the Cloud like any other project:

• Business case

• Financial analysis

• Change management

• Etc.

31

61

Questions

Simon Castonguay, CA, CISA, DIFA Manager

Risk, Performance, Technology and Compliance

KPMG LLP

514-840-2570

[email protected]

32

Prochaine rencontre

Nous sommes heureux d’annoncer que notre prochaine rencontre sera

63

Au programme:

À venir...

64

Merci à nos commanditaires!

33

65

TDWI Membership Offer

• 10% discount on TDWI Membership

– For TDWI Chapter attendees; Limited time only

– You can renew or extend your existing TDWI membership

• Benefits

– Discounts on conferences, seminars, certification, books

– Valuable reports and publications from TDWI Research

– Web archive of research

– Inquiry Service for Team Members

• Individual and team memberships options

• Join today!

66

Environnement

Nous récupérerons les enveloppes de plastique

We recycle plastic name tags

34

67

Annonces

Informez-vous à l’avance et nous pourrons

annoncer votre événement BI.

Let us know in advance and we

could announce your BI event.

68

La présentation

La présentation sera envoyée

aux gens présents seulement

The presentation will only be sent to poeple who

are present today.

35

69

Mot de la fin / Parting words

Commentaires & suggestions à

Send comments and suggestions to

[email protected]

70

Merci !

Thank you!