Ying-Jie Chen/ SAP Taiwan July, 2012 · 7/19/2012 · 63% are exploring in-memory databases 50%...
Transcript of Ying-Jie Chen/ SAP Taiwan July, 2012 · 7/19/2012 · 63% are exploring in-memory databases 50%...
大資料時代的異質資料整合
Ying-Jie Chen/ SAP Taiwan
July, 2012
© 2012 SAP AG. All rights reserved. 2
Legal Disclaimer
The information in this presentation is confidential and proprietary to SAP and may not be disclosed without
the permission of SAP. This presentation is not subject to your license agreement or any other service or
subscription agreement with SAP. SAP has no obligation to pursue any course of business outlined in this
document or any related presentation, or to develop or release any functionality mentioned therein. This
document, or any related presentation and SAP's strategy and possible future developments, products
and/or platforms directions and functionality are all subject to change and may be changed by SAP at any
time for any reason without notice. The information on this document is not a commitment, promise or legal
obligation to deliver any material, code or functionality. This document is provided without a warranty of any
kind, either express or implied, including but not limited to, the implied warranties of merchantability, fitness
for a particular purpose, or non-infringement. This document is for informational purposes and may not be
incorporated into a contract. SAP assumes no responsibility for errors or omissions in this document, and
shall have no liability for damages of any kind including without limitation direct, special, indirect, or
consequential damages that may result from the use of this document. This limitation shall not apply in
cases of intent or gross negligence.
All forward-looking statements are subject to various risks and uncertainties that could cause actual results
to differ materially from expectations. Readers are cautioned not to place undue reliance on these forward-
looking statements, which speak only as of their dates, and they should not be relied upon in making
purchasing decisions.
© 2012 SAP AG. All rights reserved. 4
Big Data – 大量的異質性資料如何處理 ? - 現況是 … 八仙過海, 各顯神通
9 out of 10 organizations use relational databases
-- 93% are considering other options
63% are exploring in-memory databases
50% are exploring columnar databases
50% are exploring Hadoop
The Challenge of Big Data Benchmarking Large-Scale Data Management, Ventana Research, January 2012
© 2012 SAP AG. All rights reserved. 5
關聯
Present
處理
Process
儲存
Store
吸收
Ingest Kafka Flume
Scribe
Azkaban Oozie Pig Hive
Hadoop MapReduce S4 Storm
Voldemort Cassandra Hbase
Big data applications ?
因應 “Big Data” 挑戰的解決方案
- Open-source 軟體的戰國時代 ?
© 2012 SAP AG. All rights reserved. 6
SAP real-time data platform
MP
P
sc
ale
-ou
t
Open developer APIs and protocols
Co
mm
on
lan
dsc
ap
e m
an
ag
em
en
t
Ap
ach
e H
ad
oo
p
3rd P
arty
DB
SAP Solutions for Enterprise Information Management
SAP Sybase
Replication Server SAP Data
Services
SAP HANA platform
SAP Master Data Governance
SAP Master Data Management
SAP
Sybase
IQ
SAP
Sybase
ASE
SAP Sybase SQL
Anywhere
SAP Sybase Event
Stream Processor
Co
mm
on
m
od
eli
ng
SA
P S
yb
as
e P
ow
erD
es
ign
er
3rd Party
BI Client
SAP NetWeaver (On Premise / Cloud)
SAP Business
Suite
SAP Business
Warehouse
SAP Big Data Applications
SAP Analytics
SAP Mobile
Custom Apps
SAP Solutions for EIM
關聯
Present
處理
Process
儲存
Store
吸收
Ingest
SAP Real-time Data Platform
- Open Source 與 Enterprise 軟體各取所長的混合式架構
EIM = Air Traffic Control For Data
© 2012 SAP AG. All rights reserved. 8
MOVE
IMPROVE
UNLOCK
GOVERN
One Runtime Architecture &
Services
Business UI (Information Steward)
Unified Metadata
Technical UI (Data Services)
SAP BusinessObjects Data Services
ETL
Data Quality
Profiling
Text Analytics
One Administration Environment
(Scheduling, Security, User Management)
One Set of Source/Target Connectors
First and only, all-in-one solution for Data Integration, Data Quality, Data
Governance and Unstructured Data Processing.
集成全面資料管理需求的一條龍解決方案 - SAP BusinessObjects Data Services 單一平台
© 2012 SAP AG. All rights reserved. 9
Scorecard to
measure DQ from
a Data Steward’s
perspective
Key Quality
Dimensions (KPI
for data)
Drill into scorecard
details
Data quality score metrics
Latest quality score
Quality trend
與 ETL 轉檔集成的資料品質儀表板 - 各種轉檔資料的數量、品質 KPI、趨勢分析
© 2012 SAP AG. All rights reserved. 10
SAP Data Services 資料治理平台
BO Universe
Target DM
10
銀行
OS390
保險
ODS
DW ETL ETL
BO Universe BO Deski/Webi 報表
SAP BO BI Platform
Sybase PD
Modeling
Target DM
信用卡
/基金
AS400
ERP, CRM
文字報表 / 交換檔
ETL & Data Source
端到端的資料治理解決方案
- 包含從前台交易系統到報表層全程的資料字典萃取
© 2012 SAP AG. All rights reserved. 11
11
資料沿襲與衝擊分析 - 快速與立體化查詢報表關聯
© 2012 SAP AG. All rights reserved. 12
企業業務字彙 Wiki - 報表欄位 、KPI 業務定義與技術規範, 全域關聯與搜尋
© 2012 SAP AG. All rights reserved. 13
8月2日 日期
台北 城市名
重大火災事故, 刑事處罰,事故原因, 名詞
11•15, 6起, 一, 26名,50萬,5人 數字
高偉忠,姚亞明, 黃佩信 人名
地方法院,國營企業 政府機關
工程 其他
8月2日, 台北市 "11.15"特別重大火災事故相關 6 起刑事案件作出一審判決,
台北地方法院分別判處高偉忠等 26 名被告有期徒刑 16 年至免予刑事處罰。
法院認為高偉忠、姚亞明等人濫用職權是造成事故原因之一, 法院同時查明,
2004 年至 2010 年期間, 高偉忠等人利用職務便利幫助他人承包工程等, 收
受賄賂, 其中高偉忠受賄 50 萬餘元, 周建民受賄 60 萬元, 黃佩信、馬義鎊利
用在國營企業中從事公務的職務便利幫助他人承包工程等, 分別受賄 250 萬
餘元和 360 萬餘元。此外, 支上邦等 5 人為承包工程還向他人行賄。
由非結構化文字資料萃取關鍵訊息 - 內建多種語言的文字分析引擎
© 2012 SAP AG. All rights reserved. 14
106台北市信易路五段2306號四十五樓102室
106-台-北-市-信-易-路-五-段-2306-號-四-十-五-樓-102-室
106-台-北-市-信-義-區-信-義-路-5-段-2306-號-45-樓-102-室
PostCode:106
Country: 台灣
Region: 台灣
Reg Desc: 省
Locality1: 台北
Loc1 Desc: 市
Locality2: 信義
Loc2 Desc: 區
Last line process
Street Name: 信義路5段
Street Type: 路
Street Num: 2306
Num Desc: 號
Building Name: 幸福大厦
primary address process
Floor Num: 45
Floor Desc: 樓
Unit Num: 102
Unit Desc: 室
Secondary address process
Work Break
Normalization
Search & Match
街道地址清理與歸戶 - 結合定位服務與電子地圖, 強化企業營運優勢
© 2012 SAP AG. All rights reserved. 15
3
關聯與消費 3 On Demand
Services
收集與儲存 1 分析與處理 2
2
1
SAP Data Services – Hadoop 適配器 - 將非結構訊息關聯既有企業應用, 加速 Big Data 價值實現
© 2012 SAP AG. All rights reserved. 16
計算與定義是否符合最
新法規、主管機關規定、
客戶要求與公司規範 ?
網站訪問的銷售分析
維度是否具有足夠代
表性 ?
資料的時效性如何 ?
他們的品質符合標準
嗎 ?
這些數據來自哪些原
始交易 ? 它們的關聯
性正確嗎?
重拾對於企業報表的信心 - Big Data 需要新一代的 SAP EIM 企業資料治理平台