WhamTech SmartData Fabric - Technical Overview€¦ · SmartData Fabric® security-centric...

44
SmartData Fabric ® security-centric distributed virtual data, master data and graph data management, and analytics SmartData Fabric ® aka Distributed Data Virtualization Platform (DDVP) Technical Overview January 2019 Revision 4.6 Copyright 2019 WhamTech, Inc. 1

Transcript of WhamTech SmartData Fabric - Technical Overview€¦ · SmartData Fabric® security-centric...

Page 1: WhamTech SmartData Fabric - Technical Overview€¦ · SmartData Fabric® security-centric distributed virtual data, master data and graph data management, and analytics Notes on

SmartData Fabric® security-centric distributed virtual data, master data and graph data management, and analytics

SmartData Fabric® aka

Distributed Data Virtualization Platform (DDVP)

Technical Overview

January 2019

Revision 4.6 Copyright 2019 WhamTech, Inc. 1

Page 2: WhamTech SmartData Fabric - Technical Overview€¦ · SmartData Fabric® security-centric distributed virtual data, master data and graph data management, and analytics Notes on

SmartData Fabric® security-centric distributed virtual data, master data and graph data management, and analytics

Overview Sections

Click on a section to go directly to it (or click anywhere else to continue

the presentation):

1. “It’s all about the data” – a general discussion of data-related business issues

2. Comparison among the three main conventional approaches to data integration

3. SmartData Fabric® EIQ Adapters™ for unconventional federated data access

4. SmartData Fabric® Architecture

5. Comparison between conventional approaches and vendors, and SmartData

Fabric®

Revision 4.6 Copyright 2019 WhamTech, Inc. 2

Page 3: WhamTech SmartData Fabric - Technical Overview€¦ · SmartData Fabric® security-centric distributed virtual data, master data and graph data management, and analytics Notes on

SmartData Fabric® security-centric distributed virtual data, master data and graph data management, and analytics

Overview Section 1 -

“It’s all about the data” – a general discussion of

data-related business issues

Revision 4.6 Copyright 2019 WhamTech, Inc. 3

Click here to return to Overview Sections

Page 4: WhamTech SmartData Fabric - Technical Overview€¦ · SmartData Fabric® security-centric distributed virtual data, master data and graph data management, and analytics Notes on

SmartData Fabric® security-centric distributed virtual data, master data and graph data management, and analytics

Most organizations face major data-related

hurdles

Revision 4.6 Copyright 2019 WhamTech, Inc. 4

Click here to return to Overview Sections

Page 5: WhamTech SmartData Fabric - Technical Overview€¦ · SmartData Fabric® security-centric distributed virtual data, master data and graph data management, and analytics Notes on

SmartData Fabric® security-centric distributed virtual data, master data and graph data management, and analytics

Data is difficult

Revision 4.6 Copyright 2019 WhamTech, Inc. 5

Dirty

Typo/Transposition

Missing

Meaning

Duplication

Obfuscation

Governance

Location

System

Access

Security

Container

Format

Age

Click here to return to Overview Sections

Page 6: WhamTech SmartData Fabric - Technical Overview€¦ · SmartData Fabric® security-centric distributed virtual data, master data and graph data management, and analytics Notes on

SmartData Fabric® security-centric distributed virtual data, master data and graph data management, and analytics

Regardless, applications* need clean and

understood data in a specific format

*Reporting, BI, analytics, CDI-MDM, CRM, SCM,

fraud detection, anti-money laundering, ERP, etc.

Revision 4.6 Copyright 2019 WhamTech, Inc. 6

Click here to return to Overview Sections

Page 7: WhamTech SmartData Fabric - Technical Overview€¦ · SmartData Fabric® security-centric distributed virtual data, master data and graph data management, and analytics Notes on

SmartData Fabric® security-centric distributed virtual data, master data and graph data management, and analytics

New World (1 of 2)

• Analytics is king

− Problems are:

− On one side…cannot copy or move ALL data to a data warehouse, Big Data and/or Cloud

− On the other side…cannot work with multiple, disparate and difficult data sources - mainframes, legacy,

unstructured, etc.

• Big Data can be defined as any and all data that an enterprise owns or has access to

− Volume, velocity, variety, veracity and value

• Big Data and Cloud are here to stay – lower cost, indefinite scalability and ease of access

− Problems are:

− Requires new and specialized applications

− New forms of data warehouse?

− Now, Big Data silos

− How to integrate and interact with enterprise operational/transactional systems?

− Hybrid Cloud elusive

EOS

Revision 4.6 Copyright 2019 WhamTech, Inc. 7

Click here to return to Overview Sections

Page 8: WhamTech SmartData Fabric - Technical Overview€¦ · SmartData Fabric® security-centric distributed virtual data, master data and graph data management, and analytics Notes on

SmartData Fabric® security-centric distributed virtual data, master data and graph data management, and analytics

New World (2 of 2)

• Real-time

− Streaming, IoT devices, operational BI/analytics, event processing, event correlation, anomaly

detection, DoD, intelligence, etc.

• Increasing efficiencies

− Cost savings, single 360 customer/patient/employee/etc. views, fraud prevention, etc.

• Increasing sales

− Up-sell and cross-sell existing customers and gain new customers

• Increasing regulations and compliance

− SOX, HIPAA, Dodd-Frank, SARs, GDPR, etc.

• Increasing M&A/consolidation

EOS

Revision 4.6 Copyright 2019 WhamTech, Inc. 8

Click here to return to Overview Sections

Page 9: WhamTech SmartData Fabric - Technical Overview€¦ · SmartData Fabric® security-centric distributed virtual data, master data and graph data management, and analytics Notes on

SmartData Fabric® security-centric distributed virtual data, master data and graph data management, and analytics

Three main approaches to data access and integration

1. Copy all data to a single data store – DATA WAREHOUSE/BIG DATA/DATA

LAKE

− Structured with clean-up and schema transform

• Data warehouse, data mart and some Big Data

• After Data Lake/Reservoir -> Data Refinery -> Analytics Database

− Unstructured

• Big Data, e.g., Hadoop

2. Leave data where it is and submit queries that attempt to accommodate system,

access and data issues – FEDERATED DATA ACCESS

- Also Web/data services, etc.

3. Leave data where it is or copy it to a repository and provide a search index –

SEARCH/BIG DATA

− Enterprise search, Web search, Elasticsearch, Solr, etc.

Revision 4.6 Copyright 2019 WhamTech, Inc. 9

Click here to return to Overview Sections

Page 10: WhamTech SmartData Fabric - Technical Overview€¦ · SmartData Fabric® security-centric distributed virtual data, master data and graph data management, and analytics Notes on

SmartData Fabric® security-centric distributed virtual data, master data and graph data management, and analytics

Each approach has its advantages and

disadvantages

Any approach has the same hurdles to overcome

Revision 4.6 Copyright 2019 WhamTech, Inc. 10

Click here to return to Overview Sections

Page 11: WhamTech SmartData Fabric - Technical Overview€¦ · SmartData Fabric® security-centric distributed virtual data, master data and graph data management, and analytics Notes on

SmartData Fabric® security-centric distributed virtual data, master data and graph data management, and analytics

Three main hurdles to overcome

Revision 4.6 Copyright 2019 WhamTech, Inc. 11

1. Cultural

2. Security and Privacy

3. Technical

CULTURAL

HURDLES

SECURITY

& PRIVACY

HURDLES

TECHNICAL

HURDLES

Ideal approach helps

lower all hurdles

Click here to return to Overview Sections

Page 12: WhamTech SmartData Fabric - Technical Overview€¦ · SmartData Fabric® security-centric distributed virtual data, master data and graph data management, and analytics Notes on

SmartData Fabric® security-centric distributed virtual data, master data and graph data management, and analytics

Notes on Big Data

• At some point, unstructured data has to be “structured”, i.e., meaning has to be assigned for

reporting, BI and analytics

− Either by a machine through parsing and mapping, entity extraction, machine learning, etc.

− Or by a human, e.g., data engineer, scientist or analyst

• Schemaless data still has structure/columns – maybe just one large table

− Either with assigned meaning headers

− Or without, and a machine or human has to assign meaning

• Big Table storage is inefficient, simplistic and leads to large storage volumes vs. relational storage

− Fast for simple parallel query processing

− Slow for more complex queries involving joins/relationships – therefore the rise of separate analytics

databases/appliances and graph databases

− Slow for updating existing data

• Big Data “Lakes/Reservoirs”, “Refineries” and “Analytics Databases” are similar to operational

data stores, ETL and data warehouses/data marts, respectively

Revision 4.6 Copyright 2019 WhamTech, Inc. 12

Click here to return to Overview Sections

Page 13: WhamTech SmartData Fabric - Technical Overview€¦ · SmartData Fabric® security-centric distributed virtual data, master data and graph data management, and analytics Notes on

SmartData Fabric® security-centric distributed virtual data, master data and graph data management, and analytics

The End of Overview Section 1 -

“It’s all about the data” – a general discussion of

data-related business issues

Click here to return to Overview Sections

In slideshow mode, click anywhere else to continue

the presentation

Revision 4.6 Copyright 2019 WhamTech, Inc. 13

Page 14: WhamTech SmartData Fabric - Technical Overview€¦ · SmartData Fabric® security-centric distributed virtual data, master data and graph data management, and analytics Notes on

SmartData Fabric® security-centric distributed virtual data, master data and graph data management, and analytics

Overview Section 2 -

Comparison among the three main conventional

approaches to data integration

Revision 4.6 Copyright 2019 WhamTech, Inc. 14

Click here to return to Overview Sections

Page 15: WhamTech SmartData Fabric - Technical Overview€¦ · SmartData Fabric® security-centric distributed virtual data, master data and graph data management, and analytics Notes on

SmartData Fabric® security-centric distributed virtual data, master data and graph data management, and analytics

Three main approaches to data access and integration

1. DATA WAREHOUSE

2. FEDERATED DATA ACCESS

3. SEARCH/BIG DATA

Revision 4.6 Copyright 2019 WhamTech, Inc. 15

Click here to return to Overview Sections

Page 16: WhamTech SmartData Fabric - Technical Overview€¦ · SmartData Fabric® security-centric distributed virtual data, master data and graph data management, and analytics Notes on

SmartData Fabric® security-centric distributed virtual data, master data and graph data management, and analytics

Search/Big Data

• OF VALUE, in addition to structured data approaches, even

search on structured data

• Not focus in proceeding discussion as LESS OF AN OPTION FOR

STRUCTURED DATA

► For comparison with WhamTech SmartData Fabric® EIQ

Adapters, FOCUS ON DATA WAREHOUSE and FEDERATED

DATA ACCESS WITH CONVENTIONAL ADAPTERS

EOS

Revision 4.6 Copyright 2019 WhamTech, Inc. 16

Click here to return to Overview Sections

Page 17: WhamTech SmartData Fabric - Technical Overview€¦ · SmartData Fabric® security-centric distributed virtual data, master data and graph data management, and analytics Notes on

SmartData Fabric® security-centric distributed virtual data, master data and graph data management, and analytics

Data warehouse

Revision 4.6 Copyright 2019 WhamTech, Inc. 17

Data

Ware-

house

Data

SourceLoad

Data

Source

Data

Source

Application(s)

Data

and Schema

Transform

Extract

Load

Data

and Schema

Transform

Extract

Load

Data

and Schema

Transform

Extract

Queries resolved

in the Data

Warehouse

Expensive in terms of time

and cost to implement and

maintain

Click here to return to Overview Sections

Page 18: WhamTech SmartData Fabric - Technical Overview€¦ · SmartData Fabric® security-centric distributed virtual data, master data and graph data management, and analytics Notes on

SmartData Fabric® security-centric distributed virtual data, master data and graph data management, and analytics

Revision 4.6 Copyright 2019 WhamTech, Inc. 18

Data warehouse

ADVANTAGES

• High query success– Data clean and usable

– Consistent and multiple indexes across data

– Complete control over query processing

• No load on, or interference with, data source

systems

• High performance

• Pre-aggregated and pre-calculated fields

• Good for archive

• Security access – row, column (and data

element)

• Data sources not aware of queries

EOA

DISADVANTAGES

• Data stored elsewhere

– Responsibility, accountability, security,

privacy, regulatory and legal issues

• One-size-fits-all data schema

• Expensive and time-intensive ETL

• Updates – frequency & cost?

• Typically, cannot actively monitor data

sources

• Drill-down may not be possible

• Complete additional system cost, including

storage

• Usually need additional data marts/analytic

databases

EOS Click here to return to Overview Sections

Page 19: WhamTech SmartData Fabric - Technical Overview€¦ · SmartData Fabric® security-centric distributed virtual data, master data and graph data management, and analytics Notes on

SmartData Fabric® security-centric distributed virtual data, master data and graph data management, and analytics

Federated data access with conventional adapters

Revision 4.6 Copyright 2019 WhamTech, Inc. 19

Data

Source

Data

Source

Data

Source

Application(s)

AdapterConnector

MiddlewareAdapterConnector

AdapterConnector

Queries resolved

mainly at the data

source

Expensive to

implement and

limits capabilities

Click here to return to Overview Sections

Page 20: WhamTech SmartData Fabric - Technical Overview€¦ · SmartData Fabric® security-centric distributed virtual data, master data and graph data management, and analytics Notes on

SmartData Fabric® security-centric distributed virtual data, master data and graph data management, and analytics

Revision 4.6 Copyright 2019 WhamTech, Inc. 20

Federated data access with conventional adapters

DISADVANTAGES (continued)

• Query performance

• Queries simplified/hard-wired

• No pre-aggregated, pre-calculated or other

indexes

• Cannot actively monitor data sources

• Expensive and time-intensive adapters

• Time (and cost) to add new data sources

• Data sources aware of queries

• No archive

• No results if data source unavailable

EOS

ADVANTAGES

• Data remains at source

• Latest data

• Can be OK for standard app data sources or

well-governed systems with good control

• Plug-and-play in existing architectures

• Little or no storage requiredEOA

DISADVANTAGES

• Low query success– Data not clean and in some cases, unusable

– Available indexes only

– Limited query processing

• Load on data source systems

Click here to return to Overview Sections

Page 21: WhamTech SmartData Fabric - Technical Overview€¦ · SmartData Fabric® security-centric distributed virtual data, master data and graph data management, and analytics Notes on

SmartData Fabric® security-centric distributed virtual data, master data and graph data management, and analytics

The End of Overview Section 2 -

Comparison among the three main

conventional approaches to data integration

Click here to return to Overview Sections

In slideshow mode, click anywhere else to continue

the presentation

Revision 4.6 Copyright 2019 WhamTech, Inc. 21

Page 22: WhamTech SmartData Fabric - Technical Overview€¦ · SmartData Fabric® security-centric distributed virtual data, master data and graph data management, and analytics Notes on

SmartData Fabric® security-centric distributed virtual data, master data and graph data management, and analytics

Overview Section 3 -

SmartData Fabric® EIQ Adapters™ for

unconventional federated data access

Revision 4.6 Copyright 2019 WhamTech, Inc. 22

Click here to return to Overview Sections

Page 23: WhamTech SmartData Fabric - Technical Overview€¦ · SmartData Fabric® security-centric distributed virtual data, master data and graph data management, and analytics Notes on

SmartData Fabric® security-centric distributed virtual data, master data and graph data management, and analytics

What is SmartData Fabric®?

• Primarily consists of External, Index and Query (EIQ) Adapters™, which are DATALESS data source-specific indexed adapters

… that combine the best of and overcome the worst of…

data warehouse, conventional federated adapters and search

… AND offer MORE• Capabilities

• Rapid implementation

• Cost effectiveness

• Flexibility

• Plug-and-play in existing IT architectures

• Complement and leverage existing IT systems, tools and applications

• Data security layer for all data sources and access

Revision 4.6 Copyright 2019 WhamTech, Inc. 23

Click here to return to Overview Sections

Page 24: WhamTech SmartData Fabric - Technical Overview€¦ · SmartData Fabric® security-centric distributed virtual data, master data and graph data management, and analytics Notes on

SmartData Fabric® security-centric distributed virtual data, master data and graph data management, and analytics

Automated Data

Discovery and

Classification (ADDC)

thus far

Initial EIQ Adapter configuration, index build and data view mapping

Revision 4.6 Copyright 2019 WhamTech, Inc. 24

Data

Source

Data Read,

Transform/

clean-up

(and Index)

Index schema

and names

usually same

as data source

Twelve ways

to build and

maintain

indexes

EIQ

Adapter*

w/SDV**

EIQ

Indexes

Develop

and test

Data Transforms

using profiles

Network

Asset

and Device

Discovery

Metadata

Discovery

and Semantic

Mapping

Data

Source

Discovery

Indexes usually

do not store data

– only queryable

representations*EIQ SuperAdapter and EIQ TurboAdapter

**Standard Data View

Data

Classification

and Data

Security

Alternate use of raw indexes to initially build EIQ Indexes

Data Discovery

and raw index-

based

Data Profiling

Indexes mapped

to SDV

Distributed Metadata Repository,

incl. Data Governance

Page 25: WhamTech SmartData Fabric - Technical Overview€¦ · SmartData Fabric® security-centric distributed virtual data, master data and graph data management, and analytics Notes on

SmartData Fabric® security-centric distributed virtual data, master data and graph data management, and analytics

EIQ Adapter index update, query and results retrieval

Revision 4.6 Copyright 2019 WhamTech, Inc. 25

EIQ

Server

(sub-

Middleware)

Data

Source

Application(s)

Data Read,

Transform/

clean-up

(and Index)

Result-set pointers

to data in source

Results provided

in almost any format

Applications / middleware

connect with standard drivers or

Web Services and SQL***

EIQ

Adapter*

w/SDV**

Multiple other data sources

EIQ

Indexes

User-level

access

Middleware

*EIQ SuperAdapter and EIQ TurboAdapter

**Standard Data View

Queries resolved

in the EIQ Adapter

and EIQ Indexes

Raw results data usually

transformed/cleaned-up

from source

EIQ

Federation

Server

(sub-

middleware)

w/SDV

EIQ

Federation

Server

…***Future OQL, SPARQL and NoSQL options

Continual EIQ Indexes updates

Distributed Metadata Repository,

incl. Data Governance

Page 26: WhamTech SmartData Fabric - Technical Overview€¦ · SmartData Fabric® security-centric distributed virtual data, master data and graph data management, and analytics Notes on

SmartData Fabric® security-centric distributed virtual data, master data and graph data management, and analytics

How EIQ Adapters address data issues (1 of 2)

• Discover data sources

• Read and profile data using raw indexes

• Develop data quality transforms from data profiles

• Create and maintain indexes external to data sources (EIQ indexes), as follows:

− Read source data using one or more of twelve ways

− Clean, transform and standardize data used for indexes – discard data

− Index schemas and names same as data sources – no major schema transforms

• Map standard data view to EIQ indexes

− Can have more than one standard data view

• Applications/middleware access any and all EIQ Adapters as though a single

database through standard drivers, APIs, and Web and data servicesEOS

Revision 4.6 Copyright 2019 WhamTech, Inc. 26

Green represents features

unique to WhamTech

Click here to return to Overview Sections

Page 27: WhamTech SmartData Fabric - Technical Overview€¦ · SmartData Fabric® security-centric distributed virtual data, master data and graph data management, and analytics Notes on

SmartData Fabric® security-centric distributed virtual data, master data and graph data management, and analytics

How EIQ Adapters address data issues (2 of 2)

• Present a single virtual indexed view of all data sources to applications based on

standard data view

‒ Normally flat – can be relational, ontological, data object or business object

• Execute SQL for structured database queries and unstructured search almost

100% in EIQ indexes

• Generate a list of pointers, URLs, RDFs, file positions, etc., to raw results data in

data sources

• Retrieve raw results data, using pointers, from sources through user-level access

with appropriate authentication and security

• Clean, transform and standardize raw results data – optionally, not

• Present results to applications/middleware in any format

EOS

Revision 4.6 Copyright 2019 WhamTech, Inc. 27

Click here to return to Overview Sections

Page 28: WhamTech SmartData Fabric - Technical Overview€¦ · SmartData Fabric® security-centric distributed virtual data, master data and graph data management, and analytics Notes on

SmartData Fabric® security-centric distributed virtual data, master data and graph data management, and analytics

EIQ Adapter

Data source-specific

Query Transform

Application to Standard Data View Mapping

SDF EIQ Adapter index and query process

Revision 4.6 Copyright 2019 WhamTech, Inc. 28

EIQ Product

front-end

Data

Source

Data

Source

EIQ Indexes

Update ServerData Profiler

Read Transform

Index (RTI) Tool

Data Transforms/clean-ups

Data Retrieval

CONVENTIONAL DRIVER

OR BULK LOAD

USER API / DRIVER

EIQ Adapter

Other data source EIQ Adapters

and EIQ Federation Servers

DISCOVERY

INITIAL INDEX BUILD

CONTINUOUS INDEX UPDATE

QUERY PROCESSING

RESULTS RETRIEVAL

STANDARD

DRIVER

SQL

DEVELOP

and TEST

USED BY

BUILD

Transaction

Log

MESSAGE QUEUE

Data Discovery

Automatic Query Processing

BI / Analytics / Application(s)

Standard Data View Mapping to EIQ Indexes

EIQ Federation Server

EIQ Federation Server

Result-set

data source

pointers

Page 29: WhamTech SmartData Fabric - Technical Overview€¦ · SmartData Fabric® security-centric distributed virtual data, master data and graph data management, and analytics Notes on

SmartData Fabric® security-centric distributed virtual data, master data and graph data management, and analytics

Index updates/changed data capture

Revision 4.6 Copyright 2019 WhamTech, Inc. 29

LEGEND

Results Level

Batch updates (flat

file export)

Incremental updates

(flat file export)

Polling*

Update / event

notifications*

Data Schema Level

Triggers

Transaction / change

/ redo logs

Existing replication /

backup / change data

capture processes

Batch updates

(schema file export)

Incremental updates

(schema file export)

Either Data Schema

Level or Results Level

Crawler / spider

Message queues

RSS feeds*

Near real-time

– low rate

DE

CR

EA

SIN

G

INT

RU

SIV

EN

ES

S

Near real-time

– high rate

Batch / incremental

– high volume

Batch / incremental

– low volume

Preferred option

* = User-level access

Data Schema Level

Click here to return to Overview Sections

Page 30: WhamTech SmartData Fabric - Technical Overview€¦ · SmartData Fabric® security-centric distributed virtual data, master data and graph data management, and analytics Notes on

SmartData Fabric® security-centric distributed virtual data, master data and graph data management, and analytics

Example multiple data source SDF configuration

Revision 4.6 Copyright 2019 WhamTech, Inc. 30

F I

R E

W A

L L

F I

R E

W A

L L

EIQ Federation

Server

EIQ Federation

Server

Social

Media

FeedIndexes

EIQ

SuperAdapter

EIQ Conventional

Adapter

3rd Party

AdapterSalesforce

Hadoop IndexesEIQ

SuperAdapter

Mainframe IndexesEIQ

SuperAdapter

ERP

System

EIQ Federation

Server Application(s)

WhamTech

ODBC/JDBC

Driver,

APIs,

Web/data

services

TCP / IP

RDBMS IndexesEIQ

SuperAdapter • Adapters and federation servers

independently configurable and accessible

at multiple levels

• Potential LIFO/FIFO query processing

Page 31: WhamTech SmartData Fabric - Technical Overview€¦ · SmartData Fabric® security-centric distributed virtual data, master data and graph data management, and analytics Notes on

SmartData Fabric® security-centric distributed virtual data, master data and graph data management, and analytics

Example shared-nothing architecture

Revision 4.6 Copyright 2019 WhamTech, Inc. 31

Data

Source

Indexes

EIQ SuperAdapter EIQ SuperAdapter EIQ SuperAdapter

EIQ Federation ServerEIQ Federation ServerEIQ Federation Server

EIQ Federation Server

Indexes Indexes

Application(s)

EIQ SuperAdapter EIQ SuperAdapter EIQ SuperAdapter

Indexes can be multiple

sharded segments or replicated

copies

Out-of-the-box configurable

backup, failover and load

balancing = high availability

Page 32: WhamTech SmartData Fabric - Technical Overview€¦ · SmartData Fabric® security-centric distributed virtual data, master data and graph data management, and analytics Notes on

SmartData Fabric® security-centric distributed virtual data, master data and graph data management, and analytics

Revision 4.6 Copyright 2019 WhamTech, Inc. 32

EIQ Adapters for federated data access

ADVANTAGES

• High query success– Indexes/results data clean and usable

– Consistent and multiple indexes across disparate data

sources

– Complete control over query processing

• Data remains at source

• Latest data

• Almost no load on systems

• No major schema transforms

• Any and multiple data sources

• Plug-and-play in existing architectures

• Actively monitor indexes, therefore, data sources

• Real-time updateable hierarchical indexed views

• Rapid query response

• Denormalized indexes

• Advanced text search

ADVANTAGES (continued)

• Master data indexes

• Entity Extraction

• Other text tools – categorization, POS, sentiment, etc.

• Fuzzy matching

• Link Indexes™ for performance and link analysis

• Highly flexible

• Row, column (and data element) security indexes

• Data masking, tokenization and encryption options

• User-level access to data sources

• Results from indexes if data sources unavailable− Indexes serve as compressed queryable storage for IoT devices

• Data sources only aware of low-level results requests

DISADVANTAGES

• Establishing index updates

• Indexes require storageClick here to return to Overview Sections

Page 33: WhamTech SmartData Fabric - Technical Overview€¦ · SmartData Fabric® security-centric distributed virtual data, master data and graph data management, and analytics Notes on

SmartData Fabric® security-centric distributed virtual data, master data and graph data management, and analytics

Main EIQ Adapter add-ons

• WhamSearch advanced text search – SQL-based and can be combined with structured queries

• WhamEE entity extraction – enables open source GATE

• Data Security Layer – support for AD/LDAP, RBAC, SSO, IAM and RLS, file encryption, access,

advanced (basic, included) data masking, tokenization and encryption, indexes and/or packet

encryption over and above SSL

• Link Indexes™ - link mapping and link analysis (required for master data management (MDM))

• MDM – seamlessly and automatically combine (optionally, distributed) master data with

operational/transactional data

• Hadoop HDFS Smart Connector - external indexing, standard driver access and SQL query

processing for Hadoop data at HBase/Hive and HDFS levels

• Mainframe Data File (MDF) Smart Connector - external indexing, standard driver access and SQL

query processing for MDF data at the file and block levels

• Highly interactive link visualization/graph database through OEM Keylines

EOS

Revision 4.6 Copyright 2019 WhamTech, Inc. 33

Click here to return to Overview Sections

Page 34: WhamTech SmartData Fabric - Technical Overview€¦ · SmartData Fabric® security-centric distributed virtual data, master data and graph data management, and analytics Notes on

SmartData Fabric® security-centric distributed virtual data, master data and graph data management, and analytics

Use cases – can be combined

Query enabler – where data sources may…

• Not process SQL queries, e.g., archive files or application-associated

• Need schema changes and/or indexes to process SQL (or other QL) queries

• Need data transformation, entity extraction, advanced text search or other processing

Query enhancer – where data sources may…

• Need independent indexes and/or indexed views to accelerate queries

• Need data cleansing and standardization to improve query success

• Be at capacity and cannot support additional external queries

Query federator – where data sources may…

• Need to be integrated with each other and existing systems without creating a data warehouse

• Not be moved or copied – data remains in source

• Need real-time indexes and queries, e.g., for operational BI/analytics

Revision 4.6 Copyright 2019 WhamTech, Inc. 34

Click here to return to Overview Sections

Page 35: WhamTech SmartData Fabric - Technical Overview€¦ · SmartData Fabric® security-centric distributed virtual data, master data and graph data management, and analytics Notes on

SmartData Fabric® security-centric distributed virtual data, master data and graph data management, and analytics

The End of Overview Section 3 -

SmartData Fabric® EIQ Adapters™ for

unconventional federated data access

Click here to return to Overview Sections

In slideshow mode, click anywhere else to continue

the presentation

Revision 4.6 Copyright 2019 WhamTech, Inc. 35

Page 36: WhamTech SmartData Fabric - Technical Overview€¦ · SmartData Fabric® security-centric distributed virtual data, master data and graph data management, and analytics Notes on

SmartData Fabric® security-centric distributed virtual data, master data and graph data management, and analytics

Overview Section 5 -

Comparison between conventional approaches

and vendors, and SmartData Fabric®

Revision 4.6 Copyright 2019 WhamTech, Inc. 36

Click here to return to Overview Sections

Page 37: WhamTech SmartData Fabric - Technical Overview€¦ · SmartData Fabric® security-centric distributed virtual data, master data and graph data management, and analytics Notes on

SmartData Fabric® security-centric distributed virtual data, master data and graph data management, and analytics

Overview Section 4 -

SmartData Fabric® Architecture

Revision 4.6 Copyright 2019 WhamTech, Inc. 37

Click here to return to Overview Sections

Page 38: WhamTech SmartData Fabric - Technical Overview€¦ · SmartData Fabric® security-centric distributed virtual data, master data and graph data management, and analytics Notes on

SmartData Fabric® security-centric distributed virtual data, master data and graph data management, and analytics

SDF better than or as good as alternatives

Revision 4.6 Copyright 2019 WhamTech, Inc. 38

No. Feature

SmartData

Fabric®

Data

Warehouse

Conventional

Federated Data Search

Data

Lake

1 Minimal time to implement and add new sources ✓ ✓

2 Relatively low cost and high ROI ✓

3 Flexibility of use ✓ ✓

4 Actively monitor data sources ✓ () ✓ or

5 Unstructured data ✓ ✓ ✓

6 Unlimited query options and performance ✓ (✓)*

7 Denormalized views ✓ (✓)* ()

8 Relationship/link mapping ✓ ✓ or ✓ or

9 Write back to data sources ✓ ✓ or

10 No major schema transforms ✓ ✓ or ✓

11 Data source changes readily accommodated ✓ ✓ ✓

12 Full text search ✓ ✓

*with data marts

Page 39: WhamTech SmartData Fabric - Technical Overview€¦ · SmartData Fabric® security-centric distributed virtual data, master data and graph data management, and analytics Notes on

SmartData Fabric® security-centric distributed virtual data, master data and graph data management, and analytics

SDF combines best of and overcomes worst of alternatives

Revision 4.6 Copyright 2019 WhamTech, Inc. 39

No. Feature

SmartData

Fabric®

Data

Warehouse

Conventional

Federated Data Search

Data

Lake

13 Clean and usable data ✓ ✓

14 Consistent and multiple indexes and types ✓ ✓ (✓)

15 Pre-aggregated, calculated and join views ✓ ✓

16 Results when data sources unavailable ✓ ✓ ✓ or ✓

17 Row, column and data element security ✓ ✓ (✓)

18 Install nothing on data source systems ✓ ✓ ✓ or ✓ ✓

19 Structured data ✓ ✓ ✓ ✓ or ✓

20 Data stays in original format ✓ ✓ ✓

21 Data remains in source ✓ ✓ ✓ or

22 User-level access to source data ✓ ✓

23 Latest data available ✓ ✓

24 Drill-down capability ✓ ✓ (✓)

Page 40: WhamTech SmartData Fabric - Technical Overview€¦ · SmartData Fabric® security-centric distributed virtual data, master data and graph data management, and analytics Notes on

SmartData Fabric® security-centric distributed virtual data, master data and graph data management, and analytics

SDF slightly disadvantaged compared to alternatives

Revision 4.6 Copyright 2019 WhamTech, Inc. 40

No. Feature

SmartData

Fabric®

Data

Warehouse

Conventional

Federated Data Search

Data

Lake

25 No index or query load on data sources (✓) ✓ ✓ ✓

26 Data source owners not aware of queries (✓) ✓ ✓ ✓

27 Archive options (✓)* ✓ () ✓

28 Good for application data sources (✓) ✓ ✓ ✓

29 Minimal additional system cost () ✓ ✓ ✓

30 No need for data or index update process ✓ ✓

*Can store and index either in own or third-party database:

1. Changed data for archive

2. Derived data (aggregations, calculations)

3. Any designated data

Can index original format archived data, e.g., mainframe files stored for SOX compliance

Page 41: WhamTech SmartData Fabric - Technical Overview€¦ · SmartData Fabric® security-centric distributed virtual data, master data and graph data management, and analytics Notes on

SmartData Fabric® security-centric distributed virtual data, master data and graph data management, and analytics

Large platform vendors

• Primarily committed to data warehousing / Big Data / Cloud

−Centralized processing mindset

• Conventional federated data access is used as incremental ETL to

data warehouses or analytics databases / appliances

• Starting to acknowledge that conventional federated data access is

not working

• Seeking improved data virtualization, but not necessarily improved

data federation

EOS

Revision 4.6 Copyright 2019 WhamTech, Inc. 41

Click here to return to Overview Sections

Page 42: WhamTech SmartData Fabric - Technical Overview€¦ · SmartData Fabric® security-centric distributed virtual data, master data and graph data management, and analytics Notes on

SmartData Fabric® security-centric distributed virtual data, master data and graph data management, and analytics

Other data virtualization and federation vendors

• Use conventional federated adapters

• Focus on, and IP in, middleware and query optimization to overcome

data source system and data deficiencies

−Reluctant to consider other (better) approaches

• Can include results or data cache – a form of distributed data

warehousing

−With or without data cleansing

• Can include some form of link mapping across multiple data sources

EOS

Revision 4.6 Copyright 2019 WhamTech, Inc. 42

Click here to return to Overview Sections

Page 43: WhamTech SmartData Fabric - Technical Overview€¦ · SmartData Fabric® security-centric distributed virtual data, master data and graph data management, and analytics Notes on

SmartData Fabric® security-centric distributed virtual data, master data and graph data management, and analytics

The End of Overview Section 5 –

Comparison between conventional approaches

and vendors, and SmartData Fabric®

…and the presentation

Click here to return to Overview Sections

Revision 4.6 Copyright 2019 WhamTech, Inc. 43

Page 44: WhamTech SmartData Fabric - Technical Overview€¦ · SmartData Fabric® security-centric distributed virtual data, master data and graph data management, and analytics Notes on

SmartData Fabric® security-centric distributed virtual data, master data and graph data management, and analytics

The End

Revision 4.6 Copyright 2019 WhamTech, Inc. 44