InfoArchive Technical Deep Dive - Dell EMC “Deep Dive” 1345 hrs. Get Your Documentum, SAP,...

24
1 © Copyright 2016 Dell . All rights reserved. 1 © Copyright 2016 Dell. All rights reserved. InfoArchive Technical Deep Dive Andreas Kalogeropoulos Tord Svensson

Transcript of InfoArchive Technical Deep Dive - Dell EMC “Deep Dive” 1345 hrs. Get Your Documentum, SAP,...

Page 1: InfoArchive Technical Deep Dive - Dell EMC “Deep Dive” 1345 hrs. Get Your Documentum, SAP, SharePoint Data Under Control Hackathon - Configuration 1445 hrs. Digital Transformation

1 © Copyright 2016 Dell . All rights reserved. 1 © Copyright 2016 Dell. All rights reserved.

InfoArchive Technical Deep Dive

Andreas Kalogeropoulos Tord Svensson

Page 2: InfoArchive Technical Deep Dive - Dell EMC “Deep Dive” 1345 hrs. Get Your Documentum, SAP, SharePoint Data Under Control Hackathon - Configuration 1445 hrs. Digital Transformation

2 © Copyright 2016 Dell . All rights reserved.

MOMENTUM BARCELONA APP AND WIN!

2

http://bit.ly/mmtm16BCN

© Copyright 2016 Dell . All rights reserved.

BEYOND SILOS Play the BEYOND Game and win a Raspberry Pi pre-loaded with InfoArchive

Page 3: InfoArchive Technical Deep Dive - Dell EMC “Deep Dive” 1345 hrs. Get Your Documentum, SAP, SharePoint Data Under Control Hackathon - Configuration 1445 hrs. Digital Transformation

3 © Copyright 2016 Dell . All rights reserved.

• This presentation contains “forward-looking statements” as defined under the US Federal Securities Laws.

• Dell EMC makes no representation and undertakes no obligations with

regard to product planning information, anticipated product characteristics, performance specifications, or anticipated release dates (collectively, “Roadmap Information”).

• Roadmap Information is provided by Dell EMC as an accommodation to the recipient solely for purposes of discussion and without intending to be bound thereby, and is subject to change without notice.

• Roadmap information is Dell EMC Confidential Information, and is provided under the terms, conditions and restrictions defined in the Dell EMC Non-Disclosure Agreement in place with your organization.

Safe Harbor Disclaimer

Page 4: InfoArchive Technical Deep Dive - Dell EMC “Deep Dive” 1345 hrs. Get Your Documentum, SAP, SharePoint Data Under Control Hackathon - Configuration 1445 hrs. Digital Transformation

4 © Copyright 2016 Dell . All rights reserved.

Objectives High-level and dive into details of the architecture

Storage considerations

Disaster Recovery and High availability

Compliance

Page 5: InfoArchive Technical Deep Dive - Dell EMC “Deep Dive” 1345 hrs. Get Your Documentum, SAP, SharePoint Data Under Control Hackathon - Configuration 1445 hrs. Digital Transformation

5 © Copyright 2016 Dell . All rights reserved.

JOIN THE CONVERSATION! #MMTM16

Take the LEAP personality quiz

and win!

Connect with us

ECD SERVICES

Genius Labs Garden Level

Foyer

INTERACT WITH THIS SESSION #infoarchive

Page 6: InfoArchive Technical Deep Dive - Dell EMC “Deep Dive” 1345 hrs. Get Your Documentum, SAP, SharePoint Data Under Control Hackathon - Configuration 1445 hrs. Digital Transformation

6 © Copyright 2016 Dell . All rights reserved.

EMC INFOARCHIVE An Enterprise Information Archiving Platform that

unlocks data of all types, trapped in siloed applications, lowering IT costs, preserving

compliance and putting application data to work.

Leave No Application Data Behind

Page 7: InfoArchive Technical Deep Dive - Dell EMC “Deep Dive” 1345 hrs. Get Your Documentum, SAP, SharePoint Data Under Control Hackathon - Configuration 1445 hrs. Digital Transformation

8 © Copyright 2016 Dell . All rights reserved.

InfoArchive Main Value Propositions

Application Decomissioning

Active Archiving

Information Transformation & Reuse

Compliance Engine for All Data Types

• Cost creep legacy apps • Aging portfolio • Uncontrolled file shares • Merge & Acquisitions • Migrations • Unsupported software

stack • Maintenance renewals • Loss of talent/knowledge

• Backup windows too short

• Production System Performance

• Scalability Challenges • Rising costs due to static

data

• Extended Access • Push for agility • Information Integration

for Improved reuse • Mobile strategy • Need for Analytics

• Compliance gaps • Centralized compliance

control • New legislation, e.g.

Dodd-Frank, EU Data Protective Directive

• Uncontrolled file shares • Long term access

challenges

Page 8: InfoArchive Technical Deep Dive - Dell EMC “Deep Dive” 1345 hrs. Get Your Documentum, SAP, SharePoint Data Under Control Hackathon - Configuration 1445 hrs. Digital Transformation

9 © Copyright 2016 Dell . All rights reserved.

InfoArchive High-level Architecture

PRODUCTION

InfoArchive UI

ARCHIVE SERVICES

STORAGE PLATFORM

EMC Isilon, Centera, ECS NAS / SAN

INGESTION MANAGEMENT

LEGACY

PRODUCTION

connectors

XML

XML

Table Archiving

Data Record Archiving

File Archiving

Compound Record Archiving

ACCESS

CONSUMING APPLICATION

DATA & ANALYTICS UI

VALIDATION

BATCH

RETENTION

ANALYTICS CONFIG

SEARCH /EXPORT

HDFS

REST API

SQL/JDBC

A C C E S S

AUDIT

SECURITY

ENCRYPTION

SOURCE APPLICATION

Page 9: InfoArchive Technical Deep Dive - Dell EMC “Deep Dive” 1345 hrs. Get Your Documentum, SAP, SharePoint Data Under Control Hackathon - Configuration 1445 hrs. Digital Transformation

10 © Copyright 2016 Dell . All rights reserved.

InfoArchive in 2016

• Synchronous Ingestion and access

• Clinical Archiving 2.0 – HIMVision

• Cloud-scale capabilities with ECS

• WORM compliance with Centera

Accessibility • New Architecture • User Centric Design • In-Place Compliance • Solutions

– Clinical Archiving 1.9

Extreme Archiving • Cloud Storage Capabilities

– Amazon S3 • Customized Search and Export

– Find what you want – Export what you need

• eDiscovery Functionality – Collections and production sets

• InfoArchive for SAP

Compliance

RECOGNIZED AS A LEADER IN THE GARTNER MQ FOR STRUCTURED DATA ARCHIVING AND APPLICATION RETIREMENT 2016

June 4.0

Dec 4.2

Sept 4.1

Page 10: InfoArchive Technical Deep Dive - Dell EMC “Deep Dive” 1345 hrs. Get Your Documentum, SAP, SharePoint Data Under Control Hackathon - Configuration 1445 hrs. Digital Transformation

11 © Copyright 2016 Dell . All rights reserved.

Ingestion

InfoArchive

Extraction (Outside IA) Ingestion (RESTful)

Archive

Archive

Ingest Commit Reception

ETL Process

Access

Multiple Content

Unstructured Content

A SIP container

XM L

Table Archiving

ORACLE, MVS, DB2, etc.

ORACLE, Baan, SAP, etc.

Data Record Archiving

Documentum, StreamServe,

Sharepoint, etc.

File Archiving

Email, HrAccess, Siebel, Lotus Notes, etc.

Compound Record Archiving

Structured Data

Structured Data

A SIP container

A SIP container

Page 11: InfoArchive Technical Deep Dive - Dell EMC “Deep Dive” 1345 hrs. Get Your Documentum, SAP, SharePoint Data Under Control Hackathon - Configuration 1445 hrs. Digital Transformation

12 © Copyright 2016 Dell . All rights reserved.

InfoArchive Client Application: Role Based

Business Owner Knowledge User Retention Manager Configurator Administrator IT Owner

Optimized for functional needs

Search, View Dashboard

Search, Background Search, Export

Search, Create / Apply/Remove Retention, Hold, Approve Purge Compliance Dashboard

Compose Collaborate

Search, Ingestion, Admin activities

Administration

Storage, Space,

Job, Permission, Encryption

View Storage

Dashboard

Page 12: InfoArchive Technical Deep Dive - Dell EMC “Deep Dive” 1345 hrs. Get Your Documentum, SAP, SharePoint Data Under Control Hackathon - Configuration 1445 hrs. Digital Transformation

13 © Copyright 2016 Dell . All rights reserved.

Retention Manager Administrator Consultant/Developer End user

Page 13: InfoArchive Technical Deep Dive - Dell EMC “Deep Dive” 1345 hrs. Get Your Documentum, SAP, SharePoint Data Under Control Hackathon - Configuration 1445 hrs. Digital Transformation

14 © Copyright 2016 Dell . All rights reserved.

InfoArchive and OAIS

DATA MANAGEMENT

INGEST ACCESS

PRESERVATION PLANNING

ARCHIVAL STORAGE

ADMINISTRATION

MANAGEMENT

PRODUCER

AIP

SIP CONSUMER

Descriptive Info

DIP AIP

Descriptive Info

orders

XML

XML

XML

XML

queries result sets

Page 14: InfoArchive Technical Deep Dive - Dell EMC “Deep Dive” 1345 hrs. Get Your Documentum, SAP, SharePoint Data Under Control Hackathon - Configuration 1445 hrs. Digital Transformation

15 © Copyright 2016 Dell . All rights reserved.

Logical Architecture 4.1

REST Services (thru Gateway)

Reception / Ingestion Dashboard Administration

(Jobs/Audits/Encryption/ Security) Search / Export

Content Retrieval

JDBC Driver Application Server

GUI

MetaData Repository

Archive packages (SIP/Tables)

System Repository

In f oAr c hi v e Se r v er

Content/Data Repository

SAN/NAS and/or Isilon, Centera, ECS

XML </> Gateway pattern for single point of access and authentication ─ Highly scalable due to

stateless authentication ─ Modern pattern for

emerging micro services based architecture,

─ Cloud environment friendly

OAuth2 Based Stateless authentication using modern JSON Web Token JWT ─ Self contained, signed,

secure tokens for Clustering support

CLI

Page 15: InfoArchive Technical Deep Dive - Dell EMC “Deep Dive” 1345 hrs. Get Your Documentum, SAP, SharePoint Data Under Control Hackathon - Configuration 1445 hrs. Digital Transformation

16 © Copyright 2016 Dell . All rights reserved.

InfoArchive 4.1 Base Architecture (“3 jars”) IA Web UI

AngularJS

IA Server

Spring Data

xDB

Storage Systems

Spring Boot or .war on Tomcat

REST Services Authentication (OAuth2/JWT)

XML Database

Metadata, Content and System data

IA Server

Storage System(s)

IA xDB

IA Web Server

Page 16: InfoArchive Technical Deep Dive - Dell EMC “Deep Dive” 1345 hrs. Get Your Documentum, SAP, SharePoint Data Under Control Hackathon - Configuration 1445 hrs. Digital Transformation

17 © Copyright 2016 Dell . All rights reserved.

xDB

Database servers

Storage Systems

InfoArchive 4.1 Scalability

IA Server

IA Web UI

Cluster

Stateless, Standard AppServer load

balancing

Stateless, scale-out And specialization

Stateful, scale-out (Last to scale horizontally)

Stateful, scale-out (Last to scale horizontally)

IA Web Server

IA Server

Storage System

IA xDB IA xDB IA xDB

IA Server IA Server

Storage System Storage System

IA Web Server IA Web Server …

Page 17: InfoArchive Technical Deep Dive - Dell EMC “Deep Dive” 1345 hrs. Get Your Documentum, SAP, SharePoint Data Under Control Hackathon - Configuration 1445 hrs. Digital Transformation

18 © Copyright 2016 Dell . All rights reserved.

InfoArchive Storage Breakdown

Holding Properties & Configuration (e.g Searches) AIP Information Table Information (with chain of custody) Partition Criteria Configuration Information Security information Separate Compliance Repository : Retention & Hold information (Can grow based on retention rules)

XDB Library • PDI XML • SIP XML or Table(s) XML • RI XML (can grow based on indexes)

System Repository Cache-In / Cache-Out

Content/Data Repository XDB Library Backup RI XML CI Container Ingestion logs PDI XML GZIP (optional) Analytic Rendition (optional) SIP GZIP (optional)

OAIS Terminology : PDI = Preservation Description information RI = Representation Information CI = Content Information

MetaData Repository

Page 18: InfoArchive Technical Deep Dive - Dell EMC “Deep Dive” 1345 hrs. Get Your Documentum, SAP, SharePoint Data Under Control Hackathon - Configuration 1445 hrs. Digital Transformation

19 © Copyright 2016 Dell . All rights reserved.

InfoArchive Storage BreakDown and RTO/RPO

Keep all the content of an AIP/ Table WORM or NAS/SAN (*Can also be XDB BLOB)

Keep the system/technical information (System) NAS/SAN or SSD

Keep the XDB segments/libraries of the data for search NAS/SAN

METADATA REPOSITORY

CONTENT/DATA REPOSITORY

SYSTEM REPOSITORY

50% of the original XML- xDB Library *(Can increase with indexes)

40% of the original XML - xDB Library.zip Aggregated Content – CI.zip

(10% of the original XML - OriginXML.zip)

< 1% of the original XML (small) *(Can increase based on the number of

retention policies)

Storage replication for RPO

Optional applicative replication for RTO

Applicative replication for Consistency

All events/audits (record lifecycle and administrative journal) are created and pushed to an audit holding, hence no increase in system repository.

Page 19: InfoArchive Technical Deep Dive - Dell EMC “Deep Dive” 1345 hrs. Get Your Documentum, SAP, SharePoint Data Under Control Hackathon - Configuration 1445 hrs. Digital Transformation

20 © Copyright 2016 Dell . All rights reserved.

Scalability & Sizing Until now, with SIP the sizing has been driven by the demanded ingestion throughput/SLA – not the

search & retrieval activity to serve – The key for increasing the global batch ingestion throughput is to be able to run an higher number of

concurrent ingestions – The search & retrieval workload unlikely to significantly impact the global sizing

The ingestion workload profile is very different between structured data and unstructured data

Performances in SIP are not sensitive to the already archived volume – Each ingestion works in a single xDB partition – A search including a partitioning criteria quickly narrows the XQuery scope to a subset of xDB partitions – Unstructured data retrieval is processed without any underlying XQuery execution

Performance in Table are traditional sensitive to already archive volume and number of concurrent searches

HTTP multipart can be used for large files (>2Gb), but we recommend a local file transfer before starting the ingestion (through REST Services)

Page 20: InfoArchive Technical Deep Dive - Dell EMC “Deep Dive” 1345 hrs. Get Your Documentum, SAP, SharePoint Data Under Control Hackathon - Configuration 1445 hrs. Digital Transformation

23 © Copyright 2016 Dell . All rights reserved.

Page 21: InfoArchive Technical Deep Dive - Dell EMC “Deep Dive” 1345 hrs. Get Your Documentum, SAP, SharePoint Data Under Control Hackathon - Configuration 1445 hrs. Digital Transformation

24 © Copyright 2016 Dell . All rights reserved.

LET US KNOW WHAT YOU THOUGHT Take the Session Survey

1. Open the schedule with the Momentum App 2. Go to the session you attended 3. Open “Session Survey” 4. Answer the 4 questions and submit. Thank you!

© Copyright 2016 Dell . All rights reserved.

Page 22: InfoArchive Technical Deep Dive - Dell EMC “Deep Dive” 1345 hrs. Get Your Documentum, SAP, SharePoint Data Under Control Hackathon - Configuration 1445 hrs. Digital Transformation

25 © Copyright 2016 Dell . All rights reserved. © Copyright 2016 Dell . All rights reserved.

InfoArchive Sessions Wed Nov 2nd Room H1

0830 hrs. Accelerate your Digital Transformation Journey with InfoArchive

0930 hrs. InfoArchive: A Customers Journey

1045 hrs. InfoArchive as Part of your Digital Transformation and “Move to the Cloud” Journey - Atos

1145 hrs. InfoArchive “Deep Dive”

1345 hrs. Get Your Documentum, SAP, SharePoint Data Under Control

Hackathon - Configuration

1445 hrs. Digital Transformation Journey of Sanofi

1615 hrs. Sneak Peek: Future InfoArchive Functionality

Hackathon – Docker

Thur Nov 3rd Room H3

0900 hrs. Financial Services: Regulations and Compliance with InfoArchive

1000 hrs. InfoArchive: Ensuring Big Data Compliance and Reducing Risk with Real Time Analytics

Page 23: InfoArchive Technical Deep Dive - Dell EMC “Deep Dive” 1345 hrs. Get Your Documentum, SAP, SharePoint Data Under Control Hackathon - Configuration 1445 hrs. Digital Transformation

26 © Copyright 2016 Dell . All rights reserved. © Copyright 2016 Dell . All rights reserved.

InfoArchive Technical (Room J)

Tue Nov 1st 0900 hrs. Hands On Lab 1515 hrs. Hackathon Wed Nov 2nd 1345 hrs. Hackathon 1615 hrs. Hackathon: InfoArchive @ Docker

Hands On Lab Thur Nov 3rd 0900 hrs. Hackathon 1100 hrs. Hands On Lab

Page 24: InfoArchive Technical Deep Dive - Dell EMC “Deep Dive” 1345 hrs. Get Your Documentum, SAP, SharePoint Data Under Control Hackathon - Configuration 1445 hrs. Digital Transformation

27 © Copyright 2016 Dell . All rights reserved.