InfoArchive Technical Deep Dive - Dell EMC “Deep Dive” 1345 hrs. Get Your Documentum, SAP,...
Transcript of InfoArchive Technical Deep Dive - Dell EMC “Deep Dive” 1345 hrs. Get Your Documentum, SAP,...
1 © Copyright 2016 Dell . All rights reserved. 1 © Copyright 2016 Dell. All rights reserved.
InfoArchive Technical Deep Dive
Andreas Kalogeropoulos Tord Svensson
2 © Copyright 2016 Dell . All rights reserved.
MOMENTUM BARCELONA APP AND WIN!
2
http://bit.ly/mmtm16BCN
© Copyright 2016 Dell . All rights reserved.
BEYOND SILOS Play the BEYOND Game and win a Raspberry Pi pre-loaded with InfoArchive
3 © Copyright 2016 Dell . All rights reserved.
• This presentation contains “forward-looking statements” as defined under the US Federal Securities Laws.
• Dell EMC makes no representation and undertakes no obligations with
regard to product planning information, anticipated product characteristics, performance specifications, or anticipated release dates (collectively, “Roadmap Information”).
• Roadmap Information is provided by Dell EMC as an accommodation to the recipient solely for purposes of discussion and without intending to be bound thereby, and is subject to change without notice.
• Roadmap information is Dell EMC Confidential Information, and is provided under the terms, conditions and restrictions defined in the Dell EMC Non-Disclosure Agreement in place with your organization.
Safe Harbor Disclaimer
4 © Copyright 2016 Dell . All rights reserved.
Objectives High-level and dive into details of the architecture
Storage considerations
Disaster Recovery and High availability
Compliance
5 © Copyright 2016 Dell . All rights reserved.
JOIN THE CONVERSATION! #MMTM16
Take the LEAP personality quiz
and win!
Connect with us
ECD SERVICES
Genius Labs Garden Level
Foyer
INTERACT WITH THIS SESSION #infoarchive
6 © Copyright 2016 Dell . All rights reserved.
EMC INFOARCHIVE An Enterprise Information Archiving Platform that
unlocks data of all types, trapped in siloed applications, lowering IT costs, preserving
compliance and putting application data to work.
Leave No Application Data Behind
8 © Copyright 2016 Dell . All rights reserved.
InfoArchive Main Value Propositions
Application Decomissioning
Active Archiving
Information Transformation & Reuse
Compliance Engine for All Data Types
• Cost creep legacy apps • Aging portfolio • Uncontrolled file shares • Merge & Acquisitions • Migrations • Unsupported software
stack • Maintenance renewals • Loss of talent/knowledge
• Backup windows too short
• Production System Performance
• Scalability Challenges • Rising costs due to static
data
• Extended Access • Push for agility • Information Integration
for Improved reuse • Mobile strategy • Need for Analytics
• Compliance gaps • Centralized compliance
control • New legislation, e.g.
Dodd-Frank, EU Data Protective Directive
• Uncontrolled file shares • Long term access
challenges
9 © Copyright 2016 Dell . All rights reserved.
InfoArchive High-level Architecture
PRODUCTION
InfoArchive UI
ARCHIVE SERVICES
STORAGE PLATFORM
EMC Isilon, Centera, ECS NAS / SAN
INGESTION MANAGEMENT
LEGACY
PRODUCTION
connectors
XML
XML
Table Archiving
Data Record Archiving
File Archiving
Compound Record Archiving
ACCESS
CONSUMING APPLICATION
DATA & ANALYTICS UI
VALIDATION
BATCH
RETENTION
ANALYTICS CONFIG
SEARCH /EXPORT
HDFS
REST API
SQL/JDBC
A C C E S S
AUDIT
SECURITY
ENCRYPTION
SOURCE APPLICATION
10 © Copyright 2016 Dell . All rights reserved.
InfoArchive in 2016
• Synchronous Ingestion and access
• Clinical Archiving 2.0 – HIMVision
• Cloud-scale capabilities with ECS
• WORM compliance with Centera
Accessibility • New Architecture • User Centric Design • In-Place Compliance • Solutions
– Clinical Archiving 1.9
Extreme Archiving • Cloud Storage Capabilities
– Amazon S3 • Customized Search and Export
– Find what you want – Export what you need
• eDiscovery Functionality – Collections and production sets
• InfoArchive for SAP
Compliance
RECOGNIZED AS A LEADER IN THE GARTNER MQ FOR STRUCTURED DATA ARCHIVING AND APPLICATION RETIREMENT 2016
June 4.0
Dec 4.2
Sept 4.1
11 © Copyright 2016 Dell . All rights reserved.
Ingestion
InfoArchive
Extraction (Outside IA) Ingestion (RESTful)
Archive
Archive
Ingest Commit Reception
ETL Process
Access
Multiple Content
Unstructured Content
A SIP container
XM L
Table Archiving
ORACLE, MVS, DB2, etc.
ORACLE, Baan, SAP, etc.
Data Record Archiving
Documentum, StreamServe,
Sharepoint, etc.
File Archiving
Email, HrAccess, Siebel, Lotus Notes, etc.
Compound Record Archiving
Structured Data
Structured Data
A SIP container
A SIP container
12 © Copyright 2016 Dell . All rights reserved.
InfoArchive Client Application: Role Based
Business Owner Knowledge User Retention Manager Configurator Administrator IT Owner
Optimized for functional needs
Search, View Dashboard
Search, Background Search, Export
Search, Create / Apply/Remove Retention, Hold, Approve Purge Compliance Dashboard
Compose Collaborate
Search, Ingestion, Admin activities
Administration
Storage, Space,
Job, Permission, Encryption
View Storage
Dashboard
13 © Copyright 2016 Dell . All rights reserved.
Retention Manager Administrator Consultant/Developer End user
14 © Copyright 2016 Dell . All rights reserved.
InfoArchive and OAIS
DATA MANAGEMENT
INGEST ACCESS
PRESERVATION PLANNING
ARCHIVAL STORAGE
ADMINISTRATION
MANAGEMENT
PRODUCER
AIP
SIP CONSUMER
Descriptive Info
DIP AIP
Descriptive Info
orders
XML
XML
XML
XML
queries result sets
15 © Copyright 2016 Dell . All rights reserved.
Logical Architecture 4.1
REST Services (thru Gateway)
Reception / Ingestion Dashboard Administration
(Jobs/Audits/Encryption/ Security) Search / Export
Content Retrieval
JDBC Driver Application Server
GUI
MetaData Repository
Archive packages (SIP/Tables)
System Repository
In f oAr c hi v e Se r v er
Content/Data Repository
SAN/NAS and/or Isilon, Centera, ECS
XML </> Gateway pattern for single point of access and authentication ─ Highly scalable due to
stateless authentication ─ Modern pattern for
emerging micro services based architecture,
─ Cloud environment friendly
OAuth2 Based Stateless authentication using modern JSON Web Token JWT ─ Self contained, signed,
secure tokens for Clustering support
CLI
16 © Copyright 2016 Dell . All rights reserved.
InfoArchive 4.1 Base Architecture (“3 jars”) IA Web UI
AngularJS
IA Server
Spring Data
xDB
Storage Systems
Spring Boot or .war on Tomcat
REST Services Authentication (OAuth2/JWT)
XML Database
Metadata, Content and System data
IA Server
Storage System(s)
IA xDB
IA Web Server
17 © Copyright 2016 Dell . All rights reserved.
xDB
Database servers
Storage Systems
InfoArchive 4.1 Scalability
IA Server
IA Web UI
Cluster
Stateless, Standard AppServer load
balancing
Stateless, scale-out And specialization
Stateful, scale-out (Last to scale horizontally)
Stateful, scale-out (Last to scale horizontally)
IA Web Server
IA Server
Storage System
IA xDB IA xDB IA xDB
IA Server IA Server
Storage System Storage System
IA Web Server IA Web Server …
…
…
…
18 © Copyright 2016 Dell . All rights reserved.
InfoArchive Storage Breakdown
Holding Properties & Configuration (e.g Searches) AIP Information Table Information (with chain of custody) Partition Criteria Configuration Information Security information Separate Compliance Repository : Retention & Hold information (Can grow based on retention rules)
XDB Library • PDI XML • SIP XML or Table(s) XML • RI XML (can grow based on indexes)
System Repository Cache-In / Cache-Out
Content/Data Repository XDB Library Backup RI XML CI Container Ingestion logs PDI XML GZIP (optional) Analytic Rendition (optional) SIP GZIP (optional)
OAIS Terminology : PDI = Preservation Description information RI = Representation Information CI = Content Information
MetaData Repository
19 © Copyright 2016 Dell . All rights reserved.
InfoArchive Storage BreakDown and RTO/RPO
Keep all the content of an AIP/ Table WORM or NAS/SAN (*Can also be XDB BLOB)
Keep the system/technical information (System) NAS/SAN or SSD
Keep the XDB segments/libraries of the data for search NAS/SAN
METADATA REPOSITORY
CONTENT/DATA REPOSITORY
SYSTEM REPOSITORY
50% of the original XML- xDB Library *(Can increase with indexes)
40% of the original XML - xDB Library.zip Aggregated Content – CI.zip
(10% of the original XML - OriginXML.zip)
< 1% of the original XML (small) *(Can increase based on the number of
retention policies)
Storage replication for RPO
Optional applicative replication for RTO
Applicative replication for Consistency
All events/audits (record lifecycle and administrative journal) are created and pushed to an audit holding, hence no increase in system repository.
20 © Copyright 2016 Dell . All rights reserved.
Scalability & Sizing Until now, with SIP the sizing has been driven by the demanded ingestion throughput/SLA – not the
search & retrieval activity to serve – The key for increasing the global batch ingestion throughput is to be able to run an higher number of
concurrent ingestions – The search & retrieval workload unlikely to significantly impact the global sizing
The ingestion workload profile is very different between structured data and unstructured data
Performances in SIP are not sensitive to the already archived volume – Each ingestion works in a single xDB partition – A search including a partitioning criteria quickly narrows the XQuery scope to a subset of xDB partitions – Unstructured data retrieval is processed without any underlying XQuery execution
Performance in Table are traditional sensitive to already archive volume and number of concurrent searches
HTTP multipart can be used for large files (>2Gb), but we recommend a local file transfer before starting the ingestion (through REST Services)
23 © Copyright 2016 Dell . All rights reserved.
24 © Copyright 2016 Dell . All rights reserved.
LET US KNOW WHAT YOU THOUGHT Take the Session Survey
1. Open the schedule with the Momentum App 2. Go to the session you attended 3. Open “Session Survey” 4. Answer the 4 questions and submit. Thank you!
© Copyright 2016 Dell . All rights reserved.
25 © Copyright 2016 Dell . All rights reserved. © Copyright 2016 Dell . All rights reserved.
InfoArchive Sessions Wed Nov 2nd Room H1
0830 hrs. Accelerate your Digital Transformation Journey with InfoArchive
0930 hrs. InfoArchive: A Customers Journey
1045 hrs. InfoArchive as Part of your Digital Transformation and “Move to the Cloud” Journey - Atos
1145 hrs. InfoArchive “Deep Dive”
1345 hrs. Get Your Documentum, SAP, SharePoint Data Under Control
Hackathon - Configuration
1445 hrs. Digital Transformation Journey of Sanofi
1615 hrs. Sneak Peek: Future InfoArchive Functionality
Hackathon – Docker
Thur Nov 3rd Room H3
0900 hrs. Financial Services: Regulations and Compliance with InfoArchive
1000 hrs. InfoArchive: Ensuring Big Data Compliance and Reducing Risk with Real Time Analytics
26 © Copyright 2016 Dell . All rights reserved. © Copyright 2016 Dell . All rights reserved.
InfoArchive Technical (Room J)
Tue Nov 1st 0900 hrs. Hands On Lab 1515 hrs. Hackathon Wed Nov 2nd 1345 hrs. Hackathon 1615 hrs. Hackathon: InfoArchive @ Docker
Hands On Lab Thur Nov 3rd 0900 hrs. Hackathon 1100 hrs. Hands On Lab
27 © Copyright 2016 Dell . All rights reserved.