Dell Blueprint for Big Data and Analytics November 2015 Reference Architectures and Engineered...
-
Upload
antonia-potter -
Category
Documents
-
view
225 -
download
0
Transcript of Dell Blueprint for Big Data and Analytics November 2015 Reference Architectures and Engineered...
![Page 1: Dell Blueprint for Big Data and Analytics November 2015 Reference Architectures and Engineered Solutions.](https://reader036.fdocuments.us/reader036/viewer/2022062519/5697bfc91a28abf838ca8f94/html5/thumbnails/1.jpg)
Dell Blueprint forBig Data and Analytics
November 2015
Reference Architectures and Engineered Solutions
![Page 2: Dell Blueprint for Big Data and Analytics November 2015 Reference Architectures and Engineered Solutions.](https://reader036.fdocuments.us/reader036/viewer/2022062519/5697bfc91a28abf838ca8f94/html5/thumbnails/2.jpg)
2 Dell Blueprints
Big Data and Analytics Blueprint Portfolio
Consulting and Deployment:
Custom – see services details for each offer
Training: R730, SC4020
ProSupport Plus
Training Credits:SQL, MS Analytics
ProSupport Plus
Dell Software
Suite
Statistica Data Analytics Suite
Dell Boomi Integration Tools Dell Toad Data Management Dell SharePlex Replication Connector for Hadoop
ReferenceArchitectures
Dell | Cloudera Apache Hadoop SolutionStart and up to 15 Nodes, Scales to 445 nodes, Scales 45+ nodes
Dell SQL DWFTStart with 730/PS6210S to 17TB, Scales on 730xd to 21TB,
Scales on 730/PS6210S to 26 TB, Scales on 730/SC4020 to 55TB
Dell | Cloudera | Syncsort Data Warehouse Optimization for ETL Offload RA (June 19, 2015)
Engineered Solutions Dell QuickStart for Cloudera Hadoop
5 nodes
Dell In-Memory Appliance for Cloudera Enterprise Start with 8 nodes, Scales to 16 nodes, Scales from 24 – 48 nodes
Microsoft APS AppliancePDW: 3 nodes, Scales PDW + Hadoop to 6 nodes, Scales PDW + Hadoop 9 – 54 nodes
SAP HANA Appliance Single Server configurations scale from 128GB – 1.5 TB RAM;
Scale Out cluster configurations scale from 2-16TB RAM (up to 24TB w/R930 – due September, 2015)
ES Implementations:
Deployment: APS JumpStart
SERVICES
RA Implementations:
Engage your Big Data Overlay Sales Team
![Page 3: Dell Blueprint for Big Data and Analytics November 2015 Reference Architectures and Engineered Solutions.](https://reader036.fdocuments.us/reader036/viewer/2022062519/5697bfc91a28abf838ca8f94/html5/thumbnails/3.jpg)
3 Dell Blueprints
Reference Architectures: DWFT for SQL server 2014 17 TB, 21 TB, 26 TB, and 55 TB configurations
Solution benefits • Integrated, balanced and verified reference architectures
jointly engineered with Microsoft.• Capacity ranging from 17TB to 55TB.• Internal storage/SAN storage.• iSCSI and Fibre Channel networking.• Dell’s 13G server platform and all flash Dell storage arrays.• Feature rich SAN storage. Dell differentiation • Faster deployments: Pre-configured, Dell-led solution.• Reduced risk: Out-of-the-box offerings.• DWFT validated RA: Optimized data warehouse performance
that avoids over-provisioning of hardware resources.• Single point of contact/accountability for purchases, services,
and support with deep expertise based on 25 years.
Link to DWFT RAs
Dell PowerEdge serversR730/R730XD
Dell storagePS6210S/SC4020
Dell networkingswitchesS4810
MS Windows Server 2012 R2
Dell Open Manage / iDRAC / Lifecycle Controller
MS SQL Server 2014
![Page 4: Dell Blueprint for Big Data and Analytics November 2015 Reference Architectures and Engineered Solutions.](https://reader036.fdocuments.us/reader036/viewer/2022062519/5697bfc91a28abf838ca8f94/html5/thumbnails/4.jpg)
4 Dell Blueprints
Dell | Cloudera Apache Hadoop Solution Reference ArchitectureFlexible and scalable solution that simplifies Apache Hadoop
Minimize complexity through an engineered, validated solution based on extensive customer experience
• Scale Out hardware architecture — PowerEdge R730, R730xd and high performance Dell S-Series networking.
• Based on Cloudera Enterprise Apache Hadoop and Red Hat enterprise server.
• Comprehensive and collaborative service and support for the entire solution through it’s complete lifecycle.
The Dell difference
• Achieve Flexibility with a reference architecture approach that allows choice and provides guidance.
• Detailed reference architecture documentation.• Deployment guidelines detail best practices based on
extensive experience with production deployments.• Increased efficiency — PowerEdge servers are feature and
power-optimized to provide lower TCO in addition to saving on space and energy.
Link to Dell | Cloudera Apache Hadoop RALink to the Solution Brief
Store, process and analyzeall your data
Dell PowerEdge servers
Dell networkingswitches
Dell Statistica
Dell services
Cloudera Enterprise
Open
GovernedManaged
Secure
ApacheHadoop
![Page 5: Dell Blueprint for Big Data and Analytics November 2015 Reference Architectures and Engineered Solutions.](https://reader036.fdocuments.us/reader036/viewer/2022062519/5697bfc91a28abf838ca8f94/html5/thumbnails/5.jpg)
5 Dell Blueprints
Dell | Cloudera | Syncsort Data Warehouse Optimization for ETL Offload Reference ArchitectureThe first and only reference architecture for ETL offload with Hadoop
Scalable ETL with the flexibility of a reference architecture• Scale Out hardware architecture — PowerEdge R730,
R730xd and high performance Dell S-Series networking.• Tight integration between Dell, Cloudera and Syncsort
provides ease of deployment and maintenance with no performance impact or hurdles down the road.
• Close the Skills Gap by eliminating the need to develop expertise on MapReduce, Pig, Hive, and Sqoop.
• Fast Track Projects with automated conversion of legacy SQL scripts into efficient ETL processes in Hadoop without any coding.
• Comprehensive and collaborative service and support for the entire solution through it’s complete lifecycle.
The Dell difference• Faster time to value through an optimized solution jointly
designed by three market leaders.• Detailed reference architecture documentation.• Deployment guidelines detail best practices based on
extensive experience with production deployments.
Link to Dell | Cloudera | Syncsort DWO – ETL Offload RALink to the ETL Offload Solution Brief
![Page 6: Dell Blueprint for Big Data and Analytics November 2015 Reference Architectures and Engineered Solutions.](https://reader036.fdocuments.us/reader036/viewer/2022062519/5697bfc91a28abf838ca8f94/html5/thumbnails/6.jpg)
6 Dell Blueprints
Dell In-Memory Appliance for Cloudera Enterprise Big Data appliance optimized for in-memory analyticsReference architecture scalability with the simplicityof an appliance
• Scale Out hardware architecture, with predefined configurations and scalable in 4 node increments.
• Delivered assembled and ready to install, with minimal site integration requirements.
• Delivered with Cloudera Enterprise, Apache Spark, and Cloudera Impala ready to run.
• Optimized for interactive in-memory analytics and analysis of data, including streaming from connected devices and embedded sensors.
• Comprehensive and collaborative service and support for the entire solution through it’s complete lifecycle.
The Dell difference
• Based on the established Dell Cloudera Reference Architecture.• Faster time to value with a pre-configured, turnkey data
platform.• Increased efficiency — PowerEdge servers are feature and
power-optimized to provide lower TCO in addition to saving on space and energy.
Starter Configuration
8 Node ClusterPowerEdge R730- 4 Infrastructure Nodes with ProSupportPowerEdgeR730XD- 4 Data Nodes with ProSupportCloudera Enterprise Dell Networking using S4048-ON and S3048-ON switchesDell Rack 42U~176TB (disk raw space)
Mid-Size Configuration
16 Node ClusterPowerEegeR730- 4 Infrastructure Nodes with ProSupportPowerEdgeR720XD- 12 Data Nodes with ProSupportCloudera Enterprise Dell Networking using S4048-ON and S3048-ON switches Dell Rack 42U~528TB (disk raw space)Small Enterprise Configuration
24 Node ClusterPowerEdgeR730- 4 Infrastructure Nodes with ProSupportPowerEdgeR730XD- 20 Data Nodes with ProSupportCloudera Enterprise Dell Networking using S4048-ON and S3048-ON switches julDell Rack 42U~880TB (disk raw space)
Spec Sheet for the Dell In-Memory Appliance for Cloudera EnterpriseLink to the Solution Brief
![Page 7: Dell Blueprint for Big Data and Analytics November 2015 Reference Architectures and Engineered Solutions.](https://reader036.fdocuments.us/reader036/viewer/2022062519/5697bfc91a28abf838ca8f94/html5/thumbnails/7.jpg)
7 Dell Blueprints
Dell QuickStart for Cloudera HadoopCost-effective, all-in-one starter bundle for testing and building Hadoop proof-of-concept
Easy, affordable, flexible Hadoop starting point
• Five node Hadoop cluster with PowerEdge R730xd and Dell networking, Cloudera Enterprise Basic Edition and RedHat Enterprise included.
• Easy: Dell QuickStart for Cloudera Hadoop includes all hardware, software, networking, training and services.
• Affordable: Build a full Hadoop proof of concept for under $150K.
• Flexible: Build a proof of concept that can also upgrade to a full production cluster.
The Dell difference
• Upgradeable to the full Dell Cloudera Reference Architecture• Initial services jumpstart included.
Dell switch
Dell R7302x infrastructure nodes
Dell R730XD3x data nodes
Link to Datasheet for the Dell QuickStart for Cloudera Hadoop
![Page 8: Dell Blueprint for Big Data and Analytics November 2015 Reference Architectures and Engineered Solutions.](https://reader036.fdocuments.us/reader036/viewer/2022062519/5697bfc91a28abf838ca8f94/html5/thumbnails/8.jpg)
8 Dell Blueprints
Engineered Solutions for SAP HANAModular, complete In-memory appliance for real-time data analytics
Link to Solution Brief Link to Tech Sheet
Single server configuration
Scale-out cluster configuration
Engineered solutions SAP HANA detail
• PowerEdge R930 node for all configurations: 4U/4-socket Intel E7-8890v3 (R930 certified for scale-out in September).
• Delivered fully configured with either SLES or RHEL and ready for SAP HANA licenses keys to be applied.
• Deployment services included in appliance SKU; full SAP HANA transformation consulting and managed services available.
• 38% faster performance than next competitor in SAP BW-EML 1B record benchmark (www.sap.com/benchmark).
The Dell difference Key differentiation :
• Common node platform, from the smallest to the largest, simplifies the management and maintenance of your system.
• Modular Scalability ensures your ability to grow your scale out system without disruption or “rip and replace”.
• Single vendor for every aspect of the solution, end to end.
![Page 9: Dell Blueprint for Big Data and Analytics November 2015 Reference Architectures and Engineered Solutions.](https://reader036.fdocuments.us/reader036/viewer/2022062519/5697bfc91a28abf838ca8f94/html5/thumbnails/9.jpg)
9 Dell Blueprints
Microsoft analytics platform system by Dell • Integrated compute, storage, networking and software
appliance for high performance database workload needs.
• Microsoft APS software aggregates, stores and queries relational (SQL)+ non-relational (Hadoop) data in the solution.
• Includes Jumpstart services (3 weeks) for customer training and architecture design.
The Dell difference
• MPP (Massively Parallel Processing) appliance for up to 100x improvement over SMP database workloads.
• Highly scalable solution — starting from 3 nodes to 54, Multiple racks can be configured (up to 6 racks). Scales from 21TB to 6PB, Scale-out expansion 3 nodes at a time.
• White glove delivery and installation: Delivered as fully built appliance with software installed and configured (EDT) for the customer with training services (GICS).
Link to APS Solution BriefLink to Jumpstart Services
Engineered Solution: Microsoft Analytics Platform System by DellReal-time management of relational (SQL) and non-relational (Hadoop) data
x2 | SX6036 Infiniband switchesx2 | N3048 ethernet switchesx2 | R630 management nodesx2 | R630 nodes added when HDInsight included in first rack Optional
3rd Scale Unite for 9 nodes Optionalx3 | R630 compute nodes x2 | MD3060e JBODs (102 Drives / 18 Spare)
2nd Scale Unite for 6 nodes Optionalx3 | R630 compute nodes x2 | MD3060e JBODs (102 drives / 18 spare)
Base unit for 3 nodesx3 | R630 compute nodesx2 | MD3060e JBODs (102 drives / 18 spare)
Scales from 3 nodes to 54 nodes across 6 racks (up to 6PB)
![Page 10: Dell Blueprint for Big Data and Analytics November 2015 Reference Architectures and Engineered Solutions.](https://reader036.fdocuments.us/reader036/viewer/2022062519/5697bfc91a28abf838ca8f94/html5/thumbnails/10.jpg)
10 Dell Blueprints
Software: Dell SharePlexDatabase replication and migration
Dell SharePlex
Database replication — Oracle to Oracle — Near real-time data integration.• SharePlex saves DBAs more than five hours a day by
automating replication, which has also increased accuracy.• Only SharePlex provides data compare and repair, in-flight
data integrity, plus monitoring and alerting functionalities — all in one affordable solution.
System requirements
Platform UNIX®, Linux®, Windows®Memory SharePlex processes are 64-bit and can exceed 4GB.Per process memory greater than or equal to 256MB.Additional software: SQL*Plus®Source environments: Oracle Target environments: Oracle, Microsoft SQL Server, SAP ASE, Hadoop, Java Message Service (JMS), File See the platform-specific pre-installation checklist in the installation guide foradditional system and database requirements.For replication, migration or data integration from Oracleto Hadoop, go to SharePlex Connector for Hadoop.
Dell PowerEdgeservers
Source databases:Oracle
Target databases:Oracle, SQL Server, Hadoop, SAP ASE, more…
SharePlex database replication
On-premises / Remote / In-the-Cloud
Oracle EBS / PeopleSoft / Siebel / SAP / more…
CRM / Finance / HR / Web Apps / BI
Dell networkswitches
Dell storage
Link to Datasheet for Dell SharePlex
![Page 11: Dell Blueprint for Big Data and Analytics November 2015 Reference Architectures and Engineered Solutions.](https://reader036.fdocuments.us/reader036/viewer/2022062519/5697bfc91a28abf838ca8f94/html5/thumbnails/11.jpg)
11 Dell Blueprints
Software: Dell StatisticaPredictive analytics platform
Dell Statistica
Dell Statistica is an advanced analytics platform that enables organizations to transform unstructured, semi-structured and structured data into actionable business decisions. Statistica excels at creating predictive models that can see into the future.• Challenger in the Gartner Magic Quadrant for Advanced
Analytics Platforms.• Professional services available for standing up models and
driving better reports.
System requirements
Compatible with Windows® XP, Windows Server® 2003 and 2008, Windows Vista® and Windows 7 and 8.Client requirements Windows XP (Windows 7 or above recommended) 512 MB RAM (1 GB recommended) 500 MHz processor (2.0 GHz, 64-Bit, dual core recommended).Server requirements: Windows Server 2008 R2 or later2 GB RAM (8 GB recommended) 1.0 GHz processor (2.0 GHz, 64-Bit, dual core recommended) 2.5 GB disk space 100 Mb/s or faster network bandwidth For complete system requirements, please visit statsoft.com/Products/Licensing.
Dell Statistica
Analytics
Multiple users
Dell PowerEdgeservers
Dell — Single user
Link to Datasheet for Dell Statistica
![Page 12: Dell Blueprint for Big Data and Analytics November 2015 Reference Architectures and Engineered Solutions.](https://reader036.fdocuments.us/reader036/viewer/2022062519/5697bfc91a28abf838ca8f94/html5/thumbnails/12.jpg)
12 Dell Blueprints
Software:Dell Toad Data Point and Toad Intelligence CentralData prep and cleansing
Dell Toad Data Point and Toad Intelligence Central
Toad Data Point is a cross-platform query, data integration and preparation tool that simplifies data access, analysis and provisioning for data management professionals. It is specifically built for data analysts, providing nearly limitless data connectivity, desktop data integration, visual query building, and workflow automation.• Improved data access • Desktop data integration • Data Preparation • Improves productivity
System requirements
Compatible with Windows® XP, Windows Server® 2003and 2008, Windows Vista® and Windows 7 and 8. Client requirementsWindows XP (Windows 7 or above recommended) 512 MB RAM (1 GBrecommended) 500 MHz processor (2.0 GHz, 64-Bit, dual core recommended)Server requirements: Windows Server 2008 R2 or later 2 GB RAM (8 GB recommended) 1.0 GHz processor (2.0 GHz, 64-Bit, dual core recommended)2.5 GB disk space 100 Mb/s or faster network bandwidth. For complete systemrequirements, please visit statsoft.com/Products/Licensing-Options/System-Requirements.
Dell PowerEdgeservers
User
Toad Intelligence Central
Data sources
Toad Data Point
Dell statistica
Analytics
Link to Datasheet for Dell Toad Data Point Link to Datasheet for Dell Toad Data Intelligence Central
![Page 13: Dell Blueprint for Big Data and Analytics November 2015 Reference Architectures and Engineered Solutions.](https://reader036.fdocuments.us/reader036/viewer/2022062519/5697bfc91a28abf838ca8f94/html5/thumbnails/13.jpg)
13 Dell Blueprints
Software: Dell BoomiCloud-based data integration platform
Dell Boomi
The Dell Boomi AtomSphere platform was designed and implemented from the ground up to be an elastic, multi-tenant, hosted platform. It is not a retro-fit of a traditional software solution where multi-tenancy is achieved via multiple installation instances. Dell Boomi has a proven, tenant-isolation implementation that achieves isolation at a process, data and management level by:
• Assigning a unique identifier to each account and tagging all objects associated with the account with this ID.
• Using roles and permissions to control access to account objects and management functions.
• Encapsulating all integration workflow, transformation rules, business logic validations and connector operations as metadata bound to a specific customer account.
• Deploying workflow configuration metadata to an Atom, which acts on it to perform the execution of an integration process.
Dell Boomi
Application Application
Link to Datasheet for Dell Boomi
![Page 14: Dell Blueprint for Big Data and Analytics November 2015 Reference Architectures and Engineered Solutions.](https://reader036.fdocuments.us/reader036/viewer/2022062519/5697bfc91a28abf838ca8f94/html5/thumbnails/14.jpg)