©2015IBMCorporation
Cloud Developer Certification Preparation
Features of IBM Data Connect in IBM Bluemix
• IT Architect IBM
©2015IBMCorporation
• Key characteristics of IBM Data Connect Service in Bluemix
• How Data Connect improves data collection and analysis
After you complete this unit, you should understand:
2
©2015IBMCorporation
Data is the New Natural Resource
80%ofthetimedoinganalyticsisn’tspentdoinganalytics,but
datapreparation“upto80percent,isspentcleaning,aggregatingandorganizingitpriorto
performinganyvisualizationoranalytics”– information-management.com
“businessprofessionalsspendmorethan40%oftheirtime
fixingandvalidatingdatabeforetheyuseit”– Forrester
Datadiversitymakesithardtofind,mix&match,&analyzedata
“Thiscomplexitynecessitatesdiscoveryandtransformationofdataintoaconsumableformattoenablethekindofinsightful
analysisthatuncoverspatterns.”– Gartner
TraditionalETLrequiresskillsthatareinlimitedsupply
Itfuelsbusinessdecision-making,insightsandcompetitiveadvantageCloud,BigDataandIoTtechnologiesareprovidingvastamountsofnewdatainawidevarietyofformatsthatbusinessescantapinto
But there are obstacles keeping companies from exploiting data for analytics…
“59% ofbusinessandtechnologydecisionmakerssayittakesmonthsoryearstomeetnewcomplexrequeststoturndataintobusiness intelligence
insight”– Forrester
©2015IBMCorporation
IBM Bluemix Data Connect
BusinessandCitizenAnalysts Developers
Canfindandusethedatatheyneed toacceleratedatabasedbusinessdecisionsusingtimely,accurateandtrustedinformation
Canquicklydevelopdata-richapplicationsbyembeddingtheDataConnectRESTAPIintoneworexistingapplications
DataEngineers
Canenableself-servicedataaccesstousersanddeliverdatafasterandstillmaintaindatagovernanceandsecurity
Access yourdataregardlessofwhereitresidesCombine datafrommultiplerelevantdatasources
Transform andcleanseyourdataanduseitwithconfidence
FullymanagedbyIBMintheCloud toaccessitfromanywhere, anytimePay-as-you-Go andSubscription options togetstartedquicklyPoweredbySparkforaspeedy andresponsive experience
Seamlessly embeddedwithinWatsonAnalytics
Afullymanagedself-service datapreparationandintegrationserviceenablinguserstoeasilyputdatatowork
©2015IBMCorporation
What can Data Connect do for you?
Load data for analyticsAccess prepared data,
wherever it is, and load it to a data service on the cloud
Control data workflowsUse the Data Connect API to create and control workflow activities from an application
Map structured datato semi-structured data
Load normalized tabular data into Cloudant NoSQL data stores
AccessdatainahybridcloudenvironmentAccessdatawhereveritresidesbyconnectivity tothemostcommonindustry datasources, andsecurelyreachintodatabehind afirewall
Blend datafrom multiple sources
Access data from any of the supported sources and combine
the data to create a file/table relevant to the target analytics task
Shape data for analyticsFilter values and columns from the source data, sort, remove
duplicates, and understand the quality of the data through
standardized scores
©2015IBMCorporation
§ Accessdatafrommultiplerelevantcloudoron-premisesdatasources§ Automaticallyanalyzes,profilesandclassifiesthedatauponingestion§ Deliversqualityscoresandmetricstounderstand andtrustthecontent§ Easilyrefine,enrichandcleansethedatausingarobustcatalogoftransformationoperations§ Save,edit,runandscheduleactivitiestodeliverthedatatoadiversesetoftargets
Self-Service
A simple and intuitive self-service user interface to visually interactwith your data to Access, Combine and Transform it with ease and confidence
©2015IBMCorporationAllofthesupportedtargetsarecompatiblewitheachsource– Someversionrestrictionsapply
Data AccessCloudSource On-PremisesSource CloudTarget On-PremisesTarget
AmazonRedshift Apache Hive AmazonS3 HortonworksHDFS
AmazonS3 Cloudera Impala BluemixObjectStorage IBM BigInsights™
ApacheHive HortonworksHDFS IBMCloudant™[2] IBMDB2®LUW
BluemixObjectStorage IBM BigInsights™ IBMdashDB IBMDB2®z/OS
IBM BigInsights™ onCloud IBMDB2®LUW IBM BigInsights™ onCloud IBMPureDataforAnalytics®
IBMCloudant™[1] IBMDB2®z/OS IBMDB2®onCloud Microsoft SQLServer
IBMdashDB IBMInformix® IBMWatson™Analytics MySQLEnterprise&Community
IBMDB2®onCloud IBMPureDataforAnalytics® MicrosoftAzure Oracle
MicrosoftAzure MicrosoftSQLServer PostgreSQLonCompose Teradata
PostgreSQL onCompose MySQLEnterprise& CommunityEdition SoftLayerObjectStorage
Salesforce Oracle
SoftLayerObjectStorage PivotalGreenplum
PostgreSQL
Sybase
SybaseIQ
Teradata
©2015IBMCorporation
75+RefinementOperations
Combine Join Standardize Email Address Math &Trig Next Even NumberReorder Sort Phone Number Exponential of number
Columns Date Factorial 1ChangeValue BlankOut ChangeCase Lower Case Factorial N
FillDown Title Case FloorReplace Upper Case Next Odd NumberSHA1 Escape Escape String RoundMD5 Un-escape String Natural LogarithmReplaceString Quote Double Quote Base 10 LogarithmConvertString Single Quote CeilingNulltoZero String Concatenate String Arc cosine
Remove Filter Right Pad String Arc sineDuplicates Return Right Most String Cosine of angleEmptyColumnsRows Return Left Most String Hyperbolic cosineNullColumnRows Trim Collapse whitespace DegreesSortRemoveDuplicateRows Leading & Trailing White Space Angle Measure in RadiansOneendingnewlinecharacter Strip All White Spaces Sing of AngleEndString Trim Leading Space Hyperbolic sing
Rename Column Date&Time Derive Timestamp Tangent of angleConvert String toDate Time to Number Hyperbolic tangent
StringtoTime Time to String Arc tangentStringtonumber Date to String Absolute Number ValueNumbertoString Timestamp to Date Whole part of real division
Standardize Name Timestamp to Time Next Even NumberAddress Timestamp to String Exponential of numberTax ID Whole part of real division
Data Refinement
©2015IBMCorporation
Firewall
On-Premises SystemSecure Gateway Client
Securely Access Data On-Premises
SSH
TUNNEL
…
SecureGatewayService
BluemixData Connect
Service
1. DataConnectattemptsaremoteon-premisesconnection
2. Therequestisrouted totheSecureGatewayservice
3. TheSecureGatewayserviceroutes therequesttotheSecureGatewayclient
4. TheSecureGatewayclientroutestherequesttothedatasource
5. Ifit’saread,theresultispassedbackthrough theSecureGatewaySSHtunnelback toDataConnect
6. Ifit’sawrite,thedataiswrittentotheon-premisesdatasource
©2015IBMCorporation
• IBM Data Connect service in Bluemix allows you to:– Identify relevant data wherever it resides (Cloud, on premise etc) – Transform the data to suit your needs– Load it to another data store or applications (e.g Watson Analytics) for later use
Summary
10
Top Related