IBM - SSA · DataStage® to extract, transform and load data from all these sources into a central...

15

Transcript of IBM - SSA · DataStage® to extract, transform and load data from all these sources into a central...

Page 1: IBM - SSA · DataStage® to extract, transform and load data from all these sources into a central data lake, built on IBM BigInsights®, and a powerful data warehouse, which runs
Page 2: IBM - SSA · DataStage® to extract, transform and load data from all these sources into a central data lake, built on IBM BigInsights®, and a powerful data warehouse, which runs

IBM    

Governance - Use Cases

Page 3: IBM - SSA · DataStage® to extract, transform and load data from all these sources into a central data lake, built on IBM BigInsights®, and a powerful data warehouse, which runs

Northern Trust gears up for GDPR through data governance solutions

Mention data governance to many corporateexecutives and you might get eye rolling and aquick change of subject. Business leaders aremore focused on revenue rather than meetingregulatory demands. But the rise in enterprisedata has also brought government-drivendemands for privacy and protections that areforcing companies across the world to adopttools to meet new requirements.

“Governance is a very abstract concept. Mostpeople want to run away from anything close togovernance,” said Sanjay Saxena (pictured),senior vice president of enterprise datagovernance at Northern Trust Corp.

Saxena visited theCUBE, SiliconANGLE’smobile livestreaming studio, and answeredquestions from hosts Dave Vellante(@dvellante) and James Kobielus(@jameskobielus) during IBM Fast Track YourData in Munich, Germany. They discussed theevolution of a data governance program withinSaxena’s company and IBM’s involvement. (*Disclosure below.)

Company had little datagovernance four yearsagoNorthern Trust, an international financialservices company, did not have a significantdata governance program four years ago. Butmarket conditions and the expectations ofgovernment regulators led the company todevelop a robust data control and managementprogram that has slowly gained buy-in from thevarious department leaders.

“They are accepting the work we are doing, andthey want to be a part of it,” Saxena said.

One of the motivations for developing datagovernance is the General Data ProtectionRegulation, which requires that companiesprotect the personal information of Europeancitizens and be able to track where the dataflows. The laws go into effect in 2018, andpenalties for non-compliance are steep.

Northern Trust has moved to meet these newrequirements by creating one central repositoryfor all metadata and implementing a process toeasily identify any sensitive information.Attention to data governance has actuallyimproved company operations, according toSaxena.

“Customer names and addresses being wrongmay not be much to a regulator, but it is reallyimportant to our business,” he said.

Saxena turned to IBM for help with datagovernance because it offered an end-to-endsuite instead of segregated tools provided byother vendors.

“We’ve been able to make it a single integratedsolution, and that’s really benefited us,” Saxenastated.

Page 4: IBM - SSA · DataStage® to extract, transform and load data from all these sources into a central data lake, built on IBM BigInsights®, and a powerful data warehouse, which runs

Northern Trust gears up for GDPR through data governance solutions

Watch the complete video interview below, andbe sure to check out more of SiliconANGLE’sand theCUBE’s independent editorial coverageof the IBM Fast Track Your Data event. (*Disclosure: TheCUBE is a paid media partnerfor IBM Fast Track Your Data. Neither IBM norother sponsors have editorial control overcontent on theCUBE or SiliconANGLE.)

Since you’re here …… We’d like to tell you about our mission andhow you can help us fulfill it. SiliconANGLEMedia Inc.’s business model is based on theintrinsic value of the content, notadvertising. Unlike many online publications, wedon’t have a paywall or run banner advertising,because we want to keep our journalism open,without influence or the need to chase traffic.

The journalism, reporting and commentary onSiliconANGLE — along with live, unscriptedvideo from our Silicon Valley studio and globe-trotting video teams at theCUBE — take a lot ofhard work, time and money. Keeping the qualityhigh requires the support of sponsors who arealigned with our vision of ad-free journalismcontent.

If you like the reporting, video interviews andother ad-free content here, please take amoment to check out a sample of the videocontent supported by our sponsors, tweet yoursupport, and keep coming back toSiliconANGLE.

The Northern Trust Corp. has found regulationsmandated by Europe’s new General DataProtection Regulations to be valuable guidelinesin the company’s efforts to ensure clean, qualitydata, despite the law’s potential as a limitingfactor.

“We are treating GDPR as a foundation to ourdata governance program,” said SanjaySaxena (pictured), senior vice president ofenterprise data governance at Northern TrustCorp. Saxena specifically recognizes GDPRlaws as fundamental in the data quality strategyat his company.

Saxena spoke with Rebecca Knight(@knightrm) and Dave Vellante (@dvellante),co-hosts of theCUBE, SiliconANGLE’s mobilelivestreaming studio, during the IBM Chief DataOfficer Summit event in Boston, Massachusetts.They discussed Saxena’s tools and methods formaintaining data under GDPR, as well as thebenefits of highly regulated information forcustomers and businesses alike. (* Disclosurebelow.)

As data volume increases and creates issuesaround discoverability for companies, Saxena istaking an approach rooted in the highest level ofprotection to provide the best customerexperience. “There’s a huge amount of synergybetween GDPR and information security. … Weare in a systematic fashion trying to figure outwhere all our sensitive data is and whether it iscontrolled, protected, etc. … All these thingscome together very nicely from a GDPRperspective,” he said.

Page 5: IBM - SSA · DataStage® to extract, transform and load data from all these sources into a central data lake, built on IBM BigInsights®, and a powerful data warehouse, which runs

GDPR bolsters CRM initiatives with increased data quality at Northern Trust Corp

Clean data for strongercustomer relationshipsGDPR provides a guideline for better servingcustomers when the volume and categories ofdata are virtually limitless, according toSaxena. “What we are doing in GDPR ismapping out sensitive data across hundreds ofapplications and creating that baseline for thewhole company so that any time a regulatorcomes in and wants to know where a particularperson’s information is, we should be able to tellthem in no uncertain terms,” he said.

In addition to developing client relationships,regulated data enables more effectivemarketing for businesses. “Now you have thatinformation tagged, it’s all nicely calibrated inrepositories … you can use that for youranalytics, you can use that for your toplinegrowth, or even see what your internalprocesses are that can make you more effectivefrom an operations perspective,” Saxena said.

With an amount of structured and unstructureddata that feels incomprehensible, companieslike IBM are proving valuable by offering tools tohelp organize information, according to Saxena.“There are … tools available in the marketplace,including IBM’s tools, which help you map thedata. … There are other tools that help youdevelop a metadata repository,” he stated.

As data becomes increasingly valuable, andquality harder to come by, Northern Trust Corp.is turning its attention to recruitment and talentdevelopment to maintain productivity. “It’s hardto recruit in these areas. … We get interns …who have the technology knowledge, and wecouple them up with business experts to work incollaborative teams,” Saxena said.

These efforts coupled with a commitment tostrong customer relationships has Saxenaconfident in his company’s ability to keep pacewith data innovations. “It’s still evolving from myperspective. … Have we attained a state ofperfection? No. We are getting there in terms ofmore optimization, more emphasis, and moremoney and financials being put on data quality,”Saxena concluded.

Watch the complete video interview below, andbe sure to check out more of SiliconANGLE’sand theCUBE’s coverage of IBM Chief DataOfficer Summit. (* Disclosure: TheCUBE is apaid media partner for the IBM Chief DataOfficer Summit. Neither IBM, the event sponsor,nor other sponsors have editorial control overcontent on theCUBE or SiliconANGLE.)

Since you’re here …… We’d like to tell you about our mission andhow you can help us fulfill it. SiliconANGLEMedia Inc.’s business model is based on theintrinsic value of the content, notadvertising. Unlike many online publications, wedon’t have a paywall or run banner advertising,because we want to keep our journalism open,without influence or the need to chase traffic.

Page 6: IBM - SSA · DataStage® to extract, transform and load data from all these sources into a central data lake, built on IBM BigInsights®, and a powerful data warehouse, which runs

GDPR bolsters CRM initiatives with increased data quality at Northern Trust Corp

The journalism, reporting and commentary onSiliconANGLE — along with live, unscriptedvideo from our Silicon Valley studio and globe-trotting video teams at theCUBE — take a lot ofhard work, time and money. Keeping the qualityhigh requires the support of sponsors who arealigned with our vision of ad-free journalismcontent.

If you like the reporting, video interviews andother ad-free content here, please take amoment to check out a sample of the videocontent supported by our sponsors, tweet yoursupport, and keep coming back toSiliconANGLE.

Page 7: IBM - SSA · DataStage® to extract, transform and load data from all these sources into a central data lake, built on IBM BigInsights®, and a powerful data warehouse, which runs

Jakarta Smart City

Business challengestoryAs the Internet of Things (IoT) becomes anincreasingly fundamental technology foreffective government, cities are embeddingsensors in everything from buses and garbagetrucks to water systems and public buildings. Byanalyzing data from those sensors, moderncities will be able to design smarter publicservices, make wiser policy decisions, andmanage day-to-day operations much moreefficiently.

Yet even without any investment in sensornetworks, today’s cities already contain millionsof the most intelligent and versatile “sensors”that have ever existed: human beings. A public-spirited citizen with a smartphone is anincredibly valuable source of data forgovernment agencies, because they will provideaccurate feedback on the status of the city’ssystems in real time.

The only problem is collecting and analyzing thedata fast enough. In Jakarta, a district of 10million people is divided into five cities, 44 sub-districts, and 267 villages. The city governmentreceives an average of 1,400 messages per dayvia its custom-built Qlue mobile app, whichallows users to submit feedback about publicservices. On top of this, citizens send anaverage of 130 SMS messages per day to thegovernor’s mobile phone, and many more viaother channels such as email and Twitter.

“At the same time, we knew that this was justone of many projects that will involve big dataanalytics. So, we decided to build a general-purpose big data platform that we could use tocapture, store and analyze huge volumes ofdata for any use-case.”

Transformation storyJSC knew that it needed a platform that couldhandle every aspect of big data collection,management and analytics. It needs to be ableto ingest a huge variety of different types of data—everything from unstructured text on socialmedia to JSON data from the web and relationaldata from traditional database systems. It alsoneeds to deal with cultural change, as somegovernment departments are only justbeginning to move from paper-based processesto digital workflows.

“Data governance is very important,” says DioryPaulus. “The danger with any analytics projectis that if you put garbage in, you will getgarbage out. And this is especially difficult toguard against when the datasets are so large.”

Diory Paulus, Head of Data & Analytics atJakarta Smart City (JSC), a management unitunder the Communication, Informatics andStatistics division of the Jakarta ProvincialGovernment, explains: “It’s important to listenand be transparent when citizens sendfeedback about public services—but in a city aslarge as Jakarta, the sheer number ofmessages makes it impossible to respond fastenough if you are handling every messagemanually. We wanted to find a way to processfeedback more quickly and analyze andprioritize the most important issues.

Page 8: IBM - SSA · DataStage® to extract, transform and load data from all these sources into a central data lake, built on IBM BigInsights®, and a powerful data warehouse, which runs

Jakarta Smart City

He continues: “We looked at the Gartner MagicQuadrant and asked several leading companiesto show us their products. IBM was one of thefew vendors that could answer all our questionsand had a truly comprehensive vision,embracing big data collection, datawarehousing, advanced analytics andgovernance.”

IBM® Global Business Services® helped theJSC team design a solution that would providea big data hub for integrating information fromthe citizen feedback app and social networks,as well as government services such astransportation, healthcare, water distributionand other departments.

The solution uses IBM InfoSphere®DataStage® to extract, transform and load datafrom all these sources into a central data lake,built on IBM BigInsights®, and a powerful datawarehouse, which runs on IBM PureData®System for Analytics. IBM InfoSphereInformation Governance Catalog provides arobust data governance framework, making iteasier to align the technical systems with thebusiness requirements and processes. Asanalysis tools, the solution provides IBMCognos® Analytics for reporting anddashboarding, IBM SPSS® Modeler foradvanced analytics and predictive modeling,and IBM Text Analytics for categorization andsentiment analysis of unstructured text.

“We are taking our first steps with many ofthese technologies, and the guidance andtraining we have received from IBM has beenvery valuable,” says Diory Paulus. “We are verykeen to learn and share our knowledge withother government organizations, both withinIndonesia and internationally.”

Results storyJSC has already started using the big dataplatform to streamline the way it handlescitizens’ feedback and provide new insight intothe most important topics.Diory Paulus says: “There is a real opportunityto use data to make smarter policy decisions.For example, we can analyze all the feedbackwe receive, and identify patterns. Recently welooked at the top ten villages, ranked by thenumber of complaints. We saw that one of ourvillages had the highest number of complaintsfor two consecutive months, and when welooked into it, we found that most of theproblems were related to garbage collection.

“We plotted all the incidents on a heatmap andoverlaid the route of the garbage trucks, whichshowed us that the most complaints came fromsome areas where the garbage trucks don’tdrive through. We were then able to work withthe village council and the Jakarta wastemanagement department to improve the routingand scheduling of the garbage collections—andsure enough, the number of complaints is nowdecreasing again.

Page 9: IBM - SSA · DataStage® to extract, transform and load data from all these sources into a central data lake, built on IBM BigInsights®, and a powerful data warehouse, which runs

Jakarta Smart City

“This is just one example of the power of bigdata, once you can collect and analyze iteffectively. Traffic management is another keyarea—we can integrate traffic complaints datawith data from Waze and the GPS systems inour buses, and get new insight into where thebottlenecks are.”

JSC is also investigating ways to use analyticsto combat fraud by looking into utilization of thesmart cards that citizens use for subsidizedschool fees and other services. The city is in theprocess of launching a new “Jakarta One” card,which will integrate a much wider range ofpayment services. JSC expects that rapid frauddetection capabilities will be increasinglyimportant to ensure that fraudsters cannot takeadvantage of the system.

Diory Paulus concludes: “With the IBM platform,we are now able to say ‘yes’ whenever agovernment department asks us for help withdata and analytics. We know we can collect andstore however much data we need, and wehave the analytical power to process it anddeliver results quickly.

“We are excited about the next steps on thisjourney—for example, using Text Analytics tomake it easier to categorize and prioritize themessages we receive by email and SMS, or onTwitter. This should save many hours of manualprocessing, and help us ensure that we alwaysdeal with the most important issues as quicklyas possible.

“Most important of all, analytics helps us showour citizens that their feedback really makes adifference. By showing the people of Jakartathat we are listening, we can keep themengaged in helping us build a city that is betterfor everyone.”

Page 10: IBM - SSA · DataStage® to extract, transform and load data from all these sources into a central data lake, built on IBM BigInsights®, and a powerful data warehouse, which runs

CZ

Business challengestoryFor the Dutch health insurance company CZ, afast time-to-market is crucial when testing andlaunching new software applications for use inits business—CZ not only needs to be able torespond to customer requirements swiftly, butmust also meet regulatory deadlines.

For example, each year, insurance companiesin the Netherlands need to update their systemsto align with changes in the country’s healthcaresystem. The deadline for completing thesechanges is “Prinsjesdag”—the day when thereigning Dutch monarch announces the mainfeatures of government policy for the comingparliamentary session—and there aresignificant penalties for any insurer who fails tocomplete their changes on time.

“This means our testing environment must workas smoothly and automatically as possible toensure we can launch applications on schedule.For instance, copying and loading data from ourproduction environment into our testingenvironment needs to be quick and easy, andit’s also important that the test data isreferentially intact, to ensure proper applicationtesting.”

However, to comply with data protectionregulations, CZ must also ensure that it protectsclients’ privacy as it moves data between itsproduction and non-production environments.These requirements will soon become evenmore strict, when the upcoming EuropeanGeneral Data Protection Regulation (GDPR)comes into force.

Ed de Cock, Test Coordinator at CZ, takes upthe story: “The pressure is high. Sometimes anew application has to be operational in just amonth, while updates of existing applicationsneed to be released every four weeks or evenevery other day.

Han van der Vinden, Test Manager at CZ,explains: “The regulations state that we mustnot use real customer data for development andtesting purposes, to ensure that our clients’privacy isn’t threatened in the case of lost orhacked data containing details such asaddresses or bank details. It’s not just aboutcomplying with regulations—we also need toprotect the company’s reputation and ensurethat our customers trust us as stewards of theirpersonal information.

“However, we still need to use realistic data totest our applications fully. A solution is to maskour production data—that is, replace anysensitive personal information in the datasetwith ‘dummy’ information that isn’t related to anyreal customers—before we move it over to thetest environment.”

CZ had previously developed its own tool formasking data, but this no longer met privacylegislation standards.

Page 11: IBM - SSA · DataStage® to extract, transform and load data from all these sources into a central data lake, built on IBM BigInsights®, and a powerful data warehouse, which runs

CZ

Ed de Cock adds: “Our homegrown tool didn’tanonymize data sufficiently to meet newregulations, and also left room for errors. Forexample, if you added a new table to yourproduction database, you would also need toremember to add a new table to your testingenvironment—but because this wasn’tautomated, there was a chance of it beingforgotten. That could lead to errors andinconsistencies in the testing process.

“What’s more, indicating which dataset youwanted to load into the testing environment wasvery time-consuming. For instance, if youwanted to select people with certaincharacteristics, you first needed to run a queryto extract that data from the productiondatabase, and then load it manually in thetesting environment.”

CZ knew it was time to find a new automatedapproach to test data management that wassecure, accurate, and would enable softwareengineering teams to work at a faster pace.

Results storyWith IBM InfoSphere® Optim™ Test DataManagement in place, CZ can easily protect itsclients’ data, comply with privacy regulations,and meet deadlines for changing regulations—while cutting costs and boosting efficiency.

“Optim’s subsetting feature is a hugeadvantage, because it allows us to reduce theamount of data that is copied to our testenvironments,” says Han van der Vinden. “Thisreduces the time taken to create copies byaround 50 percent, which contributes to fasterdevelopment cycles overall. It also makes iteasier to spot changes in the database structureand keep the test database consistent with thedevelopment and production environments.”

Additionally, the solution’s automation featuresminimize errors and boost efficiency, as Ed deCock describes: “Optim helps us select andload data more quickly, apply changesconsistently across the testing environment, andensure that all relevant items are copied whencreating test datasets—so we can rely on ourtest results. The next step is to be able toperform automated tests quickly and accurately.We aim to achieve a 10 percent reduction inoverall development cycle time within a year,helping us get new releases into productionmore quickly.

“By speeding up our software developmentprocesses, the solution makes it easier for us torespond swiftly to customer requirements, andmeet regulatory deadlines for implementingchanges in healthcare policy.”

At the same time as boosting efficiency andaccelerating software development, the solutionalso cuts expenditure. Specifically, by enablingthe use of smaller datasets in the testenvironment, Optim reduces total storagerequirements and costs.

Page 12: IBM - SSA · DataStage® to extract, transform and load data from all these sources into a central data lake, built on IBM BigInsights®, and a powerful data warehouse, which runs

CZ

Flexibility is another key benefit: whenever newmasking requirements arise, CZ can write thespecific routines it needs and integrate themseamlessly into its testing processes. This mayprove vital in helping CZ comply with new dataprivacy regulations such as the GDPR.

“The IBM solution empowers us to keep ourclients’ personal data safe, protecting thecompany’s reputation and preserving ourcustomers’ trust,” concludes Ed de Cock. “Italso cuts costs and accelerates softwaredevelopment processes, boosting efficiency andenabling us to meet regulatory deadlines.”

Page 13: IBM - SSA · DataStage® to extract, transform and load data from all these sources into a central data lake, built on IBM BigInsights®, and a powerful data warehouse, which runs

Government of Odisha

Business challengestory

The Food Supplies and Consumer WelfareDepartment of the Government of Odisha offersa lifeline for some of the state’s most vulnerablecitizens; its subsidies help to ensure that peopleliving below the poverty line have access toadequate food and other household necessities.

With hundreds of thousands of cases tomanage every year, the Department found itchallenging to ensure that its benefits werebeing directed to citizens who were truly inneed.

A spokesperson from the Government ofOdisha explains: “Most of our benefits claimantstruly need our support to make ends meet, butunfortunately there are always individuals whowill try to abuse the system. For instance, somepeople would under-report their income andclaim benefits that they were not really eligiblefor, while others would use multiple ID cards toclaim benefits more than once.

“These kinds of fraud divert valuablegovernment resources away from the citizenswho needed them the most—and we wanted toput a stop to it. The problem was that we lackedan efficient way of confirming claimants’eligibility for welfare assistance.

“In the past, our best option was to hire externalinvestigators to visit claimants and check ontheir circumstances, but this was too expensiveand time-consuming to apply on a larger scale.We realized that if we wanted to stamp outbenefits fraud, we needed to find a smarter wayto work. And that’s where IBM came in.”

Transformation storyIn a first-of-its-kind project in India’s governmentsector, the Food Supplies and ConsumerWelfare Department used a suite of IBMInformation Management solutions to unifycitizen data from multiple systems of record,and analyze it to determine which citizens weremaking wrongful claims.Over the course of three years, the project teamgathered approximately 45 million records fromdozens of government offices, including filessuch as driver’s licenses, electricity bills, incometax returns and pension statements. IBM®InfoSphere® DataStage was used to buildreliable, repeatable processes to extract,transform and load this data from the sourcesystems, while IBM InfoSphere QualityStagehelped to identify and remediate data qualityissues with individual records.

Page 14: IBM - SSA · DataStage® to extract, transform and load data from all these sources into a central data lake, built on IBM BigInsights®, and a powerful data warehouse, which runs

Government of Odisha

Once all the data has been brought togetherand stored in a central IBM DB2® database, thesolution uses IBM InfoSphere Master DataManagement Standard Edition to compare therecords from each system, use a probabilisticmatching engine to identify records that belongto the same person, and link them to a single“master record” for each citizen.

“The IBM solution provides sophisticatedmatching algorithms that help us create a singleview of each citizen’s income and property, andcompare this against our welfare claimant datato identify potential cases of fraud,” recalls aspokesperson. “For instance, if the recordsshow that an individual owns a car or a house,receives a large pension or pays at least amodest amount of income tax, then it meansthey aren’t living below the poverty line and arenot eligible to claim benefits.”

Looking beyond the welfare system, betterinformation management has the potential toenhance citizen services on a much broaderlevel. With a standardized, centralized store ofcitizen data, other government departmentsstand to gain a much more accurate andcomprehensive view of citizens, which they canuse to drive more efficient and responsivecitizen-centric services.

Results storyWith the new solutions in place, the FoodSupplies and Consumer Welfare Departmenthas a much faster, accurate and effective wayof identifying fraudulent welfare claims.

“IBM Analytics have completely changed theway that we work,” says a spokesperson.“Today, we can catch fraudsters almostimmediately. With the new system, 60 percentof records that previously would have requiredinvestigation are flagged up automatically,which means there’s much less need for us tohire investigators to go home-to-home to checkup on claimants. This is delivering bigadministrative cost savings, which we can re-direct to other initiatives.”

The solution also helps to improve ongoingfraud detection, as a spokesperson explains:“When someone attempts to register for thiswelfare program, the system automaticallychecks their records and, if they do not meet theeligibility criteria, their application will berejected. By taking a proactive approach, andstopping fraudsters before they even get intothe system, we are saving significant effort andresources.”

Most importantly, the new approach is helpingto ensure that the right benefits go to the rightpeople.

A spokesperson concludes: “Since introducingIBM Analytics solutions, we have detectedaround 500,000 wrongful welfare claimants andhave removed them from our program. Now, thebenefits that they were unjustly receiving can beused to support genuinely needy peopleinstead. We are ensuring that deserving citizensget the help they need, while keeping tightercontrol over welfare costs and making betteruse of taxpayer money—it’s a win-win situationfor everyone.

Page 15: IBM - SSA · DataStage® to extract, transform and load data from all these sources into a central data lake, built on IBM BigInsights®, and a powerful data warehouse, which runs

Government of Odisha

“This project is the first of its kind in the Indiangovernment sector, and we are proud to beleading the way and showing what is possiblewith a modern approach to master datamanagement. We look forward to helping ourcolleagues in other departments and from otherstates learn how they can benefit from thesetechnologies too.”