London Jaspersoft Community User Group Event 2 KETL presentation

Post on 14-Apr-2017

121 views 1 download

Transcript of London Jaspersoft Community User Group Event 2 KETL presentation

Welcome to London Jaspersoft Community User Group Thursday 16th June 2016

Introductions and update themes for next eventKETL: Why DQ is important for BIImplementation case study: Andy Fenn and Alexander McGuire from Workplace Systems BreakErnesto: Complex Report Designs with Jaspersoft Studio

http://www.jiem.org/index.php/jiem/article/view/232/130

by 2017, 33% of Fortune 100 organisations will experience an information crisis, due to their inability to to effectively value, govern and trust their enterprise information.

Gartner

www.ketl.co.uk

Impact of poor DQEstimates vary on the impact of bad data on revenue (10 to 30%!). Audit your own revenue losses from poor data. Factor in opportunity costs too.

Measuring the cost of poor DQ

http://www.jiem.org/index.php/jiem/article/view/232/130

Impact of poor DQ in a BI environmentMake DQ part of your BI PoC. It is much harder to go in after the event to address data quality issues.DQ and the resulting ETL issues will likely slow down your BI reporting and put extra strain on your data stores. Who owns data quality for your BI source systems? This needs to be established and ideally it should be the BI project team that takes responsibility for ensuring the data that they are providing in their reports is accurate and consistent. Get involved in data governance and implement DQ as a KPIs for the BI team.

http://www.jiem.org/index.php/jiem/article/view/232/130

www.ketl.co.uk

How is ‘bad’ data entering our systems?

People. Poorly designed data entry fields. Duplicate entries. Multiple data sources. Self-service user entry.

www.ketl.co.uk

Data profiling measures1. Accuracy2. Completeness3. Timeliness4. Validity5. Consistency6. Uniqueness

Experian survey on data accuracy

www.ketl.co.uk

Getting better data. Don’t try ‘big bang’ approach – too daunting. Profile your data. Use familiar datasets that you know you can improve easily. Quick gains.

You have to start with a very basic idea: data is super messy, and data cleanup will always be literally 80 percent of the work. In other words, data is the problem.

DJ Patil, Chief Data Scientist of the White House

www.ketl.co.uk13-14 Orchard Street, Bristol BS1 5EH+44 (0)117 905 5323info@ketl.co.u

k@KETL_BI

Get in touchFor further information or help with your data project speak to Helen to see how we can help >

Helen WoodcockLinkedIn: /in/helenwoodcockemail: helen@ketl.co.uk

References and Further ReadingData disasters 

http://blogs.mazars.com/the-model-auditor/files/2014/01/12-Modelling-Horror-Stories-and-Spreadsheet-Disasters-Mazars-UK.pdfhttps://www.sas.com/content/dam/SAS/en_us/doc/whitepaper1/bad-data-good-companies-106465.pdf

Research on corporate data qualityhttps://www.edq.com/globalassets/uk/papers/global-research-2015_20pp-ext-apr15.pdfhttps://www.gartner.com/doc/2636315/state-data-quality-current-practiceshttps://www.edq.com/uk/resources/infographics/data-machine/

Cost of data qualityhttp://betanews.com/2015/02/17/why-data-quality-is-essential-to-your-analytics-strategy/http://www.itbusinessedge.com/interviews/how-to-measure-the-cost-of-data-quality-problems.htmlhttp://www.itbusinessedge.com/blogs/integration/what-does-bad-data-cost.htmlhttp://techcrunch.com/2015/07/01/enterprises-dont-have-big-data-they-just-have-bad-data/https://www.experian.com/assets/decision-analytics/white-papers/the%20state%20of%20data%20quality.pdf

Data quality in the BI environmenthttp://searchdatamanagement.techtarget.com/tip/Data-quality-management-for-business-intelligence-projectshttp://www.quistor.com/en/blog/entry/why-has-my-bi-become-slow