Breaking down a medical diagnosis/procedure in ZIMS Great Data In, Great Data Out.
Breaking data
-
Upload
terry-bunio -
Category
Software
-
view
84 -
download
0
description
Transcript of Breaking data
![Page 1: Breaking data](https://reader033.fdocuments.us/reader033/viewer/2022052508/55957ed61a28ab01038b479c/html5/thumbnails/1.jpg)
Thank you to our Sponsors
Breaking Data – Terry Bunio
@Tbunio
Media Sponsor:
![Page 2: Breaking data](https://reader033.fdocuments.us/reader033/viewer/2022052508/55957ed61a28ab01038b479c/html5/thumbnails/2.jpg)
When Good Data Goes Bad…
![Page 3: Breaking data](https://reader033.fdocuments.us/reader033/viewer/2022052508/55957ed61a28ab01038b479c/html5/thumbnails/3.jpg)
Who Am I?
• Terry Bunio
• Data Base Administrator
- Oracle, SQL Server 6,6.5,7,2000,2005,2012, Informix, ADABAS
• Sharepoint fan
• Data Modeler/Architect
- Investors Group, LPL Financial, Manitoba Blue Cross, Assante
Financial, CI Funds, Mackenzie Financial
- Normalized and Dimensional
• Agilist
- Innovation Gamer, Team Member, SQL Developer, Test writer,
Sticky Sticker, Project Manager, PMO on SAP Implementation
![Page 4: Breaking data](https://reader033.fdocuments.us/reader033/viewer/2022052508/55957ed61a28ab01038b479c/html5/thumbnails/4.jpg)
![Page 5: Breaking data](https://reader033.fdocuments.us/reader033/viewer/2022052508/55957ed61a28ab01038b479c/html5/thumbnails/5.jpg)
![Page 6: Breaking data](https://reader033.fdocuments.us/reader033/viewer/2022052508/55957ed61a28ab01038b479c/html5/thumbnails/6.jpg)
![Page 7: Breaking data](https://reader033.fdocuments.us/reader033/viewer/2022052508/55957ed61a28ab01038b479c/html5/thumbnails/7.jpg)
![Page 8: Breaking data](https://reader033.fdocuments.us/reader033/viewer/2022052508/55957ed61a28ab01038b479c/html5/thumbnails/8.jpg)
![Page 9: Breaking data](https://reader033.fdocuments.us/reader033/viewer/2022052508/55957ed61a28ab01038b479c/html5/thumbnails/9.jpg)
My Blog – www.agilevoyageur.com
![Page 10: Breaking data](https://reader033.fdocuments.us/reader033/viewer/2022052508/55957ed61a28ab01038b479c/html5/thumbnails/10.jpg)
Breaking Data
![Page 11: Breaking data](https://reader033.fdocuments.us/reader033/viewer/2022052508/55957ed61a28ab01038b479c/html5/thumbnails/11.jpg)
SQL Saturday Winnipeg
• November 22nd – Red River Community College
- Downtown Campus
• First SQL Saturday ever in Winnipeg
• 3rd in Canada after Toronto and Vancouver
• 20 Sessions
• 4 Tracks
- Business Intelligence
- DBA
- Developer
- New Database Technology
![Page 12: Breaking data](https://reader033.fdocuments.us/reader033/viewer/2022052508/55957ed61a28ab01038b479c/html5/thumbnails/12.jpg)
March 2 – 3, 2015
Call for Speakers Open!
www.prairiedevcon.com
![Page 13: Breaking data](https://reader033.fdocuments.us/reader033/viewer/2022052508/55957ed61a28ab01038b479c/html5/thumbnails/13.jpg)
Question?
• What is broken data?
• How do we fix it?
![Page 14: Breaking data](https://reader033.fdocuments.us/reader033/viewer/2022052508/55957ed61a28ab01038b479c/html5/thumbnails/14.jpg)
Objectives
• Mine
- Hopefully introduce a couple of ideas you can take back and
improve on
• Yours?
![Page 15: Breaking data](https://reader033.fdocuments.us/reader033/viewer/2022052508/55957ed61a28ab01038b479c/html5/thumbnails/15.jpg)
![Page 16: Breaking data](https://reader033.fdocuments.us/reader033/viewer/2022052508/55957ed61a28ab01038b479c/html5/thumbnails/16.jpg)
Three types of broken data
• Inconsistent - Easy
• Incoherent - Moderate
• Ineffectual - Hard
![Page 17: Breaking data](https://reader033.fdocuments.us/reader033/viewer/2022052508/55957ed61a28ab01038b479c/html5/thumbnails/17.jpg)
Inconsistent
![Page 18: Breaking data](https://reader033.fdocuments.us/reader033/viewer/2022052508/55957ed61a28ab01038b479c/html5/thumbnails/18.jpg)
Inconsistent
• All Data must have a structure
• Domain
- “In data management and database analysis, a data
domain refers to all the values which a data element may
contain.” - Wikipedia
![Page 19: Breaking data](https://reader033.fdocuments.us/reader033/viewer/2022052508/55957ed61a28ab01038b479c/html5/thumbnails/19.jpg)
Inconsistent
• Domains are a simple way to ensure data consistency
• Many times this is overlooked due to tools that don’t
promote it
- Hand-rolled SQL DDL and scripts
• Use tools that require you to define Data Domains
- Erwin
- Oracle SQL Data Modeler
• FREE!
• http://www.oracle.com/technetwork/developer-
tools/datamodeler/overview/index.html
![Page 20: Breaking data](https://reader033.fdocuments.us/reader033/viewer/2022052508/55957ed61a28ab01038b479c/html5/thumbnails/20.jpg)
Inconsistent
![Page 21: Breaking data](https://reader033.fdocuments.us/reader033/viewer/2022052508/55957ed61a28ab01038b479c/html5/thumbnails/21.jpg)
Inconsistent
• You try and find inconsistencies in that model!
• Luckily I have used Oracle Data Modeler and defined the
following Domains
![Page 22: Breaking data](https://reader033.fdocuments.us/reader033/viewer/2022052508/55957ed61a28ab01038b479c/html5/thumbnails/22.jpg)
![Page 23: Breaking data](https://reader033.fdocuments.us/reader033/viewer/2022052508/55957ed61a28ab01038b479c/html5/thumbnails/23.jpg)
Incoherent
![Page 24: Breaking data](https://reader033.fdocuments.us/reader033/viewer/2022052508/55957ed61a28ab01038b479c/html5/thumbnails/24.jpg)
![Page 25: Breaking data](https://reader033.fdocuments.us/reader033/viewer/2022052508/55957ed61a28ab01038b479c/html5/thumbnails/25.jpg)
Incoherence
![Page 26: Breaking data](https://reader033.fdocuments.us/reader033/viewer/2022052508/55957ed61a28ab01038b479c/html5/thumbnails/26.jpg)
Incoherent
• Many databases remain coherent by using Foreign Key
Constraints
- These constraints ensure records can’t be stored in one table
unless the row they refer to in another table already exists
- These constraints are usually enabled all the time
- They can slow down performance and cause the data to be
Ineffectual
![Page 27: Breaking data](https://reader033.fdocuments.us/reader033/viewer/2022052508/55957ed61a28ab01038b479c/html5/thumbnails/27.jpg)
Incoherent
• Most databases create the constraints and leave them
enabled all the time
• Constraint Double-Whammy
- Slows down actual inserted/modification of data
- Further slows down code as you validate the code values before
you try to insert/update to avoid throwing database exception
![Page 28: Breaking data](https://reader033.fdocuments.us/reader033/viewer/2022052508/55957ed61a28ab01038b479c/html5/thumbnails/28.jpg)
Incoherent
• Alternative approach
- Leave Constraints disabled
- Attempt to re-enable them periodically to report on any invalid
data – Daily or Weekly
• You can then correct that data
- Disable constraints again
• In the past this process wasn’t possible due to the length
of time such a process would take
• only takes 75 minutes to re-enable 616 Foreign Key
constraints on a 1.1 Terabyte MSSQL 2012 database.
Thanks Microsoft!
![Page 29: Breaking data](https://reader033.fdocuments.us/reader033/viewer/2022052508/55957ed61a28ab01038b479c/html5/thumbnails/29.jpg)
Incoherent
• Demo
![Page 30: Breaking data](https://reader033.fdocuments.us/reader033/viewer/2022052508/55957ed61a28ab01038b479c/html5/thumbnails/30.jpg)
Ineffectual
• There are three types of database Performance Tuning
that you can do to make your data less ineffectual
- Execution Plan / Statistics IO
- SQL Profiler
- OStress
![Page 31: Breaking data](https://reader033.fdocuments.us/reader033/viewer/2022052508/55957ed61a28ab01038b479c/html5/thumbnails/31.jpg)
Execution Plan
• Demo
![Page 32: Breaking data](https://reader033.fdocuments.us/reader033/viewer/2022052508/55957ed61a28ab01038b479c/html5/thumbnails/32.jpg)
Execution Plan
• 1.sql
![Page 33: Breaking data](https://reader033.fdocuments.us/reader033/viewer/2022052508/55957ed61a28ab01038b479c/html5/thumbnails/33.jpg)
![Page 34: Breaking data](https://reader033.fdocuments.us/reader033/viewer/2022052508/55957ed61a28ab01038b479c/html5/thumbnails/34.jpg)
Execution Plan
• You then get an Execution Plan tab
• Execution Plan process has actually got very good at
recommending indexes.
• Anyone remember MSSQL Index Wizard?
![Page 35: Breaking data](https://reader033.fdocuments.us/reader033/viewer/2022052508/55957ed61a28ab01038b479c/html5/thumbnails/35.jpg)
![Page 36: Breaking data](https://reader033.fdocuments.us/reader033/viewer/2022052508/55957ed61a28ab01038b479c/html5/thumbnails/36.jpg)
![Page 37: Breaking data](https://reader033.fdocuments.us/reader033/viewer/2022052508/55957ed61a28ab01038b479c/html5/thumbnails/37.jpg)
![Page 38: Breaking data](https://reader033.fdocuments.us/reader033/viewer/2022052508/55957ed61a28ab01038b479c/html5/thumbnails/38.jpg)
How to read Execution Plan
• Index Seek >> Index Scan >> Table Scan
• Look for steps that are a large percentage of the overall
query
- See if those steps are using the right indexes
• Hover over each step to get details
![Page 39: Breaking data](https://reader033.fdocuments.us/reader033/viewer/2022052508/55957ed61a28ab01038b479c/html5/thumbnails/39.jpg)
How to read Execution Plan
• Cached plan size – how much memory the plan
generated by this query will take up in stored procedure
cache. This is a useful number when investigating cache
performance issues because you'll be able to see which
plans are taking up more memory.
• Estimated Operator Cost – Overall percentage cost of
the step
![Page 40: Breaking data](https://reader033.fdocuments.us/reader033/viewer/2022052508/55957ed61a28ab01038b479c/html5/thumbnails/40.jpg)
How to read Execution Plan
• Estimated Subtree Cost – tells us the accumulated
optimizer cost assigned to this step and all previous
steps, but remember to read from right to left. This
number is meaningless in the real world, but is a
mathematical evaluation used by the query optimizer to
determine the cost of the operator in question; it
represents the amount of time that the optimizer thinks
this operator will take.
• Estimated number of rows – calculated based on the
statistics available to the optimizer for the table or index
in question.
![Page 41: Breaking data](https://reader033.fdocuments.us/reader033/viewer/2022052508/55957ed61a28ab01038b479c/html5/thumbnails/41.jpg)
SET STATISTICS IO ON
![Page 42: Breaking data](https://reader033.fdocuments.us/reader033/viewer/2022052508/55957ed61a28ab01038b479c/html5/thumbnails/42.jpg)
SET STATISTICS IO ON
• DBCC FREEPROCCACHE
![Page 43: Breaking data](https://reader033.fdocuments.us/reader033/viewer/2022052508/55957ed61a28ab01038b479c/html5/thumbnails/43.jpg)
![Page 44: Breaking data](https://reader033.fdocuments.us/reader033/viewer/2022052508/55957ed61a28ab01038b479c/html5/thumbnails/44.jpg)
SQL Profiler
• Demo
![Page 45: Breaking data](https://reader033.fdocuments.us/reader033/viewer/2022052508/55957ed61a28ab01038b479c/html5/thumbnails/45.jpg)
SQL Profiler
![Page 46: Breaking data](https://reader033.fdocuments.us/reader033/viewer/2022052508/55957ed61a28ab01038b479c/html5/thumbnails/46.jpg)
![Page 47: Breaking data](https://reader033.fdocuments.us/reader033/viewer/2022052508/55957ed61a28ab01038b479c/html5/thumbnails/47.jpg)
SQL Profiler
![Page 48: Breaking data](https://reader033.fdocuments.us/reader033/viewer/2022052508/55957ed61a28ab01038b479c/html5/thumbnails/48.jpg)
SQL Profiler
![Page 49: Breaking data](https://reader033.fdocuments.us/reader033/viewer/2022052508/55957ed61a28ab01038b479c/html5/thumbnails/49.jpg)
SQL Profiler
![Page 50: Breaking data](https://reader033.fdocuments.us/reader033/viewer/2022052508/55957ed61a28ab01038b479c/html5/thumbnails/50.jpg)
![Page 51: Breaking data](https://reader033.fdocuments.us/reader033/viewer/2022052508/55957ed61a28ab01038b479c/html5/thumbnails/51.jpg)
SQL Profiler
• You can save traces and replay those traces to simulate
load
• There are some limitations though
- Replay RPC events as remote procedure calls
- Replay attention
- Replay DTC transactions
- Replay as part of an automated scripts === SCALABLE Tool
![Page 52: Breaking data](https://reader033.fdocuments.us/reader033/viewer/2022052508/55957ed61a28ab01038b479c/html5/thumbnails/52.jpg)
![Page 53: Breaking data](https://reader033.fdocuments.us/reader033/viewer/2022052508/55957ed61a28ab01038b479c/html5/thumbnails/53.jpg)
OStress
• Comprised of two utilties:
- Read80Trace
• Required in order to convert trace files into RML files
- OSTRESS
• Multithreaded ODBC-based query utility. The OSTRESS utility reads input
from a command-line parameter. The command-line parameter can be an
RML file that is produced by the Read80Trace utility or a standard go-
delimited .SQL script file. In stress mode, one thread is created for each
connection, and all threads run as fast as possible without synchronization
among the threads. You can use this mode to generate a specific type of
stress load on the server.
![Page 54: Breaking data](https://reader033.fdocuments.us/reader033/viewer/2022052508/55957ed61a28ab01038b479c/html5/thumbnails/54.jpg)
Ostress
• First we need to download and install the tools on the
server where we want to run our trace files
- http://www.microsoft.com/en-
us/search/Results.aspx?q=ostress&form=DLC
![Page 55: Breaking data](https://reader033.fdocuments.us/reader033/viewer/2022052508/55957ed61a28ab01038b479c/html5/thumbnails/55.jpg)
Ostress
• Demo
![Page 56: Breaking data](https://reader033.fdocuments.us/reader033/viewer/2022052508/55957ed61a28ab01038b479c/html5/thumbnails/56.jpg)
![Page 57: Breaking data](https://reader033.fdocuments.us/reader033/viewer/2022052508/55957ed61a28ab01038b479c/html5/thumbnails/57.jpg)
![Page 58: Breaking data](https://reader033.fdocuments.us/reader033/viewer/2022052508/55957ed61a28ab01038b479c/html5/thumbnails/58.jpg)
Convert Trace Files to RML files
• DOS Command
- CD c:\Program Files\Microsoft Corporation\RMLUtils
- ReadTrace –Ic:\TraceFiles\TraceSample.trc –oc:\RMLFiles –T28
• T28 flag
- Important as it allows you to replay the RML file against SQL
Server
![Page 59: Breaking data](https://reader033.fdocuments.us/reader033/viewer/2022052508/55957ed61a28ab01038b479c/html5/thumbnails/59.jpg)
Ostress
• Now you can simply run Ostress with those RML files
- OSTRESS -creplay.ini -mreplay -T88 –ic:\RMLFiles\*rml –
oc:\RMLFiles\ReplayResult
![Page 60: Breaking data](https://reader033.fdocuments.us/reader033/viewer/2022052508/55957ed61a28ab01038b479c/html5/thumbnails/60.jpg)
Specific use
• You can run and compare Ostress results when you
upgrade SQL Server or other system software and
hardware!
• You can compare them using the following command:
- ReadTrace –Ic:\TraceFiles\*.trc –oc:\TraceFiles\ReplayResult\ –
dods –f
• You can answer confidently whether the new server can
handle the current production load and stress
![Page 61: Breaking data](https://reader033.fdocuments.us/reader033/viewer/2022052508/55957ed61a28ab01038b479c/html5/thumbnails/61.jpg)
![Page 62: Breaking data](https://reader033.fdocuments.us/reader033/viewer/2022052508/55957ed61a28ab01038b479c/html5/thumbnails/62.jpg)
Review
• The power of this structure is that we can now automate
hundreds of threads to replay loads on the database
• This can now also become part of automated
testing/continuous integration processes
![Page 63: Breaking data](https://reader033.fdocuments.us/reader033/viewer/2022052508/55957ed61a28ab01038b479c/html5/thumbnails/63.jpg)
Whew…
![Page 64: Breaking data](https://reader033.fdocuments.us/reader033/viewer/2022052508/55957ed61a28ab01038b479c/html5/thumbnails/64.jpg)
Questions?
![Page 65: Breaking data](https://reader033.fdocuments.us/reader033/viewer/2022052508/55957ed61a28ab01038b479c/html5/thumbnails/65.jpg)