Best Practices of Data Modeling with InfoSphere Data Architect

45
Best Practices in Data Modeling using InfoSphere Data Architect Dr. Vladimir Bacvanski [email protected]

description

Best practices of data modeling with InfoSphere Data Architect. Covers best practices of logical data models, physical data models, model comparison, working in a team and more. Parts of the presentation are from the InfoSphere Data Architect course http://www.scispike.com/training/infosphere_data_architect_training.html

Transcript of Best Practices of Data Modeling with InfoSphere Data Architect

Page 1: Best Practices of Data Modeling with InfoSphere Data Architect

Best Practices in Data Modeling using InfoSphere Data Architect

Dr. Vladimir [email protected]

Page 2: Best Practices of Data Modeling with InfoSphere Data Architect

www.scispike.com Copyright © SciSpike 2011

2

Talk Topics

The value of modeling Logical data modeling and best practices Physical data modeling and best practices Model transformation: hints and tips Team productivity and automation

Parts of this presentation are taken from the course:Mastering Data Modeling with InfoSphere Data Architect http://www.scispike.com/training/infosphere_data_architect_training.html

Page 3: Best Practices of Data Modeling with InfoSphere Data Architect

Demo: InfoSphere Data Architect and Data Models

3

Page 4: Best Practices of Data Modeling with InfoSphere Data Architect

Value of Data Modeling and InfoSphere Data Architect

Page 5: Best Practices of Data Modeling with InfoSphere Data Architect

www.scispike.com Copyright © SciSpike 2011

5

Data Models: The Foundation and the Benefits

Systems

Data

Data Models

Business Opportunity

Improved AgilityReduced RiskReduced Cost

Increased EffectivenessIncreased Alignment

with IT

Minimized Redundancy

Compatibility

Quality

Security

ServicesIntegration Reliability

Quality

Business

Page 6: Best Practices of Data Modeling with InfoSphere Data Architect

www.scispike.com Copyright © SciSpike 2011

6

Data Modeling and Integrated Data Management

Data Mode

ls

Model driven

development and SOA

Governance• Complianc

e• Policies• Standards

Consistency• Naming• Semantics• Values• Security• Traceability

Quality

Communi-cation &

Collaboration

Tools have a central role in integrated data management

Tools are: – Enablers – Accelerators

Page 7: Best Practices of Data Modeling with InfoSphere Data Architect

www.scispike.com Copyright © SciSpike 2011

7

Data Modeling: Traditional High Level Process

Process Models

Data Requirements

Logical Data Modeling

Logical Data Model

Physical DataModeling

Create/UpdateData

PhysicalData Model

Data

TechnicalRequirements

Performance Requirements

Business Data

Page 8: Best Practices of Data Modeling with InfoSphere Data Architect

www.scispike.com Copyright © SciSpike 2011

8

How do we Create Data Models?

Top down• From

knowledge about the problem domain to realization

Bottom up• From the

realization, abstracting to higher levels

Page 9: Best Practices of Data Modeling with InfoSphere Data Architect

www.scispike.com Copyright © SciSpike 2011

9

Application Development and Data Models

Development Approaches

Process Oriented

Data Oriented

Hybrid(Data and Process in parallel)

Object-Oriented

Service-Oriented

Prototyping

Agile

Page 10: Best Practices of Data Modeling with InfoSphere Data Architect

www.scispike.com Copyright © SciSpike 2011

10

Data Models, Use Cases and Service Specifications

The vocabulary of use cases and service specification is best described by information (data) models

Every word in the use case specification that relates to information needs to be present in the model

Use Case / Service Spec.

Information Model

Page 11: Best Practices of Data Modeling with InfoSphere Data Architect

www.scispike.com Copyright © SciSpike 2011

11

IDA: Key Functional Areas

Key areas of functionality

The tool is easily extendable

Additional features can be delivered as plug-ins– IBM and third-

party

Information Modeling

Validation

Discovery

Database Development

Lifecycle Management

Reporting

InfoSphere Data Architect

Eclipse Platform

Page 12: Best Practices of Data Modeling with InfoSphere Data Architect

www.scispike.com Copyright © SciSpike 2011

12

InfoSphere Data Architect: Key Integrations

InfoSphere Data

Architect

IBM Data Studio

Optim Solutions

Rational Software Architect

Rational Requisite

ProWebSphere

Business Modeler

Import from other

tools

Industry Solutions •Enterprise Modeler Extender

Open Source

We show only the most common integrations

Page 13: Best Practices of Data Modeling with InfoSphere Data Architect

Logical Data Modeling

Page 14: Best Practices of Data Modeling with InfoSphere Data Architect

www.scispike.com Copyright © SciSpike 2011

14

Use Multiple Diagrams for a Complex Model

Model is an abstraction of some business area (domain)

Diagram is a visualization of a (part of a) model

Business Domain Model Diagrams

abstract visualize

Best

Practice

Page 15: Best Practices of Data Modeling with InfoSphere Data Architect

Demo: Logical Data Models

15

Page 16: Best Practices of Data Modeling with InfoSphere Data Architect

www.scispike.com Copyright © SciSpike 2011

16

Use Domain Models Instead of Base Types Type for the first name of a customer Defined with base type: First Name:

VARCHAR(30)– This is not optimal:

• Projects may inconsistently model the name type as VARCHAR of different length

Defined with domain type: First Name: NAME– NAME, defined as VARCHAR(30)

Domain types can be shared across projectsNote: do not confuse the term “Domain Model” in data modeling with the same term in object-oriented (OO) modeling. In OO modeling, domain models (AKA business models) describe technology independent problem (business) domain.

Best

Practice

Page 17: Best Practices of Data Modeling with InfoSphere Data Architect

www.scispike.com Copyright © SciSpike 2011

Use Glossary Models

Glossary model describes the names and abbreviations that an organization allows for data objects

In EditorIn Data Project Explorer

Best

Practice

Page 18: Best Practices of Data Modeling with InfoSphere Data Architect

www.scispike.com Copyright © SciSpike 2011

18

Glossary Words

Prime words– Key business concepts– Employee, Company

Class words– Property / qualifiers of concepts– Name, Id

Modifier words– Modifiers of class words– First, Last, Annual

Page 19: Best Practices of Data Modeling with InfoSphere Data Architect

www.scispike.com Copyright © SciSpike 2011

19

Abbreviations Best Practices

Must be meaningful to modelers and users Make the abbreviations unambiguous Avoid reserved words (e.g. DATE) Abbreviate from left to right Strive for consistency If several words are in a family, they can have the

same abbreviation

Many organizations develop their standard abbreviation processes

Best

Practice

Page 20: Best Practices of Data Modeling with InfoSphere Data Architect

www.scispike.com Copyright © SciSpike 2011

20

Define Word Status

Define status of the words you are using Candidate

– The default value for new words Accepted

– Accepted by the organization for general use Standard

– The preferred word among the set of synonyms Deprecated

– Replaced by another word– Must specify the replacement

Best

Practice

Page 21: Best Practices of Data Modeling with InfoSphere Data Architect

www.scispike.com Copyright © SciSpike 2011

21

Uses Glossary while Modeling

When entering the name, press Ctrl-Space

Convenient way to be naming compliant while modeling

Names from the glossary appear

Best

Practice

Page 22: Best Practices of Data Modeling with InfoSphere Data Architect

www.scispike.com Copyright © SciSpike 2011

22

Defining Naming Standards

Naming standard defines the sequences of words that build names

Best

Practice

Page 23: Best Practices of Data Modeling with InfoSphere Data Architect

www.scispike.com Copyright © SciSpike 2011

23

Run Name Analysis

Run name analysis before errors and bad names propagate to physical data models

Select a package

Run “Analyze Model”

Best

Practice

Page 24: Best Practices of Data Modeling with InfoSphere Data Architect

www.scispike.com Copyright © SciSpike 2011

24

Model Analysis Rules: Naming Standard

Make sure “Naming standard” is selected

Page 25: Best Practices of Data Modeling with InfoSphere Data Architect

www.scispike.com Copyright © SciSpike 2011

Use Packages to Manage Complex Models

Packages are hierarchical, grouping elements that help manage complexity of a model– Analogous to folders in a file system

– For this example, we use the Invoice.ldm provided in the IDA help

Model elements about Invoices

Model elements about Products

Best

Practice

Page 26: Best Practices of Data Modeling with InfoSphere Data Architect

www.scispike.com Copyright © SciSpike 2011

26

Use Submodels

Break a large model into smaller models Benefit: Modelers can work in parallel on a model

– They can be put under version control system

Best

Practice

Page 27: Best Practices of Data Modeling with InfoSphere Data Architect

www.scispike.com Copyright © SciSpike 2011

27

Created Submodel

A submodel is stored in separate file Submodels that are not used do not need to be

opened improves performance

Shortcut to submodel

New model

Original package is here

Page 28: Best Practices of Data Modeling with InfoSphere Data Architect

www.scispike.com Copyright © SciSpike 2011

28

VCS: Submodels and Team Sharing

Put Submodels in a Version Control System (VCS)– Only the modified submodel needs to be

checked in– This reduces model collisions – Merging is easier as the models are smaller

Version control systems and large models problems:– Even if only one model property is changed, the

whole model needs to be checked in (as it is stored in one file)

– Collision management is cumbersome

Best

Practice

Page 29: Best Practices of Data Modeling with InfoSphere Data Architect

www.scispike.com Copyright © SciSpike 2011

29

Create Overview Diagrams

Best practice: create a diagram that follows the overall organization of your model by dropping diagrams to the editor– Readers can navigate by clicking on the

diagrams

A shortcut to a diagram: double-click opens the diagramYou can add shapes, comments and text to the diagram

Best

Practice

Page 30: Best Practices of Data Modeling with InfoSphere Data Architect

Physical Data Modeling

Page 31: Best Practices of Data Modeling with InfoSphere Data Architect

www.scispike.com Copyright © SciSpike 2011

31

Creating Physical Data Models

Forward engineer:

Empty model

Transform from Logical Data Model

Import from another tool (e.g. ERwin)

Reverse engineer:

Drag and drop from the Data Source

Explorer

From database

From DDL

Page 32: Best Practices of Data Modeling with InfoSphere Data Architect

Demo: Physical Data Models

32

Page 33: Best Practices of Data Modeling with InfoSphere Data Architect

www.scispike.com Copyright © SciSpike 2011

33

Analyze LDM before Transformation

Window > Preferences > Transform

It is a good practice to have the analyze checkbox checked

Best

Practice

Page 34: Best Practices of Data Modeling with InfoSphere Data Architect

www.scispike.com Copyright © SciSpike 2011

34

Use Consistent Transformation Options

Naming options

Data type defaults

Surrogate keys

Best practice: traceabilty is checked

Best

Practice

Page 35: Best Practices of Data Modeling with InfoSphere Data Architect

www.scispike.com Copyright © SciSpike 2011

35

Write Documentation

If there was documentation in the logical data model, it is carried over into the physical data models

Best

Practice

Page 36: Best Practices of Data Modeling with InfoSphere Data Architect

Demo: Reverse Engineering Databases

36

Page 37: Best Practices of Data Modeling with InfoSphere Data Architect

www.scispike.com Copyright © SciSpike 2011

37

Customize the Generated DDL

Set up your preferences– Window > Preferences > Data Management >

Code Templates

Best

Practice

Page 38: Best Practices of Data Modeling with InfoSphere Data Architect

Demo: Publishing and Reporting

Page 39: Best Practices of Data Modeling with InfoSphere Data Architect

www.scispike.com Copyright © SciSpike 2011

39

Publish your Data Models

Create model and diagram representation for the web or in PDF– Interesting elements can be selected for

reporting– Use defined templates– Reports may be customized

Hypertext links connect related elements– Models can be browsed (almost) as in the

modeling tool

Best

Practice

Page 40: Best Practices of Data Modeling with InfoSphere Data Architect

Demo: Team Support and Comparison of Models

Page 41: Best Practices of Data Modeling with InfoSphere Data Architect

www.scispike.com Copyright © SciSpike 2011

41

Use a Team Support / Version Control System It is a must when working with multiple modelers!

– Retrieve previous versions– Merge changes: with IDA use model comparison– Branching and merging

Even when there is just one modeler:– Provides for history of models– Ability to retrieve a previous model

Best

Practice

Page 42: Best Practices of Data Modeling with InfoSphere Data Architect

www.scispike.com Copyright © SciSpike 2011

42

Use Model Comparison

We can compare up to three models in IDA The models need to have a common ancestor

model This is a common situation in teamwork:

– We start with a base model– Two coworkers independently create two

different versions

IDA provides a tool that detects changes to models

IDA can create a delta DDL script

Best

Practice

Page 43: Best Practices of Data Modeling with InfoSphere Data Architect

www.scispike.com Copyright © SciSpike 2011

43

Database Model ComparisonReverse enginee

r the databas

e

Modify model

Compare with

Original Source

Examine the

changes

Generate delta

DDL script

Execute DDL

change script

When maintaining databases, compare the model with the database

Best

Practice

Page 44: Best Practices of Data Modeling with InfoSphere Data Architect

www.scispike.com Copyright © SciSpike 2011

44

Getting in Touch + Resources

Email: [email protected] Blog: http://www.OnBuildingSoftware.com/ Twitter: http://twitter.com/OnSoftware Hands-on Training:

Mastering Data Modeling with InfoSphere Data Architecthttp://www.scispike.com/training/infosphere_data_architect_training.html

InfoSphere Data Architect home page:– http://www-01.ibm.com/software/data/optim/data-architect/

Page 45: Best Practices of Data Modeling with InfoSphere Data Architect

www.scispike.com Copyright © SciSpike 2011

45

Conclusion

Use the best practices of data modeling with InfoSphere Data Architect!

Learning and applying the best practices will enable you to:– Work faster with less friction– Work in a team, reducing conflicts– Be more agile on a project

Make sure the whole team learns the practices!