Software Group | IBM Israel Software Laboratories SOA ...

39
Software Group | IBM Israel Software Laboratories SOA Advanced Technologies © 2007 IBM Corporation Joshua Fox Regulatory Compliance through Metadata Mining

Transcript of Software Group | IBM Israel Software Laboratories SOA ...

Page 1: Software Group | IBM Israel Software Laboratories SOA ...

Software Group | IBM Israel Software Laboratories SOA Advanced Technologies

© 2007 IBM Corporation

Joshua FoxRegulatory Compliance through Metadata Mining

Page 2: Software Group | IBM Israel Software Laboratories SOA ...

Software Group | Israel Software Labs

© 2007 IBM Corporation

What Does My IT System Mean?

Real World

Metadata

Page 3: Software Group | IBM Israel Software Laboratories SOA ...

Software Group | Israel Software Labs

© 2007 IBM Corporation

Use Case: Security Marking A simplified example Security labeling has many drivers Focusing here on the semantics

Page 4: Software Group | IBM Israel Software Laboratories SOA ...

Software Group | Israel Software Labs

© 2007 IBM Corporation

Weaponization-related

Weaponization-related

Use Case: Security Marking

Not Weaponization-

related

Not Weaponization-

related

Not Weaponization-

related

Not Weaponization-

related

Not Weaponization-

related

Not Weaponization-

related

Not Weaponization-

related

Not Weaponization-

related

Weaponization-related

Page 5: Software Group | IBM Israel Software Laboratories SOA ...

Software Group | Israel Software Labs

© 2007 IBM Corporation

Biotech Lab

A lab takes its first DoD contract Needs DIACAP approval; cannot risk non-compliance Needs to apply security markings for access control in the

Information Sharing Environment

Page 6: Software Group | IBM Israel Software Laboratories SOA ...

Software Group | Israel Software Labs

© 2007 IBM Corporation

The Metadata

Metadata for structured (machine-read) data Database schemas Web service WSDLs COBOL copybooks UML & DoDAF Models

Page 7: Software Group | IBM Israel Software Laboratories SOA ...

Software Group | Israel Software Labs

© 2007 IBM Corporation

Security Markings: Find Subject Find all info services in semantic

area of, e.g. “weaponization” Metadata Repository holds service

descriptions, database schemas, other metadata

Repository also holds standard categories from data dictionary

Tool proposes categorization Analyst uses this as input, saving

valuable manual-analysis time

Semantics

Metadata

Page 8: Software Group | IBM Israel Software Laboratories SOA ...

Software Group | Israel Software Labs

© 2007 IBM Corporation

<>…<>

<>…<>

<>…<>

Historical MD Situation MD in small quantities Scattered in

DBA teams Development teams

Page 9: Software Group | IBM Israel Software Laboratories SOA ...

Software Group | Israel Software Labs

© 2007 IBM Corporation

Background

Trends in leading-edge enterprises

Large,

cross-organization,

metadata repositories

Page 10: Software Group | IBM Israel Software Laboratories SOA ...

Software Group | Israel Software Labs

© 2007 IBM Corporation

The Promise:

Governance across the organization,

but…

Page 11: Software Group | IBM Israel Software Laboratories SOA ...

Software Group | Israel Software Labs

© 2007 IBM Corporation

Mess of Metadata

<xsd>

<xsd>

<xsd>

<xsd>

……

<xsd>

<xsd>

<xsd>

<xsd>

<xsd>

……

……

<xsd>

Page 12: Software Group | IBM Israel Software Laboratories SOA ...

Software Group | Israel Software Labs

© 2007 IBM Corporation

Heterogeneity in Metadata Different technologies: XML,

RDB, UML Different structures and

terminologies

<xsd>…

<xsd>

<xsd>

<xsd>

……

<xsd>

<xsd>

<xsd>

<xsd>

<xsd>…

……

Page 13: Software Group | IBM Israel Software Laboratories SOA ...

Software Group | Israel Software Labs

© 2007 IBM Corporation

Confused Semantics in Metadata

Tank?

Army

Navy

Page 14: Software Group | IBM Israel Software Laboratories SOA ...

Software Group | Israel Software Labs

© 2007 IBM Corporation

Confused Semantics in Metadata

“Secure” NSA: No eavesdropping Air Force: Buy it Army: Guard the perimeter Marines: Storm it Navy: Lock the door, turn

off the lights

Page 15: Software Group | IBM Israel Software Laboratories SOA ...

Software Group | Israel Software Labs

© 2007 IBM Corporation

Huge Quantities of Metadata

<xsd>…

<xsd>

<xsd>

<xsd>

……

<xsd>

<xsd>

<xsd>

<xsd>

<xsd>…

……

Page 16: Software Group | IBM Israel Software Laboratories SOA ...

Software Group | Israel Software Labs

© 2007 IBM Corporation

Older Approaches

Build taxonomy/ontology Map it to the metadata

Metadata (e.g., XSD)

Ontology

Page 17: Software Group | IBM Israel Software Laboratories SOA ...

Software Group | Israel Software Labs

© 2007 IBM Corporation

Older Approaches Don’t Work

Page 18: Software Group | IBM Israel Software Laboratories SOA ...

Software Group | Israel Software Labs

© 2007 IBM Corporation

Older Approaches Don’t Work Painstaking human labor

Page 19: Software Group | IBM Israel Software Laboratories SOA ...

Software Group | Israel Software Labs

© 2007 IBM Corporation

Older Approaches Don’t Work Painstaking human labor High-cost labor: IT+ business

knowledge

$$

$

$

Page 20: Software Group | IBM Israel Software Laboratories SOA ...

Software Group | Israel Software Labs

© 2007 IBM Corporation

Older Approaches Don’t Work Painstaking human labor High-cost labor: IT+ business

knowledge: Consultants!

$$

$

$

Page 21: Software Group | IBM Israel Software Laboratories SOA ...

Software Group | Israel Software Labs

© 2007 IBM Corporation

Older Approaches Don’t Work Painstaking human labor High-cost labor with IT+ business

knowledge: Consultants! Beyond human limits

$$

$

$

:-(

:-(

:-(

:-(

:-(

Page 22: Software Group | IBM Israel Software Laboratories SOA ...

Software Group | Israel Software Labs

© 2007 IBM Corporation

New Opportunities Created By:

Moore’s Law Great progress in Data Mining

Searching, classifying and organizing

Recent innovative uses: Terrorist Threat Analysis Security, Web 2.0, Google

Page 23: Software Group | IBM Israel Software Laboratories SOA ...

Software Group | Israel Software Labs

© 2007 IBM Corporation

The Time is Right

Well-known search and information-management techniques

Now, apply them to metadata

Page 24: Software Group | IBM Israel Software Laboratories SOA ...

Software Group | Israel Software Labs

© 2007 IBM Corporation

Compliance

MetadataRepository

Functional Architecture

Persistence

Semi-automation ofmapping

Engine

BusinessFunctionality Access Reporting

Real-LifeMeaning

Ontology(AKA taxonomy, dictionary,

glossary, logical model, categories)

Mapping(ontology <->metadata)

Page 25: Software Group | IBM Israel Software Laboratories SOA ...

Software Group | Israel Software Labs

© 2007 IBM Corporation

Methodology

1) Prepare Metadata2) Set up Categories3) Machine Learning 4) Suggest Category

Page 26: Software Group | IBM Israel Software Laboratories SOA ...

Software Group | Israel Software Labs

© 2007 IBM Corporation

(1) Prepare Metadata

a) Load metadata into repository

b) Pre-process metadata into Text: e.g., “Deployment”, “Location” Structure: e.g., “Deployment:Location” to

represent Table and Column

Page 27: Software Group | IBM Israel Software Laboratories SOA ...

Software Group | Israel Software Labs

© 2007 IBM Corporation

(2) Set up Categories

(AKA taxonomy, ontology, glossary, data dictionary, business model, domain model)

a) Follow Security Classification Guide

b) May use Community-of-Interest (CoI) vocabulary

c) Defense Discovery Metadata Standard for categories

d) Keep it simple!

Page 28: Software Group | IBM Israel Software Laboratories SOA ...

Software Group | Israel Software Labs

© 2007 IBM Corporation

(3) Machine Learning

a) Training on a sample of metadata samples

b) Provide semantic category mappings for this sample

c) Standard Bayesian classification algorithms learn common or uncommon words in a category

Page 29: Software Group | IBM Israel Software Laboratories SOA ...

Software Group | Israel Software Labs

© 2007 IBM Corporation

(4) Suggest Category for Metadata Item

a) Preprocess metadata

b) Submit to classification engine

c) Receive suggested category

d) Proceed with analysis

Cla

ssificatio

nE

ng

ine

Metadata

Analyst

Humans and machines complementing each other

Page 30: Software Group | IBM Israel Software Laboratories SOA ...

Software Group | Israel Software Labs

© 2007 IBM Corporation

Understand Your IT: Use Cases Legacy Transformation: What business services are

hiding in your legacy applications? Reuse: Where is a service with this business

functionality? Fast Start for Community of Interest

Page 31: Software Group | IBM Israel Software Laboratories SOA ...

Software Group | Israel Software Labs

© 2007 IBM Corporation

Non-Financial

Non-Financial

Non-Financial

Non-FinancialNon-Financial

Non-Financial

Non-Financial

Non-Financial

Financial

Financial

Non-Financial

Use Case: SOX Reporting

Page 32: Software Group | IBM Israel Software Laboratories SOA ...

Software Group | Israel Software Labs

© 2007 IBM Corporation

SOX ComplianceReal World

Metadata

A Telco needs to comply with SOX to avoid penalties

Build reports from all info services with “financial” information

Metadata repository holds services, DB schemas, etc.

Tool proposes categorization Analyst can find relevant data sources

more quickly, then build report

Page 33: Software Group | IBM Israel Software Laboratories SOA ...

Software Group | Israel Software Labs

© 2007 IBM Corporation

Why Mine the Metadata

Services: Invocation-level data is transient

Metadata already expresses semantics of the data

Metadata uncoupled from ever-changing data

Table: Troop_ Deployment

Column: Total

Troop_Deployment

… … … Total

154,650

25,390

Page 34: Software Group | IBM Israel Software Laboratories SOA ...

Software Group | Israel Software Labs

© 2007 IBM Corporation

Mining the Metadata: More Secure Tool & human analyst do

not access actual data Human analyst can avoid

accessing even the metadata

Table: Troop_ Deployment

Column: Total

Troop_Deployment

… … … Total

154,650

25,390

Page 35: Software Group | IBM Israel Software Laboratories SOA ...

Software Group | Israel Software Labs

© 2007 IBM Corporation

Data Mining Complements metadata

mining Build metadata from data Differentiate on the

resource level

Table: Deployment

Column: Location

Deployment

… … … Location

“DC LAN 1”

“Baghdad LAN 2”

Page 36: Software Group | IBM Israel Software Laboratories SOA ...

Software Group | Israel Software Labs

© 2007 IBM Corporation

SimplicityMetadata Data

Structured Data Documents

Coarse-Grained Fine-Grained

Classification Search, metadata-internal relationships, transformation-building

Schema-to-Semantics Schema-to-Schema

Feasible Long-term Research

Reusable Functionality Specialized Functionality

Business Value Technical Value

Our focus

Page 37: Software Group | IBM Israel Software Laboratories SOA ...

Software Group | Israel Software Labs

© 2007 IBM Corporation

SummaryReal World

Metadata

Too much metadata: humans need help Use your metadata repository Understand your metadata Identify relevant metadata Comply with regulations using IT

metadata Metadata mining: The time is right

Page 38: Software Group | IBM Israel Software Laboratories SOA ...

Software Group | Israel Software Labs

© 2007 IBM Corporation

Joshua Fox

Metadata Analytics

Israel Software Labs

IBM

[email protected]

http://www.joshuafox.com

Thank you

Page 39: Software Group | IBM Israel Software Laboratories SOA ...

Software Group | Israel Software Labs

© 2007 IBM Corporation