Successfully Kickstarting Data Governance's Social Dynamics: Define, Collaborate, Validate
-
Upload
stijn-stan-christiaens -
Category
Business
-
view
522 -
download
0
Transcript of Successfully Kickstarting Data Governance's Social Dynamics: Define, Collaborate, Validate
Successfully Kickstarting Data Governance's Social Dynamics:
Define, Collaborate, Validate
[email protected] 6th 2011, Chicago
It’s all about the context
What would Google do?
Did I take the wrong Gates to the web?
Let’s get more computing power
World View
Information is meaningful data
Information is today’s currency
Lack of context leads to unreliable data
‣ A ‘Customer’ for Marketing is not the same as ‘Customer’ for Finance
‣ Field ‘Customer’ in CRM is not equal to field ‘Customer’ in ERP
Solution: Data Governance
Bringing business and IT together to govern data as an Enterprise Asset
What does my data mean?
How can I involve all stakeholders?
How do I operationalize DG?
Positioning
The Closed World Syndrome
requirements and functionality known and specific
data model agreed locally and refers to organisational concepts
usually cryptically stored in proprietary format (vendor lock-in)
only understood by designer
designed for the purpose of one organisation
all facts about the domain are already stored;facts not stored are
presumed false
The Fairy Tail of the Everlasting Golden Record
‣ the golden record is a single version of truth for a limited period of time
‣ considered a install-once fix rather than a discipline to pursue
‣ contradicts with the inherent dynamics of online B2B communications
‣ unscalable and unsustainable
‣ need for governance to oversee
Limits of Data Integration in The Extended Enterprise
users, usage context, and applications
largely unknown a priori
ontologies refer to language-neutral, context-independent
concepts agreed by the community
systems must combine by interoperation
Sounds familiar?
what does it mean “Customer” ?
“Customer” is a type of Party of Person that orders at least two Product Items per Year.
so “Customer” refers to a class with attributes Pname, Paddress,... ?
...and a Party can either be an Individual or a Company...
Aha, and what types of Product Item exist ?
Feed the metadata repository giant?
walled garden walhalla
Empowering Information Governance
Banking customer
‣ Goal: reduce time needed to do end of year closing of general ledger from 40 to 10 days
‣ Problem: takes too long because of manual reconciliation
‣ Root cause: conflicting general ledger account taxonomies(apples and oranges)
‣ Solution: build a shared business vocabulary
Technology company
‣ Goal: meaningful reporting on corporate level
‣ Problem: inaccurate reporting (e.g., on “Customer Install Base”)
‣ Root cause: lack of Governance across organizational boundaries and lack of business ownership of data
‣ Solution: create common understanding, agreement and ownership on the business level
Government
‣ Goal: link data between different government bodies and agencies
‣ Problem: integration is costly and painful
‣ Root cause: bad quality data in various formats
‣ Solution: effects of business rule change can be tested and operationalized for data transformation and validation
Utilities
‣ Goal: obtain correct understanding of assets for reporting
‣ Problem: registration has never been done sufficiently
‣ Root cause: unclarity on “what things actually are”
‣ Solution: governance organization
So where do we start?
‣ Each industry has their own specifics – start with what they provide
‣ Each organization has their own specifics – start with what you already have
Cyc
Cyc is an artificial intelligence project that attempts to assemble a comprehensive ontology and knowledge base of everyday common sense knowledge, with the goal of enabling AI applications to perform human-like reasoning.Source: http://en.wikipedia.org/wiki/Cyc
Cyc (in facts & figures)
‣ Started in 1984 by Doug Lenat.
‣ Name comes from the stressed syllable of 'encyclopedia'.
‣ 70 million dollars and 700 person-years of work,
‣ 600,000 concepts, defined by 2,000,000 axioms, organized in 6,000 microtheories,
‣ But not enough applications to support continued research.
‣ In 2004, the Cyc project was scaled back, and more emphasis was placed on developing applications.
Source: John F. Sowa (1 September 2009)
Wikipedia
"an effort to create and distribute a free encyclopedia of the highest possible quality to every single person on the
planet in their own language"Source: http://en.wikipedia.org/wiki/Wikipedia
Wikipedia (in facts & figures)
‣ Created in 2001 by Jimmy Wales and run by non-profit organization, the Wikimedia Foundation.
‣ More than 14,000,000 articles in more than 260 languages
‣ There are 11,062,835 registered users, including 1,702 administrators, while employing fewer than 35 people.
‣ Net loss of 49,000 editors in first 3 months of 2009 versus loss of 4,900 editors in first 3 months of 2008... (WSJ, 23 November 2009)
Source: http://en.wikipedia.org/wiki/Wikipedia:About
Too much freedom?
Evolution evolving
December 3, 2001: initial version.
July 13, 2002: from controversial to commonly accepted in 2 hours.October 1, 2002: debut of biology grad student at Harvard, good for a total of 79 edits over 3 years.
August 9, 2004: black line indicates deletion as vandalism (half of all vandalisms are corrected within 5 minutes).March 29, 2005: longest point, discussion to reduce to neutral point of view
September 19, 2005: edit war, with rollbacks rollbacked several times
11
22
33
44
55
66
from IBM Watson Research
Agile methodology
‣ setup communities to reflect your organizational structure
‣ determine roles and responsibilities to reconcile and validate business vocabularies and rules within these communities
‣ define workflows and tasks to streamline this whole process around the clock
‣ monitor completeness of terms and rules
‣ analyze performance of contributors and tune accordingly
‣ transparency about use and lineage of business vocabulary and rules in technical systems
Structuring communities
‣ Based on the organizational chart (i.e., direct alignment with business units)
‣ Based on functional division (e.g., Sales and Marketing, Finance)
‣ Based on regional division (e.g., per country, region, …)
‣ Based on Lines of Business
‣ Based on existing Subject Areas
‣ Can span different organizations or enterprises
‣ Community entails ownership, which means “know thy self or thy customer”…
Functional example
Workflow and roles
‣ Standard roles exists from various sources: thought leaders, DAMA, frameworks (e.g., steward, owner, council, …)
‣ Various workflows are needed:
‣ Intake, approval, publication, decommission
‣ Notifications
‣ Validation
‣ Promote, demote, hire, fire members
‣ Dependency validation
‣ Every organization has their own roles and workflows – whatever works best for them
Stakeholder performance
Stakeholder performance
Stakeholder performance
If airplanes were like your systems…
Courtesy of Poppy Quintal (see www.aecma.org/Publications/SEnglish/senglish.htm
… would you still board them?
Courtesy of Poppy Quintal (see www.aecma.org/Publications/SEnglish/senglish.htm
How?
‣ Results: faster and more efficient handling of tasks, less errors, less accountability, more precision, easier communication, ... Control over complexity
‣ Combination of limited vocabulary (about 1000), larger unlimited set of more technical terms, rules and standards
Courtesy of Poppy Quintal www.aecma.org/Publications/SEnglish/senglish.htm
Conclusions
‣ Existing tools are insufficient for handling complexity in the information age:‣ There is little focus for sought-after business audience.
‣ Semantic technology is available: it helps understand what our data means‣ Agreed, clear and formal meaning, ready for use in systems.
‣ Data governance: to keep data understood, we need to “run it right”:‣ Combo of technology, organisation, methodology, and culture.
Questions & Feedback?
Read and watch more:
•Website: www.collibra.com
•Thought leader sessions: http://www.collibra.com/thought-leader-sessions
•Blog: inside.collibra.com
•Twitter: @collibra
•E-mail: [email protected]
Thank you