© copyright 2011 Semantic Insights™. Automated Traceability of Key Success Factors through...
-
Upload
amara-hollick -
Category
Documents
-
view
221 -
download
5
Transcript of © copyright 2011 Semantic Insights™. Automated Traceability of Key Success Factors through...
© copyright 2011 Semantic Insights™
© copyright 2011 Semantic Insights™
Automated Traceability of Key Success Factors through
Lifecycle Documents
Chuck Rehberg, Chief Scientist
Semantic Insights™(A division of Trigent Software, Inc.)
20-Nov-2011
Executive Summary• Problem Addressed
– The need to verify high-level requirements are transformed and reflected through all levels of project documents.
• Solution Proposed - A system that:– Given a list of Key Success Factors and a potentially large set of documents– Generates a “Traceability Report” mapping each Key Success Factor to
Specific Statements in Specific Sections in Specific Documents
• Solution Requirements– Domain-specific Semantic Data (Dictionary, Ontology, Experiences,
Language)– List of Key Success factors stated in natural language– Set of documents
• New Technologies Employed (recently patented or patent pending)– Natural Language Processing (non-statistical dictionary-driven, with WSD)– Meaning maps (multiple ways of saying the same thing)– Generation of focused high-speed rules-based document “readers”– Natural language report generation
Functional Overview
Key Success Factors(Mission, Goals, Requirements...)
High-level Documents
Intermediate-level Documents
Lower-level Documents
Find and indexall semantic references
Document Repository
SIRA
Domain-specific Semantic data
(Dictionary, Ontology,…)
Report of each Key Success Factors to
Specific Statements in Specific Sections in Specific Documents
CIO
Domain Experts
The basic process
• Identify Key Success Factors– Key Success Factors include natural language
statements of Mission, Goals and other Requirements, often taken directly from the initial program documents.
• Provide Domain-specific Semantic data – This includes: Dictionary, Ontology, Experience and
Language
• Provide access to the document corpus to be automatically read and analyzed
• Generate the desired report
Sample “Key Success Factors”
1. Function as one unified DoD Enterprise, creating an information advantage for our people and mission partners.
2. Provide a rich information sharing environment in which data and services are visible, accessible, understandable, and trusted across the enterprise.
3. Provide an available and protected network infrastructure (the GIG) that enables responsive information-centric operations using dynamic and interoperable communications and computing capabilities.
4. Drive the fundamental concepts of net-centricity across all mission of the Department of Defense to ensure that all applicable DoD programs, regardless of Component or portfolio, comply with the DoD net-centric vision and enable agile, collaborative net-centric information sharing.
Unpacking the Semantics of “Key Success Factors” (KSF)• Each KSF statement embodies a number of semantically
distinct assertions. • For example from the preceding list:
– “Provide a rich information sharing environment in which data and services are visible, accessible, understandable, and trusted across the enterprise.”
• The System unpacks this KSF statement into these basic requirements:– Environment provides information.– Information is shared.– Services are understandable across the enterprise.– Services are accessible across the enterprise.– Services are visible across the enterprise.– Services are trusted across the enterprise.– Data are understandable across the enterprise.– Data are accessible across the enterprise.– Data are visible across the enterprise.– Data are trusted across the enterprise.
Providing Domain-specific Semantic data(continuing the previous example)
• Beyond the normal everyday meanings, you may need to specify domain-specific semantics for these terms:– Environment– Information– “share” as in “to share information”– Services– Enterprise– understandable– accessible– visible– Data
• Such semantic information includes: – Linguistic metadata such as “part of speech” and usage– An Ontology specifying relevant generalizations, specializations,
composition, and relationships to other concepts.
• Note: The following basic demo uses only the predefined English dictionary and grows the initial Ontology dynamically
High-speed machine reading
• Readers– The system generates special purpose high-speed
readers capable of quickly “reading” a large set of documents.
– The goal of the reader is to identify statements which semantically overlap each of your Key Success Factors.
– Domain-specific semantic information will be used to increase the accuracy of the results of the high-speed reader.
• Implications and Inferences– The system further uses domain-specific knowledge
to find statements that imply support for your Key Success Factors.
Introducing PriArt
Enter Key Success Factor Statement
“Unpacked” KSF Form (demo purposes)
Select Architecture Documents to Read
Architecture/
Appendix B_Draft OV-5a_IEA xxxxxxxxxxxx.docAppendix F_Draft GIG 2.0 Alignment with DoD IEA xxxxxxxxxx.docAV-1_Initial Draft DoD IEA xxxxxxxxxxxxx.docCV-1 (rev1)_Initial Draft DoD IEA xxxxxxxxxxxxx.docCV-2_Initial Draft DoD IEA xxxxxxxxxxxx.docDoD EIEA AV-1_Vxxxxxxxxxxxx.docDraft Activity Decomposition Overview (OV-5a)_IEA xxxxxxxxxx.pptxDraft Document Framework Description_IEA xxxxxxxxxxxxx.docDraft IE Capabilities Taxonomy (CV-2)_IEA xxxxxxxxxxxxxxxxx.docDraft IE Capability Vision (CV-1)_IEA xxxxxxxxxxxxxxxxx.docDraft IE Operational Concept (OV-1)_IEA xxxxxxxxxxxxxxxxx.docDraft Integrated Dictionary (AV-2)_IEA xxxxxxxxxxx.xlsxDraft Integrated Document_IEA xxxxxxxxxxxxxxxxxxxxxxx.docDraft Operational Viewpoint_IEA xxxxxxxxxxxxxxxxxxxx.docDraft Overview and Summary (AV-1)_IEA xxxxxxxxxxxxxxxxxxx.docDraft Updated EA Compliance Req_IEA xxxxxxxxxxxxxxxxxxx.docOperational Context Initial Draft DoD IEA xxxxxxxxxxxxxxx.docOV-1_Initial Draft DoD IEA xxxxxxxxxxxxxxx.docOV-5a_Initial Draft DoD IEA xxxxxxxxxxxxxxx.docOV-6a_Initial Draft DoD IEA xxxxxxxxxxxxxxx.docRevised EA Compliance Initial Draft DoD IEA xxxxxxxxxxxxxxx.doc
Report on a set of Architecture Documents showing mapping within one document.
(Draft Integrated Document_IEA v2_Sep Deliverable_20110916.doc)
Report on a set of Architecture Documents showing all references to a selected KSF.
PDF Report follows same format as on-line report
Generated Bibliography
Bibliography
[ 1 ] 1321800012561.doc. Retrieved on 11/20/2011 10:58:38, from
http://192.168.2.104/DoD_IT_Source/Architecture/Draft IE Operational Concept (OV-
1)_IEA xxxxxxxxxxxxx.doc
[ 2 ] 1321799657677.doc. Retrieved on 11/20/2011 10:59:15, from
http://192.168.2.104/DoD_IT_Source/Architecture/Draft Integrated Document_IEA xxxxxxxxxxxxxxxx.doc
[ 3 ] 1321800012569.doc. Retrieved on 11/20/2011 11:18:59, from
http://192.168.2.104/DoD_IT_Source/Architecture/Draft Operational Viewpoint_IEA xxxxxxxxxxxxxxxx.doc
[ 4 ] 1321799650176.doc. Retrieved on 11/20/2011 11:20:46, from
http://192.168.2.104/DoD_IT_Source/Architecture/OV-
6a_Initial Draft DoD IEA xxxxxxxxxxxxxxx.doc
[. . .]
However, the information source may refer to new terms and new concepts• SIRA combines both advanced linguistics and semantics to
discover and learn newly encountered items• Example: using linguistic placeholders like “the unknown
thing” (#?#)
This Investigation:Environment provides information.Services are understandable across the enterprise.Services are accessible across the enterprise.Services are visible across the enterprise.Services are trusted across the enterprise.Data are understandable across the enterprise.Data are accessible across the enterprise.Data are visible across the enterprise.Data are trusted across the enterprise.
Becomes:#?# are accessible across the #?#.#?# are trusted across the #?#.#?# are understandable across the #?#.#?# are visible across the #?#.#?# provides information.
• The investigation now includes anything that asserts these relationships
Applied to a Portfolio of existing Systems
Portfolio/
03a_CDD for GFM DI Incr xxxxxxxxxxxxxxxxxx.pdf2007-07-23 xxxxxxxxx CDD.doca4542xx.pdfa5265xx.pdfArmy DIMHRS CDD xxxxxxxx.pdfDAI CDD Appendices approved xxxxxxxxxxx.pdfDocument-NECC CDD xxxxxxxxxxxxx.docDRAFT NCES CDD xxxxxxxxxx.docFinal CDD draft xxxxxxxxx.docGCSS-A_MS B_CDD_xxxxxxxxx.docGCSS_FoS_MA_ICD_xxxxxxxxxx.pdfGIGMAICX.pdfJoint_JET-_NN_CDD_Appendix_xxxxxxxxxxxxxxxxxxx.docLightweight FSP CDD xxxxxxxxxxx.docNSWCDD-MP-xxxxxxxxxxx.pdfUnmanned Systems ICD Draft xxxxxxx.doc
Over 2000 pages of source documents become 30 pages of points relevant to the investigation with bibliography and hyperlinks
The need for more accuracy• By using “the unknown thing” (#?#) you can identify
items which represent the specific kinds of items desired. For example (samples from previous report):– Environment
• Cloud Computing• SOA
– Information• Survival Information• potential or impending attack based on intelligence law
enforcement and open source information• Information on security relevant events
– Services• NECS Services
– Enterprise• The GI Analytical Environment• GCSS-Army
– Data• User Profile
• However #?# can also identify items which may be outside our interest. Perhaps this is not relevant:– “Program-specific assessments from this literature will provide
tailored information. [4]”
Enhance the Ontology to increase accuracy • SIRA naturally uses your investigation and information
sources to automatically extend the current Ontology• However, many relationships such as generalization,
specialization, instantiation, and composition are often not explicitly stated in the text. These may be required to identify relevant information.
• By adding semantic information to the Ontology, the system automatically extends the subsequent semantic research to include these concepts/terms (and their synonyms) where appropriate.
• In short, you can introduce a concept and corresponding terms, define the relevant semantic relationships and linguistic metadata, and begin using the term in your investigation right away.
Reporting and Queries
• As a result of machine reading, the information source documents are semantically index relative to the investigation.
• The system supports interactive queries of this semantic index. Report templates and content can be dynamically defined to render the results the query results.
• All semantic information can be exported• “Key Success Factors” are just one example of
an investigation. There is no restriction on the nature and content of an investigation.
24
• Who we are:– Semantic Insights is the R&D division of
Trigent Software, Inc. www.trigent.com – We focus on developing semantics-based
information products that produce high-value results serving the needs of general users requiring little or no training.
– Visit us at www.semanticinsights.com
Who we are
25
Chuck Rehberg• As CTO at Trigent Software and Chief
Scientist at Semantic Insights, Chuck Rehberg has developed patented high performance rules engine technology and advanced natural language processing technologies that empower a new generation of semantic research solutions.
• Chuck has more than twenty five years in the high-tech industry, developing leading-edge solutions in the areas of Artificial Intelligence, Semantic Technologies , analysis and large –scale configuration software.
© copyright 2011 Semantic Insights™