Post on 21-Aug-2015
Redefining Perspectives A thought leadership forum for technologists interested in defining a new future
June - 2015
COPYRIGHT ©2015 SAPIENT CORPORATION | CONFIDENTIAL
Session 2
Semantic Search – the technology and its application in financial markets
COPYRIGHT ©2015 SAPIENT CORPORATION | CONFIDENTIAL
Search
3COPYRIGHT ©2015 SAPIENT CORPORATION | CONFIDENTIAL http://www.indiatechonline.com/images/special_feature/idc-emc-suudy-on-digital-universe-165.jpg
Search
4COPYRIGHT ©2015 SAPIENT CORPORATION | CONFIDENTIAL
Keyword based Search Engine
Search Engine User
“Give me what I Said”
ENTERPRISE ECOSYSTEM
Search – Enterprise Ecosystem
5COPYRIGHT ©2015 SAPIENT CORPORATION | CONFIDENTIAL
60% of Enterprise Data are Unstructured
Structured Data
Trading
Reference
SecuritySearch Silos
Keyword Based Search
Semi Structured Data
Wiki
Vendor Data
Reports
“Give me what I asked”
Unstructured Data
Research
Company Filings
Feeds Data
Search Silos
Custom Search Application
Custom Search Application
Semantic Search – Making Results Relevant
6COPYRIGHT ©2015 SAPIENT CORPORATION | CONFIDENTIAL
Context & Intent based,
Meaning & Relationships among words
Semantic Search – Making Results Relevant
7COPYRIGHT ©2015 SAPIENT CORPORATION | CONFIDENTIAL
Disambiguate
“Give me what I Want; Not just what I Say”
Search – Enterprise Semantic Ecosystem
8COPYRIGHT ©2015 SAPIENT CORPORATION | CONFIDENTIAL
Enterprise Semantic
Search
Knowledge Discovery
Enterprise Content Enterprise Semantic Search
Linked Data
DBPedia
Freebase
Internal Knowledge
Base
Enterprise Data Models
Content Extraction
Context Mapping
Contextual Meaning
Inferencing
Structured
Unstructured
Semi-Structured
We’ll focus on…
• We will consider a Financial Domain Investment Bank Use Case
• How Semantic Search Platform is built technically in-line with the use case
COPYRIGHT ©2015 SAPIENT CORPORATION | CONFIDENTIAL
The Use Case – Investment Research
10COPYRIGHT ©2015 SAPIENT CORPORATION | CONFIDENTIAL
A typical Research team in an Investment Bank performs the following:
• Manually gather research information• Analyze gathered documents to find
requested information
Challenges:
• High volume of research corpus
• Manual Analysis results in• Inaccuracy• Longer Response Time• Time to Market• Lower ROI
Automate Routine Requests Faster response. But limited benefit. Problem still remains for Complex Information
Requests
Outsourcing Research Team Potential Cost Savings Problem not solved but moved to a different place.
QoS risks.
Ontology Based Semantic Search
Faster Response More Relevant and Contextual Search Results Knowledge Discovery through Inferencing Domain and technology expertise required
Current Scenario Options Pros/Cons
The Use Case – Investment Research
11COPYRIGHT ©2015 SAPIENT CORPORATION | CONFIDENTIAL
Build a Semantic Search platform that leverages latest advancements in Search and Natural Language Processing to make Investment Research Experience significantly more efficient and effective
Maximize ROI on Market Research Spending
Get Insight to Timely Industry Information
Find and Discover Actionable Knowledge
Perform Informed Investment Decisions
The Use Cases – Potential Search Queries
12COPYRIGHT ©2015 SAPIENT CORPORATION | CONFIDENTIAL
‘HSBC Holdings Plc’ ‘Asset Write down’ Asia
Interest rate risk private banks Western Europe
Documents about banks based out of Paris and talk about interest rates volatility in Western Europe
Companies in Eastern Europe whose turnover is greater than $100 million and face challenge of nationalization
Show me documents about Retail Banks in South Asia whose P/E ratio is greater than 20.0
Do a proximity search on ‘Regulatory Change’ with reference to ‘Retail Banking’
Looking for documents published by HSBC and authored by Ronit Ghose
Enabling Semantic Search - Approaches
1. Lexicon and Ontological Based Search
2. Statistical Analysis and/or Pattern Matching Search
13COPYRIGHT ©2015 SAPIENT CORPORATION | CONFIDENTIAL
Enabling Semantic Search – 4 Pillars
14COPYRIGHT ©2015 SAPIENT CORPORATION | CONFIDENTIAL
Reasoning Engines
Natural Language
Processing
Ontology
Semantic Analysis
Enabling Semantic Search – Core Concepts
15COPYRIGHT ©2015 SAPIENT CORPORATION | CONFIDENTIAL
Model defined using constructs for:
• Concepts – classes
• Relationships – properties (object and data)
• Rules – axioms and constraints
• Instances of concepts – individuals (data)
• Uses W3C standards RDF/S and OWL
Relationships
Concepts/Classes
Instances
What is Ontology ?
It’s an Knowledge Model, assembly of concepts in which all possible relationships that might exist among concepts are explicitly mapped. it captures knowledge so that,
• Questions can be answered• New Insights can be generated
Enabling Semantic Search – Core Concepts
16COPYRIGHT ©2015 SAPIENT CORPORATION | CONFIDENTIAL
Data stored in TriplesExpressed as Subject : Predicate : Object
Internal Knowledge
Base
DISCOVER NEW INSIGHTS
Pranab Mukherjee New Delhi IndiaLives in Is in
Lives inGet me documents about Retail Banks in Eastern Europe which have net profit great than $10 million and are facing challenges of nationalization
Putting It All Together - Application Process Flow
17COPYRIGHT ©2015 SAPIENT CORPORATION | CONFIDENTIAL
Content Providers
Content Extraction & Standardization
StandardizedDocument
Step 1
Content Ingestion
Classification
Ontology Tagging
Meta Data
Document Store
XML and Triple Storage
Indexing &Querying
Step 2
Content Delivery
Search
Engagement
Step 3
Components – Content Extraction & Standardization
18COPYRIGHT ©2015 SAPIENT CORPORATION | CONFIDENTIAL
Unstructured Content
Text Extraction & Standardization
Metadata Extracted Textual Content
• Extract Meaning from Unstructured Data
• Transform into Structured Data for Auto Tagging
Components – Content Ingestion
19COPYRIGHT ©2015 SAPIENT CORPORATION | CONFIDENTIAL
Ontology Management
A tool that supports lists, controlled vocabularies, taxonomies, thesauri or ontologies:• Concepts/Terms• Taxonomy• Associative Relationships• Synonyms
http://wiki.opensemanticframework.org/index.php/Ontology_Tools
Components – Content Ingestion
20COPYRIGHT ©2015 SAPIENT CORPORATION | CONFIDENTIAL
Content Classification
• Analyze document
• Add metadata ‘tags’ that describe that documents which are sourced from Ontology
Example : Classification Results
Typical Architecture
22COPYRIGHT ©2015 SAPIENT CORPORATION | CONFIDENTIAL
CO
RE
PL
AT
FO
RM
STORAGE LAYER
PRESENTATION LAYER
Free-Text Search
Ontology Driven Search
Graph Search Collaboration Engagement
CORE SERVICES
Logging
Caching
Security
Monitoring
IndexesContent Store
Triples
Inferencing
SPARQL
XQUERY
Classification Server
Ontology Server
RuleSets
Inference Engine
ONTOLOGY MGMT
Ontology Creation
RuleSets
Entity Extraction
Inferencing
CONTENT DELIVERY
Query pre-processor
Query Builder
Inference Engine
Results post-processor
CONTENT INGESTION
Import
Classification/ Indexing
Standardization / Structuring
Storage
Semantic Search – Opportunities & Beyond
23COPYRIGHT ©2015 SAPIENT CORPORATION | CONFIDENTIAL
Augmented Reality
Other Possibilities?
http://augmentedpixels.com/wp-content/uploads/2014/04/augmented-reality-iphone-football-concept.jpg
http://www.ventures-africa.com/wp-content/uploads/2015/01/original_aefd15169aaebd3f037b5ed672db6de1.png