Post on 28-Mar-2015
Investing in the
future
Methods for Sharing Online ResourcesDecember 2004Cormac.Connolly@ESRC.ac.uk
ESRC Society Today
The Background Theory
Methods for Sharing Online Resources…
How does a child recognise what a dog looks like?
What exactly is content?
Waving a magic wand…
Putting theory into practice.
Serendipity Effect
ESRC World
Today’s Content Reality… (nightmare)
Unstructured vs Structured
80% Unstructured80% Unstructured
20% Structured20% Structured
OracleOracle
DB2DB2
MS SQLMS SQL
• AutomaticAutomatic• Data AgnosticData Agnostic• Language IndependentLanguage Independent• FastFast• ScalableScalable• AccurateAccurate• Dynamic & RealtimeDynamic & Realtime• Includes Voice & VideoIncludes Voice & Video• Fully XML compatible Fully XML compatible
+ + and includes and includes Legacy MethodsLegacy Methods
=
A Unique Combination of Technologies
Autonomy granted single Autonomy granted single source supplier status by source supplier status by US US GovernmentGovernment for its unique for its unique
technology. Also used by technology. Also used by UK UK GovernmentGovernment
Aggregate content,tag & categorize
Hypertext links to similar contentPersonalization fromforms/questionnaires geodemographicprofiling
Searching forinformation
E-mailing informationto relevant recipients
Reformatting for multi-channel delivery, e.g.PDF to XML
Answering customerinquiries via a helpdesk
Manual ProcessesProcess Automation
Aggregation
Automatic Categorization
Hyperlinking
Profiling
Personalization
Collaboration
Delivery
Retrieval
Routing
Alerting
Understanding + Automation Removes Manual Processes
IntegrationThroughUnderstanding
Information Information TheoryTheory and and Bayesian Bayesian InferenceInference
Notes
News Feeds
Internet
Database
Files
DocumentManagement
XML
Audio/ Media
• TaxonomiesTaxonomies - - Fully hierarchical and relational, dynamic, Fully hierarchical and relational, dynamic,
individualized, trainable & editable by example or legacy methodsindividualized, trainable & editable by example or legacy methods Requires very few documents for initial trainingRequires very few documents for initial training
Fully dynamic views of categorized contentFully dynamic views of categorized content
Manual supervision if available if requiredManual supervision if available if required
Example Functions…
AutomaticAutomaticCategorizationCategorization
RetrievalRetrieval
HyperlinkingHyperlinking
PersonalizationPersonalization
AlertingAlerting
ProfilingProfiling
ClusteringClustering
• ClusteringClustering - - Ability to take information, or people and cluster them automatically Ability to take information, or people and cluster them automatically into related groupsinto related groups
• ProfilingProfiling - - User profiles can be automatically matched and connected for User profiles can be automatically matched and connected for collaboration creating centers of expertisecollaboration creating centers of expertise
• PersonalizationPersonalization - - Fully granular automatic and individual personalization Fully granular automatic and individual personalization configurable by users or administratorconfigurable by users or administrator
• AlertingAlerting - - Accurate scalable proactive alerting. Avoiding Accurate scalable proactive alerting. Avoiding problems of Keyword systems. Implicit or explicitproblems of Keyword systems. Implicit or explicit
alert subject settingalert subject setting
• HyperlinkingHyperlinking - - Fully Automatic Fully Automatic hyperlink generation across data hyperlink generation across data typestypes
• RetrievalRetrieval -- Natural language, concept matching, Natural language, concept matching, full legacy Boolean, metadata and XML, distributed full legacy Boolean, metadata and XML, distributed and federated, refine by example, combinational, and federated, refine by example, combinational, cross-lingual, user feedback, results weighting, cross-lingual, user feedback, results weighting, parametricparametric
Legacy Compatibility Module - LCM
Legacy SystemsLegacy Systems
• Legacy IndexesLegacy Indexes
• Legacy TopicsLegacy Topics
• Legacy Profiling Legacy Profiling
• Legacy Collaboration Legacy Collaboration SystemsSystems
IDOLIDOL
• Putting information Putting information
into contextinto context
• Conceptual IndexesConceptual Indexes
• Conceptual ProfilesConceptual Profiles
• Contextual CategoriesContextual Categories
• Collaboration & Collaboration &
Expertise NetworksExpertise Networks
Additional benefits of being able to integrate with a whole host of Document Management Additional benefits of being able to integrate with a whole host of Document Management Systems and Legacy Retrieval and Collaboration Systems in order to leverage the existing Systems and Legacy Retrieval and Collaboration Systems in order to leverage the existing user-document relationships that reside within the knowledge base.user-document relationships that reside within the knowledge base.
Legacy Legacy CompatibilitCompatibilit
y Moduley Module
Supported Repositories…
• Oracle 9iOracle 9i
• Oracle DatabaseOracle Database
• Lotus NotesLotus Notes
• Lotus QuickplaceLotus Quickplace
• DocumentumDocumentum
• ATG Dynamo ATG Dynamo
• IntershopIntershop
• Exchange ServerExchange Server
• FileNetFileNet
• iManage ServeriManage Server
• Microsoft SQLMicrosoft SQL
• SybaseSybase
• DB2DB2
• ODBC DatabasesODBC Databases
• Microsoft SharePointMicrosoft SharePoint
• OpenText LiveLinkOpenText LiveLink
• PCDocsPCDocs
• Siebel 2000Siebel 2000
• VignetteVignette
Meta Data…
• All document Meta-Data supportedAll document Meta-Data supported
• e.g: Price, Colour, Image, Size, Author, Summary, Type, e.g: Price, Colour, Image, Size, Author, Summary, Type, Security, Meta-tagsSecurity, Meta-tags
• Strings, Numbers, Dates, Bits supportedStrings, Numbers, Dates, Bits supported
• Conceptual search + mixed Conceptual / Meta searchConceptual search + mixed Conceptual / Meta search
• Full Meta-data Boolean searchFull Meta-data Boolean search
• Meta-data weightingMeta-data weighting
• Biasing / Filtering by Meta-DataBiasing / Filtering by Meta-Data
• Advanced Compound SortingAdvanced Compound Sorting
• Boolean Meta-Data CategorizationBoolean Meta-Data Categorization
• Powerful per document free field structurePowerful per document free field structure
FULL META DATA SUPPORTFULL META DATA SUPPORT
XML Support…
Just another file format:Just another file format:
• Read XML nativelyRead XML natively
• Products can output XMLProducts can output XML
• Advanced XML field mappings / operationsAdvanced XML field mappings / operations
• ALL Autonomy operations available on XMLALL Autonomy operations available on XML
FULL XML SUPPORTFULL XML SUPPORT
Retrieval Methods…
dog AND pet AND labradordog AND pet AND labrador1. Legacy Methods1. Legacy Methods
2. Bayesian Inference2. Bayesian Inference
3. Information Theory3. Information Theory
Statistics Generation from The Corpus
The DRE, using Bayesian The DRE, using Bayesian Inference and Shannon’s Inference and Shannon’s Information Theory, builds Information Theory, builds “Bags” of statistics from a “Bags” of statistics from a corpus of documentscorpus of documents
DRE = Dynamic Reasoning EngineDRE = Dynamic Reasoning Engine
The DRE Identifies Key Concepts…
The DRE Identifies Key Concepts…
The DRE Identifies Key Concepts…
And Stores Statistics on Document…
To Form a Conceptual Understanding…
Natural Language Response…
““Tell me about the Tell me about the golden labrador...”golden labrador...”
Natural Language Response
Conceptual Relations…
Serendipity effect…
“Autonomy shines when finding interesting or unanticipated matches between texts or digital assets … important and
needed to drive collaboration… or knowledge sharing activities …” Forrester
…"Autonomy can understand and analyse huge amounts of information….. (it can) categorise the ideas that they contain and build a sophisticated idea of what it is looking at without human help…..Autonomy has developed a program that reads, analyses and acts upon text, a breakthrough in artificial intelligence" Sunday Times
ESRC World…
““Search” belittles Autonomy’s capability as an enabling technology Search” belittles Autonomy’s capability as an enabling technology for personalization, knowledge management, and collaboration - for personalization, knowledge management, and collaboration - an automated “Intelligent Data Operating Layer” for unstructured an automated “Intelligent Data Operating Layer” for unstructured content…content… AMR July 2003AMR July 2003
Potential Content Resources
SOSIG UK Data Archive Selected MIMAS targets IBSS ESRC Investment Websites 3rd Party Materials Commissioned Content