ForgetIT: Beyond the page: Giving content a meaning and value
-
Upload
olivier-dobberkau -
Category
Internet
-
view
899 -
download
0
description
Transcript of ForgetIT: Beyond the page: Giving content a meaning and value
Concise Preservation by combining Managed Forgetting and Contextualized Remembering
Olivier Dobberkau (R&D)
T3DD2014!Beyond the page - Giving content a meaning and value!TYPO3 Developer Days, 19/22 June 2014, Eindhoven
TYPO3 Developer Days, 19/22 June 2014, Eindhoven
About
Olivier Dobberkau
R&D dkd
President of TYPO3 Association
@TReverendNeverend
The problem
TYPO3 Developer Days, 19/22 June 2014, Eindhoven
Welcome to the digital, information age... …a never ending flood of content!
Technology enables us to produce nearly unlimited data
We are still „hunters and collectors“ somehow
Currently storage space feels to be „infinite“, but resources on
earth are limited sooner or later
Velocity of innovation/evolution of technology increases, which
brings new technology/formats/standard at an increasing
frequency -> so how do we handle this?
Storage capacity is ever increasing Prices for storage are falling
TYPO3 Developer Days, 19/22 June 2014, Eindhoven
Easy - let‘s keep everything!
There’s a lot more costs
Retrieval
Maintenance
Indexing
Updates
Deprecated formats
Should we really keep everything as it was created ?
“The digital dark age is a possible future situation where it will be difficult or impossible to read historical electronic documents and multimedia, because they have been stored in an obsolete and obscure file format.” Wikipedia
How do we tackle this?
TYPO3 Developer Days, 19/22 June 2014, Eindhoven
What is preservation?
“Preservation — The protection of cultural property through activities that minimize chemical and physical deterioration and damage and that prevent loss of informational content. The primary goal of preservation is to prolong the existence of cultural property.”
Preservation 101
TYPO3 Developer Days, 19/22 June 2014, Eindhoven
Preserving a website is not trivial
What do want you preserve?
Content only?
Content and Design?
How often? Stock prices vs. Company History page
How do you deal with browser differences?
How do you preserve functionality? E.g. insurance fee calculator
The project
Concise Preservation by combining Managed Forgetting and Contextualized Remembering
TYPO3 Developer Days, 19/22 June 2014, Eindhoven
The project
Deliver a framework for intelligent preservation, incl. pilot applications (personal use case, organizational use case) that already bring value to their target groups.
TYPO3 Developer Days, 19/22 June 2014, Eindhoven
The Project
EU research project
Part of the Seventh framework programme
Countries involved : Germany, Sweden, Israel, Turkey, Greece, United Kingdom, Italy
Project duration: 2013/2016
TYPO3 Developer Days, 19/22 June 2014, Eindhoven
Core concepts
Synergetic Preservation
Contextualised Remembering
Managed Forgetting
TYPO3 Developer Days, 19/22 June 2014, Eindhoven
Core values
Preservation valueMemory buoyancy
TYPO3 Developer Days, 19/22 June 2014, Eindhoven
Memory buoyancy and preservation value
TYPO3 Developer Days, 19/22 June 2014, Eindhoven
Memory buoyancy and preservation value
Digital preservation
TYPO3 Developer Days, 19/22 June 2014, Eindhoven
Memory buoyancy and preservation value
Digital preservation
Forgetting without context
TYPO3 Developer Days, 19/22 June 2014, Eindhoven
Memory buoyancy and preservation value
Digital preservation
Forgetting without context
Preservation with learning
TYPO3 Developer Days, 19/22 June 2014, Eindhoven
Memory buoyancy and preservation value
Digital preservation
Forgetting without context Preservation with context
Preservation with learning
TYPO3 Developer Days, 19/22 June 2014, Eindhoven
Memory buoyancy and preservation value
Digital preservation
Forgetting without context
Managed digital preservation
Preservation with context
Preservation with learning
TYPO3 Developer Days, 19/22 June 2014, Eindhoven
Memory buoyancy and preservation value
Archive or delete
Digital preservation
Forgetting without context
Managed digital preservation
Preservation with context
Preservation with learning
TYPO3 Developer Days, 19/22 June 2014, Eindhoven
Memory buoyancy and preservation value
Archive or delete
Information not neededDigital preservation
Forgetting without context
Managed digital preservation
Preservation with context
Preservation with learning
TYPO3 Developer Days, 19/22 June 2014, Eindhoven
Use cases
Organizational Preservation
Personal Preservation
TYPO3 Developer Days, 19/22 June 2014, Eindhoven
Organizational use case
Organizational Preservation
Digital Asset Management
Versioning
Archiving a complete Website
Individual genres and their specific requirements
Example: Press Release
TYPO3 Developer Days, 19/22 June 2014, Eindhoven
Business case / Value preposition
Creating metrics to actually „measure“ the value of content is unique to ForgetIT and will be a USP
Sustainable and integrated tools to manage the process of preservation, which is new to CMS systems
The utilized standards (e.g. CMIS, ODATA, STANBOL, etc.) and newly created tools within the context of TYPO3 CMS will lead to CMS interoperability and thus prevent future loss of content due to technological evolution (see „preventing the digital dark age“)
TYPO3 Developer Days, 19/22 June 2014, Eindhoven
Content Value Performance Indicators: Potential dimensions to look at:
Production Inner relevance Outer relevance „Meaning"
Effort ReferencesSocial Media
relevance Context
ComplexityPage
impressionsGoogle page
rank Ontologies
VersionsTYPO3 CMS
page rank Backlinks Annotation
…Memory
Buoyancy … …
…
TYPO3 Developer Days, 19/22 June 2014, Eindhoven
Why TYPO3 CMS?
Open source
large base of installation
Want to create awareness on the concept of preservation
TYPO3 Developer Days, 19/22 June 2014, Eindhoven
Architecture
TYPO3 Developer Days, 19/22 June 2014, Eindhoven
TYPO3 Developer Days, 19/22 June 2014, Eindhoven
TYPO3 Developer Days, 19/22 June 2014, Eindhoven
TYPO3 Developer Days, 19/22 June 2014, Eindhoven
Technology
TYPO3 Developer Days, 19/22 June 2014, Eindhoven
Content Management Interoperability Services (CMIS)
Standard allowing interoperability between CMS
Abstraction layer
Defined domain model
OASIS Standard
TYPO3 Developer Days, 19/22 June 2014, Eindhoven
Semantic web
A web that can be processed by machines
Resource Description Framework (RDF)
TYPO3 Developer Days, 19/22 June 2014, Eindhoven
Ontologies / Domains Semantic relations in Content
industry specific
concepts !
geography,time,
abstract concepts !
!company related products, events,
concepts, ...
This is our set of concepts
to annotate content with!
during creation/update
flows over time
as the basis for defining
value
future „smart semantic“
editing
TYPO3 Developer Days, 19/22 June 2014, Eindhoven
But - what does semantic annotation mean? How does it look to us in a press release (tt_news)?
„.. to announce, that the Global Toy fare will be held in Nuremberg on February 12th, 2014. LEGO will be presenting it products in Hall ... “
company event
common geography
common date
industry conceptof a brand
TYPO3 Developer Days, 19/22 June 2014, Eindhoven
Suggestion how to tackle this from dkd:
Treat semantics like learning the system a foreign (company) language
Implementing a semantic „overlay“ within the backend, so that during the creation/update of content annotation can happen
Suggest annotations if the backend already knows a word/concept
Using these content annotations to level up DAM in TYPO3CMS
Integrating semantic search in back end and front end
Connect DAM to the Media Mixer from ForgetIT Framework
TYPO3 Developer Days, 19/22 June 2014, Eindhoven
Text summarization
Generation of visual summaries!• Content Detection analyzes a
document to determine which sections are useful in terms of content (e.g. removing the generic menus in a web page; avoids irrelevant material biasing the summary)!
• TermRaider extracts representative, weighted terms (words, entities etc.) from documents which can provide a summary (e.g. as a term cloud)
TYPO3 Developer Days, 19/22 June 2014, Eindhoven
Outlook: Semantic text composition
Semantic text editor!• Tool for inferring and suggesting semantic annotations for text while it
is being composed
TYPO3 Developer Days, 19/22 June 2014, Eindhoven
Outlook: Semantic text composition
Semantic text editor components!• Editor!
− An extended version of the open-source HTML-based rich text editor CKEditor, which allows for annotating and tracking arbitrary parts of the text !
• Natural Language Processing component!−Named entity recognition locates and classifies atomic elements in text into
predefined categories such as people, organizations, and locations!−Coreference resolution identifies which words refer to which things in a text!−Relation extraction extracts binary relations from the text being composed!
• Linked Open Data component !− Entity disambiguation distinguishes between different entities that have similar
or identical names!−Relation extraction searches for relations among entities!−Context inference finds contextual information about entities mentioned in the
text
Annotation/Contextualisation of Images
TYPO3 Developer Days, 19/22 June 2014, Eindhoven
Image analysis
ForgetIT visual analysis
technologies demonstrator!• Concept detection and feature
extraction!• Visual quality assessment!• Image clustering!• Face detection
http://multimedia.iti.gr/ForgetIT/CostaRica/demonstrator.html
TYPO3 Developer Days, 19/22 June 2014, Eindhoven
Image feature extraction and concept detection
TYPO3 Developer Days, 19/22 June 2014, Eindhoven
Image clustering for summarization
Want to support the ForgetIT project?
How to get involved?
Ideas
TYPO3 Developer Days, 19/22 June 2014, Eindhoven
Code contributions
TYPO3 Developer Days, 19/22 June 2014, Eindhoven
Test and evaluate
TYPO3 Developer Days, 19/22 June 2014, Eindhoven
Take our survey! (1/2)
Organizational Preservation
http://bit.ly/U65uL6
TYPO3 Developer Days, 19/22 June 2014, Eindhoven
Take our survey! (2/2)
Personal Preservation
http://bit.ly/1kJPNhZ
Timeline
TYPO3 Developer Days, 19/22 June 2014, Eindhoven
Timeline
2013
• D10.3
• Mockups
• Proof of concept
2014
• Architecture
• FAL
• Semantic UI / Layer
• DAM Dashboard
• Log Aggregation Toolkit
2015
• Content value framework
2016
• Final
TYPO3 Developer Days, 19/22 June 2014, Eindhoven
D10.1
Research
Analysis
Application Design
Application Logic and Workflow
Mockups
TYPO3 Developer Days, 19/22 June 2014, Eindhoven
Use Case I: Press release
Use case I: Press release
• Creating a press release
• Adding meta data
• Semantic annotation
TYPO3 Developer Days, 19/22 June 2014, Eindhoven
TYPO3 Developer Days, 19/22 June 2014, Eindhoven
Ingest press release
Automatic annotation
• Initiated by user
• Add entity to own ontology
• Color coded according to type
TYPO3 Developer Days, 19/22 June 2014, Eindhoven
TYPO3 Developer Days, 19/22 June 2014, Eindhoven
Ingest press release
Manual annotation
• Selection from text or clipboard
• Add entity to own ontology
TYPO3 Developer Days, 19/22 June 2014, Eindhoven
Use Case II: Preservation-aware digital asset management
Use Case II: Preservation-aware digital asset management
• Searching for assets
• Managing digital assets
• Handling digital assets
TYPO3 Developer Days, 19/22 June 2014, Eindhoven
Summary
TYPO3 Developer Days, 19/22 June 2014, Eindhoven
Where to find us
http://www.forgetit-project.eu
TYPO3 Developer Days, 19/22 June 2014, Eindhoven
Contact
@ForgetITProject
Olivier Dobberkau
TYPO3 Developer Days, 19/22 June 2014, Eindhoven
Call to Action
Join our efforts in creating:
a semantic layer in TYPO3 CMS
defining the future of DAM within the TYPO3 world
establishing content value measures
preparing TYPO3 and our customers to manage forgetting and preservation of content
Thank you for your attention!