ACM BPM and elasticsearch AMIS25

26
Search and Find Luc Gorissen ACM/BPM and Elastic Search

Transcript of ACM BPM and elasticsearch AMIS25

Page 1: ACM BPM and elasticsearch AMIS25

Search and Find

Luc Gorissen

ACM/BPM and Elastic Search

Page 2: ACM BPM and elasticsearch AMIS25

Luc Gorissen

Previous employers:- KPN Research- CMG Wireless Data Solutions- OraVision- Oracle

Focus:- BPM and SOA Suite

[email protected]

+31 6 3622 4226

@LucGorissen

No, no, no

LinkedIn

Page 3: ACM BPM and elasticsearch AMIS25

The Challenge

Starting Point:Our ACM/BPM implementation supports successfully our core business processes

Requirement:We need to be able to search through case/process data of the last 7 year

We need:

An ACM/BPM archive where we can search through data

of cases/processes of up to 7 years old

Page 4: ACM BPM and elasticsearch AMIS25

The Technology

Company:

Product:

Promise:

Can it be done?

... a highly scalable open-source full-text search and analytics engine. It allows you to store, search, and analyze big volumes of data

quickly and in near real time.

Page 5: ACM BPM and elasticsearch AMIS25

5

Topics

Use Case Data Use Case Evaluation

Use Case Elastic ProductStack Basic Concepts

31 2

4 5 Recommendation6

Page 6: ACM BPM and elasticsearch AMIS25

6

Elastic Product Stack

... a highly scalable open-source full-text search and analytics engine. It allows you to store, search, and analyze big volumes of data

quickly and in near real time.

• Full-Text Search• Document-Oriented• Near-Real-Time• Horizontally Scalable• Multi Tenant

• Schema-Free• REST-API• Open Source – Apache 2 license• On top of Apache Lucene

• REST/JSON

Feat

ures

Page 7: ACM BPM and elasticsearch AMIS25

7

Elastic Product Stack

Product DescriptionElasticsearch Search engine

Elastic Cloud Elasticsearch Cloud offering

Logstash Data collection engine

Kibana Analytics and visualization platform

Beats Collect data (network, infra, file, winlog) and ship

Shield Protect access to your data

Watcher Alerts/notifications from changes in your data

Marvel Monitor your Elasticsearch cluster

Page 8: ACM BPM and elasticsearch AMIS25

8

Elastic Product Stack

Maturity

• Complete product stack• Cloud offering• Modern technology around solid Apache Lucene core (1999)• Clients: Ruby, Python, PHP, Perl, .NET, Java, Javascript, etc• Apache Lucene release 6.0.1, May 27, 2016• Elasticsearch release 2.3.3, May 18th, 2016• Oracle plans to replace Secure Enterprise Search with

ElasticSearch in WebCenter products (OOW 2015)

• Support / community group / meet-ups / training

Page 9: ACM BPM and elasticsearch AMIS25

9

Basic concepts

Supports: availability, scalability, distribution

Cluster

Document (JSON)

Index ABC Index ABC

Shard 1

Shard 2

Index ABC Index ABC

Replica Shard 1

Replica Shard 2

Dis

tribu

te o

ver n

odes

Page 10: ACM BPM and elasticsearch AMIS25

10

Installation development set-up

Installation of Elasticsearch:

[developer@localhost bin]$ tar -xvf elasticsearch-2.3.2.tar.gz[developer@localhost bin]$ pwd/home/developer/elasticsearch/elasticsearch-2.3.2/bin[developer@localhost bin]$ ./elasticsearch

Installation of Kibana (‘Analytics and visualization platform’):

[developer@localhost kibana]$ tar -xvf kibana-4.5.1-linux-x64.tar.gz[developer@localhost config]$ vi kibana.yml [developer@localhost bin]$ pwd/home/developer/kibana/kibana-4.5.1-linux-x64/bin[developer@localhost bin]$ ./kibana

Page 11: ACM BPM and elasticsearch AMIS25

11

Use Case: tweets AMISnl

tweetsAMISnl

TwitterSupport

ScreenTweet(Office Management)

CtoScreening(CTO)

TweeterContacted(telemarketeer)

MarketingScreening(marketing)

Screen all tweets of AMISnl to see if action is required for the conference

Page 12: ACM BPM and elasticsearch AMIS25

12

Use case

Tweets:

7336664880837509122016-05-20 14:31:36RT @robbrecht: Orcas - Automatic deployment for the database https://t.co/4U6QSuROjf @amisnl @OC_WIRE

7336524555238113282016-05-20 13:35:50RT @sai_penumuru: Learn something new from my session. #AMIS25 @oracleotn @oracleace https://t.co/1gBagwgotD

7336523882723123222016-05-20 13:35:34RT @sai_penumuru: Join me on 2nd-3rd June 2016 for BEYOND THE HORIZON conference in Netherlands. #AMIS25 @oracleace @oracleotn https://t.co…

7336219462906716202016-05-20 11:34:36NEWSFLASH! The official #AMIS25 app is now available. Search for 'AMIS 25' in your app store and enjoy! https://t.co/iYOEGG6l90

In total: 3212 tweets

Page 13: ACM BPM and elasticsearch AMIS25

13

Use Case result: data in JSON format

Transform to JSON

<caseActivityDefinition> <applicationName>default</applicationName> <completedDate>2016-05-19T06:29:13.910+02:00</completedDate> <componentName>TwitterSupport</componentName> <compositeDn>default/TwitterSupport!1.0*soa_33331876-7da2-4ba6-b28d-fec89397281e</compositeDn> <compositeName>TwitterSupport</compositeName> <compositeVersion>1.0</compositeVersion> <definitionId>default/TwitterSupport!1.0/CtoScreeningProcess</definitionId> <displayName>CtoScreeningProcess</displayName>

{ "caseActivityDefinition": { "caseId": "100036", "completedDate": "2016-05-23T09:39:03.111+02:00", "definitionId": "default/TwitterSupport!1.0/CtoScreeningProcess", "displayName": "CtoScreeningProcess", "instanceId": "116187", "name": "CtoScreeningProcess", "nameSpace": "http://xmlns.amis.nl/TwitterSupport/CtoScreeningProcess", "startDate": "2016-05-23T09:19:08.111+02:00" }}

3212 tweets

Retrieve data from the ACM system with the platform API. Retrieved data:• CaseActivities• CaseMileStones• Comments• CaseData

XML

JSON

Page 14: ACM BPM and elasticsearch AMIS25

14

Insert data into ElasticSearch

Insert MileStone data into ElasticSearch archive:

curl -XPUT 'localhost:9200/casemilestones/external/1?pretty' -d '{ "caseMilestone": { "caseId": "103242", "state": "ATTAINED", "name": "TweetScreenedMilestone", "updatedDate": "2016-05-25T10:27:34.111+02:00" }}'

index

Milestone data in JSON

Page 15: ACM BPM and elasticsearch AMIS25

15

Results use case: data into ElasticSearch

Totals - start:

[developer@localhost elasticsearch-2.3.2]$ curl 'localhost:9200/_cat/indices?v'health status index pri rep docs.count docs.deleted store.size pri.store.size yellow open caseactivities 5 1 0 0 650b 650b yellow open casemilestones 5 1 0 0 260b 260b yellow open casecomments 5 1 0 0 650b 650b yellow open casedata 5 1 0 0 650b 650b

Totals - end:

[developer@localhost elasticsearch-2.3.2]$ curl 'localhost:9200/_cat/indices?v'health status index pri rep docs.count docs.deleted store.size pri.store.size yellow open caseactivities 5 1 3693 0 929.4kb 929.4kb yellow open casemilestones 5 1 16060 0 1.5mb 1.5mb yellow open casecomments 5 1 7207 0 685.1kb 685.1kb yellow open casedata 5 1 16060 0 2.2mb 2.2mb [developer@localhost elasticsearch-2.3.2]$

Timing :

# documents: 43020Upload time: 9:57 minUpload speed: ~72 docs / sec

Page 16: ACM BPM and elasticsearch AMIS25

16

Results use case: sample search

[developer@localhost json]$ [developer@localhost ~]$ curl -XPOST 'localhost:9200/casedata/_search?pretty' -d '> {> "query": { "match": { "caseData.value": "Lucas"}},> "_source": ["caseData.caseId", "caseData.value"]> }> '{ "took" : 96, "timed_out" : false, "_shards" : { "total" : 5, "successful" : 5, "failed" : 0 }, "hits" : { "total" : 215, "max_score" : 1.3943143, "hits" : [ { "_index" : "casedata", "_type" : "external", "_id" : "AVTZQ0eNjRs4lNcko-Qb", "_score" : 1.3943143, "_source" : { "caseData" : { "caseId" : "102701", "value" : "Blog by Lucas Jellema: UX Expo 18th of March – OTN ArchBeat YouTube Video Interview: Jeremy Ashley &amp; Lucas J... http://t.co/9GlLzTJ3U0" } } }, { "_index" : "casedata",

Page 17: ACM BPM and elasticsearch AMIS25

17

Kibana

Let’s start looking at the data with Kibana:

What can it add to the archive?

Page 18: ACM BPM and elasticsearch AMIS25

18

KibanaTimeline for case activities

Page 19: ACM BPM and elasticsearch AMIS25

19

Searching with Kibana

Page 20: ACM BPM and elasticsearch AMIS25

20

Kibana dashboard

Page 21: ACM BPM and elasticsearch AMIS25

21

Office Documents

Especially for case management,

‘Office Documents’ are important.

Installation of plugin for indexing Office and PDF docs (Apache Tika):

[developer@localhost bin]$ pwd/home/developer/elasticsearch/elasticsearch-2.3.2/bin[developer@localhost bin]$ ./plugin install mapper-attachments

Page 22: ACM BPM and elasticsearch AMIS25

22

‘Office Documents’

Document formats:• Supported Document Formats• HyperText Markup Language• XML and derived formats• Microsoft Office document formats• OpenDocument Format• Portable Document Format• Electronic Publication Format• Rich Text Format• Compression and packaging formats• Text formats• Audio formats• Image formats• Video formats• Java class files and archives• The mbox format

Page 23: ACM BPM and elasticsearch AMIS25

23

Results use case:searching office documents

Insert documents base64 encoded … and search:

[developer@localhost ~]$ curl -POST 'localhost:9200/casedocuments/document/_search?pretty' -d '> {> "query": {> "query_string": {> "query": "+bonnetje +teeven" }},> "_source": ["docName"]> }> '{ "took" : 64, "timed_out" : false, "_shards" : { "total" : 5, "successful" : 5, "failed" : 0 }, "hits" : { "total" : 1, "max_score" : 0.43479362, "hits" : [ { "_index" : "casedocuments", "_type" : "document", "_id" : "AVT4Siu7Ia99IOtnY-TF", "_score" : 0.43479362, "_source" : { "docName" : "/doc/factuur.docx"

Page 24: ACM BPM and elasticsearch AMIS25

24

Use Case Results

• Mature, enterprise grade product

• Easy search, even ‘Office Documents’

• Basic analysis, more investigation required

• Careffully determine what info to put into elasticsearch – Audit trail? TaskQueryService? Other info?

• It is schema-free: easy transitions between Oracle releases

• You will find the caseIdentifier and anything related to the caseIdentifier

• Not an easy overview of case history

Page 25: ACM BPM and elasticsearch AMIS25

25

Recommendation

Back to ‘the challenge’:

An ACM/BPM archive where we can search through data

of cases/processes of up to 7 years oldAspects:

- TCO: License Costs- TCO: Yet another technology- DB versus elasticsearch:

- Schema-less JSON data store- No transactions- Near-real-time

- Document Management System / doc types- Logstash jdbc plugin

Page 26: ACM BPM and elasticsearch AMIS25

26