SPARK Search Engine

26
SPARK Search Engine

description

SPARK Search Engine. Who am I?. Martijn Harthoorn Programmer at Furore Implementer of the Search Engine of SPARK http://spark.furore.com/ fhir / patient ? ... The work after the question mark. The place of Search. REST. Service. Storage. Spark. MongoDB. Index & Search. - PowerPoint PPT Presentation

Transcript of SPARK Search Engine

Page 1: SPARK Search  Engine

SPARK

Search Engine

Page 2: SPARK Search  Engine

Martijn HarthoornProgrammer at FuroreImplementer of the Search Engine of SPARK

http://spark.furore.com/fhir/patient?...

The work after the question mark.

Who am I?

Page 3: SPARK Search  Engine

The place of Search

REST Service

Storage

Index&

Search

MongoDB

Spark

Page 4: SPARK Search  Engine

ParadigmFHIR client should be easy. FHIR server needs to solve the complex issues.

Search

Search has some…

Page 5: SPARK Search  Engine

First there was Storage

Search

Then there was Search

Page 6: SPARK Search  Engine

Connectathon

To test a client – you must have a tested serverTo test a server – you must have a tested client

“One fool can ask more questions than seven wise men can answer”

Page 7: SPARK Search  Engine

Connectathon

“But what if you are wrong?”

Page 8: SPARK Search  Engine

History

Version 1.

- A Generics based implementation - On top of the FHIR data model. - Programmed per search parameter programming. - No meta data available yet.- No indexing. - Slow.

Page 9: SPARK Search  Engine

History

Version 2.

- Data Model independent,- Meta data not available - manually added- Lucene.NET as indexer (Index in Lucene, Database in Mongo)- Fast- Standardised all parameter specifics into standard “modifiers”.- All Code based on search parameter types.- Joins are client side

Page 10: SPARK Search  Engine

History

Version 3.

- Modified to store the Lucene index in Mongo- Index storage unreliable.- Never saw light of day

Page 11: SPARK Search  Engine

History

Version 4. CURRENT

- Index storage to a dedicated Mongo collection- Build expression tree from parameters- Chained parameters have full functionality (modifiers, operators)- Joins are client side

Page 12: SPARK Search  Engine

Indexing

Why indexing?

Page 13: SPARK Search  Engine

Why indexing

http://spark.furore.com/fhir/patient?provider.name:partial=Health

Page 14: SPARK Search  Engine

Why indexing

http://spark.furore.com/fhir/patient?provider.name:partial=Health

Page 15: SPARK Search  Engine

Indexing. HOW-TO

You DO want A de-serialized data to an object with all values strongly typed.

You DON’T want to spend time analyzing and interpreting JSON and/or XML.

1. Harvest the Resource2. Determine data type 3. Groom your data4. Store data in Index

Page 16: SPARK Search  Engine

Indexing. 1. Harvesting

Resource: PatientSearch parameter: family

Searches for the family name and prefix of every HumanName that is registered with a Patient.

Usage:

http://spark.furore.com/fhir/patient?family=White

Page 17: SPARK Search  Engine

Indexing. 1. Harvesting

Patient

List<Name>

Name (HumanName)

Name (HumanName)

Name (HumanName)

Family

Prefix

Given

Suffix

Resource: PatientSearch parameter: family

Using the Visitor pattern

Path from Meta data: "patient.Name.Prefix" "patient.Name.Family"

Page 18: SPARK Search  Engine

Indexing. 2. Determine data type

> patient (Patient) > Name (HumanName) > LastName (string)

Data type: stringSearch parameter type: string

Selected indexing method: - Single value – as string- More values – as string array

Page 19: SPARK Search  Engine

Indexing. 2. Determine data type

> patient (Patient) > Gender (Coding) > Coding (List<Coding>)> Code (CodeableConcept)

Data type: CodeSearch parameter type: Token

Selected Indexing method:Store in an array each codeable concept - System (uri)- Code (string)- Display (string)

Page 20: SPARK Search  Engine

Indexing. 3. Groom your data

- Remove dashes, dots, slashes from dates etc.

- If you implement a like search from the left side, you might want to split names at the dash in to multiple hits.

Page 21: SPARK Search  Engine

Indexing. 4. Store in the index

Field Value

Resource "Patient"

Local ID patient/1

Level 0

Family ["LaVaughn", "Robinson", "Obama"]

Given "Michelle"

Gender [ { System: “…”, Code: “..”, Display: “..” } , …

* LevelThe patient is not a contained resource (level 0)

* Family In Mongo you can store an array that can be searched like a normal string.

Page 22: SPARK Search  Engine

Future

Version 5. NEXT

- All parameters based on FHIR data types?- Joins using Mongo Map-Reduce?

Page 23: SPARK Search  Engine

Complexity

So what is the issue?

Page 24: SPARK Search  Engine

Complexity

Include & Chained parameters

- Joining over references return multiple resource types - Client side (not in Mongo database) joins

Page 25: SPARK Search  Engine

Complexity

Transactions

- FHIR has bulk POST- Split between Indexing and storage

Page 26: SPARK Search  Engine

Complexity

Multiple typesSome properties do not have a fixed type.

Example: observation.value

Can be a:- CodeableConcept- String - Quantity (number + unit)