SPARK Search Engine. Martijn Harthoorn Programmer at Furore Implementer of the Search Engine of...

26
SPARK Search Engine

Transcript of SPARK Search Engine. Martijn Harthoorn Programmer at Furore Implementer of the Search Engine of...

Page 1: SPARK Search Engine. Martijn Harthoorn Programmer at Furore Implementer of the Search Engine of SPARK ...

SPARK

Search Engine

Page 2: SPARK Search Engine. Martijn Harthoorn Programmer at Furore Implementer of the Search Engine of SPARK ...

Martijn HarthoornProgrammer at FuroreImplementer of the Search Engine of SPARK

http://spark.furore.com/fhir/patient?...

The work after the question mark.

Who am I?

Page 3: SPARK Search Engine. Martijn Harthoorn Programmer at Furore Implementer of the Search Engine of SPARK ...

The place of Search

REST Service

Storage

Index&

Search

MongoDB

Spark

Page 4: SPARK Search Engine. Martijn Harthoorn Programmer at Furore Implementer of the Search Engine of SPARK ...

ParadigmFHIR client should be easy. FHIR server needs to solve the complex issues.

Search

Search has some…

Page 5: SPARK Search Engine. Martijn Harthoorn Programmer at Furore Implementer of the Search Engine of SPARK ...

First there was Storage

Search

Then there was Search

Page 6: SPARK Search Engine. Martijn Harthoorn Programmer at Furore Implementer of the Search Engine of SPARK ...

Connectathon

To test a client – you must have a tested serverTo test a server – you must have a tested client

“One fool can ask more questions than seven wise men can answer”

Page 7: SPARK Search Engine. Martijn Harthoorn Programmer at Furore Implementer of the Search Engine of SPARK ...

Connectathon

“But what if you are wrong?”

Page 8: SPARK Search Engine. Martijn Harthoorn Programmer at Furore Implementer of the Search Engine of SPARK ...

History

Version 1.

- A Generics based implementation - On top of the FHIR data model. - Programmed per search parameter programming. - No meta data available yet.- No indexing. - Slow.

Page 9: SPARK Search Engine. Martijn Harthoorn Programmer at Furore Implementer of the Search Engine of SPARK ...

History

Version 2.

- Data Model independent,- Meta data not available - manually added- Lucene.NET as indexer (Index in Lucene, Database in Mongo)- Fast- Standardised all parameter specifics into standard “modifiers”.- All Code based on search parameter types.- Joins are client side

Page 10: SPARK Search Engine. Martijn Harthoorn Programmer at Furore Implementer of the Search Engine of SPARK ...

History

Version 3.

- Modified to store the Lucene index in Mongo- Index storage unreliable.- Never saw light of day

Page 11: SPARK Search Engine. Martijn Harthoorn Programmer at Furore Implementer of the Search Engine of SPARK ...

History

Version 4. CURRENT

- Index storage to a dedicated Mongo collection- Build expression tree from parameters- Chained parameters have full functionality (modifiers, operators)- Joins are client side

Page 12: SPARK Search Engine. Martijn Harthoorn Programmer at Furore Implementer of the Search Engine of SPARK ...

Indexing

Why indexing?

Page 13: SPARK Search Engine. Martijn Harthoorn Programmer at Furore Implementer of the Search Engine of SPARK ...

Why indexing

http://spark.furore.com/fhir/patient?provider.name:partial=Health

Page 14: SPARK Search Engine. Martijn Harthoorn Programmer at Furore Implementer of the Search Engine of SPARK ...

Why indexing

http://spark.furore.com/fhir/patient?provider.name:partial=Health

Page 15: SPARK Search Engine. Martijn Harthoorn Programmer at Furore Implementer of the Search Engine of SPARK ...

Indexing. HOW-TO

You DO want A de-serialized data to an object with all values strongly typed.

You DON’T want to spend time analyzing and interpreting JSON and/or XML.

1. Harvest the Resource2. Determine data type 3. Groom your data4. Store data in Index

Page 16: SPARK Search Engine. Martijn Harthoorn Programmer at Furore Implementer of the Search Engine of SPARK ...

Indexing. 1. Harvesting

Resource: PatientSearch parameter: family

Searches for the family name and prefix of every HumanName that is registered with a Patient.

Usage:

http://spark.furore.com/fhir/patient?family=White

Page 17: SPARK Search Engine. Martijn Harthoorn Programmer at Furore Implementer of the Search Engine of SPARK ...

Indexing. 1. Harvesting

Patient

List<Name>

Name (HumanName)

Name (HumanName)

Name (HumanName)

Family

Prefix

Given

Suffix

Resource: PatientSearch parameter: family

Using the Visitor pattern

Path from Meta data: "patient.Name.Prefix" "patient.Name.Family"

Page 18: SPARK Search Engine. Martijn Harthoorn Programmer at Furore Implementer of the Search Engine of SPARK ...

Indexing. 2. Determine data type

> patient (Patient) > Name (HumanName) > LastName (string)

Data type: stringSearch parameter type: string

Selected indexing method: - Single value – as string- More values – as string array

Page 19: SPARK Search Engine. Martijn Harthoorn Programmer at Furore Implementer of the Search Engine of SPARK ...

Indexing. 2. Determine data type

> patient (Patient) > Gender (Coding) > Coding (List<Coding>)> Code (CodeableConcept)

Data type: CodeSearch parameter type: Token

Selected Indexing method:Store in an array each codeable concept - System (uri)- Code (string)- Display (string)

Page 20: SPARK Search Engine. Martijn Harthoorn Programmer at Furore Implementer of the Search Engine of SPARK ...

Indexing. 3. Groom your data

- Remove dashes, dots, slashes from dates etc.

- If you implement a like search from the left side, you might want to split names at the dash in to multiple hits.

Page 21: SPARK Search Engine. Martijn Harthoorn Programmer at Furore Implementer of the Search Engine of SPARK ...

Indexing. 4. Store in the index

Field Value

Resource "Patient"

Local ID patient/1

Level 0

Family ["LaVaughn", "Robinson", "Obama"]

Given "Michelle"

Gender [ { System: “…”, Code: “..”, Display: “..” } , …

* LevelThe patient is not a contained resource (level 0)

* Family In Mongo you can store an array that can be searched like a normal string.

Page 22: SPARK Search Engine. Martijn Harthoorn Programmer at Furore Implementer of the Search Engine of SPARK ...

Future

Version 5. NEXT

- All parameters based on FHIR data types?- Joins using Mongo Map-Reduce?

Page 23: SPARK Search Engine. Martijn Harthoorn Programmer at Furore Implementer of the Search Engine of SPARK ...

Complexity

So what is the issue?

Page 24: SPARK Search Engine. Martijn Harthoorn Programmer at Furore Implementer of the Search Engine of SPARK ...

Complexity

Include & Chained parameters

- Joining over references return multiple resource types - Client side (not in Mongo database) joins

Page 25: SPARK Search Engine. Martijn Harthoorn Programmer at Furore Implementer of the Search Engine of SPARK ...

Complexity

Transactions

- FHIR has bulk POST- Split between Indexing and storage

Page 26: SPARK Search Engine. Martijn Harthoorn Programmer at Furore Implementer of the Search Engine of SPARK ...

Complexity

Multiple typesSome properties do not have a fixed type.

Example: observation.value

Can be a:- CodeableConcept- String - Quantity (number + unit)