Faceting with Lucene Block Join Query - Lucene/Solr Revolution 2014

Post on 16-Jul-2015

691 views 0 download

Tags:

Transcript of Faceting with Lucene Block Join Query - Lucene/Solr Revolution 2014

Faceting with Lucene

Block Join Query

Agenda

1. Why we need special faceting for

Block Join queries?

1. Proposed Block Join facet component.

PRIVILEGED AND CONFIDENTIAL

Introducing myself

Oleg Savrasov, PhD

A programmer

Working for Grid Dynamics

(griddynamics.com)

Work and live in Saint-Petersburg,

Russia

Online shopping

Jerrica is looking for a dress

Huge amount of dresses

Facet filters help

Facet

filters

Reduced

amount

Tasks to be solved

● Performant Search

● Facet

calculation/filtering

FacetComponent ?

Product has many SKU

Aggregated facet counts

Facets should

count products,

not SKU.

Expected

facets:

COLOR

Blue : 1

Red : 1

SIZE

S : 1

M : 1

Flat documents don’t help

False positive match for

+COLOR:Blue +SIZE:M

Separate SKU documents

q = *:*

facet.field = COLOR

facet.field = SIZE

COLOR

Blue : 1

Red : 2

SIZE

S : 2

M : 1

Wrong

numbers!

There is

only one

product

Search products only

q = *:*

fq = scope:product

facet.field = COLOR

facet.field = SIZE

COLOR : 0

SIZE : 0

No such

fields in

product

documents

Aggregated facet counts

Facets should

count products,

not SKU.

Expected

facets:

COLOR

Blue : 1

Red : 1

SIZE

S : 1

M : 1

Solr Block Join Support (since Lucene 3.4.0)

Gre

en

Blu

e

Yello w

Ye

llo w

Blu

e

Gre

en

Pro

du

ct

Gre

en

Ye

llo w

Pro

du

ct

Gre

en

Blu

e

Yello w

Ye

llo w

Pro

du

ct

docId

1 1 1

Query: {!parent which="scope:product"}COLOR:Blue

1 1

scope:product

COLOR:Blue

ToParentQuery 1 1

Child docs Parent doc

Block1

SOLR-5743 Faceting with Block Join support

● Create BlockJoinFacetComponent

● Only DocValues fields are

supported

● Facet counts should correspond to

amount of parent documents

● ToParentQuery is expected

Faceting over DocSet slicesG

reen

Blu

e

Ye

llo w

Ye

llo w

Blu

e

Gre

en

Pro

du

ct

Gre

en

Ye

llo w

Pro

du

ct

Gre

en

Blu

e

Ye

llo w

Ye

llo w

Pro

du

ct

docId

10 1 0 0 1 0

DocSet Slice

DocSet Slice counts

COLOR Blue : 2

Aggregated counts

COLOR Blue : +1

Block Join Facet Component

BlockJoinFacetCollector

Facets counting

It works!

q =

{!parent

which="scope:product"}COLOR:Blue

child.facet.field = SIZE

<response>

...

<lst name="facet_counts">

<lst name="facet_fields">

<lst name="SIZE">

<int name="S">14</int>

<int

name="L">22</int>

<int

name="XL">17</int>

</lst>

</lst>

</lst>

</response>

The dress is found

Further improvements

● Thorough profiling

● Performance improvements

● Algorithmic improvements

References

http://www.slideshare.net/MarkHarwood/proposal-for-nested-document-support-

in-lucene

http://blog.mikemccandless.com/2012/01/searching-relational-content-with.html

http://blog.griddynamics.com/2013/09/solr-block-join-support.html

Big thanks!

Do you have any questions?

Please vote for SOLR-5743.