Thinking Sphinx Talk at Boston.rb

Post on 15-Jan-2015

2.127 views 0 download

Tags:

description

Dan Pickett talks about Thinking Sphinx, the Ruby plugin/gem that interfaces with the Sphinx Full Text Search Engine

Transcript of Thinking Sphinx Talk at Boston.rb

Searching With

Thinking Sphinx

Dan Pickett

I Know What You’re Thinking...

But, No

The Sphinx We’re Talking About

Yes, the Eye is looking at you

What is Full Text, Indexed Search?

• Searches for keyword matches

• Think of the DB “like” operator on steroids

• File based index (reduces DB load)

• Relevance Ranking / Phrase Proximity

• Two step process

• Query the DB and create indices (indexer)

• Search against created indices (searchd)

Can Haz Search?

What’s Out There

• Direct SQL

• Ferret

• SOLR

• Lucene

• Sphinx

Every time you integrate Ferret, an angel weeps for you

‘Nuff SaidAlthough angels are known to be emotional characters

Courtesy of: Evan Weaver, “Rails Search Benchmarks” 03/17/08

UltraSphinx

Also, Evan Weaver likes Thinking Sphinx

Why Sphinx Rocks

• Relevance Ratings and Phrase Proximity

• Active Development

• searchd Daemon doesn’t hog memory

• Delta Indexing

• Fast Indexing + Querying

• Distributed Capability

You rock too, but Sphinx is cooler

Why TS Rocks• Maximizes use of the Riddle Client

• Sort modes

• Match modes

• Great support and active community

• Available as a gem and a plug-in

• Beautiful Code

• Pat Allan is the man

That was mean - I apologize for the burn in the last slide.You are equally as cool as Sphinx

Let’s Play A Game...Where the F*ck is Carmen Sandiego?©

Courtesy: Bob-Rz @ Deviant Art 02/19/07

“Where the F*ck is Carmen Sandiego?” is a registered trademark of Enlight Solutions, Inc. Well, not really but it sounds cool. Honestly, though, does anyone ever read the fine print? You should be paying attention to the presentation. On we go...seriously,

focus people.

Define your Index of Suspects

InstallShield FTLLet’s Use Rake

• rake ts:config

• rake ts:in

• rake ts:start

• rake ts:stop

• rake ts:restart

Get to Work, Detective

Make Your Arrest

That was easy...

Additional Features

• Match Modes

• Sort Modes

• Polymorphism

• Field Weighting

• Integration with will_paginate

What I Wish I Knew

Protip: Despite its misleading name, Rockapella does not rock

Serious Mullet

What I Wish I Knew

About Integrating TS• Sometimes the indexer silently fails

• Watch your output

• Disregard the Distributed Index warning

• Use delta indexing

• Run regular index tasks

• Use delayed_job or another queue manager to handle delta indexing

What time is it? Beer o’clock

What I Wish I Knew

About Deploying TS

• Store PID files in a shared folder

• Ensure you’ve set proper permissions

• Set memory limits on indexing

• mem_limit option in sphinx.yml

• For large data sets, indices can be extremely large

• Ensure you have a surplus of storage capacity

Are we done yet? It’s about that time for a beer...

What’s Missing?

• Excerpting

• Strong Facet Support

• ASpell Integration/Spell Check support

Blah, blah, blah - You must be getting thirsty by now

It’s a Young but Awesome Utility

• Clone the source and see for yourself

• freelancing-god/thinking-sphinx

• Cucumber test-suite

• Extremely well architected

• Join the mailing list (Google Groups)

Did he mention Pat Allan is the man, yet?

Thanks

• Follow me on Twitter

• www.twitter.com/dpickett

• Check out my blog

• www.enlightsolutions.com

• Recommend me

Questions?