Super Awesome Presentation Dandre Allison Devin Adair.

Super Awesome Presentation

Dandre AllisonDevin Adair

Comparing the Sensitivity of Information Retrieval Metrics

Filip RadlinskiMicrosoft

Cambridge, UKfiliprad@microsoft.com

Nick CraswellMicrosoft

Redmond, WA, USAnickcr@microsoft.com

How do you evaluate Information Retrieval effectiveness?

• Precision (P)• Mean Average Precision (MAP)• Normalized Discounted Cumulative Gain (NDCG)

Precision

• Average the number of relevant documents in the top 5 for a given query

• Average over all queries

Mean Average Precision

• For each relevant document in the top 10 , find the precision up until its rank for a given query

• Sum the precisions and normalize by the known relevant documents

Normalized Discounted Cumulative Gain

• Normalize the Discounted Cumulative Gain by the Ideal Discounted Cumulative Gain for a given query

Normalized Discounted Cumulative Gain

• Discounted Cumulative Gain– Give more emphasis to relevant documents by

using 2relevance

– Give more emphasis to earlier ranks by using a logarithmic reduction factor

– Sums over top 5 • Ideal Discounted Cumulative Gain– Same as DCG by sorts by relevance

What’s the problem

SensitivityMight reject small but significant improvements

BiasJudges removed from search process

FidelityEvaluation should reflect user success!!

Alternative Evaluation

• Use actually user searches• Judges become actual users• Evaluation becomes user success

Interleaving

System A Results + System B Results

Team-Draft Algorithm

Captain Ahab Captain Barnacle

Interleaved List

Crediting

• Whoever has the most distinct clicks is considered “better”

• In case of tie - ignored

Retrieval Systems Pairs

• Major improvements– majorAB– majorBC– majorAC

• Minor improvements– minorE– minorD

Evaluation

• 12,000 queries– Samples n-times with replacement

• count sampled queries where rankers differ– Ignores ties

• Percent where better ranker scores better

Interleaving Evaluation

Credit Assignment Alternatives

• Shared top k– Ignore?– Lower clicks treated the same

• Not all clicks are created equal– log(rank)– 1/rank– Top– Bottom

Conclusions

• Performance measured by:– Judgment-based– Usage-based

• Surprise surpise small sample size is stupid– (check out that alliteration)

• Interleaving is transitive

Super Awesome Presentation Dandre Allison Devin Adair.

Documents

Transcript of Super Awesome Presentation Dandre Allison Devin Adair.

Featuring - Adair Ranch

Devin History Project

Devin McLaughlin

MERRICK SHERMAN ALLISON MORGAN MCKEE DUTNEY … · James WALLACE F • 2016-17 Becker Bryan Corey Nikolas̀ Devin Willard Macgregor Kevin Alex PARSONS SCHAFER NASBY SMITH MCMILLEN

Devin Francis - 2013

Culture Collage Devin

3419 Steve Adair

While You Slepts - fee.org · While You Slept OUR TRAGEDY IN ASIA AND WHO MADE IT by JOHN T. FLYNN THE DEVIN-ADAIR COMPANY New York · 1951

Names indexed from the Tithe Applotment Books …billmacafee.com/182030stithe/tithenorthantrim.pdf1851 Adair Adair James 1826 Cary Billy Clogher Lower [Low Clogher] 3867 Adair Adair

Devin Royal

Geo. Washington Adair DescendantsJohn and Cynthia Penrod Adair in later years. 150 Figure 38. George Newton Adair and Adelia Francesca Sawyer Adair. 156 Figure 39. James Daniel Adair,

Devin Gaffney

Adair Ranch 2014

Kim and Devin

Summary of Adair capabilities

Adair on Leadership

Happy Birthday Daniel Adair!

Bronson and Devin

Devin Reynolds

Devin Hummel - one.walmart.com