Making Watson Fast Daniel Brown HON111. Need for Watson to be fast to play Jeopardy successfully –...

15
Making Watson Fast Daniel Brown HON111

Transcript of Making Watson Fast Daniel Brown HON111. Need for Watson to be fast to play Jeopardy successfully –...

Page 1: Making Watson Fast Daniel Brown HON111. Need for Watson to be fast to play Jeopardy successfully – All computations have to be done in a few seconds –

Making Watson Fast

Daniel BrownHON111

Page 2: Making Watson Fast Daniel Brown HON111. Need for Watson to be fast to play Jeopardy successfully – All computations have to be done in a few seconds –

• Need for Watson to be fast to play Jeopardy successfully– All computations have to be done in a few seconds– Initial application speed: 1-2 hours processing

time per question

Page 3: Making Watson Fast Daniel Brown HON111. Need for Watson to be fast to play Jeopardy successfully – All computations have to be done in a few seconds –

• Unstructured Information Management Architecture (UIMA): framework for NLP applications; facilitates parallel processing– UIMA-AS: Asynchronous Scaleout

• UIMA chosen at start for these reasons; other optimization work only began after 2 years (after QA accuracy/confidence improved)

Page 4: Making Watson Fast Daniel Brown HON111. Need for Watson to be fast to play Jeopardy successfully – All computations have to be done in a few seconds –

UIMA implementation of DeepQA

Page 5: Making Watson Fast Daniel Brown HON111. Need for Watson to be fast to play Jeopardy successfully – All computations have to be done in a few seconds –

UIMA implementation of DeepQA

• Type System• Common Analysis Structure (CAS)• Annotator– CAS multiplier (CM): creates new “children” CASes

• Flow Controller

• CASes can be spread across multiple systems (processed in parallel) for efficiency

Page 6: Making Watson Fast Daniel Brown HON111. Need for Watson to be fast to play Jeopardy successfully – All computations have to be done in a few seconds –

Scaling out

• Two systems: – Development (+question processing)• Meant to analyze many questions accurately

– Production (+speed)• Meant to answer one question quickly

Page 7: Making Watson Fast Daniel Brown HON111. Need for Watson to be fast to play Jeopardy successfully – All computations have to be done in a few seconds –

Scaling out: UIMA-AS

• (UIMA-AS: Asynchronous Scaleout)– Manages multithreading, communication between

processes necessary for parallel processing• Feasibility test: simulated production system with

110 processes, 110 8-core machines– Goal: less than 3 seconds; actual: more than 3 seconds– Two sources of latency: CAS serialization, network

communication– Optimizing CAS serialization resulted in runtime of <1s

Page 8: Making Watson Fast Daniel Brown HON111. Need for Watson to be fast to play Jeopardy successfully – All computations have to be done in a few seconds –

Scaling out: Deployment• 400 processes, 72 machines

Page 9: Making Watson Fast Daniel Brown HON111. Need for Watson to be fast to play Jeopardy successfully – All computations have to be done in a few seconds –

• How to find time bottlenecks in such a system?– Monitoring tool– Integrated timing

measurements (in flow controller component)

Page 10: Making Watson Fast Daniel Brown HON111. Need for Watson to be fast to play Jeopardy successfully – All computations have to be done in a few seconds –

RAM Optimizations

• Wanted to avoid disk read/write time delays, so all (production system) data was put into RAM

• Some optimizations: – Reference size reduction– Java object size reduction– Java object overhead– String size– Special hash tables– Java garbage collection with large heap sizes

• *Full GC between games

Page 11: Making Watson Fast Daniel Brown HON111. Need for Watson to be fast to play Jeopardy successfully – All computations have to be done in a few seconds –

Indri Search Optimizations

• Indri search: used to find most relevant 1-2 sentences from Watson database

• Using single processor, primary search takes too long (i.e. 100s)– Supporting evidence search even longer

• Solution?– Divide corpus (body of information to search) into chunks, then

assign each search daemon a chunk– (specifically, 50GB corpus of 6.8 million documents, 79 chunks of

100000 documents each, 79 Indri search daemons with 8 CPU cores each; end result, 32 passage queries could be run at once)

Page 12: Making Watson Fast Daniel Brown HON111. Need for Watson to be fast to play Jeopardy successfully – All computations have to be done in a few seconds –

Preprocessing and Custom Content Services

• Watson must first analyze the passage texts before being able to use them– Deep NLP analysis - semantic/structural parsing,

etc.• Since Watson had to be self-contained, this

analysis could be done before run time (preprocessed)– Used Hadoop (distributed file system software)– 50 machines, 16GB/8 cores each

Page 13: Making Watson Fast Daniel Brown HON111. Need for Watson to be fast to play Jeopardy successfully – All computations have to be done in a few seconds –

Preprocessing and Custom Content Services

• Retrieving the preprocessed data? – Preprocessed data much larger than unprocessed

corpus (~300GB total)– Built custom content server – allocated data to 14

machines, ~20GB each– Documents then were accessed from these

servers

Page 14: Making Watson Fast Daniel Brown HON111. Need for Watson to be fast to play Jeopardy successfully – All computations have to be done in a few seconds –

End result

• Parallel processing combined with a number of other performance optimizations resulted in a final average latency of less than 3 seconds.– No one “silver bullet” solution

Page 15: Making Watson Fast Daniel Brown HON111. Need for Watson to be fast to play Jeopardy successfully – All computations have to be done in a few seconds –