R*: An Overview of the Architecture

R*: An Overview of the Architecture

R. Williams, et alIBM Almaden Research Center

Outline

Environment and Data DefinitionsObject NamingDistributed CatalogsTransaction Management and Commit ProtoctolsQuery PreparationQuery ExecutionSQL Additions and Changes

Environment and Data Definitions

CICS as the underlying communication modelData distribuion:

Dispersed Replicated Partitioned

Horizontal vertical

Snapshot

Figure 1 from paper

Figure 21.4 from CS 432 text

Object Naming

System Wide Names (SWN): USER @ USER_SITE.OBJECT_NAME @

BIRTH_SITE

Distributed Catalogs

Local site maintains objects in its databaseCatalog entry may be cachedEntries are versioned

SWN Type Format Access path

Object ref

(view)

Statistics

Transaction Management and Commit Protocol

Transaction number: SITE.SEQ_NUM (or SITE.TIME)

Two phase commit (2PC)

Query Preparation

Name resolutionAuthorization checkDistributed compilationGlobal plan generation/optimizationLocal access path selectionLocal optimizationLocal view materialization

Figure 2 from paper

Cost Model

3 weighted components: I/O CPU Message

# of messages sent # of bytes sent

Query Execution

Synchronous vs asynchronous executionDistributed concurrency controlDeadlock detection and resolutionCrash recovery

Figure 3 from paper

SQL Additions and Changes

DEFINE SYNONYMDISTRIBUTE TABLE HORIZONTALLY VERTICALLY REPLICATED

DEFINE SNAPSHOTREFRESH SNAPSHOTMIGRATE TABLE

R* Optimizer Validation and Performance

Evaluation for Distributed Queries

Lothar F. MackertGuy M. Lohman

IBM Almaden Research Center

Outline

Distributed Compilation/OptimizationInstrumentationExperiments and Results

Distributed Compilation/Optimization

Issues: Join site Transfer methods:

ship whole fetch matches

Cost model sentbytesofwsentmsgsofw

accesspageofwcallsRSSofwtTotal

bytemsg

OIcpu

##

##cos /

Weights Estimation

CPU: inverse of MIPSI/O: avg seek, latency, transfer timeMSG: # of instruction per msgBYTE: effective transmission speed of network

Figure 2 from paper

Instrumentation

Distributed EXPLAINDistributed COLLECT COUNTERSForce optimizier

Experiment I

Transfer methodMerge-scan join of 2 tables: 500 tuples in each table Project both table – 50% 100 different values for join attribute Join result: 2477 tuples

Figure 4 from paper

Figure 3 from paper

Experiment II

Distributed vs local joinJoin of 2 tables: 1000 tuples in each table Project both table – 50% 3000 different values for join

attribute

Figure 5 from paper

Figure 6 from paper

Experiment III

Relative importance of cost components

Figure 7, 8, 9, 10 from paper

Experiment IV

Optimizer evaluationAccurate estimates of # of msgs and bytes sent (<2% difference)Better estimates when tables are more distributed

Experiment V

Alternative distributed join methods: Dynamically created indexes Semijoins Bloomjoins

2 tables: 1000 tuples for outer Varies inner from 100 to 6000 tuples

Figure 11, 12 from paper

Other Experiments

Clustered index: Bloomjoins < Semijoins < R*

50% Projection: Site 1: Bloomjoins < Semijoins < R* Site 2: Bloomjoins < R* << Semijoins

Wider join column: Bloomjoins < R* << Semijoins

R*: An Overview of the Architecture

Documents

Transcript of R*: An Overview of the Architecture