Tolerating Byzantine Faults in Database Systems using Commit Barrier Scheduling

Click here to load reader

download Tolerating Byzantine  Faults in Database Systems using  Commit Barrier Scheduling

of 35

description

Tolerating Byzantine Faults in Database Systems using Commit Barrier Scheduling. Ben Vandiver , Hari Balakrishnan , Barbara Liskov , and Sam Madden CSAIL, MIT. Sponsors: Quanta Computer Inc, NSF. Non-crash faults in Databases. Over 50% of reported bugs were non-crash faults - PowerPoint PPT Presentation

Transcript of Tolerating Byzantine Faults in Database Systems using Commit Barrier Scheduling

Tolerating Byzantine Faults in Transaction Processing Systems using Commit Barrier Scheduling

Ben Vandiver, Hari Balakrishnan, Barbara Liskov, and Sam MaddenCSAIL, MITTolerating Byzantine Faultsin Database Systemsusing Commit Barrier SchedulingSponsors: Quanta Computer Inc, NSF1Non-crash faults in DatabasesOver 50% of reported bugs were non-crash faults

Incorrect answers, data or index corruption, etc.Previous focus on fail-stop faultsBetter model: Byzantine faults

Bug CategoryDB22/03-8/06Oracle7/06-11/06MySQL8/06-11/06DBMS Crash1202160Non-Crash Faults1312864Only verifiable bugs2Failure IndependenceHeterogeneous replicasDifferent implementations / versions

Easiest with non-invasive solutionRequires standard interfaceSQL is moderately standard

3Client InteractionOrganized into TransactionsQuery, Query, , Commit / RollbackInteractiveStrong consistencySingle-copy serializable4Database FunctionalityEach Database providesSerializable isolationStrict (rigorous) 2-phase lockingDatabases dont execute in issue-order

Limited control over execution orderIssueS1S2Replica 1executesReplica 2executesS1S2S2S15Replica CoordinationBFT well known solution3f+1 replicasGlobally order client requestsReplicas execute in orderExhibits no concurrencyGoal: mechanism to extract concurrency in database context6ArchitectureClientClientClientShepherdDB1DB2DB3SQLSQLSQLShepherd middleware acts as replication agentSingle Shepherdshepherd is very simple -> reasonable to get it right (vs database)database world about pragmatism -> single shepherd is a practical solutionfuture work accomplish byz replication w/ good performance2f+1 databasesShepherd performs agreement

7ArchitectureClientClientClientShepherdDB1DB2DB3SQLSQLSQLSQLResultResult?VoteResultNeedf+1matchingvotesShepherd middleware acts as replication agentSingle Shepherdshepherd is very simple -> reasonable to get it right (vs database)database world about pragmatism -> single shepherd is a practical solutionfuture work accomplish byz replication w/ good performance2f+1 databasesShepherd performs agreement

8How to extract concurrency?Just issue statements to replicasLikely to get stuckSolution: pre-determine which statements conflictInspecting SQL is very hardFree for all unlikely to make forward progress

Goal is: constraint concurrency just enough9Commit Barrier SchedulingPrimary / Secondary SchemeRun transactions first on the primaryDuplicate primarys ordering on the secondaries

Works best when primary is Sufficiently BlockingRequired for performance, not correctness

10ClientClientClientShepherdDBDBDBPrimaryResultResultResultCommit Barrier SchedulingSQLSQLSQLSQLSQL?Pool of statements executed by the primary to issue to the secondaries11Correct ExecutionStatement Ordering RuleExecute statements of transaction in orderCommit Ordering RuleAll replicas commit transactions in the same orderOrder determined by Shepherd12Execution Trace on PrimaryT1T2SXCSYSZCTimeNon-faulty primary

Use 2PL13Extracting Conflict InfoDont Conflict!T1T2SXCSYSZC14Avoiding ConflictsMight Conflict!Transaction-Ordering Rule: A query from transaction T2 that was executed by the primary after the COMMIT of transaction T1 can be sent to a secondary only after it has processed all queries of T1.T1T2SXCSYSZC15Commit Barrier SchedulingMaintain barrier for each replicaMark statements and transactions with barriersIssue statements and commits when replicas barrier reaches appropriate valueSimple to implement16Analysis of CBS:Non-faulty primaryFull concurrency on the PrimaryDeadlocks detected and resolved locallyAmple concurrency on Secondariesallows many statements to run in parallelSecondaries hardly ever blockLatency increaseNon-faulty primary is the common case

Latency issue17Early ReturnClientClientClientShepherdDBDBDBPrimaryResultNext SQL StmtSQLSQLPipelinedExecution!18Early Return AnalysisCut latency in halfMust vote at CommitSent wrong answer, abort the transactionCorrectness ConditionClients receive correct answers for all transactions that commit

19Masking FaultsFaulty Secondary not a problemVoting resolves wrong answersFaulty Primary is a problemGenerates invalid scheduleGoal: correct execution20Faulty Primary ScenarioFaulty PrimaryReplica R1Replica R2T1: A = 1T1: A = 1T1: waitingT2: A = 1T2: waitingT2: A = 1T1 , T2 Increment A by 1, return AA initially 0, should end up 2f+1 matching votes for both answers!21Other IssuesMechanicsReplica RepairShepherd crashesHeterogeneity & SQL

22ImplementationPrototype called HRDBImplemented in JavaAbout 3500 semicolon-lines of codeJDBC interface to clients and databasesWorks with MySQL, DB2, Derby, and SQLServer23Performance17%Cbs 3 replica case24Heterogeneous ReplicationRan 2f+1=3 replica system, heterogeneous vendorsMySQL, DB2, Commerical DB XSufficiently Blocking holds in practiceSystem runs at slowest of f+1 fastest replicas, or primary1 query in one config not sufficiently blocking25Fail-Stop FaultsByz fault results in repair process, then its merely slow and looks like this case26Bugs and HRDBSuccessfully masked bugsHeterogeneous vendors & heterogeneous versionsFound a new bug in MySQLWhile running TPC-CPresent since October 2001Patched in recent releaseStarting to look for bugs actively with HRDBMask bugsSome with different vendor, some with different version

TPC-C well trodden territoryDue to it being a concurrency fault27ConclusionFirst practical Byzantine Fault Tolerant DatabaseFailure independence by supporting heterogeneous replicasNovel concurrency extraction schemeTool for finding new bugs in databases28Backup Slides29Snapshot IsolationAllows read-after-write hazardsConverts fail-stop to Byzantine faultsNeed write-sets to implementScheme called Snapshot Barrier Scheduling30Implement with BarriersT1T2T3SWCSJSKCCSXSZPrimary S Annotate with current barrier upon completion C Increment barrier before issueSYB=1B=2B=0B=3Secondary S Issue when replica barrier is at least the value of the annotation C Increment replica barrier after completion31Heterogeneity IssuesNon-determinism in answersResult set orderingNon-deterministic functions in queriesDatabase-assigned row IDsQuery RewritingSQL incompatibilityTranslation EngineSQL hiding Views and Stored Procedures

32Future WorkReplicating the ShepherdEfficient Replica RepairFinding Bugs33Replica RecoveryReplicasFail-stop crashes Shepherd replays missing transactionsUses transaction log table in database to discover which transactions to replayByzantine faults Shepherd repairs faulty state, then replaysEfficient repair mechanism under developmentShepherdFail-stop crashes - Maintains a write-ahead log34Faulty PrimaryWrong answers result in transaction abortConcurrency FaultsCan result in secondaries being unable to make progressSystem is back to Correct but Slow solutionSame case as when primary is not sufficiently blockingCan be hard to tell if primary is faultyReplace primary by doing a view change35