Database ArchitecturesDatabase Architectures
CentralizedCentralized Client-ServerClient-Server Parallel - single siteParallel - single site Distributed - multiple sitesDistributed - multiple sites
Database ArchitecturesDatabase Architectures
Centralized
(Parallel)
Distributed
Client-Server
Function Data
CentralizedCentralized
PC, Mini, or MainframePC, Mini, or Mainframe Single DatabaseSingle Database Single Database ManagerSingle Database Manager One or More UsersOne or More Users Data and Function in One PlaceData and Function in One Place
Client-ServerClient-Server
PCs to Mainframes to MinisPCs to Mainframes to Minis PC to PCPC to PC Mainframe to MainframeMainframe to Mainframe
Use Desktop Processing PowerUse Desktop Processing Power Better User InterfaceBetter User Interface Greater FunctionalityGreater Functionality Retain Centralized Control of DataRetain Centralized Control of Data
Client-Server: Basic ModelClient-Server: Basic Model
ServerClient
Client Clien
tClien
t
Client
RequestResult
ServersServers
SupercomputerSupercomputer MainframeMainframe MiniMini PC ServerPC Server
All retain all dataAll retain all data
Client-Server ArchitectureClient-Server Architecture
Data
Function
Server
(Back-End)
Client
(Front-End)
Thin
Client
Fat
Client
FunctionalityFunctionality
PresentationPresentation I/O ProcessingI/O Processing
ValidationValidation Business RulesBusiness Rules
Application LogicApplication Logic Data Management Data Management
ValidationValidation Error HandlingError Handling
““Thin” ClientThin” Client
Presentation Services OnlyPresentation Services Only Accept InputAccept Input Format OutputFormat Output DisplayDisplay
Server does all processingServer does all processing
““Fat” ClientFat” Client
PresentationPresentation ValidationValidation Application Logic - ProgramsApplication Logic - Programs Data ManagementData Management Send SQL to ServerSend SQL to Server
Server is just DBMSServer is just DBMS
““In Between” ClientIn Between” Client
ClientClient PresentationPresentation Some Application LogicSome Application Logic
ServerServer Some Applicaton LogicSome Applicaton Logic Data Management and ServicesData Management and Services
Benefits of Client-ServerBenefits of Client-Server
Use Local Processing PowerUse Local Processing Power Better User InterfaceBetter User Interface Some Functionality if System DownSome Functionality if System Down Use Sunk Costs of PCsUse Sunk Costs of PCs Support ReengineeringSupport Reengineering Support IntranetsSupport Intranets Flexibility, Scalability, CustomizeabilityFlexibility, Scalability, Customizeability
Challenges of Client-ServerChallenges of Client-Server
Cost of (Upgraded) PCsCost of (Upgraded) PCs Network RelianceNetwork Reliance Distributing Application UpdatesDistributing Application Updates Management of Complex SystemManagement of Complex System Problem Identification & ResolutionProblem Identification & Resolution Application PartitioningApplication Partitioning
Other Client-Server Other Client-Server ArchitecturesArchitectures Traditional is Two-Tiered (client-server)Traditional is Two-Tiered (client-server) Three-TieredThree-Tiered
Client-Application Server-DB ServerClient-Application Server-DB Server (PC - Mini - Mainframe)(PC - Mini - Mainframe) (PC - PC Server - Mainframe)(PC - PC Server - Mainframe)
Beyond ThreeBeyond Three PC - PC Server - Web Server - Mini - PC - PC Server - Web Server - Mini -
MainframeMainframe
Client-Server vs. DistributedClient-Server vs. Distributed
Client-Server: Application DistributionClient-Server: Application Distribution
Distributed: Data DistributionDistributed: Data Distribution
Often, “client-server” is used to refer to Often, “client-server” is used to refer to either application distribution or data either application distribution or data distribution or both.distribution or both.
MiddlewareMiddleware
What ifWhat if Multiple databases (sources) need to Multiple databases (sources) need to
be accessed from a single client?be accessed from a single client? Different kinds of clients?Different kinds of clients? Mix of clients and servers?Mix of clients and servers? Want to take advantage of existing Want to take advantage of existing
base of applications (legacy systems)?base of applications (legacy systems)?
MiddlewareMiddleware
Fat Clients just send SQL transactionsFat Clients just send SQL transactions Other types of transactions may be Other types of transactions may be
needed based on the server (system)needed based on the server (system)
MiddlewareMiddleware
Software that shields applications from the complexity of the operating environment.
Client Client Client
Middleware
System
(Legacy)
System
(Legacy)
Types of MiddlewareTypes of Middleware
Transaction Process (TP) MonitorTransaction Process (TP) Monitor Database MiddlewareDatabase Middleware Remote Procedure Call (RPC)Remote Procedure Call (RPC) Message-Oriented Middleware (MOM)Message-Oriented Middleware (MOM) Object-Request BrokersObject-Request Brokers
(CORBA - ORB)(CORBA - ORB)
TP MonitorTP Monitor
Synchronous - sender must waitSynchronous - sender must wait QueuingQueuing Message DeliveryMessage Delivery Insured DeliveryInsured Delivery Either DirectionEither Direction
Database MiddlewareDatabase Middleware
Variety of Clients/PlatformsVariety of Clients/Platforms Variety of Servers/DBMSs/PlatformsVariety of Servers/DBMSs/Platforms Specific to DB transactions (SQL)Specific to DB transactions (SQL)
Message-Oriented Message-Oriented Middleware (MOM)Middleware (MOM) Asynchronous - clients do not waitAsynchronous - clients do not wait Queues & Queue Management/RecoveryQueues & Queue Management/Recovery Message DeliveryMessage Delivery Insured DeliveryInsured Delivery Either DirectionEither Direction
(like email or EDI only transactions)(like email or EDI only transactions)
Advantages of MiddlewareAdvantages of Middleware
Leverage sunk costs (legacy systems)Leverage sunk costs (legacy systems) Reduce development costReduce development cost Reduce development timeReduce development time
Increase responsivenessIncrease responsiveness Improve overall systems managementImprove overall systems management Consolidate diffuse informationConsolidate diffuse information
Challenges of MiddlewareChallenges of Middleware
CostCost Session management - Transaction stateSession management - Transaction state SecuritySecurity Network relianceNetwork reliance Diversity of systems - lack of standardsDiversity of systems - lack of standards Constant technology changeConstant technology change Availability of talentAvailability of talent Middleware ManagementMiddleware Management
Parallel and DistributedParallel and Distributed
Client-Server is an attempt to improve Client-Server is an attempt to improve performanceperformance
Reduce time to execute a transactionReduce time to execute a transaction ParallelParallel
Reduce time to get the dataReduce time to get the data DistributedDistributed
Parallel SystemsParallel Systems
Single site for dataSingle site for data Very LargeVery Large databases databases Operations performed simultaneouslyOperations performed simultaneously
Parallel Database Parallel Database ArchitecuresArchitecures Shared MemoryShared Memory Shared DiskShared Disk Shared NothingShared Nothing HierarchicalHierarchical
Shared MemoryShared Memory
AdvantagesAdvantages Extremely efficient communicationsExtremely efficient communications
DisadvantagesDisadvantages Max of 32/64 processorsMax of 32/64 processors Bus becomes bottleneckBus becomes bottleneck
Shared DiskShared Disk
AdvantagesAdvantages No bus bottleneckNo bus bottleneck Fault tolerance providedFault tolerance provided
DisadvantagesDisadvantages Disk access becomes bottleneckDisk access becomes bottleneck
Shared NothingShared Nothing
AdvantagesAdvantages No disk bottleneckNo disk bottleneck Highly scaleableHighly scaleable
DisadvantagesDisadvantages High communication overhead/costHigh communication overhead/cost
Between processorsBetween processors To another processor’s dataTo another processor’s data
HierarchicalHierarchical
AdvantagesAdvantages Best of all worldsBest of all worlds
DisadvantagesDisadvantages Worst of all worldsWorst of all worlds Some high communcation overhead/costSome high communcation overhead/cost
Between subsystemsBetween subsystems ComplexityComplexity
Distributed DatabasesDistributed Databases
Client-Server - distribute functionalityClient-Server - distribute functionality
What about distributing data?What about distributing data?
Distributed DatabasesDistributed Databases
OverviewOverview Distributed StorageDistributed Storage Distributed QueriesDistributed Queries Distributed TransactionsDistributed Transactions Multidatabase (Middleware)Multidatabase (Middleware)
Distributed DatabasesDistributed Databases
Multiple locationsMultiple locations Single Single logicallogical database database Several physical databasesSeveral physical databases Network connectionsNetwork connections
AdvantagesAdvantages
Sharing across locationsSharing across locations Local controlLocal control AvailabilityAvailability
ChallengesChallenges
Development costsDevelopment costs People & EquipmentPeople & Equipment
TestingTesting Problem identification & resolutionProblem identification & resolution Technical expertiseTechnical expertise Network dependenceNetwork dependence Increased processing overheadIncreased processing overhead
Distributed Data StorageDistributed Data Storage
ReplicationReplication FragmentationFragmentation BothBoth
ReplicationReplication
Data is repeatedData is repeated Spectrum of options availableSpectrum of options available
Temporary replication of specific rowsTemporary replication of specific rows Replicate infrequently changed dataReplicate infrequently changed data Replicate by siteReplicate by site
Central site - all / each local site - their Central site - all / each local site - their data onlydata only
Full replicationFull replication Everything everywhereEverything everywhere
Concerns with ReplicationConcerns with Replication
Availability neededAvailability needed Amount of parallelism in readsAmount of parallelism in reads Overhead of updatesOverhead of updates Keeping replicas updatedKeeping replicas updated Conflicting updatesConflicting updates
FragmentationFragmentation
PartitioningPartitioning Divide data into subsets based on needDivide data into subsets based on need Have to be able to pull back together to Have to be able to pull back together to
get original tablesget original tables
FragmentationFragmentation
HorizontalHorizontal by rowsby rows specified conditionsspecified conditions
VerticalVertical by columnby column each requires primary key (or created key)each requires primary key (or created key)
MixedMixed by row and columnby row and column
Fragmentation & ReplicationFragmentation & Replication
Repeat as necessary:Repeat as necessary: Replicate fragmentsReplicate fragments Fragment replicasFragment replicas
Don’t lose track of what you have and Don’t lose track of what you have and where it is!where it is!
Network TransparencyNetwork Transparency
Distributing data should not require that Distributing data should not require that the user know where or how it’s been the user know where or how it’s been distributed.distributed.
The database should be seen as a The database should be seen as a single entity no matter how fragmented single entity no matter how fragmented and replicated it becomes.and replicated it becomes.
Network TransparencyNetwork Transparency
Some DBMSs are starting to provide this Some DBMSs are starting to provide this level of functionality so transparency level of functionality so transparency exists even at the program level, but in exists even at the program level, but in many cases this “transparency” must be many cases this “transparency” must be programmed into the applications.programmed into the applications.
It must always be designed into the It must always be designed into the database.database.
Distributed QueriesDistributed Queries
How do you query data that is How do you query data that is everywhere?everywhere?
Effeciency vs. OverheadEffeciency vs. Overhead
Splitting the query apartSplitting the query apart Keeping track of the data/locationsKeeping track of the data/locations Making sure everything gets executedMaking sure everything gets executed Putting the results back togetherPutting the results back together Generating network trafficGenerating network traffic Handling partial resultsHandling partial results
Distributed QueriesDistributed Queries
Full replication can avoid the overheadFull replication can avoid the overhead Huge increase in update overheadHuge increase in update overhead Parallel execution no longer possibleParallel execution no longer possible Additional costs of replicationAdditional costs of replication
ExampleExample
5 sites - NY, Pgh, Chicago, Dallas, Los 5 sites - NY, Pgh, Chicago, Dallas, Los AngelesAngeles
Data fragmented by site - no replicationData fragmented by site - no replication
Query (in Pgh):Query (in Pgh):
SELECT Name, Max (Salary) from SELECT Name, Max (Salary) from EmployeeEmployee
Option 1 - High BandwidthOption 1 - High Bandwidth
1. Have all sites send their full employee 1. Have all sites send their full employee tables to Pgh.tables to Pgh.
2. Build a temporary employee table.2. Build a temporary employee table.
3. Run the query against this table.3. Run the query against this table.
Option 2 - Option 2 - Not so High BandwidthNot so High Bandwidth1. Examine the query and determine it can 1. Examine the query and determine it can
be run separately at each location and the be run separately at each location and the results combined.results combined.
2. Submit just the query to each location.2. Submit just the query to each location.
3. Wait for the results from each city.3. Wait for the results from each city.
4. As results return, build a temporary table 4. As results return, build a temporary table (5 rows only).(5 rows only).
5. Find the max using the temporary table.5. Find the max using the temporary table.
Distributed TransactionsDistributed Transactions
Transaction TypesTransaction Types CoordinatorsCoordinators Commit ProtocolsCommit Protocols Concurrency ControlsConcurrency Controls DeadlocksDeadlocks
Transaction TypesTransaction Types
Local - transaction only needs local dataLocal - transaction only needs local data Global - transaction uses non-local dataGlobal - transaction uses non-local data
My global becomes someone else’s localMy global becomes someone else’s local
Either type of transaction must still have Either type of transaction must still have ACID properties - global is the concernACID properties - global is the concern
System StructureSystem Structure
Things to do:Things to do:
1. Process local transactions1. Process local transactions
(transaction manager)(transaction manager)
2. Process and track global transactions2. Process and track global transactions
(transaction coordinator)(transaction coordinator)
Global ProcessingGlobal Processing
1. Recognize as global1. Recognize as global
2. Break up transaction2. Break up transaction
3. Distribute pieces3. Distribute pieces
4. Assemble results4. Assemble results
5. Coordinate termination5. Coordinate termination
6. Handle problems6. Handle problems
Coordinator of CoordinatorsCoordinator of Coordinators
Coordinate among sitesCoordinate among sites Detect problemsDetect problems Attempt to fixAttempt to fix Share status with othersShare status with others
Coordinator FailureCoordinator Failure
Backup CoordinatorBackup Coordinator receives all messages - maintains statereceives all messages - maintains state monitors coordinatormonitors coordinator automatically takes over if coordinator automatically takes over if coordinator
downdown avoids delays - increases overheadavoids delays - increases overhead
ElectionElection highest pre-assigned numberhighest pre-assigned number
Commit ProtocolsCommit Protocols
Two-PhaseTwo-Phase Three-PhaseThree-Phase
AllAll sites must commit or all sites have to sites must commit or all sites have to rollbackrollback
Replicated data onlyReplicated data only
Two-Phase CommitTwo-Phase Commit
Phase 1Phase 1 Send PREPARE to all sitesSend PREPARE to all sites Sites respond READY or ABORTSites respond READY or ABORT
Phase 2Phase 2 If all sites READY,If all sites READY,
COMMIT locally - Send COMMITsCOMMIT locally - Send COMMITs If not READY or time expiresIf not READY or time expires
ROLLBACK locally - Send ROLLBACKROLLBACK locally - Send ROLLBACK
Two-Phase Commit -Two-Phase Commit -Phase 1Phase 1
Coordinator
Site Site Site
Send PREPARE - all sites
Two-Phase Commit -Two-Phase Commit -Phase 2Phase 2
Coordinator
Site Site Site
Send COMMIT - all sites
Two-Phase Commit -Two-Phase Commit -Phase 1Phase 1
Coordinator
Site Site Site
Site responds ABORT or does not respond
Two-Phase Commit -Two-Phase Commit -Phase 2Phase 2
Coordinator
Site Site Site
Send ROLLBACK - all sites
Site Failure - RecoverySite Failure - Recovery
COMMIT and ROLLBACK as normalCOMMIT and ROLLBACK as normal If READY onlyIf READY only
Check with coordinator or other sitesCheck with coordinator or other sites Either COMMIT or ROLLBACKEither COMMIT or ROLLBACK If no one found, ROLLBACKIf no one found, ROLLBACK
Coordinator FailureCoordinator Failure
Ask the sitesAsk the sites If one has COMMIT, then REDOIf one has COMMIT, then REDO If one has ROLLBACK, then UNDOIf one has ROLLBACK, then UNDO If one doesn’t have READY, UNDOIf one doesn’t have READY, UNDO
If all READY onlyIf all READY only Coordinator must decideCoordinator must decide Sites must wait and locks are heldSites must wait and locks are held ““Blocking” occursBlocking” occurs
Three-Phase CommitThree-Phase Commit
Phase 1Phase 1 Sent PREPARESent PREPARE Sites respond READY or ABORTSites respond READY or ABORT
Phase 2Phase 2 If all sites READY, send PRECOMMITIf all sites READY, send PRECOMMIT Else, ROLLBACKElse, ROLLBACK Sites must ACKNOWLEDGESites must ACKNOWLEDGE
Phase 3Phase 3 If at least K sites ACKNOWLEDGE, send If at least K sites ACKNOWLEDGE, send
COMMITCOMMIT
Coordinator FailureCoordinator Failure
Three-Phase Commit prevents blockingThree-Phase Commit prevents blocking If coordinator failsIf coordinator fails
New coordinator is selectedNew coordinator is selected Sites queried to determine statusSites queried to determine status New coordinator resumesNew coordinator resumes
Network PartitioningNetwork Partitioning
Network split creates two separate Network split creates two separate networksnetworks
Each “half” selects a coordinatorEach “half” selects a coordinator Coordinators make independent decisionsCoordinators make independent decisions Result could be different decisionsResult could be different decisions Resolution of network problem may create Resolution of network problem may create
need to resolve database problemsneed to resolve database problems
Concurrency ControlConcurrency Control
Single Lock ManagerSingle Lock Manager Multiple Lock ManagersMultiple Lock Managers
Single Lock ManagerSingle Lock Manager
One site for all lockingOne site for all locking All other sites must go to itAll other sites must go to it Can read from anywhereCan read from anywhere Updates must be to all copiesUpdates must be to all copies
Advantages: Simple, Easy deadlock detectionAdvantages: Simple, Easy deadlock detection Disadvantages: Bottleneck, VulnerabilityDisadvantages: Bottleneck, Vulnerability
Simple Multiple Lock MgrsSimple Multiple Lock Mgrs
Each site locks a unique partition of the Each site locks a unique partition of the datadata non-replicated datanon-replicated data
Advantages: Fairly simple, reduced Advantages: Fairly simple, reduced bottlenecksbottlenecks
Disadvantages: Complicated deadlock Disadvantages: Complicated deadlock detectiondetection
Majority ProtocolMajority Protocol
Each site locks its own data Each site locks its own data replication possiblereplication possible
Request owner for lock on data that isn’t localRequest owner for lock on data that isn’t local When multiple owners, n/2 + 1 (majority) must When multiple owners, n/2 + 1 (majority) must
provide the lockprovide the lock
Advantages: No bottlenecksAdvantages: No bottlenecks Disadvantages: More messages sent, Complicated Disadvantages: More messages sent, Complicated
deadlock detection, More deadlocks (each gets 1/2)deadlock detection, More deadlocks (each gets 1/2)
Biased ProtocolBiased Protocol
Reduced form of Majority ProtocolReduced form of Majority Protocol For a READ, only need any single lockFor a READ, only need any single lock For a WRITE, need all locksFor a WRITE, need all locks
Advantages: No bottle necks, Reduced trafficAdvantages: No bottle necks, Reduced traffic Disadvantages: Update traffic, DeadlocksDisadvantages: Update traffic, Deadlocks
Primary CopyPrimary Copy
Site designated to hold “primary” copySite designated to hold “primary” copy Multiple sitesMultiple sites Replicated DataReplicated Data
All locks through that siteAll locks through that site
Advantages: Fairly simple, reduced bottlenecksAdvantages: Fairly simple, reduced bottlenecks Disadvantages: Vulnerability, Complicated Disadvantages: Vulnerability, Complicated
deadlock detectiondeadlock detection
Other Than LockingOther Than Locking
TimestampsTimestamps Centralized generationCentralized generation Local generationLocal generation
Timestamp tests determine ability to Timestamp tests determine ability to read or writeread or write
Deadlocks & Distributed DataDeadlocks & Distributed Data
CentralizedCentralized One SiteOne Site
DistributedDistributed
Centralized - same advantages and Centralized - same advantages and disadvantages as other centralized disadvantages as other centralized control (database or locking)control (database or locking)
Distributed Deadlock Distributed Deadlock DetectionDetection Each site tracks all transactions accessing its Each site tracks all transactions accessing its
own dataown data Dummy transaction for transactions that Dummy transaction for transactions that
originated here but are executing elsewhereoriginated here but are executing elsewhere If deadlock found that includes dummy If deadlock found that includes dummy
transactiontransaction Must send deadlock information to other sitesMust send deadlock information to other sites They check for deadlockThey check for deadlock May have to pass on to another siteMay have to pass on to another site
Top Related