Post on 22-Dec-2015
Databases Unplugged:Databases Unplugged:Challenges in Ubiquitous Challenges in Ubiquitous
Data ManagementData ManagementMichael Franklin
UC Berkeley
M. Franklin, 12/17/99 2
““Gazillions of Gizmos”Gazillions of Gizmos”
“In ten years, billions of people will be using the Web, but a trillion "gizmos" will also be connected to the Web.” Asilomar Rep. on DB Research, Dec. 1998
You’ve heard it before…
Smartphones, PDAs, Smartcards, badges, wearables, lightswitches, toasters, …
Worldwide sales of Internet-enabled appliances projected to grow from 5.9M units in 1998 to 55.7M units in 2002. IDC via H&Q report
M. Franklin, 12/17/99 3
An Explosion in ScaleAn Explosion in Scale
Distribution
Personalization
More
Less
Less More
BatchRJE
Time Sharing
WS/Server
PC + Network
Many peopleper computer
One personper computer
Many computersper person
InformationAppliances
Scaled downPCs, desktop
metaphor
(Picture is by way of Randy Katz)
M. Franklin, 12/17/99 4
Technical ChallengesTechnical Challenges
Disconnection/Weak Connection
Standard distributed database techniques break down. Limited resources
Memory, CPU, Power, User Interface, Bandwidth Movement/Location
Killer Mobile apps use current and future locations. Scale
Number and diversity of devices. Reliability - Palm Pilots don’t bounce.
M. Franklin, 12/17/99 5
But, is Mobile Data Mgmt Needed?But, is Mobile Data Mgmt Needed?
“Fundamentally, the ability to access all information from anywhere and have ONE unified and synchronized information repository is critical to making appliances useful.” Hambrecht and Quist, iWord , March 1999
“All these information appliances have internal data that "docks" with other data stores. Each gizmo is a candidate for database system technology, because most will store and manage some information.” Asilomar Report
M. Franklin, 12/17/99 6
Road MapRoad Map Motivation
Alternative scenarios for mobile Databases
Technical/Research challenges
Some solutions
Consistency Data Dissemination Data Recharging
Conclusions
M. Franklin, 12/17/99 7
How Will it Happen?How Will it Happen?
SQL engine on the device (largely standalone)
Extension of enterprise infrastructure
Data Collection (device to infrastructure)
Data Dissemination (infrastructure to device)
PIM-driven information assistant
Alternatives
M. Franklin, 12/17/99 8
SQL Engine on the DeviceSQL Engine on the Device Reasonable for Palmtop — but probably not the
toaster or light-switch…
Stand-alone with occasional synchronization.
Footprint versus functionality
Engine can be made surprisingly small (10-100s KB). Sybase uses “take what you need” library approach
All major vendors are playing in this space: Oracle Lite, Sybase SQL Anywhere, Informix/Cloudscape, DB2 Oracle Lite, Sybase SQL Anywhere, Informix/Cloudscape, DB2
for the Workpad, SQL Server for Windows CEfor the Workpad, SQL Server for Windows CE But, what is the killer app???
M. Franklin, 12/17/99 9
Extension of EnterpriseExtension of Enterprise
Logical Progression?
Mainframe->Desktop->Palm ERP-> Palm
Device becomes the endpoint of the enterprise infrastructure (queries and updates).
This is happening but must take into account fundamental limitations of the mobile platforms.
Again, examples exist, but the killer app has not yet emerged here.
M. Franklin, 12/17/99 10
Data Collection DevicesData Collection Devices
Inventory Management/Tracking/Sensors/Census
Examples: Symbol technologies --- Palm with a bar code scanner; more futuristic: smart dust.
Asymmetric (device to server) data flow/usage dictates system architecture.
Many applications exist, but no clear need for full function DBMS on the device.
Server-side DB must handle data streams
M. Franklin, 12/17/99 11
Data DisseminationData Dissemination Many Potential Apps
stock and sports tickers traffic information systems software distribution news and/or entertainment delivery
Asymmetric (server to devices) data flow/usage dictates system architecture.
No clear need for full function DBMS on the device, but intelligent caching and filtering on device is crucial.
M. Franklin, 12/17/99 12
Personal Information ManagementPersonal Information Management PIM is the killer app for mobile devices.
So, use PIM to drive the data management architecture.
Example: IBM’s Active Calendar
Calendar provides semantic information on what information will be needed when (and where).
Use this information to pre-stage information from the fixed infrastructure.
This seems to be the most promising approach for driving device DB functionality.
M. Franklin, 12/17/99 13
Research IssuesResearch Issues Transactions (not likely) and Consistency. Distribution of function
how to split query functionality? adaptive??
New Querying and Access Models info filtering and dissemination location centric/movement triggers/pervasive (invasive?) computing Evidence Accrual – killer app: dating game
Availability and Recovery
M. Franklin, 12/17/99 14
Data Caching and ConsistencyData Caching and Consistency How to keep distributed data consistent?
Centralized algorithms require connectivity at specific times.
Alternative: Epidemic Algorithms (Peer-to-peer)
Conflict detection: timestamps, version vectors,… Conflict Handling (update commitment):
OptimisticOptimistic (resolution) - Manual except in limited (resolution) - Manual except in limited domains,domains,
PessimisticPessimistic (avoidance) - primary copy, (avoidance) - primary copy, write-all or voting-based.write-all or voting-based.
M. Franklin, 12/17/99 15
Epidemic Protocol IllustrationEpidemic Protocol Illustration(Picture is by way of Ugur Cetintemel)
M. Franklin, 12/17/99 16
Deno - Cetintemel and KeleherDeno - Cetintemel and Keleher
Pessimistic, Asynchronous (epidemic), voting-based
“Bounded” weighted-voting:
Each replica is assigned a currency ci s.t. 0 ci 1.0
Total currency in the system is bounded, i.e., ci=1.0 Currency can be re-distributed for optimization or planned
disconnection.
An update’s life:
Sites issue tentative updates Updates and votes are propagated in a pair-wise fashion Updates gather votes as they pass through sites An update commits when it gathers plurality of votes
M. Franklin, 12/17/99 17
Decentralized Update CommitmentDecentralized Update Commitment An update u wins an election
with plurality A site s maintains:
votes(u): the sum of votes u gained so far
unknown: the sum of votes unknown to s
(i.e., 1.0 – votes(u), for u) u commits iff for all u’ <> u,
votes(u) > votes(u') + unknown and
votes(u) > unknown
Issues: time to commit; abort rates
s1Oi
(s1, 0.20, u1)
votes(u1) = 0.20
unknown = 0.80
(s1, 0.20, u1)
(s5, 0.20, u1)
votes(u1) = 0.40
unknown = 0.60
(s1, 0.20, u1)
(s5, 0.20, u1)(s6, 0.15, u2)
votes(u1) = 0.40
votes(u2) = 0.15
unknown = 0.45
(s1, 0.20, u1)
(s5, 0.20, u1)(s6, 0.15, u2)(s2, 0.15, u1)
votes(u1) = 0.55
votes(u2) = 0.15
unknown = 0.30
u1 commits!
s1Oi
(s1, 0.20, u1)
votes(u1) = 0.20
unknown = 0.80
(s1, 0.20, u1)
(s4, 0.20, u2)
votes(u1) = 0.20votes(u2) = 0.20
unknown = 0.60
(s1, 0.20, u1)
(s4, 0.20, u2)
(s6, 0.25, u3)
votes(u1) = 0.20votes(u2) = 0.20votes(u3) = 0.25
unknown = 0.35
(s1, 0.20, u1)
(s4, 0.20, u2)
(s6, 0.25, u3)
(s2, 0.25, u2)
votes(u1) = 0.20votes(u2) = 0.45votes(u3) = 0.25
unknown = 0.10
u2 commits!
M. Franklin, 12/17/99 18
Semantic Caching - Dar et al.Semantic Caching - Dar et al. Idea: Maintain description of cache contents as a set of
logical predicates rather than a list of items.
Potential advantages:
Less overhead with no need for static clustering (reduces bandwidth requirements).
Describe missing items with logical remainder query. Application/Environment specific replacement functions ---
e.g. considering direction and velocity. Issues:
controlling complexity of cache descriptions interacting with real database systems
M. Franklin, 12/17/99 19
Dissemination-Based Info Sys Dissemination-Based Info Sys (DBIS)(DBIS)
1) Push vs. Pull is just one dimension along which to compare data delivery mechanisms.
- We’ve identified three.
2) Different mechanisms for data delivery can (and should) be applied at different points in the system.
- Select components from toolkit.
Franklin and Zdonik - Framework in OOPSLA 97,Toolkit description and demo in SIGMOD 99.
M. Franklin, 12/17/99 20
DBIS FrameworkDBIS Framework
An architecture that combines data delivery techniques for responsive client access.
3 types of nodes: Data sources Clients Information brokers (can add value)
Any data delivery mode can be used.
Network transparency Possibly dynamic.
M. Franklin, 12/17/99 21
Delivery OptionsDelivery Options
PushPull
Aperiodic Periodic
Unicast 1-to-n Unicast 1-to-n
Aperiodic Periodic
Unicast 1-to-n Unicast 1-to-n
request/response
request/responsew/snoop
polling pollingw\snoop
Email lists
publish/subscribe
Emaillistdigests
Broad-castdisks
publish/subscribe
M. Franklin, 12/17/99 22
Network TransparencyNetwork Transparency
Clients Brokers Sources
The type of a link matters The type of a link matters only only to nodes on each endto nodes on each end
M. Franklin, 12/17/99 23
DBIS ExampleDBIS Example
1-to-n pushServerDB
Proxy cache
An example:
Can vary dynamically
Unicast pull
Proxy cache
Proxy cache
Unicast pull
Unicast pull
M. Franklin, 12/17/99 24
DBIS Research IssuesDBIS Research Issues
Each data delivery mechanism has unique aspects
Broadcast Disks - sched., caching, prefetching,updates On-demand Broadcast -scheduling, data staging Publish/Subscribe-large-scale filtering, channelization
Security/Fault-tolerance/Reliability
End-to-End network design and control
Fundamental performance tradeoffs
Exploiting existing and emerging technologies
M. Franklin, 12/17/99 25
““Data Recharging”Data Recharging” Mobile devices require 2 resources: power and data
It is impractical to be continuously connected to fixed sources of these.
Devices cope with disconnection using caching:
Power cached in rechargeable batteries Data cached in hot-synched memory
Ideal: make recharging data as simple as power:
Anywhere (with adapters), anytime, flexible connection duration
Joint work w/ Mitch Cherniack and Stan Zdonik getting underway
M. Franklin, 12/17/99 26
Data Recharging - Research Data Recharging - Research AgendaAgenda
Profile Definition and Maintenance
Update Storage and Preparation
Efficient integration of "recharge" updates with existing cached data.
Recharge, Trickle Charge, Jump Start... Consistency Guarantees
Global Data Staging
Approaches will be driven by (mostly PIM) applications.
M. Franklin, 12/17/99 27
ConclusionsConclusions Lots of plausible/useful Mobile data architectures.
For many, the applications exist today Each has its own set of fascinating research
opportunities. PIM is the killer app for mobile data access.
It can be used to drive the integration with enterprise and Internet data sources.
Successful MDA work lies at the intersection of communications and data management rather than exclusively in either camp.