BOINC: An Open Platform for Public-Resource Computing David P. Anderson Space Sciences Laboratory...
-
Upload
roy-watkins -
Category
Documents
-
view
218 -
download
0
description
Transcript of BOINC: An Open Platform for Public-Resource Computing David P. Anderson Space Sciences Laboratory...
BOINC: An Open Platform BOINC: An Open Platform for Public-Resource for Public-Resource
ComputingComputingDavid P. AndersonDavid P. Anderson
Space Sciences LaboratorySpace Sciences LaboratoryU.C. BerkeleyU.C. Berkeley
Public-resource computingHome PCs
businessacademic
Advantages:• scale• free• growth• public education• no policy issues
Challenges:• low BW at client• costly BW at server• firewall/NAT issues• sporadic connection• untrustworthy, insecure clients• server security• heterogeneity• need PR, glitzy GUI
your computers
Why share an infrastructure?
Research lab X
University Y Public project Z
projects
applications
resource pool
• Participants install one program, select projects, specify constraints; all else is automatic• Projects are autonomous• Advantages of a shared platform:
• Better long-term resource utilization• Better instantaneous resource utilization• Faster/cheaper for projects, software is better• Easier for projects to get participants• Participants learn more
Goals of BOINC(Berkeley Open Infrastructure for Network
Computing)• Public-resource computing/storage• Multi-project, multi-application
– Participants can apportion resources• Handle fairly diverse applications• Work with legacy apps• Support many participant platforms• Small, simple
General structure of BOINC
• Project:
• Participant:
Scheduling server (C++)
BOINC DB(MySQL) Work
generation
data server (HTTP)
App agentApp agentApp agent
data server (HTTP)data server
(HTTP)
Web interfaces
(PHP)
Core agent (C++)
Project back endRetry
generation
Result validation
Result processing
Garbage collection
Data model• File attributes:
– Name– URL list– Persistent flag– Upload-when-present flag
• Files may originate in client or in project work manager
• Projects can use participant disks for long-term data archival
Computing model• Applications, platforms, app
versions• Workunits
– Inputs to a computation– Estimates of resource requirements
• Results– Outputs of a computation
Hosts and scheduling• Host measurements
– CPU performance (integer/FP/memory)– RAM, cache, disk free/total– On/connected statistics– Network bandwidth statistics
• Workunit properties– RAM/disk/computation requirements
• Scheduling policy– feasibility– High/low water mark
Accounting and result validation
• Standardized unit of credit– CPU time * (int+FP+mem)– Project-specific benchmark?
• Result validation– Compare redundant results, flag incorrect
results• Granted credit:
– Minimum of claimed credit among correct results
Participant preferences• Examples:
– Work only while user away– Confirm before connecting– Don’t work if on batteries– High, low water marks– Limits on disk space, bandwidth– Application-specific preferences– List of projects + authenticators + % allocation
• Edited via Web interface
Application Programming• Checkpoint/restart• Filename translation• Graphics
– OpenGL-based– Application window or screensaver
Conclusion• BOINC status
– Mostly feature-complete– Client runs on Linux, Solaris, Windows, MacOS X– Small: client is 5,000 lines, server 2,000
• Projects:– Astropulse (later this year)– Other SETI@home (Parkes etc.)– Folding@home, climate prediction– Others: rendering? Theorem proving?