UNDP - Salzburg Academy - My_World by country_22 july 2014_final
SuperComputingDelivered_16-Dec-2014_FINAL
-
Upload
tyler-gray -
Category
Documents
-
view
47 -
download
1
Transcript of SuperComputingDelivered_16-Dec-2014_FINAL
Supercomputing Delivered
Georgetown University
School of Continuing Studies
MPTM 900
Annabel Berman
Kyle Facada
Tyler Gray
SUPERCOMPUTING DELIVERED
2
Table of Contents
Forward 2
Abstract 4
Introduction 5
Company Overview 6
Proposed Solution 7
Technological Solution 8
Grid Computing and the Life Sciences 13
Current State of BOINC at Georgetown University 17
Business Plan 26
Management 36
Further Considerations 37
Conclusions 38
Appendix 40
● Technical Architecture-‐ Advanced “LATTICE” Implementation ● Reward Badges Available for WCG Participation / Individual Recognition ● Georgetown University Faculty / Staff BOINC Group ● Interview: Arnie Miles GU UIS & Adjunct Assistant Professor of Computer
Science ● Interview: Jennifer Smith Library Coordinator of Communications ● Email: Arnie Miles Email: Revisiting Expanding BOINC at GU ● Arnie Miles/Judd Nicholson: BOINC Georgetown Campus
Deployment Kick-‐Off Meeting ● Interview: Michael Cummings & Adam Bazinet; Pioneers for UMD-‐ The
Lattice Project ● Draft Email for Tony Cipriano, Written on Request of Georgetown University
CTO Judd Nicholson: ● Email: Notification from Arnie Miles that BOINC / WCG Rollout Has Started
Works Cited 55
SUPERCOMPUTING DELIVERED
3
Forward
Some of the background research for this project has been derived from the
author’s previous research and efforts towards implementing Berkeley Open
Infrastructure for Network Computing (BOINC) at Georgetown University, with
citations made as appropriate. These efforts, undertaken in 2013 as preparation for
this capstone project, have resulted in this project being able to take on a real-‐world
significance and purpose that would not have otherwise been possible. To this
effect, our initial submission to Georgetown University’s “H.Roundtables” online
question and answer served to identify and spark interest from three key faculty
members at Georgetown University: Mr. Judd Nicholson, Deputy Chief Information
Officer; Mr. Arnie Miles, IT Strategist & Adjunct Assistant Professor of Computer
Science; and Ms. Jennifer Smith, Coordinator of Communications, Outreach and
Programming at the Lauinger Library. Thanks to their interest, support and
guidance, phase one of the project, specifically the deployment of BOINC software to
all systems under the authority of Georgetown's University Information Services
department, has been completed as of November 19, 2014. With this milestone
accomplished, we’re now on track for a University wide deployment to occur
through a phased rollout in early 2015.
SUPERCOMPUTING DELIVERED
4
Abstract
This project paper will demonstrate how any organization which utilizes
networked computer systems and mobile devices can, with minimal costs,
contribute their unused computing power to solve some of humanities greatest
challenges in healthcare and science. We will describe how advancements in
computer hardware and networking led to the development of an innovative open-‐
source software program known as the Berkley Open Infrastructure for Network
Computing (herein referred to as BOINC), which has since evolved to become the
platform of record for thousands of organizations around the world via the IBM-‐
sponsored World Community Grid (WCG). We will demonstrate how our progress
towards implementing BOINC / WCG at Georgetown University makes a compelling
business case for a new benefits corporation (b-‐corp) or a non-‐profit entity
specializing in the implementation of distributed computing initiatives for other
organizations.
Keywords: distributed computing, BOINC, World Community Grid
SUPERCOMPUTING DELIVERED
5
Introduction
Of the many scientific and technological breakthroughs in the last 50 years,
the 1969 Apollo XI mission and the sequencing of the human genome are unique in
both their impact on society, as well as the degree to which technology innovation
made the projects possible. Despite possessing only 2 kilobytes of memory and just
1.024MHz or just .004077 GHz of processing power, the Apollo 11 computer helped
guide three astronauts safely to the moon (Robertson, 2009). More recently, rapid
advancements in the life sciences resulting from the mapping of the human genome
have created a wealth of complex data as to how the human body works. As Bazinet
notes, the analysis of this data has traditionally required very computationally
intensive methods such as stochastic simulation, machine learning approaches,
Bayesian analysis, and Markov-‐chain Monte Carlo sampling (Bazinet, 2009).
However, with recent advancements in grid computing techniques and the
proliferation of personal computers and smartphones, there has never been a better
opportunity to leverage the power of grid computing to solve some of humanities
greatest big data challenges in science and health care.
In the history of computers, the concept of grid, or distributed computing is a
relatively new innovation. For the purposes of this analysis, we’re adopting
Bazinet’s definition of grid or distributed computing as:
“A model of distributed computing that uses geographically and administratively disparate resources. In grid computing, individual users can access computers and data transparently, without having to consider location, operating system, account administration and other details. In grid computing, the details are abstracted, and the resources are virtualized” (Bazinet, 2009).
SUPERCOMPUTING DELIVERED
6
Of the several approaches to grid computing, the most widely known effort is
the Berkley Open Infrastructure for Network Computing, or “BOINC” project, which
is an open-‐source “middleware” software platform that can be utilized to analyze
sets of data that are too large and complex for any single computer system to
process. Given that Georgetown University (GU) already possesses all of the
necessary technical and staffing resources that would be necessary to implement a
BOINC on thousands of systems; we have utilized this project to develop a
framework for implementing a multi-‐phased grid computing initiative that could be
adapted to work for other civic-‐minded organizations, large and small.
Company Overview
We are proud to present this business model for Supercomputing Delivered as
a viable concept that will provide computing resources to an underserved market of
researchers across the world. This plan is brought to you by the ‘Supercomputing
Delivered’ conceptual creators: Tyler Gray (CEO), Kyle Facada (CTO), and Annabel
Berman (CMO). This plan includes a strategy, business and financial model, and key
management and operation considerations, which have been carefully developed as
a framework to make the BOINC computing concept a viable business venture. An
implementation approach is also provided, to guide our new venture as we take the
necessary steps to enter the market, as well as a cost analysis outlining our
projections for the financial resources required to implement BOINC at varying
levels of complexity.
SUPERCOMPUTING DELIVERED
7
This business plan is a product of a lengthy and detailed analysis of the
current landscape in this small but growing industry and the BOINC technology
itself. In addition to the business model, we have also provided a proposed
organization structure and market overview, competitive analysis, and other
relevant considerations for the venture.
Proposed Solution
The Berkley Open Infrastructure for Network Computing or “BOINC” is open-‐
source software for volunteer grid computing; the largest platform currently being
utilized is an initiative spearheaded by the IBM Corporation known as the World
Community Grid (or WCG). At a basic level, BOINC, via IBM’s WCG configuration
pools individual computing resources together, allowing them to process large
amounts of data in parallel to effectively create the equivalent of a multi-‐million
dollar supercomputer. The BOINC software does so by leveraging the idle time and
computing capacity on any desktop computer, laptop computer, tablet, or even
mobile device. This software has the power to offer organizations (particularly
those with limited resources) supercomputing capability at a fraction of the typical
cost.
Supercomputing Delivered is proposing to drive the adoption of BOINC at
colleges and Universities who are either not participating at all, or not fully
participating by means of the popular WCG configuration. Supercomputing
Delivered has an initial 12-‐month plan to use our unique knowledge, expertise, and
insights to assist other Universities implement a BOINC program whereby the users
SUPERCOMPUTING DELIVERED
8
on their networks will participate in the WCG. Our longer term, three-‐year goal is to
explore the concept of re-‐selling University computing resources to customers
outside of the higher education community, specifically nonprofits and other
underserved computing markets.
Proposed Technological Solution
In their 2002 work “SETI@home: An Experiment in Public-‐Resource
Computing,” Dr. Anderson and his colleagues from Berkley chronicle the series of
developments that make BOINC possible. Thanks to Moore’s Law, a theory that
states that the computational speed of systems will, on average double every 18
months, today’s high end consumer personal computers (PC) are comparable to
multi-‐million dollar supercomputers that only governments and large corporations
could afford just a few years prior. As consumers began connecting to the Internet,
the idea for using Internet connected computers in unison lead to the launch of two
projects from which SETI@home would draw inspiration (Andersen, et al., 2002).
BOINC was not the first project to apply the concept of distributed computing to
solve scientific problems, unlike the earlier efforts such as GIMP and
Distributed.Net, which had previously applied distributed computing power to
solving mathematical equations. SETI@Home, however, launched with a much
more compelling hook, namely the search for search for extraterrestrial intelligence
(Andersen, et al., 2002). In the early 2000s, as computing power reached new
heights and data storage costs plummeted, SETI@home was processing information
at 60 TeraFLOPS, (trillion floating-‐point operations per second) while the world’s
SUPERCOMPUTING DELIVERED
9
most powerful supercomputer at the time “only” clocked in at twelve TeraFLOPS
(Anderson, 2003).
Since its release in 1999, SETI@home has been used by over six million
volunteers to analyze 160 terabytes (TB) of data that has been collected by radio
telescopes all over the world. The most notable is the Arecibo Observatory in
Puerto Rico that was made famous in the 1996 James Bond film Goldeneye (Korpela,
2011). After first collecting all the radio data in a centralized database, the BOINC
software then breaks the data down into smaller, more manageable parts to be
distributed and processed by individual users for patterns that could potentially
indicate activities of intelligent life. The 1997 Jodi Foster-‐directed film Contact
captured this central concept, as it portrayed the analysis of data from radio
telescopes to discern and unlock potential patterns. However, unlike in the film
where scientists at SETI made the discovery, the SETI@home project distributes the
work of signal analysis across hundreds of thousands of Internet-‐connected
computer systems. Another key difference from the film Contact is that while
SETI@home has yet to conclusively identify signs of intelligent life coming from
other planets, according to the SETI@Home team, there have been enough
promising leads that overall enthusiasm for the project has not decreased
(Anderson, 2003).
To further uncover the participant mix in the SETI@Home project, Dr. David
Anderson conducted a survey in which 130,000 users responded. Dr. Anderson
discovered that the primary reason why most people were motivated to participate
in the project was, surprisingly, for public recognition that came from the
SUPERCOMPUTING DELIVERED
10
SETI@Home leaderboard, and scoreboard of sorts that measured users’ system data
processing speeds. In fact, some users would go so far as to try and manipulate the
system in order to receive credit for completing more work than they really had! In
other words, in the search to discover extra-‐terrestrial life by analyzing radio
signals, people were, in essence, cheating. Thankfully, by unocering irregular usage
patterns and too-‐good-‐to-‐be-‐true results, Dr. Anderson and his team added new
features to their software to both mitigate cheating and enhance the accuracy of
future results by performing redundant computations and cross checking completed
data sets (Anderson, 2003).
World Community Grid
Since the BOINC software can utilize the power of distrusted computing for
virtually any type of big data project, BOINC has drawn the attention of academia,
business, information technology and the pharmaceutical industry as a means to
advance scientific research across a variety of disciplines. In 2003, the IBM
Corporation led a consortium of organizations in developing a unified public
computing grid to utilize the power of grid computing for scientific projects that
benefit humanity called the “World Community Grid” (WCG). Similar to the
development of SETI@Home, the WCG launched with a singular aim-‐ to accelerate
the discovery of a cure for smallpox. With strong public interest, the project
resulted in the identification of 44 new potential treatments for smallpox. Based on
the success of this initial effort, IBM elected to enhance their WCG program by
expanding access for users running operating systems other than Microsoft
SUPERCOMPUTING DELIVERED
11
Windows (Linux / OSX) and away favor the proprietary backend software running
WCG to the open source BOINC (Cleary, 2008).
Alternative Models of Grid Computing and Middleware Systems: The Lattice
Project
Although we have been focused on desktop-‐based grid computing systems
utilizing variations of a specific software program, BOINC / WCG are not the only
approaches to employing IT resources at scale. In fact, many large academic
institutions and corporations utilize their own IT resources to perform analysis and
conduct research for internal stakeholders. What sets BOINC / WCG apart in the
realm of grid computing and competing “middleware” software platforms is the
open-‐source nature of the hardware and the fact that end users are donating their
resources for complete strangers to conduct scientific research. Using these two
concepts as a framework, researchers Michael Cummings and Adam Bazinet from
the University of Maryland developed their own novel combination of their
architectures in order to:
“Unite heterogeneous computing resources into a computational Grid system, so that resources are uniformly usable and addressable. Our Grid is composed of institutional resources, such as clusters and workstations, and resources that are volunteered by users running BOINC software… We have made a special effort to unite traditional Grid computing with what is known as desktop or volunteer computing, and our work has benefited greatly as a result” (Bazinet, 2009).
In Bazinet’s description, what makes the Lattice Project especially unique is that
SUPERCOMPUTING DELIVERED
12
while most BOINC projects are focused on one particular problem, they have created
“a generalized Grid system using Globus, BOINC, and Condor” that is capable of
running several applications at once. Applications that were not originally designed
to work on grid computing systems are now able to, thereby greatly enhancing the
capabilities of the system to tackle even more diverse big data problems (Bazinet,
2009).
In terms of how The Lattice Project fits into our plans for Supercomputing
Delivered at Georgetown University and beyond, we envision an implementation of
the Lattice Project’s architecture as the long term end goal following initial
deployment of a BOINC / WCG solution. As our experience in implementing BOINC /
WCG at Georgetown has shown us, for institutions that may lack strong computer
science or systems engineering programs or are simply in overall size compared to a
major state funded public institution (like the University of Maryland, for example),
the actions involved in setting up BOINC / WCG can help pave the way for more
robust grid computing initiatives in the future. In the course of researching and
begging to implement BOINC / WCG at GU, we have identified the key stakeholders
who would be responsible for complex integrations that would take place in the
future. By involving these individuals and departments early on, we’re building a
coalition towards creating an even more robust program for the benefit of future
students and faculty.
Grid Computing Benefits the Life Sciences
According to the World Health Organization (WHO), malaria is both a
preventable and treatable mosquito-‐borne disease whose primary victims are
SUPERCOMPUTING DELIVERED
13
children in Africa under five years old. In the latest estimates from 2012, the WHO
reports that there were 627,000 deaths from malaria, “with an uncertainty range of
473,000 to 789,000, mostly among African children,” out of a total of 207 million
cases of malaria in 2012 (World Health Organization, 2013). Strongly linked to
poverty, despite recent increases in funding for malaria control in the last eight
years, the WHO’s analysis indicates an “estimated US $5.1 billion is needed every
year between 2011 and 2020 to achieve universal access to malaria interventions,”
however in 2011 “only US $2.3 billion was available, less than half of what is
needed” (World Health Organization, 2013).
Furthermore, as frightening as those numbers are, if current trends continue
and the virus becomes resistant to antimalarial drugs, the potential losses will be
catastrophic. While many nations and non-‐governmental organizations, such as the
Bill and Melinda Gates Foundation, are attacking the Malaria problem from several
angles, the big data analysis projects “Fight Malaria @Home,” is hoping to utilize the
BOINC / WCG platform and the power of distributed computing in order to discover
new drugs to target these strains of malaria which are becoming resistant to current
drugs. In this sense, Fight Malaria @Home has made it possible for anyone with a
computer and Internet connection to contribute to the fight against malaria.
On a biological level, malaria is actually caused by the protozoan parasite
known as Plasmodium falciparum. As the sponsors of Fight Malaria @Home point
out, thanks to the human genome project, “the plasmodium falciparum genome has
been sequenced, the proteome has been mapped, and protein expression has been
confirmed” (O'Brien D. A.). In other words, the molecular structure of the protozoan
SUPERCOMPUTING DELIVERED
14
parasite that causes Malaria has been translated into machine-‐readable format.
To understand how distributed computing can help to discover new drugs to
fight malaria, it is first important to establish some background as to how modern
medicine fights disease using computer aided drug discovery. As the National
Institute of Health explained in a 2008 press release announcing the creation of a
new web-‐based database on diseases;
“Most drugs work by latching onto proteins and altering a biological process. Researchers can use computational tools to study the structural and biophysical properties of a target protein and, from among tens of thousands of possible ligands, predict the relatively few that bind to the protein in a potentially useful way. These ligands may warrant further study as so-‐called lead compounds for drug discovery. Computational tools can also indicate which compounds may interact with other proteins and cause unwanted side effects that could limit therapeutic use” (2008).
Just as in conventional warfare, understanding the composition of enemy
forces as well as their tactics and methods of operation is necessary to win in battle,
so to must scientists study plasmodium falciparum in order to discover its weak
points. To accomplish this, Fight Malaria @Home is currently analyzing over 19,000
different compounds that have been found to be successful in fighting plasmodium
falciparum. In their informational video explaining the project, team leaders Dr.
Anthony Chub and Kevin O’Brien explain that while any one of these 19,000
compounds can kill the parasite;
“We don’t know how it works, we don’t know which protein it binds to, we don’t know which pathway it is involved in, and we don’t know if some of them might work in a similar way to existing drugs. So what we’re interested in doing is screening all 19,000 compounds and look for new targets that have never been hit inside the parasite before” (O'Brien D. A., 2012).
SUPERCOMPUTING DELIVERED
15
It is in analyzing these 19,000 compounds in which Fight Malaria @Home is
utilizing the power of BOINC to crunch numbers in support of an alternative
approach to finding new cures that would have otherwise not been possible due to
funding constraints and competition from other researchers. As opposed to the
other projects seeking to find a cure to malaria, as Dr. Chub and O’Brien go onto
explain; “while Alex (leader of competing efforts) is docking millions of compounds
against a small number of target proteins, we are asking if we can find the target
protein that is inhibited by a specific compound. In essence it's the same question -‐
backwards” (O'Brien D. A.). In terms of their progress towards this goal, in May
2013 the project had its first major success:
“After analyzing the data generated in the first phase (docking all the 400 MalariaBox compounds against all models), we have ranked lists of potential inhibitors for each receptor. We tested the top 20 hits using a model of the Plasmodial form of HDAC (histone deacetylase) in a laboratory (Dr. Marian Brennan, RCSI, Dublin), and found THREE new inhibitors! So that's just one of the 1,500 receptors that we're docking, using only the best docking results, i.e., only the 20 best ligands” (O'Brien D. A.).
BOINC Software: Desktop and Mobile User Experience
Review
As part of our analysis, our project team
participated, and recruited others to participate, in BOINC
/ WCG. Although most organizations tend to be relatively
uniform in terms of the hardware and software
configurations, this project aims to involve organization
SUPERCOMPUTING DELIVERED
16
as well as the private resources of individuals in combined projects. Therefore, it is
important to understand first-‐hand how BOINC / WCG software affects individual
user experience.
Desktop Software: Windows + OSX
As noted by Arnie Miles, since the BOINC software was originally developed
for Windows, BOINC performs best from both a user and administrative perspective
on Windows based PCs. In Mr. Miles as well as our own experience, BOINC did not
significantly downgrade user experience when setup to run only during the system’s
idle time. In terms of desktop and laptop performance, because BOINC is a resource
intensive application, laptop systems without proper ventilation will generate a
significant amount of heat. In this respect, for laptop systems a unique installation
profile should be created to utilize fewer system resources than comparable
desktop systems.
Mobile Device Support
At present, thanks to the support of IBM and HTC Corporation, there is a free
Android application available for download that allows anyone to run BOINC from
their smartphone. With a sample size of 10 participants using devices from a
variety of manufactures on both handheld and tablet computers, we asked
volunteers to install and try the software for one weekend and to report on results.
In terms of how the program affects overall device performance, because the default
settings for BOINC are to run the program only when a device is connected to A/C
SUPERCOMPUTING DELIVERED
17
power source and Wi-‐Fi, we did not receive any complaints of reduced performance
on the devices. However, the default settings for the application in regards to the
amount of data storage could be a problem for some users, as the default settings
regarding the maximum storage space the app can use is abnormally high(up to
90% of available capacity). Aside from some improvements to setting up the app,
such as implementing a single-‐sign-‐on for users, overall feedback was positive.
Worldwide, the user feedback on the application from the Google Play store is
positive, with 3,009 total reviews and a rating of 4.4 / 5.0 (BOINC Reviews).
Current State of BOINC at Georgetown University
Of the wealth of statistical information available on all BOINC teams, on the
leaderboard teams are ranked according to their: Current Credits, BOINC World
Position, Position in Country, Number of Members and Number of Active Members.
Prior to undertaking this initiative, the current state of BOINC deployment at
Georgetown University in late 2013 was as follows:
Largest & Most Productive Georgetown Group:
“Georgetown University Faculty Staff and Students.”
● BOINC Rank: 407 out of 101,525 ● Current Credits: 36,774,236.59 ● Position in Country: 368 out of 101,525 ● Number of Members: 347 ● Active Members: 21 ● Top Projects: BOINC Combined, World Community Grid ● Group Homepage: http://boincstats.com/en/stats/-‐
1/team/detail/60725
(BOINCstats/BAM! | BOINC combined -‐ team stats -‐ Georgetown University Faculty Staff and Students).
SUPERCOMPUTING DELIVERED
18
Second Largest & Most Productive Georgetown Group:
“Georgetown University”
● BOINC Rank: 3,147 out of 101,525 ● Current Credits: 12,027,196.93 ● Position in Country: 1,794 out of 101,525 ● Number of Members: 37 ● Active Members: 3 ● Top Projects: Climate Prediction, Einstein@Home, Malaria Control
SETI@Home ● Group Homepage: http://boincstats.com/en/stats/-‐
1/team/detail/2165/overview (BOINCstats/BAM! | BOINC combined -‐ team stats -‐ Georgetown University).
Deploying BOINC at Georgetown University
While there is much to be gained by fully embracing BOINC on campus, there
are a number of issues that are currently being addressed in order to deploy BOINC
/ WCG University wide. The deployment of BOINC at GU will succeed by overcoming
potential roadblocks and building consensus, a staggered deployment path, and a
diffusion of responsibility across multiple GU departments. As a first step, the
Supercomputing Delivered Team worked with the Georgetown University
Information Systems (UIS) team to encourage condensing the multiple Georgetown
BOINC project teams that are currently in existence at the University. There are
currently three teams with active members (in order of most prevalent: Georgetown
University Faculty Staff and Students, Georgetown University, and Georgetown
Hoyas) (BOINCstats/BAM! | Search).
Potential Security Concerns
As the creators of BOINC acknowledge, there are legitimate concerns about
SUPERCOMPUTING DELIVERED
19
the security, network, and performance implications of introducing a new system
into any environment. However, as we have demonstrated at Georgetown
University, this can be mitigated by actively seeking out stakeholders and
requesting a formal review of BOINC’s security wiki early on. As described in the
BOINC security wiki, a multilayered system of protections have been built into the
system to protect the integrity of systems:
“Malicious executable distribution
BOINC uses code signing to prevent this. Even if attackers break into a project's BOINC server, they will not be able to cause clients to accept a false code file.
Intentional abuse of participant hosts by projects BOINC uses account-‐based sandboxing: applications run under a specially-‐created account (Mac/Linux? version 5.4+, Windows version 6+). If file and directory permissions are set appropriately, applications will have no access to files outside of the BOINC directory.
Accidental abuse of participant hosts by projects BOINC prevents some problems: for example, it detects when applications use too much disk space, memory, or CPU time, and aborts them. Projects can minimize the likelihood of causing problems by pre-‐released application testing. Projects should test their applications thoroughly on all platforms and with all input data scenarios before promoting them to production status” (Berkley Open Infrastructure for Network Computing, 2009).
Since implementation of BOINC at Georgetown, the University has not
experienced any security issues with the BOINC software, proving the initial
minimal security threats worthwhile. According to Jennifer Smith, the Georgetown
University Library Communication Coordinator:
“We have not had any performance or security concerns with the program.
BOINC (only) uses approximately 200MB of memory at (when) the computer
SUPERCOMPUTING DELIVERED
20
is running. Since the vast majority of our computers have 4GiB or more
memory, this hasn't made any noticeable impact on performance. We've also
configured BOINC is so that it only runs computations (which can slow down
the computer significantly) when the computer is not in use by a user”
(BOINC at Georgetown: Library Coordinator of Communications [Interview]).
Supercomputing Delivered met with Arnie Miles several times throughout
the recent months. Arnie is an Adjunct Professor of Computer Science at GU who
plays an important role within UIS. Arnie is spearheading the full deployment of
BOINC across the entire University (see Implementation Path, Stage 1 for more
details). According to Miles, “Since BOINC is run by IBM, there is very low security
impact (according to the Security Office)” (BOINC at Georgetown: UIS [Personal
interview]). As multiple University staff have advocated for BOINC and ensured our
team that security is not a concern, our team felt comfortable moving forward with
the project.
Financial Concerns
Implementing BOINC will require resources in the form of staff time, and
potentially increased utility costs due to the higher CPU utilization of machines
running BOINC. In terms of concerns regarding additional overtime costs for staff,
given that BOINC is already in use on GU systems, it stands to reason that additional
GU staff would be willing to volunteer their time out of an interest in the end
product. Initial outreach to GU IT Admin staff on this front has been encouraging.
Financially, BOINC projects could benefit Georgetown researchers in a way far
SUPERCOMPUTING DELIVERED
21
exceeding the minimal costs the small energy and staff costs could incur. Through
IBM and Government agencies, such as the National Science Foundation,
Georgetown researchers could apply for grants to fund BOINC projects. The NSF
awarded a $998,862 grant for the Einsten@Home Project, sponsored by the
University of Wisconsin-‐ Milwaukee (Research Areas).
The only costs Georgetown has incurred thus far were for staff time to install
the BOINC software, and setting up the programs when BOINC was installed in May
of 2013, according to Smith; “There is a slight increase in our electricity usage, but
we have not been able to ascertain the amount.” As for increased power
consumption, this could be mitigated by efficiency savings elsewhere or fundraising
drives to install clean energy solutions. Since the power consumption that BOINC
utilizes is so minimal, this is a very minor concern (BOINC at Georgetown: Library
Coordinator of Communications [Interview]).
Stakeholders
Given the wide variety of worthwhile projects to choose from, it is likely that
a strong coalition of students, professors and university officials would support
BOINC. Furthermore, given that BOINC is supported by a grant from the National
Science Foundation, there is a reasonable chance that Georgetown could obtain a
grant to implement and improve on BOINC on its computer systems, thereby
mitigating some of the opposition to the idea.
Implementation Path: Georgetown Deployment of BOINC
Stage 1-‐ Data Collection-‐ Investigation of Current BOINC Environment
SUPERCOMPUTING DELIVERED
22
and Discussion for Georgetown Campus Distribution
In May of 2013, Library Systems Administrator, Aaron Williams handled
most of the implementation of BOINC for the Georgetown Library computer
systems. To date, this is the only location on Georgetown’s campus that has a formal
installation of BOINC/ WCG software (BOINC at Georgetown: Library Coordinator of
Communications [Interview]). Beginning in November of 2013, Georgetown has
seen active participation in their BOINC projects; there has been a particular surge
in credits within the last six months. From November 2013 to November 2014, the
Georgetown University Faculty Staff and Students Group has increased its total
credit output by 598 percent. Moreover, since our initial efforts to enact BOINC /
WCG at Georgetown University, as of December 1 the addition of several hundred
new systems has generated 6,333,187 points or
0.45 TeraFLOPs in additional processing power in less than two weeks-‐time.
(BOINCstats/BAM! | BOINC combined -‐ team stats -‐ Georgetown University Faculty
Staff and Students).
Given this surge in the interest and use of BOINC amongst the Georgetown
community, Supercomputing Delivered is continuing to explore opportunities
expanding the deployment of BOINC/WCG beyond shared computing systems and
those under the direct control of Georgetown University Information services.
During our initial interview with Mr. Miles, he mentioned an upcoming conversation
he would be having with Deputy CIO, Judd Nicholson, in which he would be
requesting Nicholson’s approval to have a campus wide deployment of BOINC on
any machine under UIS control (computer, server, lab, etc) (BOINC at Georgetown:
SUPERCOMPUTING DELIVERED
23
UIS [Personal interview]). On October 23, 2014, Miles sent an email to
Supercomputing Delivered, announcing that Nicholson was tentatively on-‐board
with rolling out BOINC/ WCG all systems under the direct control of Georgetown’s
UIS department. Miles then invited our team to be a part of the kickoff meeting, in
which we would be project leads under Nicholson’s supervision (Revisiting
Expanding BOINC at GU [Interview]). This exciting news further demonstrates the
enthusiasm for WCG among GU faculty and belief that undertaking this initiative will
lead to outsized results for both the early backers and the university at large.
During the kickoff meeting held on October 31, 2014, Miles tasked
Supercomputing Delivered with drafting a letter to Tony Cipriano, who is
responsible for imaging the classroom lab stations (BOINC Georgetown Campus
Deployment: Kickoff Meeting [Interview]). Having completed this task, the letter
was subsequently sent on behalf of CTO Miles which prompted Mr. Cipriano to
initiate the software deployment process by which over 100 new systems have
come online to start processing data for the Georgetown University WCG group in
only two weeks. This initiative is taking off across Georgetown; the current BOINC
environment is blossoming and our team is on the forefront of this effort.
Stage 2: Secure Endorsement from Key Students and Faculty Members
In September 2012, Georgetown’s Chief Operating Officer deployed the
IdeaScale platform to serve as a permanent forum to solicit feedback and encourage
discussion between students, faculty and staff (Crouch, 2012). Although the
IdeaScale platform, (known as Georgetown Idea Roundtable, or “h.Roundtables,”) is
SUPERCOMPUTING DELIVERED
24
not prominently featured on the GU website, since its launch h.Roundtables has still
managed to attract 3,323 users to generate more than 300 ideas, over 700
comments and 30,000+ votes (Georgetown University, 2014). A member of
Supercomputing Delivered submitted the idea to deploy BOINC throughout
Georgetown owned PCs in November of 2013 to IdeaScale, which just so happened
to coincide with another independent effort to implement BOINC at Georgetown’s
Laungier Library (Georgetown Ideas). This early interaction was a major factor in
facilitating our early success with implementing BOINC / WCG on a larger scale at
Georgetown University.
Stage 3: Official Briefings and Discussions with Key GU Administrators
& Students
As part of the project leads to Miles and Nicholson’s to initiate the
deployment of BOINC / WCG across all UIS controlled systems throughout
Georgetown, we are in a good position to speak with other GU Administrators about
expanding the adoption of BOINC to enable Georgetown to claim a top ten position
in the worldwide WCG rankings (BOINC at Georgetown: UIS [Personal interview]).
Now that GU will be deploying BOINC on all UIS controlled systems, it is important
that we maintain contacts with Administration, as well as initiating communications
with active student users and fostering new relationships to encourage more
growth amongst the student body as well as faculty. It is our hope that all PCs, not
just those operated by UIS, will install BOINC to become a part of this effort.
SUPERCOMPUTING DELIVERED
25
Stage 4: Formal Adoption of BOINC by GU
The formal adoption process has already begun and will continue in the
coming months, with the widespread roll out of BOINC on all UIS managed systems.
With continued support from GU IT staff, current GU students, and alumni, our team
believes BOINC can be installed and ran on every system at Georgetown. Outreach
to current students to assist with the implementation will be incorporated into the
next stage of outreach efforts, while outreach to alumni will be conducted via social
media networks such as LinkedIn and by requesting language in official university
communications to students and alumni.
Stage 5: Lessons Learned / Media Outreach
Regular outreach to local and national technology reports will be on-‐going
throughout the process as to the progress, challenges and results of implementing
BOINC at Georgetown University. Continued meetings with Miles, Nicholson, and
the UIS deployment team while we begin to implement BOINC across Georgetown
will be an imperative step in ensuring progress is made. Our team is fully
committed to the effort at Georgetown, as well as beyond.
Post-‐Georgetown Business Plan
BOINC has taken off successfully at Georgetown, and is on its way to being
largely adopted throughout the entire campus. As Supercomputing Delivered, our
primary business goal is to now bring our knowledge and consulting services to
Universities and non-‐profit organizations who are looking at maintaining their
SUPERCOMPUTING DELIVERED
26
BOINC accounts, setting up new projects, or assisting in whatever their BOINC needs
may be, to ensure this platform continues to be widely used. We have conducted
extensive research regarding this business model, and have identified the below
building blocks, which we believe make our business a viable, strong structure.
Market Demand & Value Propositions
BOINC is a reliable, secure product with a demonstrated value and enormous
potential. Our company brings the unique knowledge, expertise, and capability that
higher education institutions need to fully implement and benefit from a BOINC
solution on their campus. Colleges and Universities, as institutions of higher
education and learning, are by nature motivated to invest in tools and technology
that broaden their research capability, making them inclined to utilize our
consulting services.
While there is not a current market for our services, we believe we can create
a market by offering a unique value proposition to our potential institution
customers. BOINC has the potential to unlock vast computing power and problem
solving/research capability for numerous institutions; however, these institutions
lack the knowledge, skill sets, resources, and organizational commitments to
implement a BOINC and use it to generate value and innovate (which is where
Supercomputing Delivered provides an invaluable asset). Key to our success is
twofold; we must demonstrate the value of BOINC to the customer, and then also
demonstrate that we can provide them. Having successfully assisted Georgetown
University in deploying a BOINC system and assisting them develop ways to
SUPERCOMPUTING DELIVERED
27
leverage the capability, we are uniquely suited to advise other institutions as they
seek to replicate the success we have had at Georgetown. The lessons learned and
critical insights we gather at Georgetown will enable us to anticipate and address
the challenges that other organizations will likely encounter as they deploy BOINC,
helping them do so more efficiently and with reduced risk.
Competitive Analysis
At the present time, we see no companies currently in this specific market
that would directly compete with our service offering. While there are some
companies offering similar computing resources for sale, these are predominately
done in a self-‐service model, rather than a managed implementation of open-‐source
software. The chart below identifies and provides an overview of the three
organizations that present potential competitive threats to our business venture,
and our response plan to those risks.
Overview: Amazon High Performance Computing (HPC) allows scientists or researchers to increase the speed of research by running high performance computing in the cloud Reduces costs by providing Cluster Compute or Cluster GPU servers on-‐demand without large capital investments
Overview: Slicify is a marketplace to connect users with extra computing resources to organizations seeking to rent those resources from all over the world. Reduces costs by providing simple hourly rate from a pool of providers. Slicify is a Chinese company
Overview: IBM is one of the largest providers of IT consulting and professional services in the world. As the host organization for the World Community Grid (WCG), IBM would be able to offer a similar capability Presently, the World Community Grid is a
SUPERCOMPUTING DELIVERED
28
Response Plan: Partner with organization and establish business relationship. Ultimately seek hardware referral fees.
based in Hong Kong, which may be of concern to security minded users. Response Plan: Monitor organization size, service offerings, market penetration, and position in market. Track pricing.
philanthropic initiative of IBM Corporate Citizenship Response Plan: Partner with organization to obtain technical knowledge. Ensure goals/objectives remain aligned to those of WCG program. Monitor IBM service offerings.
Customer Segments
Critical to the success of our venture is identifying who we will provide our
services to, and then finding ways to logically coordinate and tailor our services to
provide the highest level of value to the customer. Supercomputing Delivered has
an initial plan of targeting accredited Colleges and Universities, segmented by region
and then by size (based on 2013 total enrollment). Small Universities are those with
less than 10,000 students, medium Universities are those with between 10,000 and
20,000, and large Universities are defined as those with greater than 20,000
students. Our future customer base will include nonprofit organizations in the
Washington, D.C. area with big data analytics needs that are: certified 501(c)3’s or
small business organizations (SBO’s) (those that have revenue of less than $5
million per year).
Key Partnerships
Several key partnerships will be established for this business venture. A
relationship management system will be utilized to ensure these partnerships are
well maintained and that regular communication is established with the partner.
SUPERCOMPUTING DELIVERED
29
The partnerships include: Georgetown University (specifically UIS and the Library),
IBM’s WCG, and Amazon Cloud Services. Georgetown University, where we will
pilot our BOINC deployment framework, will be a critical relationship and as we
continue to monitor the success of the program following its implementation. The
World Community Grid and IBM, as the host organizations for network, will be key
sources of information, particularly from a technical perspective. Finally, as
indicated in our business plan, hardware referrals as part of a BOINC deployment
offer opportunities for incremental revenue. Thus, maintaining a relationship with
Amazon Cloud Services, a major provider of hardware, is essential. Supercomputing
Delivered hopes to establish future partnerships with other hardware/service
providers and other Universities and nonprofits as we grow our business.
Channels/Marketing
Established channels will be used to deliver services to University customers
and to drive awareness of our services both at the customer (University) level, and
within communities. We will promote awareness via marketing campaigns on
Georgetown and other University intranet sites. We will immediately establish a
website and social media presence to ensure prospective customers have access to
all available information. Our team will utilize client engagement frameworks,
knowledge repositories, and other leading class tools that help us deliver advisory
services to our clients.
Revenue Streams
SUPERCOMPUTING DELIVERED
30
Revenue Streams refer to the ways in which we will acquire capital and
monetary assets. The three identified revenue streams in our initial business plan
are aligned with, but not directly correlated to, our key activities. These revenue
streams include our professional advisory services, in which we market and manage
the implementation of BOINC at a college or University, assess the current BOINC
landscape, and help institutions maximize the value that they obtain from their
participation in the WCG. These fee-‐based services will either be billed to customers
on a time + cost basis or for a fixed price. The second source of revenue from the
business venture is government and foundation grants. As a non-‐profit primarily
marketing services to institutions of higher education, or organization will be
eligible for funding sources obtained through federal, state, and local grant
programs or corporate philanthropic initiatives. Finally, our third source of revenue
will be from sales of “Lattice” project level dedicated implementations.
Key Resources
A variety of resources will need to be leveraged as we seek to implement our
business plan that we have divided into three major groupings: physical, capital, and
intellectual. Each of these types of resources are critical enablers that we will use to
market or deliver services to our potential clients and is an essential component of
the business plan. Physical resources refer to tangible things that we’ll need, such as
computers, business supplies, software, and even facility costs. These are things
that absolutely must be present in order for our business plan to succeed. Capital
resources refer to monetary assets obtained from our revenue sources. These are
SUPERCOMPUTING DELIVERED
31
used to maintain the business, acquire physical assets, pay salaries, and invest in
new services to provide to clients. Finally and likely most importantly, are our
intellectual resources. These refer to our BOINC phased deployment methodology
outlined above, working knowledge, insights, experience, and the “know how”
required to deliver services across our three key activities, or service offerings. As
our value proposition relies on the knowledge gained from successful prior
deployments of BOINC, particularly the one at Georgetown, we will leverage a
robust knowledge management system. This includes internal knowledge tracking
principles, frameworks, and tools and technology that will allow us to easily access
the information we need as we provide services to customers.
Key Activities
The key activities we will undertake refer to the specific services that we
provide potential customers and are organized logically by the particular value that
we will provide the client in the service. First, and likely what will be our core
service offering is the establishment of a BOINC program at the institution. This is a
managed implementation of the software and structure needed to participate in the
WCG network, where we first determine if the organization has a BOINC program or
existing participants, and includes a cost analysis and governance framework for the
customer. The final outputs of the managed implementation are a formal adoption
plan and the deployment of the program.
Our second major activity or service is the evaluation of the current state of
existing BOINC programs. Our subject matter knowledge and key partnerships will
SUPERCOMPUTING DELIVERED
32
enable us to assess the customer’s current program, evaluate the potential capacity,
and also develop a business case for deployment at organizations with cost
restrictions or potential program resistance. The outputs include a formalized
current state analysis report that provides a comprehensive overview of the
institution’s program, and a specific performance improvement plan that identifies
proven ways that organizations can drive participation in the program, and engage
and develop leadership buy-‐in.
Finally, the last major activity in our service portfolio involves assisting
institutions with existing programs maximize the value of their program. This
involves analyzing the current research needs for large scale analytic capabilities
and assisting researchers at that institution access the WCG. The output of this
major activity is a Value Extraction Toolkit that highlights specific actions the
institution can take to further benefit from its program.
Cost Structure/Financials
The cost structure below demonstrates the main costs associated with
running BOINC. There are three types of BOINC projects a client may decide to run.
The first type is the most common, which is running BOINC software for a pre-‐
existing project. The two other types of projects are not as common and our team
anticipates that clients will rarely request our services in setting up their own large
BOINC project (a project with over 5,000 hosts).
SUPERCOMPUTING DELIVERED
33
Cost Component Services for Running Current BOINC Project
Smaller BOINC Project for Implementation (< 5,000 hosts)
Bigger BOINC Project for Implementation (5,000 + hosts)
Salaries $1,000 $5,000 $10,000
Network Covered through University
Covered through University
$2,000 for 100 Mbit
Hardware $0-‐$10,000 $4,000 $18,000 for Server; $25,000 for AC
Total (Startup) $0-‐ $10,000 $4,000 $43,000
Total (Monthly) $1,000 $5,000 $12,000
(Kondo, D).
The revenue Supercomputing Delivered anticipates will come from four
different sources. Since this is a new venture, with no means to compare, we had to
work on many assumptions and estimates. As we prosper and grow, our financials
will continue to change. We estimated 60% of our revenues coming from our
consulting services, 30% from initial seed money and grants. As stated previously,
Government entities such as NSF have given BOINC projects grants; specifically, NSF
gave a $998,862 grant for the Einstein@Home Project, sponsored by the University
of Wisconsin-‐ Milwaukee (Research Areas). Additionally, NASA provided funding
for the Orbit@Home project (Graham, Richard M). Companies also provide
researchers and students grants for conducting computing projects. On September
17, 2012, Nvidia announced they were selecting ten PhD students globally to receive
$25,000 grants each, who were participating in GPU-‐ based research (inclusive of
SUPERCOMPUTING DELIVERED
34
BOINC) (Attention Geniuses).
We believe, as a 501(c)3 organization, we could qualify for grants as a means
to capital. By referring Amazon Cloud Services as a hardware provider for BOINC to
our clients, we are hoping to generate referral fees from Amazon as we grow as a
business (this is an estimated 10% of our revenue stream). A major component of
our services to our client is the ability to optimize their BOINC project results, and
display the data in meaningful ways. This monthly optimization of data results
accounts for the remaining 10% of our revenue stream.
The estimated payback period (given our assumed data) is 1, and our Return
on Investment (ROI) is 29%. Based on this data and our assumptions, our venture is
very sound financially. Although the profits we are seeing in the first three years are
SUPERCOMPUTING DELIVERED
35
minimal, they are stable. As a social good, 501(c)3 service-‐provider, we are not
seeking extravagant profits, especially not upfront. We came to these numbers by
assuming that in year 1, we assist a University in maintaining a BOINC project, and
also assist a University in setting up a small BOINC project. In year 2, we are
working on two BOINC sustainment projects and one BOINC implementation
project. In year 3, we assumed we would be working on three sustainment projects
and one small implementation projects. We used low estimates to the amount of
work we believe we could generate, so that our cash flow analysis would not be
overestimated. As providing staff and consulting services, our cash outflow is the
cost of labor required for each type of project. For implementation projects, labor is
more expensive, due to the need of hired programmers, not just our executive
consulting team (Supercomputing Delivered).
The chart below demonstrates the payback period analysis.
Year 0 1 2 3
Cash Inflows Seed Funding/Grant/ Revenue:
$100,000
$125,000
$200,000
Cash Outflows $72,000 $84,000 $144,000
$28,000 $41,000 $56,000
46.43% 36.59%
SUPERCOMPUTING DELIVERED
36
Management and Key Operations
The management of the business venture refers to the methods and
organizational structure that we will employee as we initiate business operations
and execute our strategy. We will initially charter our enterprise as a non-‐profit
organization and a 501(c)(3). Our Board of Directors will initially be comprised of
the three-‐team members, each with a unique role in running or developing an
aspect of the business. Tyler Gray, as the CEO, will have ultimate authority of the
venture and is the face of the company. He will serve as the project manager of each
deployment. Kyle Facada, as the Chief Technology Officer, will own service delivery,
developing the catalogue of service offerings, and serve as the lead technology
advisor. Annabel Berman, as the Chief Marketing Officer, will focus on building our
customer base, managing key relationships, identifying new opportunities, and
marketing our services via our various channels. She will also own all
administrative and business functions for the firm.
Operationally, we plan to initially deploy our business plan with no salaried
staff as we build our customer base and work to validate our model within the first
year of operations. Following our first year, we will develop and manage to a staff
acquisition plan. This plan ultimately calls for a dedicated team of deployment
consultants, each one identifying with a particular customer segment and providing
customized, personalized service to each of our clients. Lastly, our IT Support
offering will leverage the BOINC Leaderboard initially, until we are able to provide
this service in-‐house.
SUPERCOMPUTING DELIVERED
37
Further Considerations
Supercomputing Delivered researched an array of databases to identify any
potential legal or policy issues that could affect the implementation of BOINC. Based
on our research, there have not been any legal implications resulting from BOINC or
WCG; likewise, there have been no legal implications resulting from BOINC projects.
The databases utilized for research include: Westlaw Business, ESBSCOhost eBook
Collection, and the Georgetown Law State Secret Archives (Business Law Research,
EBSCOHOST, Georgetown Law State Secrets Archive). The WCG does not have any
policies in place that would restrict our consulting services, or use of their software
in our business pursuit.
Conclusions and Further Study
In reviewing both the history and potential of distributed computing with
BOINC, it is important to remember that we are still in relatively new and uncharted
territory. However, one aspect of this project, which certainly isn’t new, is the
concept of utilizing existing tools in novel ways to accomplish a seemingly
impossible task. Since BOINC expanded from SETI@Home to include projects in
other disciplines, the sheer volume of projects and results that have been generated
are nothing short of astounding.
On November 3, 2014 Dr. David J. Foran published an update on the World
Community Grid website commemorating the 10-‐year anniversary of the WCG
project with a series of updates that put the power and importance of grid
SUPERCOMPUTING DELIVERED
38
computing with BOINC / WCG into perspective. Noting that in cases in which a
patient is diagnosed with cancer, their doctor’s analysis currently is the primary
method of determining how aggressively the patient is treated. However, new
precision medicine techniques such as “microarray analysis” is enabling doctors to
analyze large batches of tissues samples simultaneously, to better identify patterns
and unique cancer signatures. While microarray analysis shows great promise, if
pathologists were able to utilize digital pattern recognition algorithms, it would be
“possible for researchers to determine a patient's type and stage of cancer more
precisely, meaning they can prescribe therapies or combinations of treatments that
are most likely to be effective” (Foran, 2014).
To study this approach, Dr. Foran utilized the WCG in 2006 and since then:
“More than 200,000 World Community Grid volunteers from around the globe who donated over 2,900 years of their computing time, we were able to study over 100,000 patient tissue samples to search for cancer signatures. Access to this vast computing power enabled our team to rapidly conduct this research under a much wider range of environmental conditions and to perform specimen analysis at much greater degrees of sensitivity” (Foran, 2014).
Other BOINC projects such as MalariaControl.Net, Roseta@Home and dozens
more have made even more impressive contributions to their respective disciplines:
• FightAIDS@Home: On November 4 2014 FightAIDS@Home announced that they have validated the computational analysis done by FightAIDS@Home users, and that this analysis is “providing important confirmation of our methodology and the value of your computational results” that has increased their understanding of how to disrupt HIV (FightAIDS@Home Research Team).
• ClimatePrediction.Net project has led to the publication of 14 original pieces of research on topics relating to better weather prediction and the effects of greenhouse gasses.
SUPERCOMPUTING DELIVERED
39
• The Einstein@Home project has discovered over more than three-‐dozen new neutron stars (Allen).
• The GPUGrid project has been able to conduct research into cancer, HIV, neural disorders like schizophrenia and find answers to questions “within an atomic level of accuracy (that) requires performing molecular dynamics simulations at the limit of computational power. GPUGRID technology allows us to successfully tackle this problem” (GPUGRID, 2012).
Given the strong track record of BOINC / WCG and the fact that Georgetown
University’s mission is a call to service for all students, faculty and staff to work
together in solving the world’s most pressing problems, “including poverty, disease
and conflict,” we’re thrilled to be able to utilize this project contribute to
Georgetown as an institution and society at large.
SUPERCOMPUTING DELIVERED
40
Appendix
I. Technical Architecture-‐ Advanced “LATTICE” Implementation (Bazinet).
II. Reward Badges Available for WCG Participation / Individual
Recognition
SUPERCOMPUTING DELIVERED
44
IV. Interview: Jennifer Smith, Georgetown Library (LAU) Coordinator of Communications Date: October 21, 2014, 12:00 pm. Teleconference.
Jennifer Smith: Library Coordinator of Communications [email protected] Q1 Supercomputing Delivered: -‐ What prompted LAU Library to implement BOINC? A1 Jennifer:
- It was implemented 18 months ago, Aaron Williams handled most of implementation started before it came up on Ideas Scale.
- IdeaScale topic came up last November. - Not very knowledgeable of its current use, believe it has remained consistent
since its inception. Q2 Supercomputing Delivered:
- Have there been security or performance issues on Library systems? A2 Jennifer:
- We have not had any performance or security concerns with the program. BOINC uses approximately 200MiB of memory at all times that the computer is running. Since the vast majority of our computers have 4GiB or more memory, this hasn't made any noticeable impact on performance. We've also configured BOINC is so that it only runs computations (which can slow down the computer significantly) when the computer is not in use by a user.
Q3 Supercomputing Delivered:
- Does it surprise that more buildings / departments are not participating? A3 Jennifer:
- From staff perspective, not one of our core competencies, not enough time. - Fantastic IT dept in library, able to make happen, sure other depts would
love to if they had the resources. - Would have to be implemented by UIS, and there are competing priorities-‐
does not make money. - We (the library) were able to do it because we have our own IT department.
SUPERCOMPUTING DELIVERED
45
Q4 Supercomputing Delivered: - What have the costs been like to run BOINC for the library?
A4 Jennifer:
- The only costs the Library has incurred were staff time installing and setting up the program. Presumably there is a slight increase in our electricity usage, but we have not been able to ascertain the amount.
Q5 Supercomputing Delivered
- Has interest in BOINC expanded since initiation? A5 Jennifer:
- We have not actively been recruiting. - We ran a news story on the library website last Feb to raise awareness - I believe that if there was an effort by University administration, BOINC
would be more widely adopted.
V. Interview: Arnie Miles, Georgetown University Information Systems
(UIS) & Adjunct Assistant Professor of Computer Science Date: October 14, 2014, 1:00 pm. Teleconference.
Q1 Supercomputer Delivered:
- How did you first hear about / get involved with BOINC?
A1 Arnie:
- I Built HPC clusters for GU, at first HPC convention for IMB. I had been doing
the predecessor to WCG.
- I started assigning extra credit assignments for the students of my class to
become involved in WCG. Most of my students still have their accounts.
Everything from my group (Georgetown university faculty, staff and students
group) was left open for everyone to join, the library joined.
Q2 Supercomputing Delivered:
- Are you aware of any prior efforts to implement BOINC at GU?
SUPERCOMPUTING DELIVERED
46
A2 Arnie:
- I had tried to get condor project on campus-‐ in the lab or faculty machines.
However, I couldn’t get any traction. It would require a lot of busy work and
it is not likely to get kudos. There are not a lot of GU researchers; BOINC on
the other hand is internationally known project that has legs, our
participation may help researchers in general, and can point to project
Georgetown is helping with international impact.
- Took the idea to have BOINC rolled out throughout GU computers that are
controlled by UIS as a proposal to deputy CIO, Judd Nicholson. I am waiting
to get on his calendar to talk to him about the implementation approach.
Q3 Supercomputing Delivered:
- Have you had any security or performance concerns?
A3 Arnie:
- It is run by IBM, there is very little impact and low security impact (Security
office stated). There are not really any performance concerns; a couple
students complain that it slows machine down.. “only been two or three”
anecdotal campaigns (cannot prove it is linked to the software by any
means).
Q4 Supercomputing Delivered:
- If this was implemented, do you think it would help GU become better known
for technology?
A4 Arnie:
- WCG would serve as a reputation enhancement for the university, less so for
UIS except for body that did the work for the university-‐ less for for the CIS
department.
- If one of more researchers from medical school / (Georgetown 60th largest
provider) That prestige would buy grants-‐ then even more prestige would
SUPERCOMPUTING DELIVERED
47
come as a result.
- With popular projects, we could use project to increase advertising within
Georgetown community-‐ then people may propose new projects.
- Another level of success [future goal to set]-‐-‐ Self install of WCG on UIS
platform (so any faculty member or student could run on their own personal
machine).
SUPERCOMPUTING DELIVERED
48
VI. Email/ Interview: Arnie Miles (UIS), Revisiting Expanding BOINC at GU Date: October 23, 2014, 11:10 am. Email.
SUPERCOMPUTING DELIVERED
49
VII. Kick-‐Off Meeting/Interview: Arnie Miles (UIS), Judd Nicholson (Deputy CIO), BOINC Georgetown Campus Deployment Kick-‐Off Meeting Date: October 31, 2014, 2:00 pm. Teleconference.
Attendees: Arnie Miles (AM), Judd Nicholson (JN), Amy Bruno (AB), Annabel Berman, Tyler Gray, Kyle Facada (Students) Meeting Minutes: AM + JN: We’re on the same page RE implementation BOINC via WCG. Per JD, it is
“Easier to sell (a new project) based on the social good aspect vs. it’s technical
construction”
Prior Convo AM / JN: Moving forward multiple levels (UIS all machines labs/class) -‐
All under single username (similar to how Launger Library is a user in the
Georgetown University BOINC / WCG group).
Next Step: Deploy to UIS desktop PCs then setup portal whereas GU facility + staff
can be invited to download GU specific customization of BOINC / WCG via UIS
portal.
Formal Announcement-‐ JN to communicate with GU Communications Director
Laura (comms) to send out formal announcement via email and JN’s blog.
Progressing Forward (JN): “This is a great effort for GU to contribute to scientific
research as institution.” We will proceed with a phased rollout through UIS then
greater GU-‐
JN Question: Is there a way for feedback?
AM / TG: Yes, via the public WCG leaderboard, which tracks progress by team and
individual contributions.
SUPERCOMPUTING DELIVERED
50
AM administers the current GU Faculty / Staff Group. He will report as to
individual and group results to differentiate and recognize contributions from
students, faculty, and GU departments.
JN: I believe that as people volunteer, others will also then see the value. We need
periodic milestones (such as the raw number / changes to the number of volunteers,
hours committed and work completed. Next we will brief GU Chief Information
Officer Lisa Davis.
AM + Students Action Items: Setup meeting with GU Director of Research
Technologies Steve Moore to find out if any GU students / faculty would be
interested in to run project (IBM) via Steve Moore-‐ find out who would be
interested at GU research community.
JN GOAL: Currently, GU is the 60th largest group-‐ Aim for top 10 by end of
2015.
Who Does What / Next Steps:
JN: AB will be organizing follow up meetings to scope project / milestones /
deliverables, and new information from follow up meetings with Steve Moore (to
involve GU researchers) and Tony Cipriano, Senior Technical Manager at GU for
implementation guidance.
Implementation of BOINC / WCG on GU classroom labs: Tony Cipriano is
responsible for coordinating. Per JN, Students to draft email for follow up meeting to
brief Tony + address any questions or concerns. Per AM, PCs are marginally easier
than OSX (Apple) systems to setup and keep running.
SUPERCOMPUTING DELIVERED
51
VIII. Interview: Michael Cummings & Adam Bazinet; Pioneers for UMD-‐ The
Lattice Project
Date: October 13, 2014, 3:30 pm. Teleconference.
Q1 Supercomputing Delivered:
- What are the advantages you have seen regarding the use and deployment of
BOINC at UMD and in general; why did your team use BOINC software?
A1 UMD:
- BOINC has advantage because it is easy to connect to any other BOINC
project that exists.
- BOINC offers a lot of options, you know the resources BOINC uses... you can
set the disk space, CPU, and throttle back as needed. These are shared
resources. Set to run as computer is in use, is very flexible.
- UMD uses a hybrid of BOINC and CONDOR for our project. If you want local
users-‐ CONDOR is the best approach. If you have big projects, BOINC is a
good approach. If you go BOINC, established projects benefit the most. The
ideal BOINC user is someone with a big project, and lot of jobs to run. Lattice
runs multiple applications, If researcher X has to do something small-‐ would
not be worth rolling out BOINC project. CONDOR can run arbitrary code.
Q2 Supercomputing Delivered:
- How difficult is it to set up/implement BOINC?
A2 UMD:
- BOINC is pretty straightforward, CONDOR can be a little more involved for
implementation. We have helped people set up at other institutions before.
Q3 Supercomputing Delivered:
- Could this be a service fee/implemented with CONDOR route?
SUPERCOMPUTING DELIVERED
52
A3 UMD:
- Around the last 20 yrs, things around that have been explored. Variety of
businesses have come and gone. Some brokers have formed, others have
acted as consultants for businesses and institutions. None of them have been
viable. Dominant paradigm is cloud computing-‐ most likely going to go to
amazon.
Q4 Supercomputing Delivered:
- What are the estimated costs to maintain and run internally hardware or
energy wise?
A4 UMD:
- There are: energy costs-‐ 1)alternative is buying time on dedicated resources
or buying big computer cluster, it is much cheaper to do distributed desktop-‐
energy costs are trivial or non-‐existent; 2) costs for labor and setting up:
BOINC is low cost to set up for existing project; to set up new project, it can
be costly.
SUPERCOMPUTING DELIVERED
53
VIII. Draft Email for Tony Cipriano, Written on Request of Georgetown University CTO Judd Nicholson: I wanted to touch base with you regarding an exciting initiative that UIS Faculty member Arnie Miles is spearheading on with three SCS students (Tyler Gray, Annabel Berman and Kyle Facada) completing their capstone project for a Masters in Technology Management. As you may be aware, there is a small, but active group of staff and students here at GU who are utilizing both university owned and personal computers to ‘donate’ their unused computer / processor time to the “World Community Grid” in order to help advance scientific and medical research across a variety of areas. As UIS has vetted the World Community Grid (WCG) / BOINC software program to run on GU owned systems, we would like to expand the scope of participation in a phased rollout to begin with UIS operated systems (PC and Mac) and eventually expand to all shared systems in classrooms / open spaces. We would also like to offer all GU staff / students the opportunity to opt-‐in to using their own systems to contribute to the overall standing of the “Georgetown University Faculty and Staff” group via a to be created UIS portal. Our goals for this initiative are to:
● Contribute to GU’s research mandate for the public good by utilizing existing resources for scientific and medical research.
● Increase overall participation, to improve GU’s ranking from #60 to the top-‐ten.
● Track and report on GU’s present and future contributions to WCG projects via GU’s website + social media channels.
In the longer term, we hope to be able to offer GU researchers the ability to run their own big-‐data analytics projects on this system. In the coming weeks Arnie will be coordinating a series of meetings to get the ball rolling on this, and we would of course appreciate your guidance and support as we continue to plan and execute this initiative! Please feel free to reach out to Arnie and the SCS students directly for more background on their work to date.
SUPERCOMPUTING DELIVERED
54
IX. Email: Notification from Arnie Miles that BOINC / WCG Rollout Has
Started
SUPERCOMPUTING DELIVERED
55
Works Cited Allen, B. (n.d.). About Einstein@Home. Retrieved March 01, 2013, from Einstein @ Home: http://einstein.phys.uwm.edu/ Anderson, D. D. (2003). Public Computing: Reconnecting People to Science. University of California -‐Berkeley, Space Sciences Laboratory. ATTENTION GENIUSES: NVIDIA TO GIVE $25,000 GRAD STUDENT GRANTS. (n.d.). Retrieved November 10, 2014, from http://blogs.nvidia.com/blog/2012/09/17/attention-‐geniuses-‐nvidia-‐to-‐give-‐25000-‐grad-‐student-‐grants/ Berkley Open Infrastructure for Network Computing. (2009). BOINC: Security Issues. Retrieved November 1, 2014, from http://boinc.berkeley.edu/trac/wiki/SecurityIssues Berkley Open Infrastructure for Network Computing. (2012, 07 17). Detailed Stats-‐ Fight Malria @Home. Retrieved November 1, 2014, from BOINC Stats: http://boincstats.com/en/stats/136/project/detail Bazinet, A. (2009, January 1). THE LATTICE PROJECT: A MULTI-‐MODEL GRID COMPUTING SYSTEM. Retrieved September 30, 2014, from http://drum.lib.umd.edu/bitstream/1903/9892/1/Bazinet_umd_0117N_10846.pdf BOINCstats/BAM! | BOINC combined -‐ team stats -‐ Georgetown University. (n.d.). Retrieved October 8, 2014, from http://boincstats.com/en/stats/-‐1/team/detail/2165/overview BOINCstats/BAM! | BOINC combined -‐ team stats -‐ Georgetown University Faculty Staff and Students. (n.d.). Retrieved October 8, 2014, from http://boincstats.com/en/stats/-‐1/team/detail/60725 BOINCstats/BAM! | Search. (n.d.). Retrieved October 12, 2014, from http://boincstats.com/en/stats/search/#georgetown BOINC at Georgetown: Library Coordinator of Communications [Interview]. (n.d.). BOINC at Georgetown: UIS [Personal interview]. (n.d.). BOINC Georgetown Campus Deployment: Kickoff Meeting [Interview]. (n.d.). BOINC Reviews. (2014, January 1). Retrieved November 10, 2014, from https://play.google.com/store/apps/details?id=edu.berkeley.boinc&hl=en
SUPERCOMPUTING DELIVERED
56
Business Law Research (n.d.). Retrieved November 8, 2014, from http://business.westlaw.com.proxy.library.georgetown.edu/Welcome/WLBSecurities/default.wl?RS=IMP1.0&VR=2.0&SP=003381889-‐7000&FN=_top&MT=WLBSecurities&SV=Full, Clery, Daniel (2005). "IBM Offers Free Number Crunching for Humanitarian Research Projects". Science 308 (5723): 773a. doi:10.1126/science.308.5723.773a. Retrieved 24 November 2008. Crouch, M. a. (2012, September 25). IdeaScale Realizing Potential. Retrieved November 1, 2014, from The Hoya: http://www.thehoya.com/opinion/ideascale-‐realizing-‐potential-‐1.2910488#.UY5VsiuAdaQ EBSCOHOST (n.d.). Retrieved November 8, 2014, from http://web.b.ebscohost.com.proxy.library.georgetown.edu/ehost/search/advanced?sid=8d1a600f-‐5db7-‐4926-‐88f1-‐0a177ff1d0d6@sessionmgr113&vid=0&hid=110 FightAIDS@Home Research Team (Ed.). (2014, November 6). Teamwork yields experimental support for FightAIDS@Home calculations. Retrieved November 8, 14, from http://www.worldcommunitygrid.org/about_us/viewNewsArticle.do?articleId=396 Foran, D. (2014, November 3). Decade of discovery: New precision tools to diagnose and treat cancer. Retrieved November 10, 2014, from http://www.worldcommunitygrid.org/about_us/viewNewsArticle.do?articleId=395 Georgetown University. (n.d.). Jesuit & Catholic Identity. November 1, 2014, from About Georgetown University: http://www.georgetown.edu/about/jesuit-‐and-‐catholic-‐heritage/index.html Georgetown Ideas. (n.d.). Retrieved November 5, 2014, from https://roundtables.georgetown.edu/a/dtd/Utilize-‐Idle-‐Gerogetown-‐Computers-‐to-‐Cure-‐HIV-‐and-‐Malaria/455022-‐17478. Georgetown Law State Secrets Archives (n.d.). Retrieved November 8, 2014, from http://apps.law.georgetown.edu/state-‐secrets-‐archive/ Graham Richard, M. (n.d.). Orbit@Home Funded by NASA. Retrieved November 10, 2014, from http://michaelgr.com/2007/07/24/orbithome-‐funded-‐by-‐nasa/ GPUGRID. (2012, Nov 26). Stimulating the maturation of HIV protease. Retrieved November 7, 2014, from GPUGRD: http://www.gpugrid.net/science.php?topic=hiv
SUPERCOMPUTING DELIVERED
57
National Institute of Health. An Overview of the Human Genome Project. (2012, November 8). Retrieved November 10, 2014, from http://www.genome.gov/12011238 Korpela, E. A. (2011). Status of the UC-‐Berkeley SETI Efforts. University of California, Berkeley. Kansas State University. Interview: Michael Cummings & Adam Bazinet; Pioneers for UMD-‐ The Lattice Project [Interview]. (n.d.). Kondo, D. (n.d.). Cost-‐Benefit Analysis of Cloud Computing versus Desktop Grids. Retrieved October 4, 2014, from http://mescal.imag.fr/membres/derrick.kondo/pubs/kondo_hcw09.pdf O'Brien, D. A. (n.d.). Crowd-‐sourcing antimalarial drug discovery. Retrieved November 2, 2014, from Fight Malaria @ Home: http://www.fight-‐malaria.org/index.php?option=com_content&view=article&id=1&Itemid=26 O'Brien, D. A. (2012, October 3). Fight Malaria @ Home-‐ Interview with Dr. Ant Chubb and Kevin O'Brien. Retrieved November 2, 2014 from UCD Complex & Adaptive Systems Laboratory: http://www.ucd.ie/casl/newsmedia/caslmedia/title,135839,en.html Revisiting Expanding BOINC at GU [Interview]. (n.d.). Research Areas. (n.d.). Retrieved November 1, 2014, from http://www.nsf.gov/awardsearch/showAward?AWD_ID=0555655. Robertson, G. (2009, July 20). How powerful was the Apollo 11 computer? Retrieved November 2, 2014 from HuffPost Tech: http://downloadsquad.switched.com/2009/07/20/how-‐powerful-‐was-‐the-‐apollo-‐11-‐computer/ World Health Organization. (2014, March). Malaria Fact Sheet. Retrieved November 03, 2014, from http://www.who.int/mediacentre/factsheets/fs094/en/index.html