Technology for Distributed Collaboration

25
Technology for Distributed Collaboration Ian Foster Computation Institute Argonne National Laboratory University of Chicago

description

Technology for Distributed Collaboration. Ian Foster Computation Institute Argonne National Laboratory University of Chicago. www.ci.uchicago.edu. www.ci.anl.gov. In the Next 50 Years, We Must …. Increase energy production by 5, while reducing GHG emissions by 2 or more - PowerPoint PPT Presentation

Transcript of Technology for Distributed Collaboration

Technology forDistributed Collaboration

Ian FosterComputation Institute

Argonne National LaboratoryUniversity of Chicago

2

www.ci.uchicago.edu www.ci.anl.gov

3

In the Next 50 Years, We Must …

Increase energy production by 5, while reducing GHG emissions by 2 or more

Mitigate and adapt to climate change

Address increasingly drug resistant diseases

Provide meaningful livelihoods for 9B people

Innovation

4

We Must Get Smarter …

Maxwell Smart (NBC, 1960s; Warner 2008)

5

The Three Dimensions of Smart

Biology

Technology

Culture

6

Problem Solvingas “Thinking Aloud”

“What if I try A?”

“I wonder how I do B?”

“What do others know about C?”

“Hey, I’ve just learned how to do D!”

How do I reduce cycle time?

7

Thinking Aloud:Reducing Cycle Time

“What if I try A?”

Design, modeling, fabrication tools “I wonder how I do B?”

Wikis, design databases, conversation “What do others know about C?”

Databases, search tools, conversation “Hey, I’ve just learned how to do D!”

Publication, conversation, education

(Distributed) collaboration is a crosscutting theme

8

Technologies forDistributed Collaboration

Conversation Post Fedex Telephone Email, IRC, … Instant messaging Videoconference

Immersive MUDs Access Grid Second Life

Data publication FTP, Gopher, … Web Blogs Semantic Web

Federation Collaborative

bookmarking Grid computing Service-oriented

architecture

9

100

100

200

300

400

500

600

11/1

/04

12/1

/04

1/1/

05

2/1/

05

3/1/

05

4/1/

05

5/1/

05

6/1/

05

7/1/

05

8/1/

05

9/1/

05

10/1

/05

11/1

/05

12/1

/05

1/1/

06

2/1/

06

3/1/

06

4/1/

06

5/1/

06

6/1/

06

7/1/

06

8/1/

06

9/1/

06

10/1

/06

GB

/day

Daily 7-Day Average

Provides access to all IPCC data

>150 TB data downloaded

>300 scientific papers written

GB/day

600

11

Integrating Data and Computing, on Demand

Public PUMA Knowledge Base

Information about proteins analyzed against ~2 million gene sequences

Back OfficeAnalysis on Grid

Millions of BLAST, BLOCKS, etc., on

OSG and TeraGridNatalia Maltsev et al., http://compbio.mcs.anl.gov/puma2

12

Bennett Berthenthal et al., www.sidgrid.org

Social Informatics Data Grid

13

caBIG: sharing of infrastructure, applications, and data.

NIH’sCancer Biomedical Informatics Grid

14

Medical Educationover Access Grid

Credit: Jonathan Silverstein, U.Chicago

15

Access Grid and SARS

Global Communitie

s

17

PRAGMA

New communities:

18

Community WebsiteSoftware Downloads, User-contributed Content, Hardware Reference, & More

19

Globus Downloads Last 24 Hours

Last month

20

Lessons Learned

The power of diversity & scale Open Science Grid: 80 sites, 30K CPUs World Community Grid: 700,000 CPUs Access Grid: several thousand nodes Wikipedia, Flickr, CiteULike, Connotea, …

The challenges of heterogeneity Bandwidth, hardware, interests, trust, understanding,

meaning, timezone, … The challenges of scale

Participants, data, computing, ambition Everything is still far too complicated!

21

Access Grid:

The Power of Context

22

“Thinking Aloud” (for Science or Invention): 10 Year From Now

On-demand access to powerful data, design, analysis, & fabrication resources Service-oriented science & engineering Deep analysis of vast quantities of data Commoditization of design & analysis

Communities of 2, 20, 200, 2K, 2M can self-identify easily within a sea of billions To share information, converse, discover

We understand innovation & collaboration far better than today

23

Some Key Challenges Enable smooth scaling in many dimensions

Number of participants (K-, M-, G-persons?) Internet capabilities (0 to Tbit/sec) Physical resources Amount of data (megabytes to exabytes) Complexity of questions asked & answered Degree of trust, shared language, etc.

Integration with the physical world Active sensors Automated experimental protocols Integrate manufacturing and problem solving

24

Current Activities

Access Grid 3.0 Conversation, context, scale, ease of use Dozens of sites

Collaborative tagging for scientific data Collaborative creation of data exegesis

Resource federation in virtual organizations Grid protocols and software

Major deployments in development economics

25

We Can Contribute to aDemocratization of Science

Personalized manufacturing (FabLab)

Personalized reporting (blogosphere)

Personalized innovation (“Global Knowledge Environment”?)

“So much ingenuity my generation has, and no place to put it” — Charlie Leduff