Welcome

11
SCAP E Andy Jackson The British Library SCAPEdev1 AIT, Vienna - 6 th – 7 th June 2011 Welcome First SCAPE Developers’ Workshop

description

Welcome. First SCAPE Developers’ Workshop. Andy Jackson The British Library. SCAPEdev1 AIT, Vienna - 6 th – 7 th June 2011. SCAPEdev1 – The Goals. Get to know each other. Get familiar with the major platforms. - PowerPoint PPT Presentation

Transcript of Welcome

Page 1: Welcome

SCAPE

Andy JacksonThe British Library

SCAPEdev1AIT, Vienna - 6th – 7th June 2011

WelcomeFirst SCAPE Developers’ Workshop

Page 2: Welcome

SCAPESCAPEdev1 – The Goals

• Get to know each other.

• Get familiar with the major platforms.

• Outline and discuss the initial Preservation Component (a.k.a. tool) integration plan.

2

Page 3: Welcome

SCAPEGetting to know me

• Andrew Jackson, at the British Library• Technical Coordinator for SCAPE, which means…

• Someone to go to when you get stuck• Someone who will look for cross-work-package confusion

or integration problems• Someone who will propose solutions if necessary

• Chair of the Technical Coordination Committee• Raising and resolving technical integration issues, etc.• Open – let me know if you want to sit in or raise an issue• Will meet via Skype monthly

3

Page 4: Welcome

SCAPEGetting to know each other

• Round-the-room• Let’s get it out of the way…

• And…• Talk together• Debug together• Eat together

• And finally, put your picture on the SCAPE wiki and/or Sharepoint…• …if you don’t mind.

4

Page 5: Welcome

SCAPEWhere are Preservation Components used?

• SCAPE Testbeds (TB)• Taverna designs workflows that run the tools.

• SCAPE Platform (PT)• Executing tools and workflows at scale.

• Preservation Planning & Watch (PW)• PLATO uses tools on sample files during planning.

• And beyond…• Integration into tools, repositories, institutions, command-

line interfaces (CLI) & scripting, etc…

5

Page 6: Welcome

SCAPETestbeds & Taverna

• SCAPE Testbeds are generating scenarios and preservation workflows that explore them

• Uses Taverna to build workflows that invoke our tools• WSDL/SOAP calls mature• RESTful style maturing rapidly• Core integration of CLI plugin as of 2.3

• We’ll play with it this morning

6

Page 7: Welcome

SCAPEThe SCAPE Platform

• Initially, vanilla Hadoop, HDFS and HBase• Pure Java preferred• Local CLI okay too

• Later on, running Taverna workflows• Pure Java, local CLI• Or locally deployed web services if necessary

• Later on, may invoke services from inside VMs• e.g. when tools need a particular OS• Web services would make integration easier

7

Page 8: Welcome

SCAPEWhy HBase?

• Hadoop+HDFS provides massive fault-tolerant file system and processing

• But HDFS does not cope well with lots of 'small' files• See here for information:http

://www.cloudera.com/blog/2009/02/the-small-files-problem

• HBase architecture is a good fit for our use cases• HBase is used in many places…

• Including web archiving at SCAPE partners• We’ll play with Hadoop and HBase this afternoon

8

Page 9: Welcome

SCAPEPLATO & Wider Integration

• PLATO Planning Tool (PW)• CC, PA and QA during planning process• Stable API web services required (WSDL or REST)

• Repository Integration• Local CLI for some (e.g. ePrints PLATO integration [*])• RESTful for others

• Web Integration• RESTful style preferred

9

Page 10: Welcome

SCAPEThe Challenges

• Reproducible tool invocation across all contexts:• CLI, Java, SOAP / REST• But ease of development and deployment is critical

• Interoperable data formats and consistent semantics across contexts where required• So clients can understand tool outputs correctly

• Tomorrow we’ll explore the proposed integration plan

10

Page 11: Welcome

SCAPEQuestions?

11