SC1 Health WorkshopTechnical overview4 Oct 2016
Platform goals
◎ Low total cost of ownership
◎ Simple to get started with Big Data
◎ Cater for widely varying use cases
◎ Embrace emerging Big Data technologies
◎ Simple integration with custom components
Key actors
Big Data is
◎ Volumeo Quantity of data
◎ Velocityo Speed at which data is provided
◎ Varietyo Different formats/models in which data is provided
◎ Veracityo Accuracy/truthfulness of the data
Why did we need all this?
Platform architecture
Platform architecture
Platform architecture
Semantic Big Data
ongoing research!
◎ Semantic Data Lakeo from data swamp to data lakeo query contents in the data lake
◎ SANSA stacko Big Data analytics on semantic graph
Support layer
◎ Swarm UIo Launch, install and manage pipelines
◎ Pipeline daemon & monitoro Determine order in which steps are
executedo eg: Upload files before running
computations
◎ Integrator UIo Present dashboards in a unified interface
Platform architecture
Key actors
Platform installation
Platform installation
◎ Manual installation guide
◎Using Docker Machineo On local machine (VirtualBox)o In the cloud (AWS, DigitalOcean, Azure)o Bare metal
◎ Screencast
Platform development
◎ High level pictureo docker-compose.yml describes pipeline topology
◎ Common componentso extend template image with your code
◎ New componentso build a Docker image for your componento this is your own little Virtual Machine for your component
◎ Sharingo publish topology as git repositoryo publish new components on docker hub
Platform development
Platform development
Deployment
Swarm UI
Swarm UI
Deployment
Swarm UI
Swarm UI
Integrator UI
Workflow UI
More monitoring
This topic is ongoing
◎Custom User Interfaces
◎System output logs
◎Monitor network wire format (and visualise)?
◎Monitor node load (and autoscheduling)?
Concluding remarks
◎Used in practice◎Easy to get started◎ Improving as we speak
Top Related