Adventures in Research

27
Adventures in Research Joel Merrick BBC Research & Development OpenNebula Conference 2013 Thursday, 26 September 13

description

BBC Research & Development are in the process of deploying a department wide virtualization solution, catering for use cases including web development, machine learning, transcoding, media ingress and system testing. This talk discusses the implementation of a high performance Ceph storage backend and the challenges of virtualization in a broadcast research and development environment.

Transcript of Adventures in Research

Page 1: Adventures in Research

Adventures in ResearchJoel Merrick

BBC Research & Development

OpenNebula Conference 2013

Thursday, 26 September 13

Page 2: Adventures in Research

About me

• From Manchester, UK

• Sysadmin by day, Project Lead for Internal Cloud by night

• Involved with Sahana Foundation in 2008, helping with administration

• First production release running on KVM during 2010 Haiti Earthquake.

• It’s ready for prime-time

Thursday, 26 September 13

Page 3: Adventures in Research

About BBC R&D

• Established in 1922, shortly after main organisation

• Initially 2 divisions, Research Department and Development

• Grew rapidly, moving homes several times

• Eventually settled at Kingswood Warren, Surrey

• Amalgamated to R&D in 1993

• Now 3 sites - Centre House, MediaCity UK, 1 Euston Square

Thursday, 26 September 13

Page 4: Adventures in Research

About BBC R&D

Kingswood Warren, Surrey

Centre House, London

MediaCity, Manchester

1ES, London

Thursday, 26 September 13

Page 5: Adventures in Research

Previous Technologies Developed

• Noise Cancelling Microphones

• Conversion from 405-line to 625 line

• Colour Television

• Transatlantic Cable & Satellite

• BBC Micro

• NICAM Stereo

• DAB Digital / DTV / Freeview

• YouView

Thursday, 26 September 13

Page 6: Adventures in Research

CollaborationSuperHiVision with NHK for London 2012 Olympic Games

http://www.bbc.co.uk/blogs/researchanddevelopment/2012/08/the-olympics-in-super-hi-visio.shtml

Thursday, 26 September 13

Page 7: Adventures in Research

Areas of ResearchCaptureThis area covers learning how to recognise and isolate objects within audio and video files automatically, such as individual sound sources or the motion of an actor or athlete, as well as how best to record and store media so it is durable and compatible with other systems.

• ProduceOur research in this area helps keep costs down and make production more efficient by developing the kinds of technology that might radically improve the way programmes are made in the future.

• DeliverThis research aims to develop new ways to distribute our programmes, while ensuring audiences receive them in the best possible quality, wherever they are, whenever they want them and whatever device they are using.

• DiscoverThis area sees us experimenting with new types of programmes and, with the BBC about to open more than 70 years’ worth of archives, how audiences might find and interact with them.

• ExperienceHow our audiences experience BBC programmes is our focus here. In this area we anticipate their future expectations and ensure new technology, however complex, is easy to use and accessible for everyone

Thursday, 26 September 13

Page 8: Adventures in Research

Every Day is Different

• We don’t have one specific kind of workload on the shared platform

• Make is as flexible as possible, but also keep it performant

• Most users don’t really care about backend technology, they just want a simple, yet effective service.

Thursday, 26 September 13

Page 9: Adventures in Research

Some Current Projects(not all, by any means!)

• IP Studio

• Object Based Audio

• Enhanced Subtitling

• World Service Archive Voice Analysis & Scrubbing

Thursday, 26 September 13

Page 10: Adventures in Research

Challenges

• Engineers left with flexibility to do their own thing

• Silos of knowledge, hinders cross-team interactions

• Time taken to provision

• Inconsistencies

• Harder to manage asset utilisation

• Demand for compute resources and storage will only increase

Thursday, 26 September 13

Page 11: Adventures in Research

Legacy

• Robust internal systems

• Virtualisation in use, but only really single nodes and in ad hoc situations

• Each team had their favourite distribution

• Very little / no config management or deployment tools in most project areas

Thursday, 26 September 13

Page 12: Adventures in Research

A Different Approach

• Reduce the time drains

• Automate Everything (eventually!)

• Try to standardize where appropriate

• Take ownership of assets

• Make it easy to extend and reproduce the platform

Thursday, 26 September 13

Page 13: Adventures in Research

Early Stages

• Project been running for about 6 months

• Available to users for only 2 months

• 2 clusters currently online

• Project teams already committing to procurement

• Pan-BBC Interest

• Opportunity to develop best practice as well as better interactions with other areas of the organisation

Thursday, 26 September 13

Page 14: Adventures in Research

Current Uses

• Started hosting Internal Systems Infrastructure

• Build slaves

• Indexing (100GB VM!)

• General hosting

• Hacking on ideas!

Thursday, 26 September 13

Page 15: Adventures in Research

Why Build a Cloud?

• We have ownership!

• We can be more confident in security policy

• Can be guaranteed of the execution venue, so legal stipulations can be met

• Network access is much faster for users, latency is a lot better

Thursday, 26 September 13

Page 16: Adventures in Research

High Level Component View

• OpenNebula 4.2

• KVM

• Ceph (rbd for VMs) - Using snapshot layering driver and custom libvirt

• Ubuntu 13.04 - may transition back to LTS

Thursday, 26 September 13

Page 17: Adventures in Research

HardwareCompute Nodes

• Dell R720 - 32x Xeon(R) CPU E5-2670 0 @ 2.60GHz / 128GB RAM

Storage Nodes

• Dell R320 - 8x Xeon(R) CPU E5-1410 0 @ 2.80GHz / 24GB RAM

• LSI SAS2308 HBA

• SuperMicro JBOD Chassis

Thursday, 26 September 13

Page 18: Adventures in Research

Network

• All hosts have 10Gbit interconnectivity

• Intel Corporation 82599EB 10-Gigabit SFP+

• Copper TwinAx

• Cisco Nexus 5020 ‘brains’

• FEX 2232 (Fabric Extender) as ToR switch

Thursday, 26 September 13

Page 19: Adventures in Research

OpenNebula Setup

• Currently running 4.2

• Main user interaction is via Sunstone

• Users authenticate against LDAP

• Default view for users is ‘cloud’

• Ceph RBD as VM block storage

• CephFS as System Datastore

• OpenVSwitch

Thursday, 26 September 13

Page 20: Adventures in Research

Storage Node / Ceph Setup“Ceph is a distributed object store and file system designed to provide

excellent performance, reliability and scalability.”

• XFS based OSDs (not btrfs)

• 12TB per node initial, growth to 24/48TB per node

• Around 1/8th Petabyte currently

• No SSDs

• Journals on Disk

• Deployed using ceph-deploy (much better now)

• RBD writeback caching (also writethough available)

• OSDs on all, MON’s on a small subset, MDS on inverse.

Thursday, 26 September 13

Page 21: Adventures in Research

Snapshot Layering

Thursday, 26 September 13

Page 22: Adventures in Research

Ceph’s Future

• Can only get better!

• Better REST admin API’s

• 8x speed increase in CRC functions in testing

• OpenZFS to leverage journaling?

• Erasure encoding to reduce space requirements

• Mutli-site replication

• RBD client side SSD caching (specifically for OS deployment)

Thursday, 26 September 13

Page 23: Adventures in Research

Deployment

• Generally Puppet Managed

• VM Image generated using VeeWee

https://github.com/jedi4ever/veewee

"A great tool for creating and configuring lightweight, reproducible, portable virtual machine environments - often used with the addition of automation tools such as Chef or Puppet."

Thursday, 26 September 13

Page 24: Adventures in Research

Oversubscription

• Not all VMs have CPU intensive workloads

• Makes financial sense to over-commit resource when applicable

• Shared resources have CPU over-commited by 4x

• Memory is not over-commited

• Project teams can manage their own level on their own equipment

Thursday, 26 September 13

Page 25: Adventures in Research

Future Work - OpenNebula• Hypervisor-side SSD caching (B-cache, flash-cache,

enhanceio etc.)... possibly

• Better ceph integration (attach_disk etc)

• Multiple Ceph Pools for tiered storage

• SSD based local storage

• Leverage more of radosgw for S3 compliant storage

• Integrate VM generator into Sunstone/ONE?

• Move to virtio-scsi

Thursday, 26 September 13

Page 26: Adventures in Research

Future Work - Hardware Pools

• PCI Passthrough Pooling

• Mainly used for SR-IOV Network adapters

• Allow PCI capture devices to be bound to a VM

• Drive the SDI Matrix to attach a given soft-patch

• Other use cases?

Thursday, 26 September 13

Page 27: Adventures in Research

Thanks!

Any Question?

Thursday, 26 September 13