Linux.Conf.AU 2009 (LCA09) Slide "OS Circular: Internet bootable OS Archive" by Suzaki

31
1 Research Center for Information Security OS Circular: Internet bootable OS Archive @Linux.Conf.AU 2009 http://lca2009.linux.org.au/ http://openlab.jp/oscircular/ Kuniyasu Suzaki

description

OS Circular: Internet bootable OS Archive presented at Linux Conf AU 2009 http://lca2009.linux.org.au/

Transcript of Linux.Conf.AU 2009 (LCA09) Slide "OS Circular: Internet bootable OS Archive" by Suzaki

Page 1: Linux.Conf.AU 2009 (LCA09) Slide "OS Circular: Internet bootable OS Archive" by Suzaki

1Research Center for Information Security

OS Circular: Internet bootable OS Archive @Linux.Conf.AU 2009 http://lca2009.linux.org.au/

http://openlab.jp/oscircular/

Kuniyasu Suzaki

Page 2: Linux.Conf.AU 2009 (LCA09) Slide "OS Circular: Internet bootable OS Archive" by Suzaki

2

Contents

• Motivation and Related Work

• Virtual Disk for OS Circular– LBCAS: Loopback Content Addressable Storage

• OS boot on Virtual&Real Machine with LBCAS.

• Performance problem and Optimization– Relocate blocks for prefetch of page-cache (readahead)

– Hide network latency

Page 3: Linux.Conf.AU 2009 (LCA09) Slide "OS Circular: Internet bootable OS Archive" by Suzaki

3

Motivation

• I wanted to use many OSes but I hated the installation.

• I used liveCD/DVD (KNOPPIX, FreSBIE[BSD], BeleniX[OpenSolaris], etc).– The update however is not so frequent. I’m afraid it may

include vulnerable applications.

– The burning CD/DVD is time-wasting and media-wasting.

• I want to boot the least (well-maintained) OSes from the Internet without installation.

Page 4: Linux.Conf.AU 2009 (LCA09) Slide "OS Circular: Internet bootable OS Archive" by Suzaki

4

OS Circular (Big Picture)

LBCAS(Loopback Content Addressable Storage)

Virtual Machine

KVM

Real Machine

QEMU

Internet

Construct Virtual Diskfrom block files

block files on HTTP Server

OS Suppliers(update timely)

UsersTry OS without installation

Page 5: Linux.Conf.AU 2009 (LCA09) Slide "OS Circular: Internet bootable OS Archive" by Suzaki

5

Related Work

• OS Zoo– Distribute the disk file of QEMU

• Linux Distributions, *BSD, Plan9, OpenSolairs, MINIX

• Big Disk File & Slow Update

– http://www.oszoo.org

• LivePC of Moka5– Moka5 is a venture company (Stanford “Collective” group)

– Streaming service of disk image to the customized VMWare

– http://www.moka5.org

Page 6: Linux.Conf.AU 2009 (LCA09) Slide "OS Circular: Internet bootable OS Archive" by Suzaki

6

OS Circular

• OS Circular is a framework of Internet Disk Image Distributor

• The disk image is managed by LBCAS (LoopBack Content Addressable Storage)

• Venti of Plan9 depends on same idea.

• Using LBCAS, user boots an OS from the Internet on virtual and real machine.– Hard disk works as cache.

• The cached image is reusable for next boot and applied to Mobile Computing.

Page 7: Linux.Conf.AU 2009 (LCA09) Slide "OS Circular: Internet bootable OS Archive" by Suzaki

7

Block Device

256KB

4KB Page

ext2

Mapping Table and block files

map01.idx4ad36ffe8…974daf34a…2d34ff3e1…3310012a……

The block files are re-constructed as a virtual disk

with LBCAS

compressed by zlib

Address File Name00000000-0003FFFF 4ad36ffe8…00040000-0007FFFF 974daf34a…00080000-000BFFFF 2d34ff3e1…000C0000-000FFFFF 3310012a…… …

Block file is named by SHA-1 digest of its contents

Block files of LBCAS

Page 8: Linux.Conf.AU 2009 (LCA09) Slide "OS Circular: Internet bootable OS Archive" by Suzaki

8

Block Device

256KB

4KB Page

ext2

Mapping Table and block files

map01.idx4ad36ffe8…974daf34a…2d34ff3e1…3310012a……

HTTP Server (original) Client A

OS

On DemandDownload

Construct a virtual disk of LBCAS on a Client PC

Page 9: Linux.Conf.AU 2009 (LCA09) Slide "OS Circular: Internet bootable OS Archive" by Suzaki

9

LBCAS (1/2)

• The image of LBCAS are made from existing normal block device.

• Original block device is split by 256KB and compressed by zlib.Each data is saved to each “block file”.

• Block file name is a SHA1 value of its contents.– If there are same contests in blocks, they are expressed by one block file

and reduce total storage space.– The basic idea is resemble to “Venti of Plan9”[USENIX’02]

• Block files are managed by “mapping table” file. • Block files are reconstructed to a loopback file by FUSE wrapper.

– FUSE is a User-land File System.• http://fuse.sf.net

• Each block file is measured with the SHA1 file name when it mapped to loopback file.

Page 10: Linux.Conf.AU 2009 (LCA09) Slide "OS Circular: Internet bootable OS Archive" by Suzaki

10

• Storage Cache– Suppress download

• Memory Cache– Suppress disk-access and

uncompress

Structure of LBCAS

Page 11: Linux.Conf.AU 2009 (LCA09) Slide "OS Circular: Internet bootable OS Archive" by Suzaki

11

LBCAS (2/2)

• When a file is updated or created on the original block device, the relevant block files are newly created with new SHA1 file name. The mapping table file is also renewed.– Old block files are reusable.

• HTTP for file deliver– Most popular and well designed for Internet.

• Utilize inexpensive Web hosting services, Proxies, and Mirror Servers for world wide deployment.

• Block files are network/storage transparent.– If necessary block files are stored in a local storage, network connection is

not necessary.

Page 12: Linux.Conf.AU 2009 (LCA09) Slide "OS Circular: Internet bootable OS Archive" by Suzaki

12

Block Device

256KB4KB Page

ext2

block files named by SHA-1

256KB4KB Page

ext2

map01.idx4ad36ffe8…974daf34a…2d34ff3e1…3310012a……

FUSEdriver

Same files

Reusable for FUSE

map02.idx4ad36ffe8…dd4daf34a…2d34ff3e1…3310012a……

block file

Partial Update of LBCAS

Update

apt-get install …

Create Once, Use Many

Page 13: Linux.Conf.AU 2009 (LCA09) Slide "OS Circular: Internet bootable OS Archive" by Suzaki

13

Apply LBCAS to Virtual and Real Machine

• Virtual Machine (easy way)– Advantage

• LBCAS can be passed to a virtual machine as bootable device.– Virtual Machine passes the control to the MBR of LBCAS.– Bootloader, kernel and initrd are included in LBCAS.

• The virtual devices are same on anonymous PC and the transferred OS only have to prepare the drivers for them.

– Disadvantage• The native performance of real device is not available.

– Especially VIDEO card, Network card cause problem.

• Real Machine– Advantage

• The native performance is available.– Disadvantage

• LBCAS is not recognized as a bootable device. The boot procedure must be customized.

• The devices on individual PC are different. Transferred OS must detect devices and setup suitable drivers.

Page 14: Linux.Conf.AU 2009 (LCA09) Slide "OS Circular: Internet bootable OS Archive" by Suzaki

14

OS Circular (Big Picture)

LBCAS(Loopback Content Addressable Storage)

Virtual Machine

KVM

Real Machine

QEMU

Internet

Construct Virtual Diskfrom block files

block files on HTTP Server

OS Suppliers(update timely)

UsersTry OS without installation

Recognized as a bootable device

Not recognized as a bootable device. Boot procedure must be customized.

Page 15: Linux.Conf.AU 2009 (LCA09) Slide "OS Circular: Internet bootable OS Archive" by Suzaki

15

Virtual Machine with LBCAS

Page 16: Linux.Conf.AU 2009 (LCA09) Slide "OS Circular: Internet bootable OS Archive" by Suzaki

16

Customization of Boot Procedure on Real Machine

• Customization on kernel and initrd– Kernel must recognize LBCAS as the device which includes Root File

System.

– To do so, the “initrd (initial ram disk)” must setup LBCAS.

• In order to setup LBCAS, initrd must setup Network.

• Customization on “init” process (after initrd)– “init” process have to setup a driver for each device on anonymous PC.

• Most LiveCD includes this function. (Ex: AutoConfig of KNOPPIX)

• Network card is the exception, because it was setup in the initrdand must keep for LBCAS.

Page 17: Linux.Conf.AU 2009 (LCA09) Slide "OS Circular: Internet bootable OS Archive" by Suzaki

17

GRUB Memukernel /boot/grub/linuxinitrd /boot/grub/initrd

udhcp ;Setup networkhttpstraged http://***/block.lst /media/lbcaslosetup /media/lbcas/KNOPPIX /dev/loop0mount /dev/loop0 /KNOPPIX

initAutoConfig detects devicesand includes suitable drivers except NIC

initrd

GRUB

initprocess

LBCAS

Internet

normal KNOPPIX boot

HTTP server

Boot Procedure on Real Machine

This part is replaced withkboot(kexe) or gPXE.

kboot(kexe)gPXE

Setup NetwrokDownload kernel and initrd

http://***/linuxhttp://***/initrd

Reboot with them

HTTP server

linux

initrd

Internet

Client PC only have to prepare gPXE or kboot(kexec)Whole image will be downloadable form Internet

Page 18: Linux.Conf.AU 2009 (LCA09) Slide "OS Circular: Internet bootable OS Archive" by Suzaki

18

Summary: Difference on Virtual and Real Machine

LBCAS(Loopback Content Addressable Storage)

Virtual Machine

KVM

Real Machine

QEMU

Internet

Construct Virtual Diskfrom block files

block files on HTTP Server

OS Suppliers(update timely)

UsersTry OS without installation

kernel & initrd

kernel & initrd

Bootloader download a kernel & initrdkboot(kexec), gPXETreat as a Bootable Device

initrd setup LBCAS for Root File System

kernel & initrd

Page 19: Linux.Conf.AU 2009 (LCA09) Slide "OS Circular: Internet bootable OS Archive" by Suzaki

19

Problem of Performance

• Disk image– LBCAS causes fragmentation because of block size mismatch

between File System and LBCAS.– The mismatch of “readahead (prefetch of page cache)” of

Linux kernel• Cause redundant download and unnecessary uncompress.

• Network Latency– LBCAS is sensitive for network latency

• Because many small files are downloaded on demand. The bandwidth expansion techniques (sliding window, multi-connection) are not used.

– Moka5 solves this problem with streaming download of disk image.

Page 20: Linux.Conf.AU 2009 (LCA09) Slide "OS Circular: Internet bootable OS Archive" by Suzaki

20

Optimization for Fragmentation

• Block size mismatch between file system and virtual block device causes fragmentation.– LBCAS 256KB

– File System (ext2) 4KB

– Kitagawa* reported the occupancy of requested blocks at boot time was 30% of LBCAS (on KNOPPIX 3.8.2).

• * [Linux Kongress 2006]

• “ext2optimizer” repacks the data blocks of ext2 file system to be in line.– It is based on the profile of accessed data blocks at boot time.

– As the results, ext2optimizer reduces the number of block files.

Page 21: Linux.Conf.AU 2009 (LCA09) Slide "OS Circular: Internet bootable OS Archive" by Suzaki

21Redundant block

Semantic Gap between readahead and LBCAS

• The coverage of readaread varies with the page-cache hit ratio.

readahead(4K~128K)

Files Blocks requested to disk access

LBCAS(256KB)

Blocks read Block files downloaded

Ext2/3 File System(4K)

Access Order

Cache missed and the coverage is shrunk

Hit Page-CacheOccupancy is low

Page 22: Linux.Conf.AU 2009 (LCA09) Slide "OS Circular: Internet bootable OS Archive" by Suzaki

22

Block Relocation: Ext2optimizer [LinuxKongress06]

Triple Indirect

Double Indirect

Indirect Blocks

Direct Blocks

Timestamps

Size

Owner info

Mode

• Change data blocks to be arranged in line. Structure of meta data is not changed.• The arrangement is based on the access profile at boot time.• Feature:

– Normal driver is used.– The fragmentation is occurred from the view of file– The relocation increases page-cache hit. readahead extend the coverage size.

Triple Indirect

Double Indirect

Indirect Blocks

Direct Blocks

Timestamps

Size

Owner info

Mode

Page 23: Linux.Conf.AU 2009 (LCA09) Slide "OS Circular: Internet bootable OS Archive" by Suzaki

23

Static Analyze by DAVL (Disk Allocation Viewer for Linux)

Original Fragmentation 0.09%

Ext2optFragmentation 0.27%

Page 24: Linux.Conf.AU 2009 (LCA09) Slide "OS Circular: Internet bootable OS Archive" by Suzaki

24

Dynamic Analyze: Disk Access at boot time

• Ext2optimizer relocate the data blocks for boot.

Address

Tim

e

Page 25: Linux.Conf.AU 2009 (LCA09) Slide "OS Circular: Internet bootable OS Archive" by Suzaki

25

The amount of requested and downloaded data at boot time

• The block size of LBCAS is changed to 64KB, 128KB, 256KB, and 512KB.

• Ext2optimzer reduces the amount of requested and downloaded data.

• Small block size is better in this case.– Big block size is better at long latency network, because small block size

requires many times of download.

Effect of Ext2Opt Effect of Compress

512KB

256KB

128KB

64KB

Effect of sufficiency

Ext2Opt makes small difference for the amount

Amount of requested data Amount of downloaded data

Page 26: Linux.Conf.AU 2009 (LCA09) Slide "OS Circular: Internet bootable OS Archive" by Suzaki

26

Optimization for download methods

• 2 optimizations– DLAHEAD (DownLoad AHEAD)

• The necessary block files are downloaded in advancewith extra download connections (default 4).

– [Preparation] Take a profile of downloaded block files at boot time.

– DNS-Balance• DNS-Balance is a kind of name resolver which suggests the

nearest server.• Users find the nearest download site automatically.

– It prevents inter-continental download, because we offers severs in EU, US, and Japan.

Page 27: Linux.Conf.AU 2009 (LCA09) Slide "OS Circular: Internet bootable OS Archive" by Suzaki

27

World Wide deployment

• Prepare some hosting services in the world.

London

Philadelphia

Montreal

Houston

Amsterdam

Copenhagen

JapanRing Servers

knoppix.inetboot.net(DNS Balance)

Page 28: Linux.Conf.AU 2009 (LCA09) Slide "OS Circular: Internet bootable OS Archive" by Suzaki

28Resolve select.inetboot.netby DNS-Balance(ns.inetboot.net).

XXX.168.0.10

YYY.10.0.19

DNS-balance

(select.inetboot.net)

(XXX.168.0.10)

(select.inetboot.net)

(YYY.10.0.19)

Search for suitable download server

Client

Web server for Block Files

• DNS Balance (knoppix.inetboot.net) suggest a suitable IP Address of server (North America, EU, Asia)

Page 29: Linux.Conf.AU 2009 (LCA09) Slide "OS Circular: Internet bootable OS Archive" by Suzaki

29

Current Available OSes on OS Circular

• On real machine– KNOPPIX 4.0.2, 5.0.1, 5.1.1

• KNOPPIX is advanced at AutoConfig and applied to any PC.

– The kernel and initrd are downloadable. “gPXE” , “Kboot” and “kexec” can boot them.

• On virtual machine– Plan9 and NetBSD

• on Xen 2.0.3 DomU (para-virtualization)– The detail is presented at Ottawa Linux Symposium 2006

– Debian Etch, Ubuntu6.06/6.10/7.04, CentOS5

• on Xen-HVM/KVM/QEMU (full-virtualization)

Page 30: Linux.Conf.AU 2009 (LCA09) Slide "OS Circular: Internet bootable OS Archive" by Suzaki

30

LBCAS for Sony PlayStation3 Linux

• PlayStation3 has "kboot“ on 4MB Flash– kboot can get “kernel” and “initrd” via HTTP.

• The disk image is obtained by LBCAS.

Page 31: Linux.Conf.AU 2009 (LCA09) Slide "OS Circular: Internet bootable OS Archive" by Suzaki

31

Summary

The some services are available. Just try!

http://openlab.jp/oscircular/

Special Thanks

DAVL developers

http://sourceforge.net/projects/davl/

EXT2Optimizer developers

http://unit.aist.go.jp/itri/knoppix/ext2optimizer/index-en.htm