Linux.Conf.AU 2009 (LCA09) Slide "OS Circular: Internet bootable OS Archive" by Suzaki
-
Upload
kuniyasu-suzaki -
Category
Technology
-
view
1.810 -
download
0
description
Transcript of Linux.Conf.AU 2009 (LCA09) Slide "OS Circular: Internet bootable OS Archive" by Suzaki
1Research Center for Information Security
OS Circular: Internet bootable OS Archive @Linux.Conf.AU 2009 http://lca2009.linux.org.au/
http://openlab.jp/oscircular/
Kuniyasu Suzaki
2
Contents
• Motivation and Related Work
• Virtual Disk for OS Circular– LBCAS: Loopback Content Addressable Storage
• OS boot on Virtual&Real Machine with LBCAS.
• Performance problem and Optimization– Relocate blocks for prefetch of page-cache (readahead)
– Hide network latency
3
Motivation
• I wanted to use many OSes but I hated the installation.
• I used liveCD/DVD (KNOPPIX, FreSBIE[BSD], BeleniX[OpenSolaris], etc).– The update however is not so frequent. I’m afraid it may
include vulnerable applications.
– The burning CD/DVD is time-wasting and media-wasting.
• I want to boot the least (well-maintained) OSes from the Internet without installation.
4
OS Circular (Big Picture)
LBCAS(Loopback Content Addressable Storage)
Virtual Machine
KVM
Real Machine
QEMU
Internet
Construct Virtual Diskfrom block files
block files on HTTP Server
OS Suppliers(update timely)
UsersTry OS without installation
5
Related Work
• OS Zoo– Distribute the disk file of QEMU
• Linux Distributions, *BSD, Plan9, OpenSolairs, MINIX
• Big Disk File & Slow Update
– http://www.oszoo.org
• LivePC of Moka5– Moka5 is a venture company (Stanford “Collective” group)
– Streaming service of disk image to the customized VMWare
– http://www.moka5.org
6
OS Circular
• OS Circular is a framework of Internet Disk Image Distributor
• The disk image is managed by LBCAS (LoopBack Content Addressable Storage)
• Venti of Plan9 depends on same idea.
• Using LBCAS, user boots an OS from the Internet on virtual and real machine.– Hard disk works as cache.
• The cached image is reusable for next boot and applied to Mobile Computing.
7
Block Device
256KB
…
4KB Page
ext2
…
…
…
Mapping Table and block files
…
map01.idx4ad36ffe8…974daf34a…2d34ff3e1…3310012a……
The block files are re-constructed as a virtual disk
with LBCAS
compressed by zlib
Address File Name00000000-0003FFFF 4ad36ffe8…00040000-0007FFFF 974daf34a…00080000-000BFFFF 2d34ff3e1…000C0000-000FFFFF 3310012a…… …
Block file is named by SHA-1 digest of its contents
Block files of LBCAS
8
Block Device
256KB
…
4KB Page
ext2
…
…
…
Mapping Table and block files
…
map01.idx4ad36ffe8…974daf34a…2d34ff3e1…3310012a……
HTTP Server (original) Client A
OS
On DemandDownload
Construct a virtual disk of LBCAS on a Client PC
9
LBCAS (1/2)
• The image of LBCAS are made from existing normal block device.
• Original block device is split by 256KB and compressed by zlib.Each data is saved to each “block file”.
• Block file name is a SHA1 value of its contents.– If there are same contests in blocks, they are expressed by one block file
and reduce total storage space.– The basic idea is resemble to “Venti of Plan9”[USENIX’02]
• Block files are managed by “mapping table” file. • Block files are reconstructed to a loopback file by FUSE wrapper.
– FUSE is a User-land File System.• http://fuse.sf.net
• Each block file is measured with the SHA1 file name when it mapped to loopback file.
10
• Storage Cache– Suppress download
• Memory Cache– Suppress disk-access and
uncompress
Structure of LBCAS
11
LBCAS (2/2)
• When a file is updated or created on the original block device, the relevant block files are newly created with new SHA1 file name. The mapping table file is also renewed.– Old block files are reusable.
• HTTP for file deliver– Most popular and well designed for Internet.
• Utilize inexpensive Web hosting services, Proxies, and Mirror Servers for world wide deployment.
• Block files are network/storage transparent.– If necessary block files are stored in a local storage, network connection is
not necessary.
12
Block Device
256KB4KB Page
ext2
…
…
…
block files named by SHA-1
256KB4KB Page
ext2
…
…
…
…
map01.idx4ad36ffe8…974daf34a…2d34ff3e1…3310012a……
FUSEdriver
Same files
Reusable for FUSE
…
map02.idx4ad36ffe8…dd4daf34a…2d34ff3e1…3310012a……
block file
Partial Update of LBCAS
Update
apt-get install …
Create Once, Use Many
13
Apply LBCAS to Virtual and Real Machine
• Virtual Machine (easy way)– Advantage
• LBCAS can be passed to a virtual machine as bootable device.– Virtual Machine passes the control to the MBR of LBCAS.– Bootloader, kernel and initrd are included in LBCAS.
• The virtual devices are same on anonymous PC and the transferred OS only have to prepare the drivers for them.
– Disadvantage• The native performance of real device is not available.
– Especially VIDEO card, Network card cause problem.
• Real Machine– Advantage
• The native performance is available.– Disadvantage
• LBCAS is not recognized as a bootable device. The boot procedure must be customized.
• The devices on individual PC are different. Transferred OS must detect devices and setup suitable drivers.
14
OS Circular (Big Picture)
LBCAS(Loopback Content Addressable Storage)
Virtual Machine
KVM
Real Machine
QEMU
Internet
Construct Virtual Diskfrom block files
block files on HTTP Server
OS Suppliers(update timely)
UsersTry OS without installation
Recognized as a bootable device
Not recognized as a bootable device. Boot procedure must be customized.
15
Virtual Machine with LBCAS
16
Customization of Boot Procedure on Real Machine
• Customization on kernel and initrd– Kernel must recognize LBCAS as the device which includes Root File
System.
– To do so, the “initrd (initial ram disk)” must setup LBCAS.
• In order to setup LBCAS, initrd must setup Network.
• Customization on “init” process (after initrd)– “init” process have to setup a driver for each device on anonymous PC.
• Most LiveCD includes this function. (Ex: AutoConfig of KNOPPIX)
• Network card is the exception, because it was setup in the initrdand must keep for LBCAS.
17
GRUB Memukernel /boot/grub/linuxinitrd /boot/grub/initrd
udhcp ;Setup networkhttpstraged http://***/block.lst /media/lbcaslosetup /media/lbcas/KNOPPIX /dev/loop0mount /dev/loop0 /KNOPPIX
initAutoConfig detects devicesand includes suitable drivers except NIC
initrd
GRUB
initprocess
LBCAS
Internet
normal KNOPPIX boot
HTTP server
Boot Procedure on Real Machine
This part is replaced withkboot(kexe) or gPXE.
kboot(kexe)gPXE
Setup NetwrokDownload kernel and initrd
http://***/linuxhttp://***/initrd
Reboot with them
HTTP server
linux
initrd
Internet
Client PC only have to prepare gPXE or kboot(kexec)Whole image will be downloadable form Internet
18
Summary: Difference on Virtual and Real Machine
LBCAS(Loopback Content Addressable Storage)
Virtual Machine
KVM
Real Machine
QEMU
Internet
Construct Virtual Diskfrom block files
block files on HTTP Server
OS Suppliers(update timely)
UsersTry OS without installation
kernel & initrd
kernel & initrd
Bootloader download a kernel & initrdkboot(kexec), gPXETreat as a Bootable Device
initrd setup LBCAS for Root File System
kernel & initrd
19
Problem of Performance
• Disk image– LBCAS causes fragmentation because of block size mismatch
between File System and LBCAS.– The mismatch of “readahead (prefetch of page cache)” of
Linux kernel• Cause redundant download and unnecessary uncompress.
• Network Latency– LBCAS is sensitive for network latency
• Because many small files are downloaded on demand. The bandwidth expansion techniques (sliding window, multi-connection) are not used.
– Moka5 solves this problem with streaming download of disk image.
20
Optimization for Fragmentation
• Block size mismatch between file system and virtual block device causes fragmentation.– LBCAS 256KB
– File System (ext2) 4KB
– Kitagawa* reported the occupancy of requested blocks at boot time was 30% of LBCAS (on KNOPPIX 3.8.2).
• * [Linux Kongress 2006]
• “ext2optimizer” repacks the data blocks of ext2 file system to be in line.– It is based on the profile of accessed data blocks at boot time.
– As the results, ext2optimizer reduces the number of block files.
21Redundant block
Semantic Gap between readahead and LBCAS
• The coverage of readaread varies with the page-cache hit ratio.
readahead(4K~128K)
Files Blocks requested to disk access
LBCAS(256KB)
Blocks read Block files downloaded
Ext2/3 File System(4K)
①
②
③
④
Access Order
Cache missed and the coverage is shrunk
Hit Page-CacheOccupancy is low
22
Block Relocation: Ext2optimizer [LinuxKongress06]
Triple Indirect
Double Indirect
Indirect Blocks
Direct Blocks
Timestamps
Size
Owner info
Mode
• Change data blocks to be arranged in line. Structure of meta data is not changed.• The arrangement is based on the access profile at boot time.• Feature:
– Normal driver is used.– The fragmentation is occurred from the view of file– The relocation increases page-cache hit. readahead extend the coverage size.
Triple Indirect
Double Indirect
Indirect Blocks
Direct Blocks
Timestamps
Size
Owner info
Mode
23
Static Analyze by DAVL (Disk Allocation Viewer for Linux)
Original Fragmentation 0.09%
Ext2optFragmentation 0.27%
24
Dynamic Analyze: Disk Access at boot time
• Ext2optimizer relocate the data blocks for boot.
Address
Tim
e
25
The amount of requested and downloaded data at boot time
• The block size of LBCAS is changed to 64KB, 128KB, 256KB, and 512KB.
• Ext2optimzer reduces the amount of requested and downloaded data.
• Small block size is better in this case.– Big block size is better at long latency network, because small block size
requires many times of download.
Effect of Ext2Opt Effect of Compress
512KB
256KB
128KB
64KB
Effect of sufficiency
Ext2Opt makes small difference for the amount
Amount of requested data Amount of downloaded data
26
Optimization for download methods
• 2 optimizations– DLAHEAD (DownLoad AHEAD)
• The necessary block files are downloaded in advancewith extra download connections (default 4).
– [Preparation] Take a profile of downloaded block files at boot time.
– DNS-Balance• DNS-Balance is a kind of name resolver which suggests the
nearest server.• Users find the nearest download site automatically.
– It prevents inter-continental download, because we offers severs in EU, US, and Japan.
27
World Wide deployment
• Prepare some hosting services in the world.
London
Philadelphia
Montreal
Houston
Amsterdam
Copenhagen
JapanRing Servers
knoppix.inetboot.net(DNS Balance)
28Resolve select.inetboot.netby DNS-Balance(ns.inetboot.net).
XXX.168.0.10
YYY.10.0.19
DNS-balance
(select.inetboot.net)
(XXX.168.0.10)
(select.inetboot.net)
(YYY.10.0.19)
Search for suitable download server
Client
Web server for Block Files
• DNS Balance (knoppix.inetboot.net) suggest a suitable IP Address of server (North America, EU, Asia)
29
Current Available OSes on OS Circular
• On real machine– KNOPPIX 4.0.2, 5.0.1, 5.1.1
• KNOPPIX is advanced at AutoConfig and applied to any PC.
– The kernel and initrd are downloadable. “gPXE” , “Kboot” and “kexec” can boot them.
• On virtual machine– Plan9 and NetBSD
• on Xen 2.0.3 DomU (para-virtualization)– The detail is presented at Ottawa Linux Symposium 2006
– Debian Etch, Ubuntu6.06/6.10/7.04, CentOS5
• on Xen-HVM/KVM/QEMU (full-virtualization)
30
LBCAS for Sony PlayStation3 Linux
• PlayStation3 has "kboot“ on 4MB Flash– kboot can get “kernel” and “initrd” via HTTP.
• The disk image is obtained by LBCAS.
31
Summary
The some services are available. Just try!
http://openlab.jp/oscircular/
Special Thanks
DAVL developers
http://sourceforge.net/projects/davl/
EXT2Optimizer developers
http://unit.aist.go.jp/itri/knoppix/ext2optimizer/index-en.htm