[ACM Press the 33rd annual ACM SIGUCCS conference - Monterey, CA, USA (2005.11.06-2005.11.09)]...

4
Diskless Linux System with Unionfs for an Educational Computer Center Hideo Masuda Kyoto Institute of Technology Matsugasaki, Sakyou, Kyoto 606-8585, JAPAN +81 75 724 7956 [email protected] Akinori Saitoh Tottori University of Environmental Studies 1-1-1 Wakabadai-kita, Tottori, 689-1111, JAPAN [email protected] Michio Nakanishi Osaka Institute of Technology 1-79-1 Kitayama, Hirakata, Osaka 573-0196, JAPAN [email protected] Seigo Yasutome Osaka Institute of Technology 1-79-1 Kitayama, Hirakata, Osaka 573-0196, JAPAN [email protected] ABSTRACT The total cost of ownership is a major concern for computer centers that maintain hundreds of PCs. From our experience, it would be most important to reduce hardware faults, particularly the troubles of hard disk drives. This paper describes how we built our educational computer system using diskless PCs. Before constructing our system, the following items were listed as the objectives of client PCs: (1) run Linux, (2) work without local HD, (3) deploy a single OS image to all PCs, (4) be easy to update the OS image, (5) be applicable to as many device configurations as possible, (6) become clients for unix servers (Linux, Solaris, AIX, etc.), and (7) use open sources. We chose Vine Linux 3.1, which is a derivative of RedHat Linux and is well organized for Japanese language environment. To satisfy (3) and (5), we introduced unionfs which has stackable file system feature. The system directories and files dependent on each PC such as /tmp, /dev, /var, /etc/mtab, and /etc/fstab are generated as memory files systems in the boot procedure, and are stacked on the common OS image using unionfs. Since this stack mechanism was implemented only by adding some hook codes to /etc/rc.d/rc.sysinit, our idea, we believe, is applicable to other Linux distributions. Our educational computer system consists of 8 servers, 12 boot servers and 500 diskless PCs, and is now in operation. Categories and Subject Descriptors D.4.7 [SOFTWARE]: Operating Systems – Organization and Design General Terms Design, Management, Experimentation. Keywords Linux, Diskless, Single image, Educational Computer System, Vine Linux. 1. INTRODUCTION We talk about the new educational computer system in OSAKA University, JAPAN. OSAKA University has two campuses, Toyonaka campus and Suita campus, 7 km far away. Our educational computer system consists of 20 servers and over 900 PCs. The total cost of ownership is a major concern for computer centers that maintain hundreds of PCs [1]. In this paper, we discuss how we built our educational computer system using diskless Linux PCs [4]. From our experience, it would be most important to reduce hardware faults, particularly the troubles of hard disk drives. Diskless technology is good from the following point of view: Reducing fault rate because PC runs without HD which is faulty device. No need of (re-)install after replacing faulty HD. Update the OS image only server side. Some educational computer systems use diskless computer in operation. For example, diskless Windows and Linux (dual boot) by VID [2], diskless MacOS X by NetBoot [3] are both proprietary products which require special servers and a license fee, making them costly to use. Before constructing our system, the following items were listed as the objectives of client PCs: (1) run Linux, Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. SIGUCCS’05, November 6-9, 2005, Monterey, California, USA. Copyright 2005 ACM 1-59593-173-2/05/0011...$5.00. 207

Transcript of [ACM Press the 33rd annual ACM SIGUCCS conference - Monterey, CA, USA (2005.11.06-2005.11.09)]...

Diskless Linux System with Unionfs for an Educational Computer Center

Hideo Masuda Kyoto Institute of Technology Matsugasaki, Sakyou, Kyoto

606-8585, JAPAN +81 75 724 7956

[email protected]

Akinori Saitoh Tottori University of Environmental

Studies 1-1-1 Wakabadai-kita, Tottori,

689-1111, JAPAN

[email protected]

Michio Nakanishi Osaka Institute of Technology

1-79-1 Kitayama, Hirakata, Osaka 573-0196, JAPAN

[email protected]

Seigo Yasutome Osaka Institute of Technology

1-79-1 Kitayama, Hirakata, Osaka 573-0196, JAPAN

[email protected]

ABSTRACT The total cost of ownership is a major concern for computer centers that maintain hundreds of PCs. From our experience, it would be most important to reduce hardware faults, particularly the troubles of hard disk drives. This paper describes how we built our educational computer system using diskless PCs. Before constructing our system, the following items were listed as the objectives of client PCs: (1) run Linux, (2) work without local HD, (3) deploy a single OS image to all PCs, (4) be easy to update the OS image, (5) be applicable to as many device configurations as possible, (6) become clients for unix servers (Linux, Solaris, AIX, etc.), and (7) use open sources. We chose Vine Linux 3.1, which is a derivative of RedHat Linux and is well organized for Japanese language environment. To satisfy (3) and (5), we introduced unionfs which has stackable file system feature. The system directories and files dependent on each PC such as /tmp, /dev, /var, /etc/mtab, and /etc/fstab are generated as memory files systems in the boot procedure, and are stacked on the common OS image using unionfs. Since this stack mechanism was implemented only by adding some hook codes to /etc/rc.d/rc.sysinit, our idea, we believe, is applicable to other Linux distributions. Our educational computer system consists of 8 servers, 12 boot servers and 500 diskless PCs, and is now in operation.

Categories and Subject Descriptors D.4.7 [SOFTWARE]: Operating Systems – Organization and Design

General Terms Design, Management, Experimentation.

Keywords Linux, Diskless, Single image, Educational Computer System, Vine Linux.

1. INTRODUCTION We talk about the new educational computer system in

OSAKA University, JAPAN. OSAKA University has two campuses, Toyonaka campus and Suita campus, 7 km far away. Our educational computer system consists of 20 servers and over 900 PCs.

The total cost of ownership is a major concern for computer centers that maintain hundreds of PCs [1]. In this paper, we discuss how we built our educational computer system using diskless Linux PCs [4]. From our experience, it would be most important to reduce hardware faults, particularly the troubles of hard disk drives. Diskless technology is good from the following point of view:

Reducing fault rate because PC runs without HD which is faulty device.

No need of (re-)install after replacing faulty HD.

Update the OS image only server side.

Some educational computer systems use diskless computer in operation. For example, diskless Windows and Linux (dual boot) by VID [2], diskless MacOS X by NetBoot [3] are both proprietary products which require special servers and a license fee, making them costly to use.

Before constructing our system, the following items were listed as the objectives of client PCs:

(1) run Linux,

Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. SIGUCCS’05, November 6-9, 2005, Monterey, California, USA. Copyright 2005 ACM 1-59593-173-2/05/0011...$5.00.

207

(2) work without local HD,

(3) deploy a single OS image to all PCs,

(4) be easy to update the OS image,

(5) be applicable to as many device configurations as possible,

(6) become clients for unix servers (Linux, Solaris, AIX, etc.),

(7) use open sources.

2. BASIC DESIGN 2.1 Bootstrap

We assumed that every PC has PXE (Pre Execution Environment) feature. PXE is a built-in program in the NIC (Network Interface Card), and can load OS bootstrap code from the network without a local HD. PXE requires dhcp server and tftp server in the same network segment. This dhcp server offers the special response for PXE, ISC’s (Internet Software Consortium) dhcpd is OK. If pxe service daemon runs in the same network segment, we can use boot selector, so multiple OS image can be chosen.

2.2 Location of the main OS images There are two methods for locating the root partition, (a) on

the RAM disk and (b) on the NFS. We chose to use the root partition on the NFS because we wanted to provide as much free memory available as possible for office applications. This choice implies an NFS server is required for providing OS image.

Once a root partition locates on the NFS, all applications become network connection dependent. If the network goes down, the PC is hung. To cope with this problem, we chose to use a software watchdog. Linux kernel already has a software watchdog feature; the Linux kernel will reboot if /dev/watchdog is not accessed by any process within 60 seconds. We implemented the special daemon for automatic reboot so that reading a file from NFS writes the device continuously.

2.3 For a single OS image Under the UNIX-like OS, /etc includes host dependent

configuration files, /var includes host dependent status and log files, so /etc and /var cannot be shared by all PCs, i.e., /etc and /var are distinct file systems from single OS image. But some files on /etc can or should be shared on the same system; for example, /etc/resolv.conf and /etc/nsswitch.conf and so on. Moreover, some files on /var can or should be shared on the same system, for example, /var/lib/rpm includes installed application information: these files are identical on the system. If /etc and /var are completely independent file systems between PCs, there are useless copies and an increase in the synchronous update cost. As a result, all files and directories except a part of /etc, and much of /var, /dev and /tmp can be shared on the single OS image exported by the read-only NFS.

Figure 1. File system mounting structure

3. IMPLEMENTATION We have implemented our system based on Vine Linux 3.1.

Vine Linux is a derivative of RedHat Linux and is well organized for Japanese language environment.

3.1 Kernel We made a configuration and built the Linux kernel with

following required options:

# for NFS root CONFIG_ROOT_NFS=y CONFIG_IP_PNP=y CONFIG_IP_PNP_DHCP=y CONFIG_{ETHERNET DEVICE WE USE}=y # for watchdog CONFIG_WATCHDOG=y CONFIG_SOFT_WATCHDOG=y

For PXE boot loader, we chose pxelinux because Vine Linux uses lilo and syslinux.

3.2 Changes for the distribution Our system requires some procedures before normal startup

in the boot up sequence that is generating /tmp, /dev, /etc and /var.

After the Linux kernel has been loaded, /sbin/init runs at first. /sbin/init forks some processes according to /etc/inittab. In the /etc/inittab on Vine Linux, Red Hat Linux and Fedora Project, /etc/rc.d/init.d/rc.sysinit run at first as a boot up sequence. As a result of our investigation, mounting a file system is performed in this script, so we decide to add the hooks to it.

Each subdirectory that is not shared is organized as shown in Figure 1.

Unionfs is one of a stackable file system in which two or more directories are joined into one directory. Unionfs joins shareable files on the single (read-only) OS image and rewritable files on the memory file system. The directory /etc is stacked as 2Mbytes memory file system over the single OS image by unionfs and generates host dependent files as shown in Table 1 by a new hook mechanism called /etc/rc.diskless.

208

Figure 2. System configuration for updating the OS image

Table 1. The host dependent files and directories on /etc

File/directory Reasons

fstab Rewrite the definitions of removable media devices by murasaki (hotplug) or kudzu.

modules.conf Rewrite the definitions of hardware devices by kudzu.

murasaki /murasaki.preload

Rewrite the definitions of hot pluggable devices by kudzu.

X11/xorg.conf

Generate the special settings for X Window System. Auto-configuration of X may work in some environments, but some additional settings such as Keyboard, Mouse, and VNC cannot be inflected by Auto-configuration.

mtab

Rewrite by mounting especially user mount feature. In some diskless HOW-TO, mtab should be made as symbolic link on to /proc/mounts, but it is disabled the user mount feature

sysconfig/ General settings

printcap Rewrite the definitions of printers depend on the location. Users can only access the printers in the same room.

Table 2. The host dependent files and directories on /var

File/directory Reasons

cache/man/whatis In the single OS image, manual set is also identical.

lib/rpm/ In the single OS image, rpm list is also identical.

.

As for /var, we use a more sophisticated hook procedure in /etc/rc.diskless. First, it performs a mount of the device in the /etc/fstab onto /var. Note that initial /etc/fstab was generated previously in /etc/rc.diskless. Basically /var is on the NFS server to make as large a free memory space as possible, and generates

shareable files or directories as symbolic links onto the single OS image (Table 2). If the mount operation fails, it mounts a memory file system whose size is half of the main memory (this is a default value of mount.mfs(1)) onto /var and generates preliminary files, directories and symbolic links according to the reference of /var in the single OS image. Then /etc/syslog.conf is re-generated for remote logging. This procedure is for fault-tolerance against the server mis-configuration. Moreover, /var is the host dependent directory, but these files and directories under /var are changed according to the update of the single OS image. So, in the hook, /var is updated according to the reference of /var in the single OS image.

For /dev, Linux kernel has a devfs feature which is a dynamic special file generating system, but many Linux distributions do not use this feature as default and some trouble may occur from it. So we chose that it mounts a 1 MByte memory file system onto /dev after generating the special files run by new hook called /etc/rc.MAKEDEV using --bind feature. This was inspired by NetBSD /dev creation. /dev on the single OS image contains the smallest set possible, at least /dev/console, /dev/initctl, /dev/null and /dev/tty0 on the single OS image.

/tmp is on the memory file system, and we only need to mount it.

In our approach, only these two hooks in the one file (/etc/rc.d/init.d/rc.sysinit) are required for our diskless Linux system, so updating the original distribution (e.g. applying the security fixes) is very easy.

3.3 Auto-configuration of hardware In Vine Linux, Kudzu is used for the auto-configuration of

hardware. It runs in boot up sequence, and generates the files on /etc according to the PC configuration. As many PCs in an educational computer system have almost identical hardware configurations, Kudzu, which usually takes a long time to run, is not necessary for every boot up sequence. The configuration file has sets of pre-defined known hardware configurations and some additional information that are used by dmesg to decide whether or not to run Kudzu. If there is no suitable set of configuration files for some dmesg, Kudzu starts normal processing in boot up sequence.

209

3.4 Updating the OS image As Figure 2 shows, one PC (Maintenance Machine) mounts

the master OS image (Distribution Original Tree) with a read-write mode and makes the updates by Apt which is one of the package maintenance tools. Then the boot server makes a new OS image (Merged Tree) and an archive of differences. Each client PC checks the updated time of its OS image (Main Tree) periodically. If the PC finds that its OS image is updated, the PC invokes the reboot procedure when no login users exist.

4. RESULTS 4.1 Boot up time

We evaluated the system from the view of boot up time in our experimental environment (Table 3). We define the boot up time to be from the PXE start time until a login panel appears on its X window system. The experimental result shown in Table 4 indicates that there is not a significant difference of boot up times between a diskless Linux system and a normal Linux system booted from local HD; therefore we believe that a diskless Linux system is suitable for an educational computer system.

Table 3. Environment of the evaluation

Server

SGI Origin300 R10000 600MHz x4 4GB mem, 500GB (FC,RAID5) 1000baseSX

PC1 IBM Intellistation PentiumIII 500MHz, 128MB mem 100baseTX (Intel PRO/100B)

PC2 DELL Optiplex GX260 Pentium4 2.6GHz, 512MB mem 1000baseT (Intel PRO/1000MT)

Network Summit1i, Summit24

Table 4. Boot up time

Diskless Local HD

PC1 90 seconds 75 seconds

PC2 75 seconds 70 seconds

4.2 Note of codes We chose Vine Linux 3.1 as our base Linux distribution, and

our small modifications are performed for our diskless system using unionfs. In our approach, we make only small modifications of Vine Linux distribution, and we believe that these modifications are easily applied to other Linux distributions.

5. LESSONS LEARNED In this paper, we described how we built our educational

computer system using diskless PCs for reducing the TCO. We chose Vine Linux 3.1 as our base Linux distribution, and small modifications were performed for our diskless system using

unionfs. Now our system is in operation from March 2005. We will evaluate how much this approach reduces the TCO, how to measure the availability against the boot server failures, if the performance is sufficient and what else is required for boot server.

6. ACKNOWLEDGMENTS We would like to thank Daisuke Suzuki, Akira Yamada and

Koji Matsubayashi (Vine Caves, Ltd.).

This research was supported in part by Grant-in-Aid for Scientific Research (No. 17500050), Ministry of Education, Science and Culture, Japan.

7. RESOURCES Diskless Linux Mini Howto

http://www.linux.or.jp/JF/JFdocs/archive/Diskless-HOWTO.html

Preboot Execution Environment (PXE) Version 2.1

ftp://download.intel.com/labs/manage/wfm/download/pxespec.pdf

ISC DHCPD

http://www.isc.org/sw/dhcpd/

Tim Hurman, PXE daemon

http://www.kano.org.uk/projects/pxe/

Vine Linux,

http://www.vinelinux.org/

Virtual Image Disk

http://www.mintwave.co.jp/tc/vid.html

Apple NetBoot

http://www.apple.com/jp/server/macosx/netboot.html

8. REFERENCES [1] Hideo MASUDA (editor): ``The Large Scale Educational

Computer Systems'', IPSJ MAGAZINE, Vol.45, No.3 (Mar 2004), 225-281 (in Japanese).

[2] Shin-ichi TADAKI, Hirofumi ETO, Kenzi WATANABE and Yoshiaki WATANABE: ``Diskless Dual Boot Terminal System'', IPSJ MAGAZINE, Vol.45, No.3 (Mar 2004), 250-254 (in Japanese).

[3] Koji ANDO and Tetsuro TANAKA: ``MacOS X for Large Scale Educational Computer Systems'', IPSJ MAGAZINE, Vol.45, No.3 (Mar 2004), 243-246 (in Japanese).

[4] Hideo MASUDA and Akinori SAITOH: ``Implementation of Linux system for educational computer system using diskless technology'', a Symposium of IPS DSM2004 (Dec 2004), 87-92 (in Japanese).

210