[IEEE 2009 6th IEEE Consumer Communications and Networking Conference (CCNC) - Las Vegas, Nevada,...

5
AbstractMobile device manufacturers are facing the challenge, to maintain device drivers across an increasing number of product variants. Isolation of device drivers into a separate domain using virtualization technology offers a way out of the resulting porting dilemma. This paper explains the architecture concept based on the L4/Fiasco microkernel, using an exemplary block device. The paper specifically details challenges and solutions with respect to enabling DMA access and ensuring access protection in the virtualized environment. Benchmark results are given to show the performance impact of the chosen architecture. Index Terms—Embedded Platform, Device driver, Virtual machine I. INTRODUCTION Mobile communication has become the dominant branch in communication business over the last decade which is reflected in a total number of sold handsets exceeding the 1-billion-threshold in 2007 [1]. At the same time, the number of new products and products variants has increased tremendously, e.g., between 2006 and 2007 more than 80 products have been introduced to the market [2]. Unfortunately, there is not this “one platform architecture”, which is used across all products. This means that porting and maintaining the software stack is necessary, albeit a huge effort for the manufacturer. In order to look at the problem in a simplified manner, we consider that products are build out of any combination of one out of O operating systems, one out of P Core core platforms or platform variants (P Variant ), which contain D Core device components or their variants (D Variant ). Variants from core components are considering implementation differences. We assume that the number of core components is far less than the number of their variants. In our simplification, we multiply this to an overall engineering effort, which scales with: ( )( ) Variant Core Variant Core D D P P O + + . Sebastian Sumpf is Research Intern with Nokia Research Center., 955 Page Mill Road, Palo Alto, CA 94304 USA (e-mail: Sebastian.sumpf @nokia.com ) Jörg Brakensiek is Principle Member of Research Staff with Nokia Research Center , 955 Page Mill Road, Palo Alto, CA 94304 USA (e-mail: [email protected] ). To manage this challenge, rigorous architecture management principle, hardware abstraction and modularization of the device architecture have been considered [4]. While this helps on the application domain the challenge for the device driver remains. In current software architectures, all components are tightly linked. Virtualization is promising to decouple and isolate multiple OS environments running on top of a Virtual Machine Monitor (VMM) as shown in Figure 1. Virtual Machine Monitor (VMM) VM 1 Linux HW Platform VM 2 Symbian VM 3 Device Driver VM 4 Nokia VM 5 User Figure 1: Multiple Virtual Environments Decoupling the application OS from isolated device driver domains, should drastically reduce the engineering effort, the porting effort now scales with: ( )( ) Core Variant Core Variant Core Core D O D D P P P O + + + + The first term addresses that each operating system remains being ported to the core platform. The second term considers that for each platform, all device drivers are subject to change. The third term finally takes into account the remaining effort to port the virtual-device- driver stubs for different operating systems. We assume that these stubs are not affected from any device variant, as their interfaces are (ideally) hardware independent. It has to be noted that we do not consider the porting effort of the VMM to the core platform, as this is done either from the platform vendor or from the VMM provider. Because device drivers are now decoupled from the application OS, their porting effort can be more easily handled from the actual component provider. This can further reduce the porting effort, enabling the manufacturer to focus on the core parts linked to the application OS: ( ) Core Core D P O + . The following table compares the porting effort for a Sebastian Sumpf, Jörg Brakensiek Device Driver Isolation within Virtualized Embedded Platforms 978-1-4244-2309-5/09/$25.00 ©2009 IEEE 1

Transcript of [IEEE 2009 6th IEEE Consumer Communications and Networking Conference (CCNC) - Las Vegas, Nevada,...

Abstract— Mobile device manufacturers are facing the

challenge, to maintain device drivers across an increasing number of product variants. Isolation of device drivers into a separate domain using virtualization technology offers a way out of the resulting porting dilemma. This paper explains the architecture concept based on the L4/Fiasco microkernel, using an exemplary block device. The paper specifically details challenges and solutions with respect to enabling DMA access and ensuring access protection in the virtualized environment. Benchmark results are given to show the performance impact of the chosen architecture.

Index Terms—Embedded Platform, Device driver, Virtual machine

I. INTRODUCTION Mobile communication has become the dominant

branch in communication business over the last decade which is reflected in a total number of sold handsets exceeding the 1-billion-threshold in 2007 [1].

At the same time, the number of new products and products variants has increased tremendously, e.g., between 2006 and 2007 more than 80 products have been introduced to the market [2]. Unfortunately, there is not this “one platform architecture”, which is used across all products.

This means that porting and maintaining the software stack is necessary, albeit a huge effort for the manufacturer.

In order to look at the problem in a simplified manner, we consider that products are build out of any combination of one out of O operating systems, one out of PCore core platforms or platform variants (PVariant), which contain DCore device components or their variants (DVariant). Variants from core components are considering implementation differences. We assume that the number of core components is far less than the number of their variants. In our simplification, we multiply this to an overall engineering effort, which scales with:

( ) ( )VariantCoreVariantCore DDPPO +⋅+⋅ .

Sebastian Sumpf is Research Intern with Nokia Research Center., 955 Page

Mill Road, Palo Alto, CA 94304 USA (e-mail: Sebastian.sumpf @nokia.com) Jörg Brakensiek is Principle Member of Research Staff with Nokia

Research Center , 955 Page Mill Road, Palo Alto, CA 94304 USA (e-mail: [email protected]).

To manage this challenge, rigorous architecture management principle, hardware abstraction and modularization of the device architecture have been considered [4]. While this helps on the application domain the challenge for the device driver remains. In current software architectures, all components are tightly linked.

Virtualization is promising to decouple and isolate multiple OS environments running on top of a Virtual Machine Monitor (VMM) as shown in Figure 1.

Virtual Machine Monitor (VMM)

VM 1 Linux

HW Platform

VM 2 Symbian

VM 3 Device Driver

VM 4 Nokia

VM 5 User

Figure 1: Multiple Virtual Environments

Decoupling the application OS from isolated device

driver domains, should drastically reduce the engineering effort, the porting effort now scales with:

( ) ( ) CoreVariantCoreVariantCoreCore DODDPPPO ⋅++⋅++⋅ The first term addresses that each operating system

remains being ported to the core platform. The second term considers that for each platform, all device drivers are subject to change. The third term finally takes into account the remaining effort to port the virtual-device-driver stubs for different operating systems. We assume that these stubs are not affected from any device variant, as their interfaces are (ideally) hardware independent.

It has to be noted that we do not consider the porting effort of the VMM to the core platform, as this is done either from the platform vendor or from the VMM provider.

Because device drivers are now decoupled from the application OS, their porting effort can be more easily handled from the actual component provider. This can further reduce the porting effort, enabling the manufacturer to focus on the core parts linked to the application OS:

( )CoreCore DPO +⋅ .

The following table compares the porting effort for a

Sebastian Sumpf, Jörg Brakensiek

Device Driver Isolation within Virtualized Embedded Platforms

978-1-4244-2309-5/09/$25.00 ©2009 IEEE 1

number of configurations.

Table 1: Porting effort for different configurations (in % of original effort) In addition to the reduction of the porting effort,

device driver isolation is addressing three additional challenges in heterogeneous operating environments: 1) Drastically reduce the code size (i.e., memory costs)

in case of multiple OS environments (e.g., Symbian and Linux [4]), by sharing the device driver frame works (e.g., accounting for 52% of the Linux kernel code [11]).

2) Reuse legacy drivers from operating systems, other than the application OS.

3) Increase application domain stability by decoupling it from potential device driver failures.

II. ISOLATED DEVICE DRIVER REFERENCE ARCHITECTURE

A. Proof of concept In order to evaluate if isolated device drivers meet

performance and security needs, a reference architecture has been defined, as described throughout the remainder of this document.

Our system setup features a device driver VM (DD-VM in Figure 1), which has direct access to the system hardware (I/O registers or I/O ports) using a native-device driver. The VMM is configured to redirect hardware interrupts of the device to the DD-VM. An application VM (App-VM) contains special device drivers, which communicate directly with the DD-VM in order to issue device requests. This setup essentially displays a client–server relationship. Note that the guest operating systems might be heterogeneous (as suggested in Figure 1), so the device driver interface of the App-VM might be different from the one the device DD-VM implements. From this fact we conclude, that the essential part both parties have to agree upon, is the communication interface between App- and DD-VM ([8] tries to define a uniform solution to this problem). This conclusion also defines the requirements to be solved by the interface design: 1) The App-VM driver (front-end driver) must be able

to issue device requests to the DD-VM in an

asynchronous fashion. 2) The DD-VM must be able to notify the App-VM

about finished device requests also in an asynchronous way.

3) Remote access to devices should be controlled and enforced by the DD-VM.

B. Platform description

Figure 2: Reference Architecture Setup using L4 primitives

Our reference implementation uses an L4 microkernel

as VMM (L4/Fiasco [18]). Microkernels offer only a very limited number of system calls. The relevant system call, regarding our implementation scenario, is called IPC (Inter-Process Communication). IPC is used to transfer data between different address spaces, map memory pages from one address space to another (this is essentially page table manipulation), and as a vehicle for interrupt delivery to user processes. IPC fulfills criterions 1) and 2) of Section II.A.

We use L4Linux as VM environment, which is a para-virtualized version of Linux, executing directly on top of the microkernel. L4Linux is deployed as both DD-VM as well as App-VM (Figure 2).

III. DESIGN Having briefly introduced the IPC mechanism in the

previous chapter, we examine the requirements of the front- and back-end driver next. Up to the current date and despite industry efforts [17], operating systems do not define a uniform device driver interface. Rather devices are divided into classes (e.g., block devices, character devices, network devices, etc.). Because of the heterogeneous nature of these device classes, the IPC communication channel between VMs, has to be implemented separately for each device class. For our reference implementation we choose to provide a block device driver interface with the aim to support all block devices currently available for Linux. Just to give an idea that this approach is feasible, together the client and the server code hardly exceed 1000 LOC.

O = 3

PCore = 1 PVariant = 5 DCore = 10

DVariant = 20

PCore = 5 PVariant = 20 DCore = 10

DVariant = 20

PCore = 1 PVariant = 5 DCore = 20

DVariant = 80 Original effort 540 2250 1800 Device driver

isolation 213

(39 %) 795

(35 %) 663

(37 %) Core device

driver 33

(6.1%) 45

(2.0%) 63

(3.5%)

2

A. The Linux block layer The basic unit of block data transfer in Linux is the

block input–output (BIO) structure. A BIO contains information about the associated device, device sectors, as well as information about memory pages used for data transfer. A block device driver usually does not work with BIOs directly. Rather the driver allocates and registers a request queue for each device. In this case Linux builds requests by gathering one or more BIOs per request and inserts them into the device’s request queue (Fig. 3). Exploiting this method, Linux can take advantage of BIO re-ordering (e.g., using the elevator algorithm to group BIOs to minimize hard disc position and seek times).

Figure 3: Relationship between block requests, BIOs, and memory

B. The front-end driver The front-end driver is registered within the App- VM.

As mentioned in criteria 1), its main task is to issue block requests to the DD-VM. It is implemented as a stub driver.

Linux optimizations regarding request queues and the lack of knowledge within the driver of what type of hardware device is actually accessed on the remote side do not allow us to allocate request queues for remote devices within the front-end driver (Section II.C). Rather, the driver works directly with BIO structures.

Upon receipt of a BIO request from Linux, the front-end driver starts to extract necessary information (device identifier, sector number, request type, etc.) from the BIO structure and forwards this information via IPC to the DD-VM. Additionally, guest physical addresses of memory pages described within the BIO structure need to be determined and are forwarded as well. At this point the BIO request is marked pending and stored inside the front-end driver. In order to be able to distinguish different BIO requests, a unique identifier is assigned to each BIO.

If the BIO request is successful or failed, the DD-VM notifies the front-end driver, again via IPC, using the provided BIO identifier. The front-end driver determines the correct BIO and marks it as processed, thus ending the I/O request within the application Linux.

Figure 4: Relationship between block requests, thread, and IPC. Note the back-end acts as a server, whereas front-end acts as client.

Asynchronous notification is achieved by starting a

Linux independent signal thread within the App-VM that is able to interrupt the Linux thread. The counterpart of the signal thread is executed in the DD-VM and exposes the remote communication interface. Figure 4 displays the actual client–server interaction. Client requests are always issued by the Linux kernel thread and are received by the DD-VM’s interface thread. In turn the DD-VM always communicates with its client through the signal thread. Note that the interface both of these threads expose is inherently Linux independent.

C. The back-end driver Since one of our goals is device driver code re-useage,

the DD-VM does not implement a stub driver. Alternatively, Linux starts the interface thread (Figure 4) during boot up, thus enabling the IPC communication interface. Block device drivers may (locally) register their devices using a simple library call. There is no need for additional changes within a device driver, because the IPC communication as well as the request generation is entirely handled by the interface thread. Registered devices can now be imported by eligible clients.

Upon receipt of a block request issued from the front-end driver, the DD-VM first applies a security check (Section: II.E), where a failure simply translates into an I/O error at the front-end side. On success the DD-VM generates a new block request. This is done in the following manner: In the first step, we allocate a new BIO data structure and enter the vital I/O information received by the block request. Next, front-end provided guest physical memory pages are mapped into the DD-VM’s address space using IPC calls to the front-end’s signal thread (Figure 5), making these pages accessible from both domains.

In the final step, we add the resulting local page addresses to the BIO descriptor and append the BIO to the request queue of the appropriate block device. This is

3

the reason why we choose to handle raw BIO structures in the front-end driver (Section II.B), as request optimizations are now handled in the DD-VM only, where there is explicit device driver knowledge of the underlying system hardware.

Figure 5: Mapping a memory page in address space A to address space B.

To be able to signal I/O completion to a client, we use

a custom BIO destructor (Figure 4).

D. Support for Direct Memory Access (DMA) DMA requires the knowledge of (host) physical

addresses. Therefore, the driver VM must be able to obtain a host physical address of a guest physical address provided from the front-end driver. Even though, application VMs may have knowledge of these translations, it is unsafe to trust them because an evil VM could provide an arbitrary host–guest address mapping, causing a device to overwrite any physical memory location within the system.

We support DMA by taking advantage of the L4 memory management system. One of the basic concepts in L4 is that there is no policy inside the kernel [13]. This postulation especially holds in the case of memory management, which in L4 is entirely handled by so called data space managers [7] in user space. Memory allocation is performed by requesting data containers (i.e., data spaces) of the desired size at a data space manager. A task owning a data space can share its data space access rights with other tasks.

We use the data space sharing mechanism to obtain host physical addresses within DD-VM. Each front-end driver shares its main memory data space with the DD-VM in such a way, that the DD-VM can neither read nor write any data from or to this data space. On the other hand, the DD-VM can very well query the underlying data space manager to obtain any host physical address within the given data space.

E. Security considerations Multiple VMs should be able to connect to the remote driver. This property is achieved by using L4’s open wait IPC semantic at the interface thread. Open wait implies that every task in the system can send an IPC message to the waiting thread. We implemented a small access control list at the back-end side, where access

rights (none, read, write, read–write) can be defined on a per VM basis. These rights may be specified using device partition granularity.

The front-end driver keeps track of memory pages that are mappable to a DD-VM. Only, pages that are used in ongoing block device operations are allowed to be mapped by a DD-VM.

As described in [8] device drivers cannot be prevented from writing data to arbitrary host physical memory through the means of DMA. This problem can right now only be solved using an IOMMU [6], where additional address translation and access checks take place.

IV. EVALUATION In this section we explore the performance impact of

the isolated driver model by executing two benchmarks and compare the results with native driver execution in L4Linux. Most of the performance loses will be due to additional address space switches between App- and DD-VM, IPC overhead, and page mapping operations. We expect these latencies to be small compared to the cost of I/O operations.

Our benchmarking hardware consists of an Intel Atom platform, featuring 512 MB of memory. As block device we use a 4GB Micro-SD adapter with a FAT32 file system.

Tiobench [16] measures low level disk performance through block read–write operations in both sequential and random order.

The results are displayed in Table 2. We used a block size of 8 KB, a file size of 256 MB for read–write operations. App-VM was configured with 64 MB of main memory, while DD-VM was given 32 MB, which is due to the used ramdisk size of 16 MB and may be reduced even further.

Sequential Random

Throughput [MB/s]

Latency [ms]

Throughput [MB/s]

Latency [ms]

Native Read 8.93 3.41 3.29 9.20 Remote Read 8.71 3.54 1.61 19.26 Native Write 2.58 11.77 0.1 290.25 Remote Write 2.55 11.94 0.05 539.90

Table 2: TIOBENCH Results

As can be observed, the sequential read–write throughput of our remote driver reaches more than 95 percent of the performance of the native driver. Even though, the bandwidth of random write operations is negligibly small, we are still concerned about the random read, which only achieves half of the L4Linux native-driver’s performance.

IOzone [10] is a filesystem benchmark tool, which is able to measure a broad variety of file operations. We

4

execute IOzone with the same configuration as used for tiobench (Table 3). Write [MB/s] Read [MB/s]

Sequential Random Sequential RandomNative 3.76 MB/s 0.14 9.15 5.89 Remote 3.75 MB/s 0.07 9.13 0.57

Table 3: IOzone Results

The random-read performance of the remote driver decreased even further and reaches only 10% of the native-driver performance. It is obvious that in the random-read case, the amount of data to transfer per BIO, is for most requests just one block of data, whereas sequential reads mainly issue multiple adjacent block requests within a single BIO. After excessive tracing we found that for processing a single block request the exposed native-driver latency is about 1.5 ms compared to ~10 ms (in some cases even 20 ms) using the remote-driver. Further investigation showed that these latencies are caused by the microkernel’s scheduler. After App-VM issues a BIO request, one or more scheduling periods (10 ms in L4/Fiasco) may pass by until the DD-VM becomes executed. Because large block requests are processed slower by the SD card device, the additional latency does not show much performance impact for sequential reads. On the other hand, reading a single block of data is at least six times faster than the smallest scheduling latency, thus explaining the measured random-read performance. It may be noted that this observation also holds for write operations, which are slower than reads operations for the used SD device. Since sequential and random I/O operations represent the two possible extremes, we expect real-life performance to be somewhere in between.

V. RELATED WORK The VMware Workstation [15] multiplexes the

processor between two collaborating operating system using world switches. Fluke [14] aims at supporting unmodified Linux device drivers running on top of a VMM as user-level applications. Pistachio [12], VirtualLogix [3], and Xen [8], implement driver isolation and re-usage by virtual machines.

VI. CONCLUSION AND OUTLOOK This paper presented a reference implementation of an

isolated device driver environment. We demonstrated, as shown by our benchmark studies, how an application OS can be separated from an isolated driver, and determined the actual performance penalties introduced by this approach.

We do expect that future hardware support

mechanisms will specifically simplify the security issues with respect to DMA capable devices.

Also we still need to verify this approach within a heterogeneous operating system setup.

REFERENCES [1] Nokia Press Release, October 18, 2007; http://www.nokia.com/results/

results2007Q3e.pdf [2] http://en.wikipedia.org/wiki/List_of_Nokia_products [3] F. Armond, M. Gien, G. Maigné, and G. Mardinian. Shared Device

Driver Model for Virtualized Mobile Handsets. In Proceedings of the 2008 Workshop on Virtualization in Mobile Computing (MobiVirt 2008), Breckenridge, 2008

[4] J. Brakensiek, T. Eriksson, R. Suoranta, P. Liuha, “Modular Service based Platform Architecture for Mobile Devices”; WWRF Shanghai 2006.

[5] J. Brakensiek, A. Dröege, H. Härtig, A. Lackozynski, "Virtualization as an Enabler for Security in Mobile Devices",1st Workshop on Isolation and Integration in Embedded Systems (IIES'08), Glasgow, Scotland, UK, 01.04.2008 ISBN 978-1-60558-126-2

[6] AMD I/O Virtualization Technology (IOMMU) Specification, February 2007.

[7] M. Aron, J.Liedtke, K. Elphinstone, Y. Park, T. Jaeger, and L. Deller. The SawMill Framework for Virtual Memory Diversity. In Australasian Computer Systems Architecture Conference, pages 3–11, Goldcoast, Queensland, Australia, 2000.

[8] K. Fraser, S. Hand, R. Neugebauer, I. Pratt, A. Warfield, and M. Williamson. Safe Hardware Access with the Xen Virtual Machine Monitor. In 1st Workshop on Operating System Architectural Support for the On-Demand IT Infrastructure, Boston, October 2004.

[9] H. Härtig, J. Löser, F. Mehnert, L. Reuther, M. Pohlack, and A. Warg. An I/O Architecture for Microkernel-Based Operating Systems. Technical Report TUD-FI03-08-Juli-2003, TU-Dresden, Dresden, July 2003.

[10] IOzone Filesystem Benchmark Documentation, 2006. http://www. iozone.org

[11] G. Kroah-Hartman, :Linux Kernel Development, In Proceedings of the Linux Symposium, Ottawa, Canada, June 2007

[12] J. LeVassur, V. Uhlig, Jan Stoess, and S. Goetz. Unmodified Device Driver Reuse and Improved System Dependability via Virtual Machines. In Proceedings of the 6th Symposium on Operating System Design and Implementation (OSDI 2004), December 2004.

[13] J. Liedtke. Toward Real μ-kernels, In Communications of the ACM, 39(9), pages 70–77, September 1996.

[14] K. T. van Maren. The Fluke Device Driver Framework, Master’s thesis, The University of Utah, December 1999.

[15] J. Sugerman, G. Venkitachalam, and B.-H. Lim. Virtualizing I/O Devices on VMWare Workstation’s Hosted Virtual Machine Monitor. In Proceedings of the 2001 USENIX Annual Technical Conference, Boston, June 2001.

[16] Tiobench Homepage. URL: http://tiobench.sourceforge.net [17] UDI Core Specification Version 1.01, 2001. http://www.projectudi.org [18] http://www.tudos.org

5