Device Tree Partioning

Device tree partitioning for multicore, multi-OSembedded software designsFaheem Sheikh, Mentor Graphics - November 05, 2014

Performance, security, portability, and software consolidation on a single platform are key factors driving thedemand of multi-OS multicore designs in many embedded market segments, including industrial/medical,mobile, and automotive. Broadly speaking, these designs can be categorized as homogeneous orheterogeneous computing domains.

Homogenous computing is characterized by similar processing units (in terms of instruction set architecture)controlled by a single instance of an OS that can handle all the resources on the platform. Symmetricmultiprocessing OSes are an example of this, and were widely deployed in the early days of mutlicoreadoption in the embedded industry. However, embedded systems are diverse and it is impossible to satisfy themajority of requirements/constraints with a homogeneous multicore design. This is where heterogeneouscomputing comes into the picture, enabling multiple software stacks running on sets of core(s) suited to aperform a particular function.

Heterogeneous multicore computing can be further classified into supervised and unsupervised multicoreprocessing. The supervised class covers designs with manager software mediating between multiple softwarestacks while the unsupervised class has a manager-less design where software running on one set of coresmight assume the â€˜masterâ€™ role and setup work for rest of the processing units.

Frequently â€˜heterogeneityâ€™ of the system comes from diverse software stacks and not necessarily fromprocessing units. What this means is that two similar cores, one running Linux and another running an RTOS,would still classify as a heterogeneous multicore design, although they are running on the same instruction set.

This article describes a resource partitioning scheme for a supervised, heterogeneous multicore embeddedsystem, where the system under consideration contains multiple instances of embedded Linux, each runningon a different set of cores. After first reviewing the tools available currently for resource partitioning amongmultiOS systems, along with their limitations, a partitioning algorithm is proposed that is used to produce afiltered view of the platform for the guest operating systems in the systemâ€™s hypervisor. The resultingwork flow is explained with the help of a real-world use case.

The need for resource partitioningWhether supervised or unsupervised, heterogeneous computing introduces a significant resource partitioningchallenge. Consider an unsupervised design where multiple OSes on a single platform need to run in tandem,but where each OS is allowed access to only one set of devices. Or alternatively, a supervised design with ahypervisor supporting multiple guest operating systems in which the hypervisor and guests all can havepotentially different views of the hardware platform on which they are running.

Flattened device trees (FDTs). While there are several ways by which embedded software can gatherhardware information, flattened device trees [1] are fast becoming the preferred way to provide fastenablement of Linux on hardware. A major breakthrough in the adoption of device trees has been theirinclusion in Linux kernel 3.2 for the ARM architecture [2].

With the growing popularity of FDTs it is natural to look at the data contained within one and see if it can be

http://www.embedded.com/print/4436959

1 of 9 11/30/2014 5:20 PM

processed to satisfy resource partitioning requirements. The idea is to take a master device tree completelydescribing the hardware platform and convert it into multiple independent device trees that would supplyrestricted hardware views to multiple associated operating systems.

FDTs are represented by device tree structures (DTS) as a convenient textual representation of the platformin the form of a tree. It is possible to hand edit these DTS files, inserting/removing data as desired, whiledriving new slave device trees. But this manual hand-editing process is prone to errors. For example, whathappens if a device assigning to a particular OS needs to be assigned to another one? This would requirechanges in more than one place and recompilation of all the device trees touched.

Automating resource partitioningWhat is necessary is a program that auto-generates the device trees according to the requirements of thedesign under consideration. In that context, a good thing about FDTs is that they come with excellent supportin the form a device tree compiler (DTC) and runtime library (libfdt) to manipulate FDT data [3]. Using thesetools, one can write a utility for resource partitioning among multiple OSes by generating new device treestructures/blobs for each OS in the system. Of course, this requires some additional metadata to be specifiedcovering the design requirements. This can be done by extending the master DTS describing the platform for aregular single OS design.

Figure 1 is a flow diagram for a supervised heterogeneous multiOS design with this tool in action. It shows ahypervisor-based design in which platform information for virtual machines (hypervisor guests) is obtainedfrom a master device tree structure. The tool also extracts relevant information about virtual machinesrequired by the hypervisor. Although not shown here it is easy to extend this flow to generate platforminformation in any format, thus supporting OSes that donâ€™t employ device trees.

Figure 1: Flow diagram for resource partitioning based on FDTs

An algorithm for FDT partitioningResource configuration based on FDTs can be accomplished with the following steps:

Annotate the master DTS with additional information about multicore, multiOS system. There are twoparts of this annotation:

First in the form of new bindings where the additional information is kept under a new node defined atthe base of the tree. This information covers design specifications such as how many OSes are presentin the system, their memory partitioning, cores on which they run, and the devices allocated to them.The second part is the labeling of nodes in the master device tree in case such labels are missing. This isrequired for referencing master DTS nodes in the newly defined configuration node.


2 of 9 11/30/2014 5:20 PM

Decide which nodes in the master DTS should be retained for each OS. This step is the bulk of the work asit requires finding the dependency sub-trees. Additionally, there might be some mandatory nodes that need tobe included for the output to be considered a valid device tree for that platform, so those need to be markedup as well, along with the dependency sub-trees.

Copy the master DTB to a buffer per OS. Run a filter removing all unmarked nodes from the copied DTB.

For supervised heterogeneous designs, additional nodes such as virtual devices might be required in the guestDTBs. Some guest related information, memory partitioning, etc., is also required by the hypervisor atruntime. The tool should have the ability to extract this information (preferably in the form of macros anddefinitions)

Supervised dual guest use caseLetâ€™s take a dummy master device tree and use the above steps to partition it for a supervised dualguest-OS configuration. It is assumed that the hardware platform is an ARM SoC, so some essential data fromARM bindings would appear in the master DTS. Guest OSes are assumed to be Linux, and the hypervisor isassumed to be standalone software running on bare metal.

Listing 1 shows the master device tree supporting two CPUs, each dependent on a base-clk node for its ticks.The master DTS for an ARM SoC has some essential nodes in the shape of timer and gic nodes. It also hasonly a couple of peripherals, namely a general purpose I/O (GPIO) and a universal asynchronousreceiver/transmitter (UART). These peripherals depend on some pin multiplex (pin-mux) settings. All devicenodes are labelled so that they can be referenced later on.

/*

*

* Master device tree

*

*/

/dts-v1/;

/ {

#address-cells = <1>;

#size-cells = <1>;

cpus {

cpu@0 {

compatible = "arm, some_processor";

operating-points = <

/* kHz uV */

/* Only for Nominal Samples */

500000 880000

>;

clocks = <&base_clk>;

clock-names = "cpu";

};

cpu@1 {

compatible = "arm, some_processor";

operating-points = <

/* kHz uV */

/* Only for Nominal Samples */

500000 880000


3 of 9 11/30/2014 5:20 PM

>;

clocks = <&base_clk>;

clock-names = "cpu";

};

};

/* ARM architectural timer and GIC */

timer {

compatible = "arm,armv7-timer";

interrupts = <1 13 0x308>, <1 14 0x308>, <1 11 0x308>, <1 10 0x308>; clock-frequency = <6144000>;

};

gic: interrupt-controller@abcd0000 {

compatible = "arm, processor-gic";

interrupt-controller;

#interrupt-cells = <3>;

reg = <0xabcd0000 0x1000>, <0xabcd0000 0x1000>, <0xabcd0000 0x2000>, <0xabcd0000 0x2000>; };

base_clk: base_clk {

#clock-cells = <0>;

compatible = "some_platform_clock";

};

/* Some dummmy platform peripherals */

soc {


#size-cells = <1>;

uart1: serial@48020000 {

compatible = "some_platform_uart";

reg = <0x48020000 0x100>; interrupts = <0 74 0x4>; pinctrl = <&uart1_pins>;

clock-frequency = <48000000>;

};

gpio1: gpio@4ae10000 {

compatible = "some_platform-gpio";

reg = <0x4ae10000 0x200>; interrupts = <0 29 0x4>; gpio-controller;

#gpio-cells = <2>;

interrupt-controller;

#interrupt-cells = <2>;

pinctrl = <&gpio1_pins>;


4 of 9 11/30/2014 5:20 PM

};

uart1_pins: pinmux_uart1_pins {

pinctrl-single,pins = <0x60 0x0>; };

gpio1_pins: pinmux_gpio1_pins {

pinctrl-single,pins = <0x196 0x6>; };

};

};

Listing 1: Base (master) device tree structure

Title-1Listing 2 below shows two Linux OSes as guests, each to be run on one of the cores supported by thehardware platform. Guest 0 has access to UART while GPIO is only available to Guest 1. Both guests share avirtIO-based virtual console device supported by the hypervisor. [4]

/* Include the primary device tree for this platform */

/include/ "parent.dtsi"

/* Partitioning annotation */

/ {

/* Two guest OS resource configuration DTS node */

rcfg {

compatible="some_config, some_platform";


#size-cells = <1>;

num_guests = <2>;

memory@0xFC100000 {

reg = <0xFC100000 0x3E00000 0xFC000000 0x100000>; };

virt_console: virtio_console@0xFC004000 {

compatible = "virtio,mmio";

reg = <0xFC004000 0x200>; interrupts = <0x0 0x31 0x1>; interrupt_parent = <&gic>;

};

guest@0 {


5 of 9 11/30/2014 5:20 PM


#size-cells = <1>;

kernel@0xc0000000{

compatible="some_kernel";

reg = <0x80000000 0x4000000>; pcpu = "/cpus/cpu@0";

};

pt_devices {

dev1 = <&uart1>;

};

vt_devices {

dev1 = <&virt_console>;

};

};

guest@1 {


#size-cells = <1>;

kernel@0xc0000000{

compatible="some_kernel";

reg = <0xc0000000 0x4000000>; pcpu = "/cpus/cpu@1";

};

pt_devices {

dev1 = <&gpio1>;

};

vt_devices {

dev1 = <&virt_console>;

};

};

};

};

Listing 2: Device tree annotations for supervised heterogeneous dual guest configuration

The above two device tree structures are combined to a get single DTS representation for the tool to process.Figure 2 below shows a graphical form of the combined DTS. Nodes in blue come from the master DTSwhile the yellow nodes represent the data coming from partitioning annotation.


6 of 9 11/30/2014 5:20 PM

Figure 2: Device tree structure with annotated data

The next step is to mark up nodes in the combined device tree shown in Figure 2. This is done by traversingthe dependency sub-trees formed by the label-reference combination. The device tree compiler assigns aunique ID to every labelled node in the device trees structure. This ID is stored in a property called ‘phandle’.When a labelled node is referenced, this phandle can be used to traverse to that node.

Figure 3 shows the dependency graphs for both guests in our use case. Yellow nodes highlight the nodes thatwould only be valid for guest 0, blue and orange nodes represent the shared nodes, which need to be retainedin either guest’s device tree blob. The dotted edges indicates the nodes on either side are additional metadatafor helping to traverse the dependency graphs, while the solid edges actually link the the device nodes thatneed to be retained. The dotted blue path circles the device nodes that would be marked up for inclusion inthe generated device tree blob for guest 0.


7 of 9 11/30/2014 5:20 PM

Figure 3: Dependency graphs in annotated device tree structure

Figure 4 shows the resulting device tree structure generated by the partitioning tool. The yellow noderepresents a new node inserted by the partition tool to configure a virtual console device. The node has beenmoved from annotated data to the base of the new device tree.

Figure 4: Filtered platform information available to one of the guests

Eliminating tree complexityFigure 3 above shows the dependency subtree of a guest is a directed acyclic graph, with edges indicating thatphandle references and vertices are the device nodes. In order to mark all the dependent sub-nodes, onecan use the Depth First Search (DFS) [5] algorithm, whose complexity is O(n+m) where n are the vertices and


8 of 9 11/30/2014 5:20 PM

‘m’ the edges connecting those vertices. For our case, this complexity is linear with the number of guest OSesin the system as DFS would be repeated for each guest. Filtering and inserting new node(s) adds a constantfactor to the complexity.

Faheem Sheikh is a staff engineer in the embedded software division of Mentor Graphics working onembedded virtualization technology. He has many years of development experience with multicorehigh-performance computing systems. He has a PhD in computer engineering from Lahore University ofManagement Sciences.

References

1. G.Likely and J. Boyer A symphony of flavors: Using the device tree to describe embedded hardware

2. Enabling device tree support on ARM platform

3. Device tree compiler and libfdt sources

4. VirtIO specs

5. Online lecture on directed graphs and DAG


9 of 9 11/30/2014 5:20 PM

Device Tree Partioning

Documents

Transcript of Device Tree Partioning