Some clues for Emulab source code (v1.0)xuelin/cron/Some_clues_to_Emulab_source_code.pdfSome clues...

Some clues for Emulab source code (v1.0)

Lin Xue

[email protected]

June 2010

NOTE

This document is written step by step how Emulab works according to Emulab source code,

including parse input, read/write data from DB, run assign, call Dummynet, and etc. I read the

source code because I want to find how Emulab calls Dummynet to create Delay Node. It is not

going very detail, but it can give you a brief view of the Emulab mechanism and many clues that

will help you to understand the source code.

1. Emulab parses your NS file

Suppose you have written your own NS file in Emulab, and you start your experiment, now sim.tcl and

parse.tcl (/lib/ns2ir/) will work to parse your NS file and update the DB.

Sim.tcl defines the class Simulator; parse.tcl defines parse functions for every parameters. (As of now,

Emulab is very similar to NS2)

For example, if you write a line like this:

set link0 [$ns duplex-link $nodeB $nodeA 30Mb 50ms DropTail]

Here you set a duplex link between nodeA and nodeB, and you also set the bandwidth, delay and queue

management, which will be used for Dummynet later.

Since $ns is an object of “Simulator”, then you should look into function of sim.tcl:

Simulator instproc duplex-link {n1 n2 bw delay type args}

(Note: instproc, in Otcl, means add a method duplex-link into class Simulator)

Here we get the parameters from your NS file, and will parse them to lower layer.

2. Emulab puts the parameters into DB

As you may know that in Emulab, all the parameters from users’ input will be put into DB first.

Here inside duplex-link function, you see a line like this:

Link $curlink $self "$n1 $n2" $rbw $rdelay $type

So you see a new class “Link”, you should find where is the definition of “Link”. Go to lanlink.tcl

(/lib/ns2ir/) to find your answer. Here they define the LanLink class which has two children Lan and Link.

A LanLink contains a number of node:port pairs as well as the characteristics bandwidth, delay, and loss

rate. All the links and lans you defined in your NS file will go into this file.

As you can see, function:

LanLink instproc init {s nodes bw d type}

is the constructor of LanLink class, it takes all the parameters from users’ input. (bandwidth, delay and

queue management in your NS file) Moreover, since class LanLink is the superclass of class Link, class

Link will also be initialized.

After get all the parameters, surprisingly, you will find:

Link instproc updatedb {DB} and Lan instproc updatedb {DB}

Yes, that is the place where your parameters are set to DB!!

Inside these updated functions, you can see the function of spitxml_data, this function is in sim.tcl, and

exactly define how they make all the parameters to the XML file, and the SQL DB.

For the bandwidth, delay, and queue information, you’ll see a function like this:

$sim spitxml_data "virt_lans" $fields $values

That is to update all these parameters into the virt_lans table of DB.

You can find the calls of these updatedb functions inside function “run” in sim.tcl:

Simulator instproc run {}

That means whenever you write “run” in your NS file, Emulab will get all the parameters you set, and

update the DB. Actually, you can see in run function, Emulab will update many tables in the DB,

including virt_lans, virt_nodes, etc…

3. Emulab gets the data from DB

OK, now I suppose you have already known how the data is put into the DB. Now the problem is how

Emulab gets the data from DB, and use the data to assign specific hardware. As you may know, Emulab

uses a function “assign” to assign (or map) user’s requests to the specific hardware. The annealing

algorithm in assign will finally choose one optimal assignment of physical machines based on the input

and the current hardware they have. Here you need to know a function “assign_wrapper” which is an

interface between DB data representation and resource allocation algorithms. It will call the solver and

use the output to set up the database state that runs the rest of the process.

So let’s see assign_wrapper.in (testbed/src/testbed5.0/tbsetup), you will see a very important comment

there introduce how Emulab setup the virtual topology as follow.

You see the virt_lans table! Yes, Emulab is ready to read the parameters like the delay and bandwidth

you just set from the virt_lans table.

Then look into the function LoadExperiment(), and then to the function LoadVirtLans(), here you’ll see

Emulab load all the data from DB to the local variables (by DBQueryFatal).

Till now, Emulab has loaded all the parameters you just set, including bandwidth, delay, queue, and

etc… into its local variable.

4. Emulab creates the TOP file

Let’s go on reading assign_wrapper.in (testbed/src/testbed5.0/tbsetup). In order to run assign, Emulab

should have a file named TOP file which records the virtual topology as we see in last page.

Firstly, to open a TOP file:

open(TOPFILE,"> $topfile")

Secondly, there are two cases, one is there are just two virtual members elsif (@members == 2) so they

are just links, another is there are more than two virtual members elsif ($#members != 0) then to

generate virtual lan node.

In both cases, you will see the delay related variables will be stored in a variable delaylinks for future use:

$delaylinks{$plink} = [$member,$delay,$bw,$backfill,$loss,$member,$rdelay,$rbw,$rbackfill,$rloss,0];

Of course, all the variables in the right hand side of the equation are gotten from the local variables

Emulab read from DB previously:

my ($delay,$bw,$ebw,$backfill,$loss, $rdelay,$rbw,$rebw,$rbackfill,$rloss) = virtlandelayinfo($lan,

$member);

After you create your TOPFILE, close it:

close(TOPFILE);

5. Emulab runs assign

Still in assign_wrapper.in, after create the TOP file, Emulab is now ready to assign the physical resources

according to the virtual request, see function:

sub RunAssign ()

Before start assign, in addition to have the virtual topology file, Emulab still need the current physical

topology since without the snapshot of the current physical resources Emulab can not do the optimal

assignment. Again, it will create a PTOP file similar to the TOP file:

system("ptopgen $ptopargs > $ptopfile");

(If you’re interested in how the file is created, you can go to ptopgen.in

(testbed/src/testbed5.0/tbsetup). )

After that, you’ll see some lines:

# Run assign

my $cmdargs = "-P $ptopfile $topfile";

$cmdargs = "-uod -c .75 $cmdargs"

$cmd = "assign";

print "$cmd $cmdargs\n";

which will put the system command: assign –uod –c .75 –P $ptopfile $topfile

As of now, Emulab will run its assign program, which you can read starting from assign.cc

(testbed/src/testbed5.0/assign).

When Emulab finishes assign, Emulab will store the assign result in assign.log file, then operate on file

pointer ASSIGNFP which contains the mapping from virtual to physical:

if (!open(ASSIGNFP, "assign.log"))

I have not had time to read through the assign. It’s mostly about their introduction of the annealing

algorithm which will choose in several steps for the optimal assignment. If you figure it out in the future,

please share with me

6. Emulab stores physical link information

Still in assign_wrapper.in, during parsing the assign result “ASSIGNFP”, Emulab will store the information

of physical links (plinks) as follow:

That is to read every edges in the assign result file, by convention, in plinks. Plinks is indexed by virtual

name and contains (pnodeportA,pnodeportB) which means from one port of one physical node to which

another port of another node. The delay node is always the second entry (pnodeportB). It is related to

the delay node we want, we’ll see later.

7. Emulab converts the plinks into vlans, delays, and portmap

Still in assign_wrapper.in, now Emulab has already gotten the physical information from the assign

result, what Emulab want to do next is to convert the physical information into internal data structure

like vlans, delays, and portmap. Then update these variables into DB again for future use.

Emulab will loop every physical link to get all the information.

foreach $plink (keys(%plinks))

In each iteration of plink, there are several cases, like:

if (($lan,$virtA,$virtC) = ($plink =~ m|^linksdelaysrc/(.+)/(.+),(.+)$|))

(Node has a single entry in lan. Node is nodeportA, Delay node is nodeportB)

Or:

elsif (($lan,$virtA) = ($plink =~ m|^linkdelaysrc/([^/]+)/(.+)$|))

(Node may have multiple entries in lan, Delay node is nodeB and portB.)

And etc…

In each case, you can see the parameters about delay will be get from the previous local variable

delaylinks, and then put into the local variable $nodedelays:

my ($member0,$delay,$bandwidth,$backfill,$loss,$member1,$rdelay,$rbandwidth,$rbackfill,$rloss) =

@{$delaylinks{$plink}};

$nodedelays{$delayid++} = [$nodeB,$portB,$portD,$lan,$member0,$delay,$bandwidth,$backfill,$loss,

$member1,$rdelay,$rbandwidth,$rbackfill,$rloss];

This nodedelays variable will be used later for uploading all the delay related parameters to DB.

8. Emulab upload the updated information to DB

This is the second time Emulab put information to DB, first time Emulab put user’s input into the DB, this

time Emulab has already run assign, and want to put the updated informations into DB again.

Still in assign_wrapper.in, in Step 4 - Upload to DB, you’ll see Emulab want to upload the delay

information through:

foreach $delayid (sort {$a <=> $b} keys(%nodedelays))

You’ll see Emulab put the delay information into delay table:

DBQueryFatal("insert into delays " …….

Of course, there are other tables Emulab should upload, they are the same as the delay table.

One thing for clarification is that you may see there is information regarding link delay:

foreach $delayid (sort {$a <=> $b} keys(%linkdelays))

That is another kind of delay which uses end node delay, different from the delay node.

So that’s it! This is an entire process about Emulab read/wirte DB, run assign, and etc…

I write this based on the delay information. You can find other information in Emulab the same as this

process.

Now you’ve already know how Emulab reads user’s input, puts them into DB, runs assign based on the

virtual/physical information, updates DB for the second time. What I want next is how Emulab is related

to Dummynet.

9. Emulab event system

As you may know, Emulab has its own event system; you can find plenty of docs regarding its event

system. Still I want to know how Emulab deal with its delay events, which is called delay agent.

Delay agent is the agent used in Emulab event system for coordinating to control traffic shaping.

Changes can initiate anywhere, like automatic timed changes from Emulab, or manual changes from

Emulab server or a node. Delay agent allows for reactive traffic shaping, trace playback, etc.

So you need to read main.c (testbed/src/testbed5.0/event/delay-agent). Delay agent will be a process

which will be triggered by delay events.

You can see Emulab first create a raw socket to configure Dummynet through setsockopt:

s_dummy = socket( AF_INET, SOCK_RAW, IPPROTO_RAW );

Then delay agent has its own event subscribe:

if (event_subscribe(handle, agent_callback, event_t, NULL) == NULL)

So go to (testbed/src/testbed5.0/event/delay-agent/callback.c):

void agent_callback(event_handle_t handle, event_notification_t notification, void *data)

This function is called from the event system when an event notification is recd. from the server. It

checks whether the notification is valid (sanity check). If not print a warning, else call handle_pipes

which does the rest of the job.

According to that clue, go all the way from function handle_pipes, to function handle_link_modify, to

function set_link_params.

set_link_params function is the function which set all the previous parameter related to Dummynet to

Dummynet.

As you can see inside the function:

if (setsockopt(s_dummy,IPPROTO_IP, IP_DUMMYNET_CONFIGURE, &pipe,sizeof pipe)<0)

This is the function which will setup the IP_DUMMYNET_CONFIGURE to the Dummynet socket.

10. Emulab calls Dummynet

So now you need to see ip_dummynet.c (testbed/src/cron_branch/pelab/bw-

bottelneck/backfill_dummynet). It is very important for you to know that actually this file is the exact

file which runs on FreeBSD/Linux for Dummynet function. But you have one copy of it in your own

source file, if you want to change the function of Dummynet, you should change on that file, and put

that file to the original directory in FreeBSD/Linux for compile.

Here you will see a function:

static int ip_dn_ctl(struct sockopt *sopt)

This is the function handles for the various dummynet socket options (get, flush, config, del), it will react

to the IP_DUMMYNET_CONFIGURE Eumlab made just now.

As you can see, there are several requests Dummynet can handle, including: IP_DUMMYNET_GET,

IP_DUMMYNET_FLUSH, IP_DUMMYNET_CONFIGURE, and IP_DUMMYNET_DEL.

If you look into function:

static int config_pipe(struct dn_pipe *p)

That will be the exact function which setup pipe or queue parameters in Dummynet.

OK, so far you know how Emulab interact with Dummynet, I may want to go more deep into the

Dummynet source code in the future, because till now we can know how to change code based on our

software emulator.

11. Where can I find some architecture documents about the source code?

https://users.emulab.net/trac/emulab/wiki/Arch

12. How to build your own delay kernel

https://users.emulab.net/trac/emulab/wiki/kb96

Reference

[1] “A Solver for the Network Testbed Mapping Problem”, Robert Ricci Chris Alfeld, Jay Lepreau, School

of Computing, University of Utah, Salt Lake City, UT 84112, USA

[2] https://users.emulab.net/trac/emulab/wiki

https://users.emulab.net/trac/emulab/wiki/Arch

https://users.emulab.net/trac/emulab/wiki/kb96

https://users.emulab.net/trac/emulab/wiki

Some clues for Emulab source code (v1.0)xuelin/cron/Some_clues_to_Emulab_source_code.pdfSome clues...

Documents

Transcript of Some clues for Emulab source code (v1.0)xuelin/cron/Some_clues_to_Emulab_source_code.pdfSome clues...