How to capture a variable in C# and not to shoot yourself in the foot

13
How to capture a variable in C# and not to shoot yourself in the foot Author: Ivan Kishchenko Date: 27.01.2017 Back in 2005, with the release of C# 2.0 standard we got a possibility to pass a variable to the body of an anonymous delegate by capturing it from the current context. In 2008 the C# 3.0 brought us lambdas, user anonymous classes, LINQ requests and much more. Now it January, 2017 and the majority of C# developers are looking forward to the release of the C# 7.0 standard that should provide us a bunch of new useful features. However, there are still old features that need to be fixed. That's why there are plenty of ways to shoot yourself in the foot. Today we are going to speak about one of them, and it is related with quite an unobvious mechanism of variable capture in the body of anonymous functions in C#. Introduction As I have stated above, we are going to discuss peculiarities of the mechanism of variable capture in the body of anonymous functions in C#. I should warn in advance, that the article will contain a large number of technical details, but I hope that both experienced and beginner programmers will find my article interesting and simple to comprehend. But enough talking. I'll give you simple example of the code, you should tell, what will be printed in the console. So, here we go. void Foo() { var actions = new List<Action>(); for (int i = 0; i < 10; i++) { actions.Add(() => Console.WriteLine(i)); }

Transcript of How to capture a variable in C# and not to shoot yourself in the foot

Page 1: How to capture a variable in C# and not to shoot yourself in the foot

How to capture a variable in C# and not to shoot yourself in the foot Author: Ivan Kishchenko

Date: 27.01.2017

Back in 2005, with the release of C# 2.0 standard we got a possibility to pass a variable to the body of an

anonymous delegate by capturing it from the current context. In 2008 the C# 3.0 brought us lambdas,

user anonymous classes, LINQ requests and much more. Now it January, 2017 and the majority of C#

developers are looking forward to the release of the C# 7.0 standard that should provide us a bunch of

new useful features. However, there are still old features that need to be fixed. That's why there are

plenty of ways to shoot yourself in the foot. Today we are going to speak about one of them, and it is

related with quite an unobvious mechanism of variable capture in the body of anonymous functions in

C#.

Introduction As I have stated above, we are going to discuss peculiarities of the mechanism of variable capture in the

body of anonymous functions in C#. I should warn in advance, that the article will contain a large

number of technical details, but I hope that both experienced and beginner programmers will find my

article interesting and simple to comprehend.

But enough talking. I'll give you simple example of the code, you should tell, what will be printed in the

console.

So, here we go.

void Foo()

{

var actions = new List<Action>();

for (int i = 0; i < 10; i++)

{

actions.Add(() => Console.WriteLine(i));

}

Page 2: How to capture a variable in C# and not to shoot yourself in the foot

foreach(var a in actions)

{

a();

}

}

And now attention please, here is the answer. The console will print the number 10 ten times.

10

10

10

10

10

10

10

10

10

10

This article is for those who thought otherwise. Let's try to sort out, what are the reasons of such

behavior.

Why does it happen so? Upon the declaration of an anonymous function (it can be an anonymous delegate or lambda) inside

your class, one more container class will be declared during the compilation, which contains fields for all

the captured variables and a method, containing a body of the anonymous function. The disassembled

structure of the program for the code fragment given above will be as follows:

In this case the Foo method in this fragment is declared inside the Program class. The compiler

generated a container class c__DisplayClass1_0 for the lambda () => Console.WriteLine(i), and inside of

the class-container it generated a field i, having a captured variable with the same name and the

method b__0, containing the body of the lambda.

Let's consider the disassembled IL code of the b__0 method (lambda body) with my comments:

Page 3: How to capture a variable in C# and not to shoot yourself in the foot

.method assembly hidebysig instance void '<Foo>b__0'() cil managed

{

.maxstack 8

// Puts the current class item (equivalent to 'this')

// to the top of the stack.

// It is necessary for the access to

// the fields of the current class.

IL_0000: ldarg.0

// Puts the value of the 'i' field to the top of the stack

// of the current class instance

IL_0001: ldfld int32

TestSolution.Program/'<>c__DisplayClass1_0'::i

// Calls a method to output the string to the console.

// Passes values from the stack as arguments.

IL_0006: call void [mscorlib]System.Console::WriteLine(int32)

// Exits the method.

IL_000b: ret

}

All correct, that's exactly what we do inside lambda, no magic. Let's go on.

As we know, the int type (the full name is Int32) is a structure, which means that it passed by value, not

by reference.

The value of the i variable should be copied (according the logic) during the creation of the container

class instance. And if you answered my question in the beginning of the article incorrectly, then most

likely you expected that the container would be created right before the declaration of the lambda in

the code.

In reality, the i variable won't be created after the compilation in the Foo method at all. Instead of it, an

instance of the container class c__DisplayClass1_0 will get created, and its field will be initialized with 0

instead of the i variable. Moreover, in all the fragments where we used a local variable i, there will be a

field of a container class used.

Page 4: How to capture a variable in C# and not to shoot yourself in the foot

The important point is that an instance of the container class is created before the loop, because its field

i will be used in the loop as an iterator.

As a result, we get one instance of the container class for all iterations of the for loop. Adding a new

lambda to the actions list upon every iteration, we actually add the same reference to the instance of

the container class created previously. As a result, when we traverse all the items of the actions list with

the foreach loop, they all have the same instance of the container class. And we take into account that

the for loop increments the value of an iterator after every iteration (even after the last one), then the

value of the i field inside the container class after the exit from the loop gets equal to 10 after executing

the for loop.

You can make sure of it by looking at the disassembled IL code of the Foo method (with my comments):

.method private hidebysig instance void Foo() cil managed

{

.maxstack 3

// -========== DECLARATION OF LOCAL VARIABLES ==========-

.locals init(

// A list of 'actions'.

[0] class [mscorlib]System.Collections.Generic.List'1

<class [mscorlib]System.Action> actions,

// A container class for the lambda.

[1] class TestSolution.Program/

'<>c__DisplayClass1_0' 'CS$<>8__locals0',

// A technical variable V_2 is necessary for temporary

// storing the results of the addition operation.

[2] int32 V_2,

// Technical variable V_3 is necessary for storing

// the enumerator of the 'actions' list during

// the iteration of the 'foreach' loop.

[3] valuetype

[mscorlib]System.Collections.Generic.List'1/Enumerator<class

Page 5: How to capture a variable in C# and not to shoot yourself in the foot

[mscorlib]System.Action> V_3)

// -================= INITIALIZATION =================-

// An instance of the Actions list is created and assigned to the

// 'actions' variable.

IL_0000: newobj instance void class

[mscorlib]System.Collections.Generic.List'1<class

[mscorlib]System.Action>::.ctor()

IL_0005: stloc.0

// An instance of the container class is created

// and assigned to a corresponding local variable

IL_0006: newobj instance void

TestSolution.Program/'<>c__DisplayClass1_0'::.ctor()

IL_000b: stloc.1

// A reference of the container class is loaded to the stack.

IL_000c: ldloc.1

// Number 0 is loaded to the stack.

IL_000d: ldc.i4.0

// 0 is assigned to the 'i' field of the previous

// object on the stack (an instance of a container class).

IL_000e: stfld int32

TestSolution.Program/'<>c__DisplayClass1_0'::i

Page 6: How to capture a variable in C# and not to shoot yourself in the foot

// -================= THE FOR LOOP =================-

// Jumps to the command IL_0037.

IL_0013: br.s IL_0037

// The references of the 'actions'

// list and an instance of the container class

// are loaded to the stack.

IL_0015: ldloc.0

IL_0016: ldloc.1

// The reference to the 'Foo' method of the container class

// is loaded to the stack.

IL_0017: ldftn instance void

TestSolution.Program/'<>c__DisplayClass1_0'::'<Foo>b__0'()

// An instance of the 'Action' class is created and the reference

// to the 'Foo' method of the container class is passed into it.

IL_001d: newobj instance void

[mscorlib]System.Action::.ctor(object, native int)

// The method 'Add' is called for the 'actions' list

// by adding an instance of the 'Action' class.

IL_0022: callvirt instance void class

[mscorlib]System.Collections.Generic.List'1<class

[mscorlib]System.Action>::Add(!0)

// The value of the 'i' field of the instance of a container class

// is loaded to the stack.

IL_0027: ldloc.1

Page 7: How to capture a variable in C# and not to shoot yourself in the foot

IL_0028: ldfld int32

TestSolution.Program/'<>c__DisplayClass1_0'::i

// The value of the 'i' field is assigned

// to the technical variable 'V_2'.

IL_002d: stloc.2

// The reference to the instance of a container class and the value

// of a technical variable 'V_2' is loaded to the stack.

IL_002e: ldloc.1

IL_002f: ldloc.2

// 1 is loaded to the stack.

IL_0030: ldc.i4.1

// It adds two first values on the stack

// and assigns them to the third.

IL_0031: add

// The result of the addition is assigned to the 'i' field

// (in fact, it is an increment)

IL_0032: stfld int32

TestSolution.Program/'<>c__DisplayClass1_0'::i

// The value of the 'i' field of the container class instance

// is loaded to the stack.

IL_0037: ldloc.1

IL_0038: ldfld int32

TestSolution.Program/'<>c__DisplayClass1_0'::i

Page 8: How to capture a variable in C# and not to shoot yourself in the foot

// 10 is loaded to the stack.

IL_003d: ldc.i4.s 10

// If the value of the 'i' field is less than 10,

// it jumps to the command IL_0015.

IL_003f: blt.s IL_0015

// -================= THE FOREACH LOOP =================-

//// The reference to the 'actions' list is loaded to the stack.

IL_0041: ldloc.0

// The technical variable V_3 is assigned with the result

// of the 'GetEnumerator' method of the 'actions' list.

IL_0042: callvirt instance valuetype

[mscorlib]System.Collections.Generic.List'1/Enumerator<!0> class

[mscorlib]System.Collections.Generic.List'1<class

[mscorlib]System.Action>::GetEnumerator()

IL_0047: stloc.3

// The initialization of the try block

// (the foreach loop is converted to

// the try-finally construct)

.try

{

// Jumps to the command IL_0056.

IL_0048: br.s IL_0056

// Calls get_Current method of the V_3 variable.

Page 9: How to capture a variable in C# and not to shoot yourself in the foot

// The result is written to the stack.

// (A reference to the Action object in the current iteration).

IL_004a: ldloca.s V_3

IL_004c: call instance !0 valuetype

[mscorlib]System.Collections.Generic.List'1/Enumerator<class

[mscorlib]System.Action>::get_Current()

// Calls the Invoke method of the Action

// object in the current iteration

IL_0051: callvirt instance void

[mscorlib]System.Action::Invoke()

// Calls MoveNext method of the V_3 variable.

// The result is written to the stack.

IL_0056: ldloca.s V_3

IL_0058: call instance bool valuetype

[mscorlib]System.Collections.Generic.List'1/Enumerator<class

[mscorlib]System.Action>::MoveNext()

// If the result of the MoveNext method is not null,

// then it jumps to the IL_004a command.

IL_005d: brtrue.s IL_004a

// Finishes the try block execution and jumps to finally.

IL_005f: leave.s IL_006f

} // end .try

finally

{

// Calls the Dispose method of the V_3 variable.

IL_0061: ldloca.s V_3

Page 10: How to capture a variable in C# and not to shoot yourself in the foot

IL_0063: constrained. Valuetype

[mscorlib]System.Collections.Generic.List'1/Enumerator<class

[mscorlib]System.Action>

IL_0069: callvirt instance void

[mscorlib]System.IDisposable::Dispose()

// Finishes the execution of the finally block.

IL_006e: endfinally

}

// Finishes the execution of the current method.

IL_006f: ret

}

Conclusion The guys from Microsoft say that this is a feature, not a bug and that this behavior was made

intentionally, aiming to increase the performance of the programs. You will find more information by

this link. In reality it results in bugs and confusion of novice developers.

An interesting fact is that the foreach loop had the same behavior before the C# 5.0 standard. The

Microsoft was bombarded with complaints about nonintuitive behavior in the bug-tracker, but with the

release of the C# 5.0 standard this behavior was changed by declaring the iterator variable inside every

loop iteration, not before it on the compilation stage, but for all other constructions similar behavior

remained without any changes. More information can be found by the link in the Breaking Changes

section.

You may ask how to avoid such an error? Actually the answer is very simple. You need to keep track of

where and what variables you capture. Just remember that the container class will be created in that

place where you have declared your variable that you will capture. If the capture occurs in the body of

the loop, and the variable is declared outside it, then it's necessary to reassign it inside the body of the

loop to a new local variable. The correct version of the example given in the beginning can be as follows:

void Foo()

{

var actions = new List<Action>();

for (int i = 0; i < 10; i++)

{

Page 11: How to capture a variable in C# and not to shoot yourself in the foot

var index = i; // <=

actions.Add(() => Console.WriteLine(index));

}

foreach(var a in actions)

{

a();

}

}

If you execute this code, the console will show the numbers from 0 to 9, as expected:

0

1

2

3

4

5

6

7

8

9

Looking at the IL code of the for loop from this example, we'll see that an instance of the container class

will be created upon every iteration of the loop. Thus, the actions list will contain references to various

instances with correct values of the iterators.

// -================= THE FOR LOOP =================-

// Jumps to the command IL_002d.

IL_0008: br.s IL_002d

// Creates an instance of a container class

// and loads the reference to the stack.

IL_000a: newobj instance void

TestSolution.Program/'<>c__DisplayClass1_0'::.ctor()

Page 12: How to capture a variable in C# and not to shoot yourself in the foot

IL_000f: stloc.2

IL_0010: ldloc.2

// Assigns the 'index' field in the container class

// with a value 'i'.

IL_0011: ldloc.1

IL_0012: stfld int32

TestSolution.Program/'<>c__DisplayClass1_0'::index

// Creates an instance of the 'Action' class with a reference to

// the method of a container class and add it to the 'actions' list.

IL_0017: ldloc.0

IL_0018: ldloc.2

IL_0019: ldftn instance void

TestSolution.Program/'<>c__DisplayClass1_0'::'<Foo>b__0'()

IL_001f: newobj instance void

[mscorlib]System.Action::.ctor(object, native int)

IL_0024: callvirt instance void class

[mscorlib]System.Collections.Generic.List'1<class

[mscorlib]System.Action>::Add(!0)

// Performs the increment to the 'i' variable

IL_0029: ldloc.1

IL_002a: ldc.i4.1

IL_002b: add

IL_002c: stloc.1

// Loads the value of the 'i' variable to the stack

Page 13: How to capture a variable in C# and not to shoot yourself in the foot

// This time it is not in the container class

IL_002d: ldloc.1

// Compares the value of the variable 'i' with 10.

// If 'i < 10', then jumps to the command IL_000a.

IL_002e: ldc.i4.s 10

IL_0030: blt.s IL_000a

Finally, let me remind you that we are all human beings and we all make errors, that's why it would be

illogical, and as a rule long and resource-intensive to hope only for the human factor when searching for

bugs and typos. So, it's always a good idea to use technical solutions to detect errors in the code. The

machine doesn't get tired and does the work much quicker.

Quite recently, we as a team of PVS-Studio static code analyzer developers have created a diagnostic

rule that is aimed at detecting incorrect capture of the variables and anonymous functions inside the

loops. In my turn I suggest checking your code with our analyzer and see if it can detect bugs in your

code.

At this point, I'm finishing my article, I wish you clean code bugless programs.