Saturday, February 19, 2011

To Box or not to box?

 

So recently I started refreshing my knowledge in .net basics since I'm starting to seek for a new job, and I ran into a subject that Shani Raba once told  me about.

I am talking about boxing and unboxing of value types.

let's starts from the really basics.. I'm sure most of you know that in .net Framework we have two types, value types and reference types. value types are the type we mostly use, these are the common types like int32, bool, decimal and so on, these types are located in the thread stack(in the process we use) and we don't need to initialize them(with the new keyword). on the other hand we have the reference types, these types are located on the managed heap, they are heavier since they have a overhead of additional members and they are garbage collected.

I must also point that value types derive from System.ValueType which derives from System.Object.

watch this code sample:

array

So what actually happens here? The ArrayList.Add method takes and Object as a parameter, so for each integer we want to insert to the list it has to cast to Object. however the integer is stored on the stack and the list expects a Object from the managed heap. In that case the CLR is using a mechanism called Boxing.

what is Boxing:

1) Memory is allocated on the managed heap, according to the size of the value type pluse some extra overhead.

2) fields from the value types are copied to the new object on the managed heap.

3) the CLR returns the address of the value which is now it's reference.

So in our example we actually created 20 new objects on the managed heap, and as we stated in the begining of the post it's a big overhead. Now what will happen if we will write the following code?

unb

Now we try to convert the Object to int. That action is called UnBoxing, and that's what actually happens is that the CLR generates a pointer to the data in the Object, the one on the heap. As you can understand the UnBox action is much more "cheap" than the box action.

So what can we do to avoid this overhead:

1) use proper methods and overloads, for example the Console.WriteLine method has about 12 overloads, so if you have an integer you can send it as a parameter to the writeline method (by that using the overload which takes int as a parameter), and you should not use the ToString method for this.

2) Box your value type in a single place, and use that boxed type for all the places you need.

3) use Generics and Interfaces when possible.

 

BRIUT

No comments:

Post a Comment