Sunday, October 25, 2009

Memory Management in .NET

Resource Allocation
The Microsoft .NET common language runtime requires that all resources be allocated from the managed heap. Objects are automatically freed when they are no longer needed by the application.

When a process is initialized, the runtime reserves a contiguous region of address space that initially has no storage allocated for it. This address space region is the managed heap. The heap also maintains a pointer. This pointer indicates where the next object is to be allocated within the heap. Initially, the pointer is set to the base address of the reserved address space region.

An application creates an object using the new operator. This operator first makes sure that the bytes required by the new object fit in the reserved region (committing storage if necessary). If the object fits, then pointer points to the object in the heap, this object's constructor is called, and the new operator returns the address of the object.

Above fig shows a managed heap consisting of three objects: A, B, and C. The next object to be allocated will be placed where NextObjPtr points (immediately after object C).

When an application calls the new operator to create an object, there may not be enough address space left in the region to allocate to the object. The heap detects this by adding the size of the new object to NextObjPtr. If NextObjPtr is beyond the end of the address space region, then the heap is full and a collection must be performed.

In reality, a collection occurs when generation 0 is completely full. Briefly, a generation is a mechanism implemented by the garbage collector in order to improve performance. The idea is that newly created objects are part of a young generation, and objects created early in the application's lifecycle are in an old generation. Separating objects into generations can allow the garbage collector to collect specific generations instead of collecting all objects in the managed heap.

The Garbage Collection Algorithm
The garbage collector checks to see if there are any objects in the heap that are no longer being used by the application. If such objects exist, then the memory used by these objects can be reclaimed. (If no more memory is available for the heap, then the new operator throws an OutOfMemoryException.)

Every application has a set of roots. Roots identify storage locations, which refer to objects on the managed heap or to objects that are set to null. For example, all the global and static object pointers in an application are considered part of the application's roots. In addition, any local variable/parameter object pointers on a thread's stack are considered part of the application's roots. Finally, any CPU registers containing pointers to objects in the managed heap are also considered part of the application's roots. The list of active roots is maintained by the just-in-time (JIT) compiler and common language runtime, and is made accessible to the garbage collector's algorithm.

When the garbage collector starts running, it makes the assumption that all objects in the heap are garbage. In other words, it assumes that none of the application's roots refer to any objects in the heap. Now, the garbage collector starts walking the roots and building a graph of all objects reachable from the roots. For example, the garbage collector may locate a global variable that points to an object in the heap.

Following fig shows a heap with several allocated objects where the application's roots refer directly to objects A, C, D, and F. All of these objects become part of the graph. When adding object D, the collector notices that this object refers to object H, and object H is also added to the graph. The collector continues to walk through all reachable objects recursively.

Once this part of the graph is complete, the garbage collector checks the next root and walks the objects again. As the garbage collector walks from object to object, if it attempts to add an object to the graph that it previously added, then the garbage collector can stop walking down that path. This serves two purposes. First, it helps performance significantly since it doesn't walk through a set of objects more than once. Second, it prevents infinite loops should you have any circular linked lists of objects.

Once all the roots have been checked, the garbage collector's graph contains the set of all objects that are somehow reachable from the application's roots; any objects that are not in the graph are not accessible by the application, and are therefore considered garbage.

The garbage collector now walks through the heap linearly, looking for contiguous blocks of garbage objects (now considered free space). The garbage collector then shifts the non-garbage objects down in memory (using the standard memcpy function), removing all of the gaps in the heap. Of course, moving the objects in memory invalidates all pointers to the objects. So the garbage collector must modify the application's roots so that the pointers point to the objects' new locations. In addition, if any object contains a pointer to another object, the garbage collector is responsible for correcting these pointers as well.

Following fig shows the managed heap after a collection.




After all the garbage has been identified, all the non-garbage has been compacted, and all the non-garbage pointers have been fixed-up, the NextObjPtr is positioned just after the last non-garbage object. At this point, the new operation is tried again and the resource requested by the application is successfully created.

GC generates a significant performance hit, and this is the major downside of using a managed heap. However, keep in mind that GCs only occur when the heap is full and, until then, the managed heap is significantly faster than a C-runtime heap. The runtime's garbage collector also offers some optimizations using Generations that greatly improve the performance of garbage collection.

You no longer have to implement any code that manages the lifetime of any resources that your application uses. Now it is not possible to leak resources, since any resource not accessible from your application's roots can be collected at some point. Also it is not possible to access a resource that is freed, since the resource won't be freed if it is reachable. If it's not reachable, then your application has no way to access it.

Following code demonstrates how resources are allocated and managed:

class Application

{

public static int Main(String[] args)

{

// ArrayList object created in heap, myArray is now in root

ArrayList myArray = new ArrayList();

// Create 10000 objects in the heap

for (int x = 0; x < 10000; x++)

{

myArray.Add(new Object()); // Object object created in heap

}

// Right now, myArray is a root (on the thread's stack). So,

// myArray is reachable and the 10000 objects it points to are also reachable.

Console.WriteLine(myArray.Count);

// After the last reference to myArray in the code, myArray is not a root.

// Note that the method doesn't have to return, the JIT compiler knows

// to make myArray not a root after the last reference to it in the code.

// Since myArray is not a root, all 10001 objects are not reachable

// and are considered garbage. However, the objects are not

// collected until a GC is performed.

}

}

If GC is so great, you might be wondering why it isn't in ANSI C++. The reason is that a garbage collector must be able to identify an application's roots and must also be able to find all object pointers. The problem with C++ is that it allows casting a pointer from one type to another, and there's no way to know what a pointer refers to. In the common language runtime, the managed heap always knows the actual type of an object, and the metadata information is used to determine which members of an object refer to other objects.

Generations

One feature of the garbage collector that exists purely to improve performance is called generations. A generational garbage collector (also known as an ephemeral garbage collector) makes the following assumptions:

• The newer an object is, the shorter its lifetime will be.

• The older an object is, the longer its lifetime will be.

• Newer objects tend to have strong relationships to each other and are frequently accessed around the same time.

• Compacting a portion of the heap is faster than compacting the whole heap.

When initialized, the managed heap contains no objects. Objects added to the heap are said to be in generation 0, as you can see in following fig. Stated simply, objects in generation 0 are young objects that have never been examined by the garbage collector.








Now, if more objects are added to the heap, the heap fills and a garbage collection must occur. When the garbage collector analyzes the heap, it builds the graph of garbage (shown here in Green) and non-garbage objects. Any objects that survive the collection are compacted into the left-most portion of the heap. These objects have survived a collection, are older, and are now considered to be in generation 1.



As even more objects are added to the heap, these new, young objects are placed in generation 0. If generation 0 fills again, a GC is performed. This time, all objects in generation 1 that survive are compacted and considered to be in generation 2 (see following fig). All survivors in generation 0 are now compacted and considered to be in generation 1. Generation 0 currently contains no objects, but all new objects will go into generation 0.



Currently, generation 2 is the highest generation supported by the runtime's garbage collector. When future collections occur, any surviving objects currently in generation 2 simply stay in generation 2.

Generational GC Performance Optimizations

Generational garbage collecting improves performance. When the heap fills and a collection occurs, the garbage collector can choose to examine only the objects in generation 0 and ignore the objects in any greater generations. After all, the newer an object is, the shorter its lifetime is expected to be. So, collecting and compacting generation 0 objects is likely to reclaim a significant amount of space from the heap and be faster than if the collector had examined the objects in all generations.

A generational collector can offer more optimizations by not traversing every object in the managed heap. If a root or object refers to an object in an old generation, the garbage collector can ignore any of the older objects' inner references, decreasing the time required to build the graph of reachable objects. Of course, it is possible that an old object refers to a new object. So that these objects are examined, the collector can take advantage of the system's write-watch support (provided by the Win32 GetWriteWatch function in Kernel32.dll). This support lets the collector know which old objects (if any) have been written to since the last collection. These specific old objects can have their references checked to see if they refer to any new objects.

If collecting generation 0 doesn't provide the necessary amount of storage, then the collector can attempt to collect the objects from generations 1 and 0. If all else fails, then the collector can collect the objects from all generations-2, 1, and 0.

One of the assumptions stated earlier was that newer objects tend to have strong relationships to each other and are frequently accessed around the same time. Since new objects are allocated contiguously in memory, you gain performance from locality of reference. More specifically, it is highly likely that all the objects can reside in the CPU's cache. Your application will access these objects with phenomenal speed since the CPU will be able to perform most of its manipulations without having cache misses which forces RAM access.

Microsoft's performance tests show that managed heap allocations are faster than standard allocations performed by the Win32 HeapAlloc function. These tests also show that it takes less than 1 millisecond on a 200 MHz Pentium to perform a full GC of generation 0. It is Microsoft's goal to make GCs take no more time than an ordinary page fault.

Disadvantages of Win32 heap:

• Most heaps (like the C runtime heap) allocate objects wherever they find free space. Therefore, if I create several objects consecutively, it is quite possible that these objects will be separated by megabytes of address space. However, in the managed heap, allocating several objects consecutively ensures that the objects are contiguous in memory.

• When memory is allocated from a Win32 heap, the heap must be examined to find a block of memory that can satisfy the request. This is not required in managed heap, since here objects are contiguous in memory.

• In Win32 heap, data structures that the heap maintains must be updated. The managed heap, on the other hand, only needs to increment the heap pointer.

Finalization

The garbage collector offers an additional feature that you may want to take advantage of: finalization. Finalization allows a resource to gracefully clean up after itself when it is being collected. By using finalization, a resource representing a file or network connection is able to clean itself up properly when the garbage collector decides to free the resource's memory.

When the garbage collector detects that an object is garbage, the garbage collector calls the object's Finalize method (if it exists) and then the object's memory is reclaimed. For example, let's say you have the following type (in C#):

public class BaseObj

{

public BaseObj()

{

}

protected override void Finalize()

{

// Perform resource cleanup code here

// Example: Close file/Close network connection

Console.WriteLine("In Finalize.");

}

}

Now you can create an instance of this object by calling:

BaseObj bo = new BaseObj();

Some time in the future, the garbage collector will determine that this object is garbage. When that happens, the garbage collector will see that the type has a Finalize method and will call the method, causing "In Finalize" to appear in the console window and reclaiming the memory block used by this object.

Many developers who are used to programming in C++ draw an immediate correlation between a destructor and the Finalize method. However, object finalization and destructors have very different semantics and it is best to forget everything you know about destructors when thinking about finalization. Managed objects never have destructors.

When designing a type it is best to avoid using a Finalize method. There are several reasons for this:

• Finalizable objects get promoted to older generations, which increases memory pressure and prevents the object's memory from being collected when the garbage collector determines the object is garbage. In addition, all objects referred to directly or indirectly by this object get promoted as well.

• Finalizable objects take longer to allocate.

• Forcing the garbage collector to execute a Finalize method can significantly hurt performance. Remember, each object is finalized. So if I have an array of 10,000 objects, each object must have its Finalize method called.

• Finalizable objects may refer to other (non-finalizable) objects, prolonging their lifetime unnecessarily. In fact, you might want to consider breaking a type into two different types: a lightweight type with a Finalize method that doesn't refer to any other objects, and a separate type without a Finalize method that does refer to other objects.

• You have no control over when the Finalize method will execute. The object may hold on to resources until the next time the garbage collector runs.

• When an application terminates, some objects are still reachable and will not have their Finalize method called. This can happen if background threads are using the objects or if objects are created during application shutdown or AppDomain unloading. In addition, by default, Finalize methods are not called for unreachable objects when an application exits so that the application may terminate quickly. Of course, all operating system resources will be reclaimed, but any objects in the managed heap are not able to clean up gracefully. You can change this default behavior by calling the System.GC type's RequestFinalizeOnShutdown method. However, you should use this method with care since calling it means that your type is controlling a policy for the entire application.

• The runtime doesn't make any guarantees as to the order in which Finalize methods are called. For example, let's say there is an object that contains a pointer to an inner object. The garbage collector has detected that both objects are garbage. Furthermore, say that the inner object's Finalize method gets called first. Now, the outer object's Finalize method is allowed to access the inner object and call methods on it, but the inner object has been finalized and the results may be unpredictable. For this reason, it is strongly recommended that Finalize methods not access any inner, member objects.

If you determine that your type must implement a Finalize method, then make sure the code executes as quickly as possible. Avoid all actions that would block the Finalize method, including any thread synchronization operations. Also, if you let any exceptions escape the Finalize method, the system just assumes that the Finalize method returned and continues calling other objects' Finalize methods.

When the compiler generates code for a constructor, the compiler automatically inserts a call to the base type's constructor. Likewise, when a C++ compiler generates code for a destructor, the compiler automatically inserts a call to the base type's destructor. Finalize methods are different from destructors. The compiler has no special knowledge about a Finalize method, so the compiler does not automatically generate code to call a base type's Finalize method. If you want this behavior-and frequently you do-then you must explicitly call the base type's Finalize method from your type's Finalize method:

public class BaseObj

{

public BaseObj()

{

}

protected override void Finalize()

{

Console.WriteLine("In Finalize.");

base.Finalize(); // Call base type's Finalize

}

}

Note that you'll usually call the base type's Finalize method as the last statement in the derived type's Finalize method. This keeps the base object alive as long as possible. Since calling a base type Finalize method is common, C# has a syntax that simplifies your work. In C#, the following code:

class MyObject

{

MyObject()

{

}

}



causes the compiler to generate this code:

class MyObject

{

protected override void Finalize()

{

base.Finalize();

}

}

Note that this C# syntax looks identical to the C++ language's syntax for defining a destructor. But remember, C# doesn't support destructors. Don't let the identical syntax fool you.

Finalization Internals

When an application creates a new object, the new operator allocates the memory from the heap. If the object's type contains a Finalize method, then a pointer to the object is placed on the finalization queue. The finalization queue is an internal data structure controlled by the garbage collector. Each entry in the queue points to an object that should have its Finalize method called before the object's memory can be reclaimed.

Following fig shows a heap containing several objects. Some of these objects are reachable from the application's roots, and some are not. When objects C, E, F, I, and J were created, the system detected that these objects had Finalize methods and pointers to these objects were added to the finalization queue.





                                          A Heap With Many Objects

When a GC occurs, objects B, E, G, H, I, and J are determined to be garbage. The garbage collector scans the finalization queue looking for pointers to these objects. When a pointer is found, the pointer is removed from the finalization queue and appended to the freachable queue (pronounced "F-reachable"). The freachable queue is another internal data structure controlled by the garbage collector. Each pointer in the freachable queue identifies an object that is ready to have its Finalize method called.

After the collection, the managed heap looks like following fig. Here, you see that the memory occupied by objects B, G, and H has been reclaimed because these objects did not have a Finalize method that needed to be called. However, the memory occupied by objects E, I, and J could not be reclaimed because their Finalize method has not been called yet.





                                          [managed heap after garbage collection]

There is a special runtime thread dedicated to calling Finalize methods. When the freachable queue is empty (which is usually the case), this thread sleeps. But when entries appear, this thread wakes, removes each entry from the queue, and calls each object's Finalize method. Because of this, you should not execute any code in a Finalize method that makes any assumption about the thread that's executing the code. For example, avoid accessing thread local storage in the Finalize method.

The interaction of the finalization queue and the freachable queue is quite fascinating. First, let me tell you how the freachable queue got its name. The f is obvious and stands for finalization; every entry in the freachable queue should have its Finalize method called. The "reachable" part of the name means that the objects are reachable. To put it another way, the freachable queue is considered to be a root just like global and static variables are roots. Therefore, if an object is on the freachable queue, then the object is reachable and is not garbage.

In short, when an object is not reachable, the garbage collector considers the object garbage. Then, when the garbage collector moves an object's entry from the finalization queue to the freachable queue, the object is no longer considered garbage and its memory is not reclaimed. At this point, the garbage collector has finished identifying garbage. Some of the objects identified as garbage have been reclassified as not garbage. The garbage collector compacts the reclaimable memory and the special runtime thread empties the freachable queue, executing each object's Finalize method.




                                                
                                                               [managed heap after garbage collection]



The next time the garbage collector is invoked, it sees that the finalized objects are truly garbage, since the application's roots don't point to it and the freachable queue no longer points to it. Now the memory for the object is simply reclaimed. The important thing to understand here is that two GCs are required to reclaim memory used by objects that require finalization. In reality, more than two collections may be necessary since the objects could get promoted to an older generation. Above fig shows what the managed heap looks like after the second GC.

Dispose Method

Use this method to close or release unmanaged resources such as files, streams, and handles held by an instance of the class that implements this interface. This method is, by convention, used for all tasks associated with freeing resources held by an object, or preparing an object for reuse.

When implementing this method, objects must seek to ensure that all held resources are freed by propagating the call through the containment hierarchy. For example, if an object A allocates an object B, and object B allocates an object C, then A's Dispose implementation must call Dispose on B, which must in turn call Dispose on C. Objects must also call the Dispose method of their base class if the base class implements IDisposable.

If an object's Dispose method is called more than once, the object must ignore all calls after the first one. The object must not throw an exception if its Dispose method is called multiple times. Dispose can throw an exception if an error occurs because a resource has already been freed and Dispose had not been called previously.

Because the Dispose method must be called explicitly, objects that implement IDisposable must also implement a finalizer to handle freeing resources when Dispose is not called. By default, the garbage collector will automatically call an object's finalizer prior to reclaiming its memory. However, once the Dispose method has been called, it is typically unnecessary for the garbage collector to call the disposed object's finalizer. To prevent automatic finalization, Dispose implementations can call the GC.SuppressFinalize method.

Direct Control with System.GC

The System.GC type allows your application some direct control over the garbage collector. You can query the maximum generation supported by the managed heap by reading the GC.MaxGeneration property. Currently, the GC.MaxGeneration property always returns 2.

It is also possible to force the garbage collector to perform a collection by calling one of the two methods shown here:

void GC.Collect(Int32 Generation)

void GC.Collect()

The first method allows you to specify which generation to collect. You may pass any integer from 0 to GC.MaxGeneration, inclusive. Passing 0 causes generation 0 to be collected; passing 1 cause generation 1 and 0 to be collected; and passing 2 causes generation 2, 1, and 0 to be collected. The version of the Collect method that takes no parameters forces a full collection of all generations and is equivalent to calling:

GC.Collect(GC.MaxGeneration);

Under most circumstances, you should avoid calling any of the Collect methods; it is best to just let the garbage collector run on its own accord. However, since your application knows more about its behavior than the runtime does, you could help matters by explicitly forcing some collections. For example, it might make sense for your application to force a full collection of all generations after the user saves his data file. I imagine Internet browsers performing a full collection when pages are unloaded. You might also want to force a collection when your application is performing other lengthy operations; this hides the fact that the collection is taking processing time and prevents a collection from occurring when the user is interacting with your application.

The GC type also offers a WaitForPendingFinalizers method. This method simply suspends the calling thread until the thread processing the freachable queue has emptied the queue, calling each object's Finalize method. In most applications, it is unlikely that you will ever have to call this method.

Lastly, the garbage collector offers two methods that allow you to determine which generation an object is currently in:

Int32 GetGeneration(Object obj)

Int32 GetGeneration(WeakReference wr)

The first version of GetGeneration takes an object reference as a parameter, and the second version takes a WeakReference reference as a parameter. Of course, the value returned will be somewhere between 0 and GC.MaxGeneration, inclusive.

No comments: