Monday, 1 August 2011

Garbage Collection

I reckon that it might actually be feasible, with a couple of nice tricks that should enable a more performant implementation.

Firstly, GC references will be strongly typed, as opposed to non-owning pointers. They will have the type Standard.GC.Reference(T). This will instantly remove the need to follow pointers which won't lead to garbage-collected references, which will hopefully be, say, most of them. If you point to a type which, directly or indirectly, holds no GC references, then we don't need to follow that pointer. In addition, this may well allow compacting GC implementations, because it's instantly obvious which references are GC and which aren't.

Secondly, the GC will follow raw pointers that may point to objects which may hold GC references. This leads to the direct conclusion that it is undefined behaviour in the general case to hold a pointer that does not point to a valid object which is not null, because the GC might follow it if the link eventually leads to a GC reference. This may force changes in iteration idioms- for example, one-past-the-end is now undefined in the general case, unless (as is the case with node-based containers) the end explicitly exists as a node. However, making the end be the last valid object instead would remove the need to explicitly have an end node, which may yield a performance improvement anyway.

Thirdly, objects can override their GC behaviour. This means that for a type such as Vector, the default behaviour would have to be overridden so that the GC can reach any references to any objects contained within, because the compile-time reflection used to generate the default behaviour will not register the Vector's contained types. Node-based algorithms like lists and maps probably won't have that problem.

Now, I've been considering stack-scanning. Of course, native C++ already includes a mechanism for telling which objects are on the stack- exception handling. It's the same principle, we're just interested in (somewhat) different objects. That is, a stack scan would be the same algorithm as throwing an exception, and then not catching it until main(). That necessarily makes it quite an expensive operation- but not necessarily a huge one and may influence a change in exception handling mechanisms. After all, throwing an exception should be a rare event, whereas GC collections happen frequently. Now, I'm not that familiar with the hardware stack, it's not normally the kind of place I program, but as far as I know, scanning it should be as simple as check the return address to know where we are, and modify the stack pointer appropriately to find the variables that we want. Oh, and don't forget the registers, too.

Next time, I will write about general-purpose memory allocation performance and how metaprogramming can make your program run faster.

No comments:

Post a Comment