Tuesday 18 October 2011

Rargh banks annoying

Goddamn useless card readers. I don't need a card reader to send Amazon or Tesco money electronically, so why do I need one to send money to myself? :(:(:(

Sunday 16 October 2011

Rargh

Rargh, university. The bane of my life. So can't wait to get rid of this shit. :(

Tuesday 11 October 2011

Customize-Reach GC

I've decided on a new strategy for garbage collection- the CM GC, or Customize-Mark GC.

The basic idea behind a GC is to list all the objects you ever allocated off the GC. Then you look through the stack (and static references) and mark off every object referenced. Then you go through the list of new references and mark off every object *they* reference. Rinse and repeat until no new objects referenced. Delete all remaining objects.

However, I've had a genius idea for a GC. One where you can customize the "mark" phase. Consider, for example, a std::vector or boost::variant. These container objects would need a custom reach function so that they can "reach" their container's contents. This also means that it would be possible to "mark" objects which are only referred to externally.

For example, let's consider the WinAPI thread pump. You can associate a custom object with each HWND for use in callbacks. This basically consists of SetWindowLongPtr/GetWindowLongPtr. Obviously, in previous GCs this would be impossible to point to a GC object, because the GC has no way of reaching it and even if it does, then it can't change the object pointer, making it impossible. A custom-mark GC, however, can.

class Window {
    HWND hwnd;
public:
    void mark() { // special function
        struct marker {
            HWND hwnd;
            void operator=(T* ptr) {
                SetWindowLongPtr(hwnd, GWLP_USERDATA, ptr);
            }
            T* operator->() {
                return reinterpret_cast<T*>(GetWindowLongPtr(hwnd, GWLP_USERDATA));
            }
        };     
        GC::mark(marker { hwnd });
    }
};

This can tell the GC where to find a custom pointer to an object, and can update it if the GC decides to move the object. This means that the GC implementation can still be compacting, *and* maintain pointers to GC objects that are held externally.

Monday 10 October 2011

Extra Implementations

Well, I've been thinking further about implementations. I want to design a language which can be implemented on the CLR and JVM, as well as native code. Now, all the metaprogramming goodness can be achieved by icky name mangling and is platform independent.

Now, I've been thinking about implementing pointers as integers. Consider the following simple pseudo-Csharp:

class memory {
    static ArrayList<Object> heap;
    static int Create<T>() {
        lock(heap) {
            heap.Add(new T());
            return heap.size() - 1;
        }
    }
}

Obviously this will need to become more complex to actually fulfill the requirements. In any case, this handily converts from a GC reference to a pointer-style type. The integer is POD and can be memory-copied around, it needs to be manually removed, and the GC won't collect the object. In a theoretical implementation of a C++-like language on the CLR, then the return value would serve as a "pointer".

The problem is when you start wanting to insert values into memory yourself. For example, if I were to attempt to emulate a class like this:

class int_or_string {
    bool is_int;
    memory buffer[max(sizeof(int), sizeof(string))];
public:
    int_or_string() {
        is_int = false;
        string* ptr = new (buffer) string();
    }
    // etc
};

For POD types, then there are BinaryWriter classes and such available. As such, I can probably achieve a reasonably close implementation. In addition, all classes that don't have GC references in them are, arguably, implementable by binary writing them into the memory, since emulated pointers are binary writable. As all arrays must be allocated off the GC, it should be relatively simple to implement the "address" returned from placement new as, again, an index into an array of GC references.

The problem comes when I want to put a GC reference there. The GC won't exactly follow my not-exactly-pointers to find the reference.

I've been thinking about just ignoring the "placement" part of the placement new. Just allocate the supposedly placed object off the GC. The problem with this comes when you want to start placing objects into other kinds of memory- for example, how would you place objects into memory provided by external libraries?

What I really need is to differentiate between an object placed into memory which will always have a strong pointer pointing to it, and an object which is binary copied into memory. For objects which are always strong pointed to, I can just ignore the placement part and allocate off the GC. For other objects, well, you can't put GC references in there anyway.

This mandates two ways of writing into memory. So far, I've been thinking about a relatively simple kind of buffer = object; for the second kind. I can guarantee at compile-time that object is not a type which contains GC references. For objects which contain GC references, I would require something like placement new, where you would have to maintain a strongly-typed pointer to the object (including through inheritance, or as part of an array) at all times or risk UB.

Saturday 8 October 2011

Why Joel Spolsky is a numpty

Numpty being a technical term, of course. Have a look at this article:

http://www.joelonsoftware.com/articles/Wrong.html

The correct response to the encoded-non-encoded string problem is to create a separate type which mirrors the platform native string, but that cannot convert to a regular string, or only by a conversion operator which safely encodes it.

class unsafe_string {
    std::string mah_string;
public:
    //.. blah blah
    operator std::string() {
         return encode(mah_string);
    }
};

As such, instead of relying on the eye to review the code and make it safe, you can simply make the compiler do the work for you, which is a vastly superior alternative.

As for operator overloads, well, I'd just have to provide equivalent methods anyway- operator overloads are just syntactic sugar. Also, I don't know wtf IDE you use, but mine tells me the type on demand and will let me see it's declaration on demand as well.