Thursday 30 June 2011

Some new ideas

I had some new ideas today, and they are awesome. Beyond awesome. Let's start.

#1 Precompile templates. Cut whatever's necessary to do so. Currently on the list is the single-pass compilation stuff at least, maybe ADL (which I was going to cut anyway) and possibly extension methods (which I was going to add) and in addition quite possibly some of the freedom involved in the current specialization. This is going to be implemented by compiling functions to machine code which operate on compiler-internal types- effectively compiling the templates as a language to machine code. As they can be pre-compiled then concepts checking would be almost free, because it's running in machine code and is gonna be a couple data structure lookups.

#2 Add arbitrary compile-time functions. This means allow all Phase 1 types and functions, which basically means the entire Standard lib except I/O, including dynamic allocation, etc. Do not allow arbitrary string compilation (unless you write a compiler yourself). Phase 2 functions can create types, make copies (types are immutable in the current model), and they can take, read, and create functions in an AST form, which will allow them to take functions and convert them into, say, other languages like HLSL or SQL. These will also be pre-compiled. Phase 2 functions can, as an implementation addition, use C-style interfaces loaded from DLLs or maybe even included as .libs. Exceptions can be done in Intellisense.

#3 Run Phase 2 and Phase 1 functions and use Phase 2 and Phase 1 types at run-time.

TODO: Maybe merge Phase 2 and Phase 1? But that could get messy.

Languages aren't great, they're just theories. Implementations make great things happen.

Wednesday 29 June 2011

Co-variant returns

Co-variant return types. Don't you just love them?

struct Base {
    virtual Base* do_something() = 0;
};
struct Derived : public Base {
    Derived* do_something() { return this; }
};


But as per usual, this particular feature wasn't properly thought through. After all, what if I need to use an owning pointer? Only raw pointers are co-variant.

struct Base {
    virtual std::unique_ptr<Base> do_something() = 0;
};
struct Derived : public Base {
    virtual std::unique_ptr<Derived> do_something() { 

        return std::unique_ptr<Derived>(new Derived(*this)); 
    }
};


Oops - not allowed. What, are my UDTs second class citizens now? I think that this rule should be widened to allow any implicit conversion, especially now that we have explicit conversion operators in C++0x.

There are other examples of this problem.

template<typename T, typename U> void CheckType(T* t) {
    if (!dynamic_cast<U*>(t)) {
        // Whoops
    }
}


Ok- for raw pointers. But, of course, we can't do that for shared_ptr, so it's time to overload.

template<typename T, typename U> void CheckType(std::shared_ptr<T> t) {
    if(!std::dynamic_pointer_cast<U*>(t)) {
        // Whoops
    }
}


Of course, unique_ptr doesn't even provide such a facility, so we'll have to come up with something else- probably using get().

template<typename T, typename U> void CheckType(const std::unique_ptr<T> t) {
    if (!dynamic_cast<U*>(t.get())) {
        // Whoops
   }
}


I sure hope that I don't have more than a couple smart pointer types, or I'm gonna be here all day. Just let me overload dynamic_cast and be done with it. Why bother allowing us to overload only some pointer operations? This would also allow much more natural syntax for libraries that implemented their own- like QueryInterface() in COM, and LLVM has one and I think that Qt also has one. Some newer COM interfaces come with a nice template wrapper on QueryInterface() that takes away some of the headache, but it's still not the same as actually supporting dynamic_cast and being generic.

Tuesday 28 June 2011

The Windows headers

Macro leakage much? :(

If I want to include the DirectX headers, I'll have to include the Windows headers. That means I'll have to suffer the horrific, endless, macro leakage. Macro leakage makes me cry. At least, I can (try) to forward-declare what I need in my headers, but eventually, I'm gonna have to actually include the full header. Forward-declaring stuff isn't too bad when all you need is HWND, but all those types like LRESULT, WPARAM, LPARAM, HRESULT (most of which resolve to the exact same damn thing and are easily expressed in primitive types, even platform-independently) will be forever to declare.

Of course, Microsoft could have taken the x64 opportunity to make a new Windows header that didn't suck so hard, but I guess that would have broken compatibility.

Maybe I should have a little implementation file that will just forward on.

Named parameters - double use

Named parameters. Wouldn't you just love it? Anyone who has had to suffer through using a C++0x Standard container with a custom allocator that can't be default-constructed has had to notice the abysmal repetitiveness of the other parameters you left as default.

However, there's another reason to allow named parameters- multiple variadic templates. They can be disambiguated by the use of naming the parameter they correspond to. Consider this snippet:

template<typename A, typename AArgs..., typename B, typename BArgs...> 
std::pair<A*, B*> make_heap_pair(AArgs&&... AArgsRefs, BArgs&&... BArgsRefs) {
    return std::pair<A*, B*>(
        new A(std::forward<AArgs>(AArgsRefs)...), 
        new B(std::forward<BArgs>(BArgsRefs)...)
    );
}



Of course, this is relatively contrived. What becomes not particularly contrived is a simplification of a constructor for std::set<T, CustomComparator, CustomAllocator> for example.

template<typename T, typename comparator = std::less<T>, allocator = std::allocator<T>> class set {
     comparator c;
     allocator a;
public:
    template<typename compargs..., typename allocargs...>
    set(compargs&&... compargsref, allocargs&&... allocargsref)
        : c(std::forward<compargs>(compargsref)...)
        , a(std::forward<allocargs>(allocargsref)...) {
        // Whatever
    }
};


Of course, this constructor is impossible in C++0x, because there'd be no way to know where one args list finishes and the next begins. So I sure hope that you don't need to construct your comparator or allocator in-place, because that kind of interface design is impossible. But it will be possible with named parameters. In addition, there will be flat out less overloads required, because we can insert parameters that are at the end of the list and leave the comparator as default.

In fact, and this is just a quick idea, technically, you could use it to disambiguate between two functions taking identical types as signatures.


void f(int x);
void f(int NumberOfSocketsToOpen);

int main() {
    f(
        x = 1
    ); // Not ambiguous
}


Whether or not that would actually be good, I'm not at all sure.

New and delete

Sorry I haven't posted for a whole three! days. I just didn't think to record my thoughts.

Am I the only one who thinks these should be cut- yes, cut- from the language?

Firstly, thanks to variadic templates and perfect forwarding, they can be library functions, not language features (except placement new). And secondly, obscuring them behind some Standard types would make the specification much easier, and bugs much easier to avoid.

How would this be done?

Well, we'll invent a type, and we'll call it std::dynamic_array. It will likely be polymorphic. We will give it a size() method for the size of the array and an operator[] for accessing the elements. Then we will simply form unique_ptr and shared_ptr to it as necessary. We will have make_shared, make_unique, make_shared_array, make_unique_array, in my initial design, although I was also thinking of a kind of make_object

Why would we do this?

Firstly, new and delete as they are are buggy. I don't just mean the possibility of deleteing a pointer you got back from new[], or delete[]ing a pointer that's actually in the middle of an array you new[]ed, but also not knowing the size of the array. Altering the language in this way would prevent such misuse and clean up the specification. In addition, we can improve it to allow multi-dimensional variable length allocations as Standard, instead of having to hack it as we do currently.

As a language feature, our dynamic allocation should be the best it can be. The existing new and delete violate many design principles, RAII being the most prominent example, and the array-to-pointer conversion used in new[] is despicable. As such, they should be scrapped

Saturday 25 June 2011

Numerical literals

Question: What is the type of 1? That's right, it's an int.

But what if I really think that it should be something else? Take, for example, the following code snippet:

void f(unsigned char);
void f(float);
int main() {
    f(1);
}


And voila- it's an ambiguous overload. However, logically, I find this to be a bit WTF. I mean, I get it that 1 has type int, but the conversion of an integral literal in general into floating-point will be lossy, even if it's not for 1 specifically, whereas the unsigned char is an exact match for 1. In fact, for literals, it would be more than feasible for the compiler to determine the exactly appropriate type. Here's a very simple rule- if the number is negative, then the type must be signed, else it is unsigned. The type is the smallest integral type which can represent that value exactly. This would cut the need for the long and long-long, unsigned suffixes, and yield more precise type matches in circumstances where it may otherwise be ambiguous. And you know what- if it really matters whether an unsigned char or an unsigned short overload is chosen, then I've done it wrong.

So now I've had to re-write f above to actually take an int- even though the full range of an int isn't acceptable and I really can only accept 0-255, which will convert an error that the compiler could have detected at run-time into an error that can only be detected at run-time, which I'm certainly not going to write code to detect and will only come up when the GPU starts showing the wrong colours. What even happens if you give Direct3D a range outside of 0-1 normalized float in a given channel anyway?

Friday 24 June 2011

Stack Overflow

Here's a question. The premise of Stack Overflow is for questioners to come and get their questions answered by experts. This works great for the questioner, but as I am hardly the first to notice, has a slight gap where the expert is concerned. Why would the expert come and answer questions on the site?

Now, there are some reasons which don't need to be dealt with explicitly. Things like being social with your fellow experts, or maybe you'll come across a question dealing with something you've not seen before, or maybe you just feel really, really hot about your favourite language/technology and feel a compulsion to come and help people out using it or maybe you just like your fellow man.

The trouble with this is when you don't feel like it. Maybe you had a row with another guy in the chat or maybe Jeff did something again or maybe you got repeatedly anonymously downvoted or maybe you saw Eric Lippert get 1,200 upvotes for frankly, a medium answer to a medium question. And you stop going there, just for a short time, and then you ask yourself- what am I really missing here?

And now you have all this free time that you previously wasted to work, or sleep, or play games or have sex with your wife or whatever you want to spend it doing, and you ask yourself- why go back?

Exams over

So then now that my exams are over, it's time to get cracking on some coding. I've decided to drop graphics programming for a while, I tend to take breaks between large code revisions, and I'm going to resume work on the interpreted language that I've been cooking in my huge, incredible brain for a little while.

Now, I've been thinking about using alloca() to allocate stack space. This would ensure that firstly, I don't have to new anything and worry about keeping it around, and secondly, having my shiz on the actual C++ stack would improve memory locality- something I could sure use if I plan on abusing virtual functions like I am. However, there's a problem with this - alloca() isn't Standard. I really want to keep this Standard.

Now, I know that MSVC and GCC both support it, so I won't be doing too much harm to my more common target platforms. But I don't like the idea of depending on platform-specific behaviour in what should be really platform-independent code.

Monday 20 June 2011

What not to like about LINQ

Everybody loves LINQ. Even I love LINQ, I think it's an excellent design. But it has a very major problem- a problem that I actually find endemic to the .NET Framework in general, and that's leaky abstractions. For example, the IQueryable<T> interface offers methods that can't actually be called using LINQ to SQL- for example, .All(), I believe. Who designs an interface designed to query and then you can't use all the methods?

If SQL querying is a subset of object and XML querying and whatever other implementation you can think of, then it should be defined as a separate interface- ISQLQueryable<T> or something like that. That way, you can know, at compile-time, what methods are actually available. And if you can do more operations on objects or XML or whatever, then IQueryable<T> can inherit from it.

Instead, you get a nasty surprise when you attempt to run a LINQ query using All() and you find out LINQ to SQL doesn't support it.

If you can't send arbitrary things to the database server, and call arbitrary functions on them, in your LINQ query, then you should not use an interface which implies that you can do these things.

Friday 17 June 2011

Reallocate in the Standard allocator interface

Following on from my post about having a concurrent memory stack, I've reached a slight problem. The Standard allocator does not provide a way to ask you to reallocate the memory. This means that, if you were allocating off a stack, you would have to allocate, and then free- but you can't free, because it's not on the top of the stack any longer, because you just allocated off it. And then you move the contents- rather pointlessly, because they could have just stayed where they were and the memory chunk size been increased trivially.

So then. IOStreams- check. Allocator- check. Ranges- no. Is the Standard library really the pinnacle of library design?

Why formal logic fails

Formal logic, in terms of using it to program a computer, is a bit of a failure. Not only are the vast majority of programs procedural/imperative, but even teaching students formal logic and logical/declarative programming has a massive failure rate. Why is this?

It's because logic isn't specified in any language students can understand.

Let's be quite simple here. If you put a symbol on a page, then a person must understand that symbol in order to make sense of what you're trying to say. The more symbols you use, the greater understanding- and memory- a person requires. This means that the more strange symbols you put on a page at once, the vastly less likely it is that a poor student will actually understand your communication- let alone what it actually means.

Let's run a theoretical experiment. Instead of , we will just use "and". Then, we will replace the other logical symbols similarly- (a or (b and not(c))). So which of these is going to be easier for unlucky students to learn?

Or,

if (holding(at(agent, X), S) and adjacent(X, Y))
    poss(go(X, Y), S)
if (portable(G) and (holding(at(agent, X), S) and holding(at(G, X), S)))
    poss(grab(G), S)
if (holding(has(G), S))
    poss(release(G), S)


When teaching English students in the English language- why not just stick to English? There's a slim chance that if you speak the same language, you may get further. Of course, bonus points for meaningfully defining holding, at, agent, X, S, Y, G, portable, grab, release, has, and poss beforehand.

Thursday 16 June 2011

C++ and interpreted languages

Let's take a look at interpreted languages, shall we? In this case, I'm going to limit the context to interpreted languages being used to extend a host program, or whose explicit use is to extend a host program- so we can skip something like PHP, which is usually used on it's own. For example, Lua, is definitely covered- and I have quite a bit of experience with Lua, so I'll mostly pull examples there.

Why on earth would you make such a language dynamic? By this, I'm talking about the usual waft of dynamic language features- dynamic typing, garbage collection, etc.

How the hell is anyone supposed to find it easy to communicate between two languages with completely separate paradigms? If you're going to design Language A to be used to extend Language B, then pretty logically, the core point here is extend. That means that Language A should probably be quite similar to Language B, to make the extension easier.

Instead, all of the extension languages like JavaScript, Lua, and (to a lesser extent) Python, are ridiculously dynamic.

If you have a C++ application, why do you embed Lua? Because you want to interpret the code at run-time. If not for that, you would never use Lua. Why should the need to interpret the code at run-time necessarily imply that we also need to imply a bunch of other things at run-time (instead of at interpret-time)? Having dynamic types just makes life much harder for the extender.

I think that a language which is intended to be used to extend a statically typed, high-performance language should be statically-typed, high-performance. Indeed, with the magic of templates, then you can extend a mythical ideal extension language's type system from C++ to include whatever C++ types you need- if you like virtual functions.

Then, instead of having to waste your life marshalling them and stuff, it could  just work. If I want to marshal shared_ptr<int> then- well, my extension language already has that type, so it just goes. I know in advance that C++ types tend to depend on constructors and deterministic destruction, so I'll write my language to respect that automatically. I know that my C++ application, which I am extending, will use value types- as that's the C++ system- so I will write a value-typed script language. This will make all of my C++ types cleanly and easily marshal between languages. In fact, I might even get away with no marshalling whatsoever, and give my script language the apparently incredibly impressive ability to call C++ functions that weren't built explicitly for the sole purpose of interoperation, as long as their return values and arguments are listed types.

And thus, I might be able to create an extension language that can actually do what it's supposed to do - extend.
I wanted to talk today about a concurrent stack. No, not a stack usable from multiple threads. A stack that we use concurrently to the native hardware stack, used by automatic variables.

Whilst, on most platforms and under most compilers, you can dynamically allocate off the stack, it's really not a good idea, because as soon as the function ends, that memory is gone. In addition, you may, depending upon the smarts of your compiler and the logic of your function, have to pay for dynamic indexing of your normal, non-dynamically-allocated stack-based variables. In addition to this, because you can't exactly call alloca() from in the constructor, you will have to manually call it every time, which is not great encapsulation, and ensures that you will never be able to use it with a Standard container.

So I wanted to propose a solution: A second stack, allocated off the heap. You'd need one for each thread- not shown here, but we'll accomplish it in C++0x with thread_local. This would be a relatively large chunk of contiguous memory, and intrinsically function in the same way as the current hardware stack does- a simple decrement to allocate memory, all deallocated at once when the scope ends. The difference is that you can manually define "scope", and pass "scope" around and take references to "scopes".


thread_local memory_stack stack;
std::string func() {
    scope my_scope = stack.make_scope();

    auto stack_string = func(my_scope);
    return string(stack_string.begin(), stack_string.end());
}
std::basic_string<char, stack_allocator> func(const scope& s) {
    decltype(func(s)) string(s);

    // Perform some expensive computation involving lots of memory allocation here
    // But where none of the memory needs to live past the function call    
    return string;
}
int main() {
    // We want the string to live for more than a little while
    // let's say, we're going to move it into a container or something
    // which would be a copy for a stack-based string
    auto heap_string = func();

    // The string lives only for this function,
    scope s = stack.make_scope();
    auto stack_string = func(s);
}


This approach is quite efficient- but it has a few problems. For example, we can't pop off the working set of func(s), because the string is in that working set- we have to wait till we don't need the string anymore to get rid of all of it. This means that the working set of func(s) could sit around for quite some time, and the problem gets compounded as you start making more calls. So what I might consider is a transition to an instance-based approach. It'll also solve thread-safety problems and, icky globals. However, you'd now have to allocate a new stack for every function which wants to use this approach. This will dramatically reduce the amount of memory we could reasonably pre-allocate, as now we may have many instances instead of just one per thread.


std::string func() {
    memory_stack m;
    auto stack_string = func(m);
    return string(stack_string.begin(), stack_string.end());
}
std::basic_string<char, stack_allocator> func(const memory_stack& s) {
    decltype(func(s)) string(s);
    memory_stack inner_stack; // owch
    // Perform some expensive computation involving lots of memory allocation here
    // from inner_stack but where none of the memory needs to live past the function call    
    return string;
}

int main() {
    memory_stack stack;
    scope s = stack.make_scope();
    auto heap_string = func();
    
    auto stack_string = func(stack);
}


The other problem is working set size. We can only reasonably pre-allocate so much memory, and it's going to be unsuitable for large, individual allocations, although we can use an unrolled linked list to cope reasonably efficiently with unbounded working sets that are comprised of a lot of smaller allocations.

The other place where I can see this being of use is where an individual object needs dynamic memory, but that dynamic memory will always live for the lifetime of the object- where multiple dynamic allocations are required, of course.

Edit: What is the use case for such a class/such semantics? It's simple- repeated dynamic allocations and de-allocations under a certain size with a limited scope. If you're building temporary linked lists, maps, hash_maps or even temporary vectors or strings where the size isn't known up front, then you can go from several allocations to one, or one every max-allocation-size. Careful of the maximum allocation size, though.

Of course, if all the chunks have the same max allocation size, then they themselves can be allocated from a fixed-size object pool, such as that provided by Boost.

Apparently, I'm not the first one to have this idea, which would hardly be the first time my Epic Uber Groundbreaking™ idea was already thought of by someone else, and it's known as a memory pool. The Boost library only offers a fixed-size pool, so forgive me for not noticing that :P

Wednesday 15 June 2011

I decided to ask another question today. When is a language not a language? This is mostly a hypothetical question that I'd like to discuss/think about, rather than suggest as a practical solution.

For example, let's talk about the hideous mess that is the current C++ grammar. Hideous mess isn't just my opinion- it's not context-free, which makes it an official bitch to parse and compile. In addition to that, there are other compilation-related problems- for example, header files. The trouble with eliminating these problems is that all our old code is stuck with them- if we change the C++ grammar, we would have to re-write every line of existing code.

That's why I'm going to suggest that C++ defines two grammars. And, further to that, that we cut the preprocessor and compilation model entirely.

How could such a thing possibly work?

Firstly, we need to consider that new languages will always supersede old languages eventually. A new language will come along and beat C++. It might not be D, or the JIT generation, but it'll happen. And compilers for that mythical language, they will have to be implemented. When you talk about how implementers are going to have to implement two grammars, then that's what's going to happen anyway- if not already. Consider that Microsoft, for example, already compiles more than two major languages in parallel - C++ and C#. At least, if all we did was define an alternative grammar, then they could keep the same back-end generators, assembly optimizers, and that kind of thing.

Secondly, code in the new grammar could be dramatically easier to deal with than code in the existing paradigm. Not least of which because the new grammar could be designed from the ground up to be extended and meet all of C++'s existing needs in a context-free way, making parsing it substantially easier than now. This makes it a lot less than double the work for a compiler implementer.

Thirdly, a new grammar gives us an opportunity to genuinely rectify our mistakes. For example, the preprocessor. Was it a mistake in 1995? Probably not. But right now, it's a huge problem, and we need to eliminate it. Having an old and a new grammar is an excellent way to separate having old and new semantics too. When you compile "old-grammar" code, then you can do this- but it's forbidden in "new-grammar" code. Even simple things, like on char* to string literal, and array-pointer conversion- the kind of thing that nobody wants to admit really exists in the C++ Standard but always has to. When you compile in "old-grammar" mode, then you get headers and all the rest, and "new-grammar" mode will have no preprocessing.

Ultimately, in my opinion, either C++ will make this transition, or another language will come along. I think it would be better if C++ and the C++ Standard chose to make this happen themselves instead of waiting for someone else to do it. I think that C++0x is beating a dead horse- it might twitch, but it'll never get up and plough.

Garbage collection and RAII

I wanted to write a brief article here about garbage collection. A friend linked me today to this article:

http://www.johndcook.com/blog/2011/06/14/why-do-c-folks-make-things-so-complicated/

It proposes splitting C++ into two sections, effectively- a "Bottom" C++ of manual memory management, and presumably pointers and all those other terrible things, and "Top" C++, of garbage collection, and presumably, we can ditch headers and macros and all that as well. Let us celebrate.

Except, I don't genuinely believe that garbage collection is higher than scope-based resource management. In fact, I think that it's lower level than C++'s scope-based resource management. The trouble with garbage collection is that it's only memory. Any other resource, and you're back to malloc and free. And only memory that you allocated from the garbage collector, too. That's the fundamental problem- garbage collectors can only handle their own, pre-programmed situations.

Consider the automatic memory management in C++0x - unique_ptr. I wrote a trivial custom deleter (five lines) and now it will automatically manage my COM objects where I want to tie the lifetime of my reference. The same works with shared_ptr - I can just plug in a custom deleter and get shared ownership of my reference. Not only will I be able to share ownership of regular memory, but I'll also be able to share ownership of, well, whatever I want.

If, in C#, I want shared ownership or unique ownership of GC-allocated memory- that's great, the GC will handle that for me. But what happens if I want shared ownership of a deterministically destructed resource? Well, now we have a big problem. because the resource class itself has to be Disposable. And you'll have to implement your own reference counting- including thread safety. That's fine- but every class that needs to hold an instance of that class also needs to be Disposable and Disposed. And they don't call themselves- you will have to Dispose() manually in every situation except where the Disposable object is on the function local stack. How is this any different to C-style free()?

In addition, I've really got to question the ability of a garbage collector to actually manage memory. Now, I've heard of GC/JIT optimizations that will put objects on the native stack. But I've certainly never heard of a JIT or GC that will make an object pool for you. In managed languages, people still implement object pools, just like there are object pools for C++. Is this really automatic memory management?


Let's talk about contiguous memory. Can I ask the GC to allocate me some nice, contiguous memory? Sure- if I can allocate all at once. But can I ask it to store some contiguous memory, and then take objects in and out of it as I see fit, and refer to those objects? Well- no. Once I allocate that array, it's done. This isn't so bad for something that's trivially copyable like an int, but once we start talking about non-copyable classes, for example, then bad things happen. This is especially bad as you can only have contiguous arrays of value types in C# - something that doesn't even exist in Java- and I sure hope I didn't want to store references to value types, instead of to a copy. If we're doing a lot of work on these types, not being able to store them continuously is an incredible performance hit.

So let's summarize. We have on the left side a system that can automatically free any resource, in shared or unique ownership conditions, with a high degree of customizability and performance. On the right side, we have manual resource freeing, no customizability, and degraded performance.

I'm going to ask- why would anyone consider the system on the right to be "higher level"?