Tuesday 26 July 2011

Language and Library separation, and some Library plans

Library and language separation. When the language is simple, then it's easy to enforce a separation. However, as the language becomes more complex, then it's getting more difficult. For example, earlier, I posted about the need for a Standard.Map -like structure to define the type 'type' by. It would be much easier to recycle those definitions of how a Set should behave, not least easier for the implementers.

What I could do is simply leave the type as undefined, and merely mandate that 'type' can be used. In conjunction with the language providing an explicit cast in situations where it's needed, this should allow the type 'type' to be part of the Standard library, and even inherited from. What I don't like about this approach is that you could never write your own Standard library for a specific implementation, even with a copy of the operating system and processor architecture documentation. I guess that I'll have to live with extending that definition to "implementation documentation". Hell, it's not like you couldn't write- and use- your own compiler- at compile-time- and then run the resulting code- at compile-time- and use it to compile your own program-  if you really, really, really wanted to, I guess.

I should just include a note in the "Standard" saying, quite clearly, that the implementation shall document all APIs necessary to implement the whole Standard library.

Maybe I should begin constructing an actual Standard document. It would be much easier to have a full reference of all my ideas, it would be easier to communicate with people, and it would be easier to make sure that I haven't missed anything. Of course, a Standard is not an actual Standard without a long, drawn-out ISO committee waste of time. But a full specification would be an advantage.

Library changes. Firstly, as you may have gathered, there's now the use of sub-namespaces (the existing std namespace was way too cluttered). The containers will go in, imaginatively, Containers. I'm going to strip the unordered containers and call them hash instead, e.g. HashMap. That's just more descriptive. In addition, I will add an unordered_vector. The main difference between Vector and UnorderedVector is that you could call unordered_vector just a bucket full of stuff- for example, you can erase O(1) by swapping to the end and then popping off the back.

Most things will remain mostly the same, but I'm definitely going to introduce a full range object and use it where appropriate to simplify the use of algorithms, and cut the number of overloads of many functions drastically and use named parameters instead- especially functions like constructors. Strings, I am so cutting the string/wstring debacle and just going to go UTF-16. Java and C# and Windows are all UTF-16, and I figure that it's just the most compatible way to go. I'm not sure about what the new IOStreams are going to look like, but they're certainly not going to look like the current ones. For example, I am leaving buffering to be an implementation detail and not going to define any of it in my interfaces, and I will probably go back to templates instead of virtual functions for polymorphism.

In addition, I'm going to add some more functional algorithms, like map and reduce, and I've talked about some Interop functionality. I also want to add some Standard GUI code. Honestly, the point of Standard GUI libraries is not to be the latest and greatest in GUI work, but something simple and functional that can be used to create simpler GUI applications.

I'm also definitely considering adding support for DSLs for some relatively ubiquitous languages like SQL (i.e., kind of like LINQ but it'll suck less in terms of error reporting) or XML as Standard.

Threading. I don't want to just wrap atomic operations or mutexes. I want something along the lines of TBB  or PPL. I want something smoother, more integrated. I also need to address the use of compile-time threading, which is something I want to hide from the user. I definitely don't want to force serialized compilation, but I also need to perform sometimes complex mutations at compile-time, and I'm going to need a powerful threading library to support that, and possibly a couple of language additions too.

I've also been thinking about mathematical support. Let's face it, there's really no need for everyone to define their own Point class or their own Vector3 class or their own Rectangle class. The Standard can provide that. Everybody's life would be much easier if the Standard provided a little more in terms of BLAS support. 

I also want to version the library separately from the language. If there's a problem, I want it resolved sooner rather than later. I don't want stuff that's bugged sitting around for a decade before it gets fixed. In addition, I want to be able to provide full source code for all functionality that isn't compiler, OS or processor specific- that is, all functionality built on functionality from the language. If I'm going to say "You must provide SharedPointer", then I want to have source code that is portable to show for it and say "Well, if you can't be bothered to write your own, here's mine so just copy and paste it and you're done".

What I will definitely not do is ever refer to the C or C++ Standards, or include the C or C++ Standard library. If I ever see anyone tag an SO question "DeadMG++/C++" without discussing interoperation, I'll flip my brains.

Another problem I've been considering is variadics. Ultimately, I feel like now that we can produce complex data structures at compile-time, they should be part of the library. However, I'm not totally sure how I could retain the relatively seamless integration of variadics, mostly as relates to deduction. Of course, now that regular iteration over them can be done.

I guess that ultimately, "DeadMG++" really offers a new style, where functions and types are generated, and there's old-style "C++" where they are just literals. The new style is much more powerful and flexible, but the old style can be easier to use.

No comments:

Post a Comment