circusmachina: Adventures in Computer Linguistics

Obligatory Disclaimer: Where I provide links to languages as examples of my own experiences with those languages, it is best to keep in mind that these have been my experiences, and the opinions I express about those languages are my opinions. Your mileage and conclusions may vary.

Why create a programming language? There are already any number of languages available; surely, among the myriad available languages, there is at least one that will satisfy the purpose of a given programmer, whether that is for greater fundamental control over the computer, improved structure and readability of code, or the quick translation of an idea to an actual process. Yet, when perusing the lists of available languages, it becomes clear that many of them are limited in some way: some of them have highly-specialized use cases, some of them are limited to a narrow band of platforms, and some of them are not yet feature-complete. Moreover, when attempting to use some of these languages, one may find the syntax disagreeable, the documentation limited or incomplete, or the developers removing functionality in order to conform to the standards of the language as defined by a major corporation. Each of these languages may satisfy a particular programming requirement but, in trying to find one that I could use in order to write my game engine, I found not one of them which could satisfy me.

Sampling Languages

When setting out to write my game engine, I wanted to find the best possible programming language to use for that purpose. Of course, "best" is a subjective term, but since the end result was to be a game engine, there were certain criteria that could be used to help define the "best" language:

Since a game engine is dependent on quick execution, the language should produce quick and efficient code. Size was of less concern than speed.
Because a game engine is a complex collection of various systems -- almost an operating system in its own right -- the language should handle complexity with ease. In other words, its syntax and structure should allow complex ideas to be recorded in a clean and efficient manner.
I wanted my game to be playable on the widest possible array of systems; therefore, the language should be portable.

These were the three criteria I used to judge the various languages I tried. I was not concerned about libraries since mine was to be a custom game engine. There were only two libraries that I really needed: the OpenGL library and SDL. Since both of these provide C interfaces, there was little concern that they would be inaccessible from whatever language I eventually chose.

Why not C++?

Even though C++ is the most widely-used language for game programming, I immediately ruled it based on prior experience. Then, because I thought such a dismissal was too hasty, I gave C++ a chance anyway -- and immediately rejected it again. Consider the following, which is taken from my early source code for a binary tree:

/** Fetches the leaf with the specified key from the node or its subtrees.
    When called on the root node of a binary tree, this routine will recursively
    search the entire tree for \p thisKey.

    \return The ABinaryLeaf instance that contains \p thisKey, if one was
    found; \p NULL if \p thisKey was not found in the node or its subtrees.
*/
ABinaryLeaf **ABinaryLeaf::fetch(gint64 thisKey)
{
  // Stores the difference between the desired key and another key
  gint64 difference = 0;

  // Compare the desired key with our key
  difference = thisKey - myKey;
  if (difference < 0) {
    if (MyLeftTree)
      return MyLeftTree->fetch(thisKey);
    return &MyLeftTree;
  }
  else if (difference > 0) {
    if (MyRightTree)
      return MyRightTree->fetch(thisKey);
    return &MyRightTree;
  }
  return &this;
}

The point of this method is to find an occurrence of the specified sort key within the node itself, or in one of the branches connected to the node. If the key is not found, a pointer to a node (which is itself a pointer) is returned. This makes it easy to find out where such a value should be inserted in the tree. But the code above does not compile.

Veteran C++ programmers will no doubt see what the problem is right away: the address of this cannot be taken as it can be for a regular pointer. Why not? Isn't this simply a pointer to the current instance? Yes, but no: this has special meaning to the compiler, even though it is also a pointer and can be dereferenced as a pointer. This is just one of many frustrations occasioned by the "special exceptions to the rule" that are rampant in C++. Even better(!), the use of templates and operator overloading allows the programmer to define all sorts of special exceptions to the rules. This, together with the widespread aversion that many C++ programmers seem to have to commenting their code, tends to render C++ code all but unreadable. I do not want to program in a language that requires me to memorize every exception to every rule -- I want a stable syntax!

Why not Objective C?

In searching for an alternative to C++ I stumbled, quite by accident, across Objective C. Steve Jobs brought Objective C from NeXt to Apple, so the primary source of information on the current incarnation of the language is to be found on the Apple developer site. Objective C was, to my mind, what C++ should have been: an elegant solution to the problem of how to make the C language support object-oriented constructs natively. Even better, I found out that I didn't need XCode or a Mac in order to use the language: there was a GCC front-end available that could be used on any platform supported by GCC!

For the next couple of months, I programmed my game engine in Objective C, more or less content with the syntax. There were a couple of things that bothered me: the syntax of if statements requires parentheses, and when the expression includes a method call, you must also use brackets. This can result in clunky-looking code, particularly if you're nesting method calls:

if (([[MyObject string] isEmpty]) || ([[MyObject string] length] == 0)) ...

And although one of Objective C's greatest strengths is its dynamic binding, which means you don't need to know at compile time whether a particular class can implement a given method, there is the potential for a speed penalty when calling methods if the call is dispatched through the centralized messaging routine. This feature also limits optimizations that can be done to the code.

My own tests indicated that GCC's implementation of Objective C did static linking where it could, which meant methods were called directly when their entry points were known. This would tend to keep the speed of calls on par with C and C++, but I don't know if GCC implementation still does this. When the developers of GCC chose to remove the legacy implementation of the runtime in favor of Apple's higher-level interface, I abandoned the language. My game engine uses a custom memory manager, and it is no longer possible to override the low-level Objective C memory-management hooks in order to install your own memory management scheme.

It's too bad, too; Objective C showed real promise. Taking power away from the programmers is a bad move. The whole point of writing programs is to make the computer DO something; when you take away the ability to make the computer do things, you undermine the fundamental purpose of the language.

Why not Pascal?

Pascal was my second language growing up, after BASIC: I wrote a couple of utility programs with Turbo Pascal while in high school and eventually moved on to Delphi. Since Delphi is more or less limited to Windows (don't get me started on Kylix), and I wanted my engine to be portable, I opted to play with Free Pascal instead.

I like the structure of Pascal; it has a tendency to make code easy to read. Using Free Pascal's implementation of Object Pascal, I was able to get pretty far into coding my game engine before a couple of items brought the project to a standstill.

First, Free Pascal does not provide a convenient way to document your code. You can comment, of course -- and I do -- but there is no tool like Doxygen that can extract these comments to produce stand-alone documentation. A tool exists, but it requires you to document your code separate from the code itself, which seems redundant, since I already do this in my comments. There is a tool named PasDoc that will extract comments from source code, but it is somewhat limited in comparison to Doxygen; finally, there is a tool named Pas2Dox, which will convert Pascal comments and code to something that Doxygen can unerstand -- but then the representation of your code in the final documentation is not accurate.

Secondly, although Pascal provides structure, this has also caused it to become something of a verbose language; a problem which is compounded by the desire of the Free Pascal developers to support more than one variant of the language. More than anything else, this what finally caused me to break away from Pascal: my code was beginning to feel bloated by the language itself. Consider this excerpt of my basic thread class:

/// Represents a single, basic thread of program execution
type AThread = class (AnObject)
  protected
    /// The handle of the thread
    myHandle: tThreadId;
    /// Indicates whether or not the thread is running
    myRunning: boolean;
    /// Indicates whether or not the thread has been cancelled
    myCancelled: boolean;
    /// Indicates whether or not the thread has finished
    myFinished: boolean;
    /// Stores the thread return value
    myReturnValue: integer;
    /// Stores the desired stack size of the thread
    myStackSize: longword;
    /// Gets the thread priority
    function getPriority: integer;
    /// Sets the thread priority
    procedure setPriority(const pri: integer);

  public
    /// Initializer
    function init: boolean; override;
    /// Destructor
    destructor destroy; override;
    /// Starts the thread
    function start: boolean; virtual;
    /// Prepares the thread instance for execution as a separate thread
    function prepareToSplit: boolean; virtual; abstract;
    /// Executes the thread
    procedure execute; virtual; abstract;
    /// Finishes execution of the thread
    procedure prepareToJoin; virtual;
    /// Waits for the thread
    function join(const timeout: integer): longword; virtual;
    /// Yields thread execution
    procedure yield; virtual;
    /// Suspends the thread
    function pause: integer; virtual;
    /// Resumes the thread
    function continue: integer; virtual;
    /// Cancels the thread
    procedure cancel; virtual;
    /// Suspends the current thread
    class function suspend: integer; virtual;
    /// Resumes the specified thread
    class function resume(const thisThread: TThreadID): integer; virtual;

  // Properties ----------------------------------------------------------------
    property stackSize: longword read myStackSize write myStackSize;
    property priority: integer read getPriority write setPriority;
    property value: integer read myReturnValue write myReturnValue;
    property isRunning: boolean read myRunning;
    property hasFinished: boolean read myFinished;
    property isCancelled: boolean read myCancelled;
    property handle: TThreadID read myHandle;
end;

By itself, the class doesn't look bloated, does it? But it feels bloated to me, especially when it is just one of many such classes. Look at how almost every method is required to have a modifier appended to it: class or virtual or abstract or override. Look at the redundancy occasioned by the use of properties (this is the reason that I don't code them anymore). The code feels messier than it ought to be, and I think the chief reason for this is the use of properties and the necessity for function modifiers when using classes. The language is too verbose.

Why not any other language?

I've tried several of them: Python (interpreted, and so too slow); Vala (not yet feature complete); FreeBASIC (not yet feature complete); OOC/rock (not yet feature complete); C# (ugly syntax); C with GLib/GObject (probably the best compromise) -- but the end result is always the same: the language does not satisfy all three of the criteria listed above. And when no language will do, you either stop programming, sacrifice one or more of your ideals in the name of just getting the thing done, or you write your own language; one that satisfies all of your criteria.

Writing a Custom Computer Language

I initially resisted the idea of writing my own language, not because I didn't want to do so at some point, but because of the time required. I wanted to get my game engine done, and I'd already lost time in trying the languages that were available. But after tackling the problem from various angles, writing and rewriting code in various languages to see if I could find a way to bend the syntax to my will, I always came back to the same conclusion: the language does not yet exist that will do what I want it to do. Having accepted that, it is now time to build that language.

Next: Causerie Begins With...

Adventures in Computer Linguistics