Table of Contents | The Language > Operator Overloading
Prev: Inline Objects     Next: Multi-Methods    

Operator Overloading

TADS lets you define custom meanings for many of the “algebraic” operators when applied to regular objects. This is known as overloading an operator, because it assigns a meaning to the operator when it’s used with a datatype outside of its usual domain.

Note that the term is overloading, not overriding, and there’s an important difference between the two. Overriding would mean you’re changing the existing meaning of an operator; TADS doesn’t let you do this. For example, you can’t change the meaning of “+” as it applies to integer values, because the TADS virtual machine has a built-in definition for that operator on that combination of types. Overloading, in contrast, is the extension of an operator to a new type that it normally doesn’t work with, so it’s not superseding anything that’s already built into the system.

If you’re not familiar with operator overloading from other programming languages, and you want an overview of what it’s good for, you can skip ahead to the section on uses.

How operators are applied

Before we talk about how to overload an operator, we have to know where to put the overload. To determine that, it’s important to understand how TADS applies operators to values.

TADS is a dynamically typed language: the datatype of a variable or expression isn’t declared in the source code, but is determined based on the actual values being used at the moment the expression is executed. As such, the exact meaning of an operator can’t be known until operator is carried out on live data. Many operators have meanings that vary according to the types of the values they’re applied to. For example, + can mean integer addition, string concatenation, or list appending.

Each time an operator is executed, its meaning is determined by looking at a specific one of its operands. (An operand is a value that an operator is applied to. In 3 + 4, the operator is + and the operands are 3 and 4.) The choice of which operand controls the meaning is fixed for each operator: it’s always the same, no matter what types are involved.

So, each time an operator is evaluated, TADS first fully evaluates each operand to get the actual, live value of that sub-expression. It then looks at the datatype of the value for the controlling operand.

If this is one of the primitive types, such as integer or nil, TADS simply carries out its built-in definition of the operator for that type. There’s no way to change this through operator overloading; 3 + 4 will always yield 7.

If the controlling operand is an object value, and it’s an object of one of the intrinsic class types (String, List, LookupTable, etc.), the machine checks to see if that class defines a meaning for the operator. For example, String defines + as concatenation, and List defines \[\] as indexing the list. If there’s a built-in meaning for the operator, TADS carries out that meaning. As with the primitive types, there’s no way to change this through operator overloading.

If the controlling operand is an object value, and the object doesn’t have a built-in meaning for the operator, this is where operator overloading kicks in. TADS checks to see if the object defines or inherits an operator method for this specific operator. If so, TADS invokes the method, and uses the return value of the method as the operator result.

If there’s no built-in definition for the operator, and there’s no corresponding operator method defined by the controlling operand value, TADS gives up and throws a run-time error to signal that the operator isn’t valid for that combination of types.

Defining an overloaded operator

An overloaded operator is a special kind of method. You define it alongside ordinary properties and methods when you define a class or object. An overload method uses a special name that starts with the keyword operator, followed by the operator to be overloaded, followed by an ordinary argument list and method body.

For example, to overload the + operator for a class, you’d define a method called “operator +”:

    class ScopeList: object
       operator +(x)
       {
         local l = new ScopeList();
         l.scope = scope;
         l.scope += x;
         return l;
       }

       scope = []
    ;

An operator method is always invoked on the controlling operand of the operator. This is the self value for the method. The remaining operands are passed as arguments to the method. This means that the number of arguments is always one less than the number of operands.

That’s why the operator + we defined in the example above only takes one argument. Since the controlling operand of a binary operator is the left operand, the expression a + b effectively becomes a call to a.operator +(b). (That’s not real syntax, though: you can’t use operator+ to make a call to the method. To call the method you have to use the algebraic syntax, a + b.)

Apart from the special naming syntax, an operator overload method is written just like an ordinary method. It’s fine to trigger side effects (such as displaying text on the screen), call other methods and functions as normal, and call inherited and delegated as usual. Note, though, that the operator syntax can’t be used as a method name in a regular method call or in an inherited or delegated, so you can’t straightforwardly have class Sub’s operator + method inherit, say, class Base’s operator \* method.

The method should return a value to be used as the result of the operator expression. If there’s no explicit return statement, the compiler will supply nil as the default return value, as it does for any other method.

Overloadable operators

The following operators are overloadable:

Operator

Type

Expression Syntax

Definition Syntax

+

binary

a + b

operator +(b)

negate

unary

-a

operator negate()

-

binary

a - b

operator -(b)

\*

binary

a \* b

operator \*(b)

/

binary

a / b

operator /(b)

%

binary

a % b

operator %(b)

^

binary

a ^ b

operator ^(b)

\<\<

binary

a \<\< b

operator \<\<(b)

\>\>\>

binary

a \>\>\> b

operator \>\>\>(b)

\>\>

binary

a \>\> b

operator \>\>(b)

~

unary

~a

operator ~()

\|

binary

a \| b

operator \|(b)

&

binary

a & b

operator &(b)

\[\]

binary

a\[b\]

operator \[\](b)

\[\]=

ternary

a\[b\] = c

operator \[\]=(b, c)

You can only overload the operators listed above. Notably, you can’t overload any of the comparison operators (==, !=, \<, \<=, \>, \>=), the logical operators (&& and \|\|), the “if-nil” operator (??), or the conditional operator (? :). You also can’t overload the regular assignment operator (=), but you can override the index-and-assign operator, \[\]=.

The compound assignment operators, such as += and -=, also can’t be separately overloaded. However, when these operators are executed, the system actually handles them as though they were written out as separate steps, so the non-assignment part will still invoke your overload. For example, the code a += b will invoke a.operator +(b) to compute the value to assign to a. The same applies to the increment and decrement operators, ++ and --. The compiler decomposes a++ into a = a+1, so while you can’t overload ++ separately, you can still intervene in such an expression by overloading +.

You’ve probably noticed the odd man out in the table, “operator negate”. This is unary negation operator, which in actual expressions is just a minus sign “-”. Now, the minus sign is also used as the two-operand subtraction operator. The language gets away with reusing the same symbol for “negative a” and “a minus b” in expressions because it’s always clear from context which is which. But with operator overloading, there’s no such context in one important situation, which is getting a property pointer: does “&operator-” mean “operator minus” or “operator negate”? TADS solves this problem by defining “operator -” solely as the subtraction operator, and renaming the negation operator to “operator negate”.

The indexing operator’s method syntax might also look odd, because you’d normally write the argument to \[\] inside the brackets. But the correct syntax is as shown in the table, with the argument in parentheses after the brackets. This makes the method syntax consistent for all of the different operators.

The index-and-assign operator is worth mentioning because it might not be obvious what the return value from this method would mean. The primary function of the operator is to assign a value to the indexed element. So should it simply return the assigned value? No: it should actually return the container value. In many cases, this is simply self, but not always. Some types create a new object when any change is made to their contents. That’s how List works, for example. When you assign to a list element, a new list is created to reflect the update, leaving the original list object unchanged. This is required because a list is immutable . For this case, the \[\]= operator method returns the newly created container object (the new List object in the List example). If your operator \[\]= method modifies the object’s own contents directly, way Vector and LookupTable do, you should simply return self, since the “new” container reflecting the changes is simply the original container, self.

Property pointers to overload methods

You can’t call an operator overload method by name (you can call it only through the actual operator syntax), but you can get a pointer an overload property. Use the & operator as normal, and simply use the same name as the method definition uses. For example:

    if (obj.propDefined(&operator -))
       "- is defined for 'obj'\n";

List-like objects

One of the consequences of being able to overload operators is that you can create custom objects that mimic the built-in List type. Not only can you use these as though they were lists in your own code, but you can also pass them to most system functions and methods that require List arguments.

For the purposes of system functions and methods, a list-like object is defined as an object that provides the following:

Most intrinsic functions and methods will interpret an object providing these methods as list-like. The “…” argument expansion operator also recognizes this interface.

An object that meets this definition must return an integer value from its length() method. The system will invoke it expecting to get an element count as the result, so a non-integer return value will trigger a type conversion error.

Note that if you plan to use a custom object as though it were a list in your own functions and methods, you might also want to include the Collection method createIterator(). This method is required for an object to be used with the foreach or for..in statements. This isn’t required for system functions that take List arguments, because those callers only expect the basic List-style interface. But by adding createIterator(), you can ensure that your object behaves like a generic Collection for places where you own code might use it with foreach or for..in.

You might also want to define operator \[\]=, since if you’re passing it to code written to expect true List values, the code might assign element values.

Intrinsic classes

It’s legal to add overloaded operators for intrinsic classes. The syntax is the same as for a regular operator overload method, and goes in a modify definition for the intrinsic class, the same as for any other intrinsic class extensions.

You can add an operator to an intrinsic class, but you can’t override an operator that’s already defined. In other words, you can only add operators that the intrinsic class doesn’t define internally. For example, you can’t override operator + for String, because String defines + as the concatenation operator.

If you attempt to override an operator on an intrinsic class that’s already defined by the built-in class, the compiler will accept the definition without a complaint, but the method will never be called at run-time. The virtual machine always looks first for a native definition for an operator, and only looks for an overloaded operator method if the native method doesn’t exist. This is the same way that all intrinsic class methods work, actually: you can’t override String.length() either, because a native version is given priority over an extension in a modify definition.

Limitations

You can only overload the operators specifically listed earlier. Other operators aren’t eligible for overloading.

You can’t change the syntax or precedence of an operator. For example, you can’t redefine \<\< as a unary operator, or change \[\] from a postfix to a prefix, or give + higher precedence than \*.

There’s no way to create new operators.

It’s not possible to override an operator that’s defined for a system type. For example, you can’t redefine \[\] for a List, or + for a String.

There’s no way to override operators on primitive (non-object) types, such as integers.

propDefined(), propType(), getPropList(), and getPropParams() don’t work for the built-in operators of intrinsic classes, such as operator + for a String or operator \[\] for a List. They’ll simply act as though you’re talking about an undefined method. For example, 'x'.propDefined(&operator +) will return nil, even though + certainly does work with strings. The VM-defined operators of intrinsic classes are essentially built in, so there’s no equivalent of an operator method represented in the type system. For the same reason, you can’t inherit from or delegate to a built-in operator of an intrinsic class.

Typical uses

Operator overloading has been a feature of mainstream programming languages since the 1960s; modern examples include C++ and Python. Over the years, some common patterns of usage have emerged, as well as some pitfalls.

The original use case, which motivated the first implementations, was custom numeric types. The canonical example is complex numbers, but there are also vectors (in the mathematical sense), tensors, etc. An extended numeric type is a natural fit for the algebraic operators, since that’s exactly the notation that mathematicians have always used for them. The ability to define the standard operators on new numeric types lets you create types that can be used almost as though they were native. Using algebraic operators can make it easier to write code using the types than if you had to use function calls, and can make the code not only more concise to write, but easier to understand when read.

The second common use is for creating types that aren’t numerical in nature, but where some of the algebraic operators still have intuitive meanings. The String and List classes are built-in examples: in both cases, the + operator has a fairly intuitive meaning, even though it doesn’t involve arithmetic addition. As with numeric types, operator definitions can make custom object types easier to use and make code at once more concise and clearer.

The second-and-a-half use is for creating your own custom objects that mimic built-in objects that use operators. For example, you could create a special “Scope List” class that represents the set of objects in scope, but which can be used as though it were an ordinary list by code that doesn’t need to access its special features. You could do this by defining the + and \[\] (indexing) operators on the object to act the same way they do with a regular List object.

The third use is simply to make code more concise, by using operators for frequently called methods instead of spelling out method names. This is only subtly different from the second case; the distinction we’re drawing is that you can define an operator on an object even when there isn’t an obvious, intuitive meaning that’s somehow analogous to the usual arithmetic meaning of the operator. In these cases you don’t define operators for their clarity, but only for their concision. For example, the C++ standard libraries redefine \>\> and \<\< on file stream objects as “read” and “write” operators. This has nothing to do with the ordinary arithmetic meaning of the operators, so in that sense it’s a completely arbitrary reassignment of the operators, but it at least has a sort of visual logic to it.

This third use of overloading is somewhat controversial. Among C++ developers, it’s taken for granted and largely accepted, because the language’s inventor embedded the idea in the standard C++ libraries from the beginning. However, the criticism is that it hurts code clarity by arbitrarily redefining what an operator does, to such an extent that a reader would never guess the meaning of the code just by looking at it. Most people would intuitively know what + does when applied to a String or List, but unless you’ve read the C++ I/O documentation, you’d probably never guess that \<\< can sometimes write to the console. (In fact, the TADS VM source code itself, which is written in C++, is largely free of overloaded operators out of just these concerns.) If you’re writing an extension or library that other people will use, it’s probably best to use this sort of overloading sparingly.


TADS 3 System Manual
Table of Contents | The Language > Operator Overloading
Prev: Inline Objects     Next: Multi-Methods