"Java Liaison" column
Richard Gillam
March 1999
In the previous installments of this column, Ive tried to dispel (or minimize) the big complaints about Java and have focused on its standard libraries and runtime environment. This time, Id like to look more closely at the core language itself and some of its warts.
The designers of Java decided to make Java look and feel more or less like C++ because
so many programmers are familiar with C++. Since the underlying philosophy of Java was
very different, they had to depart from the C++ model, often radically, in many places.
They also made some smaller syntactic changes to improve readability and usability.
Furthermore, t
hey made a conscious decision to simplify the language, jettisoning
whole features of C++ that they deemed to be unnecessary. This may or may not have been a
good idea, but C++ programmers used to these features will be justifiably annoyed at their
absence in Java. Id like to look at four such features of C++ and show how (when
possible) their absence in Java can be worked around.
The most conspicuous omission, and one of the ones that spark really heated religious wars, is operator overloading. Java preserves function-name overloading, but not operator overloading.
This was a very intentional omission. The designers of the language felt that operator overloading wasnt really necessary. It added complexity without enabling you to do anything you couldnt do by other means.
Many people have cried foul. They use operator overloading all the time, its
convenient, and it simplifies their code. One example is scientific applications that
define their own numeric types, such as Complex
, Matrix
, Ratio
,
or Bignum
. If youre doing a lot of math on different data types, you
dont want to mess with manually converting types everywhere, and more importantly,
you dont want to have to do a + b
with some types of a
and
b
and add(a, b)
with other types. This is confusing and
inconsistent.
You could, of course, standardize on the add(a, b)
format for all
mathematical operations, but you lose automatic operator precedence and everything becomes
much longer and uglier. Instead of "(2 * a + 4 * b) / 3
",
youd have to write something like "div(add(mult(2, a), mult(4, b)),
3)".
Which one of these expressions is easier to read? (Of course, would be easier yet, but its kind of hard
to type.)
Another thing operator overloading is frequently used for in C++ is accessing
individual elements in a collection. Everybody knows what "x = y[3]
"
or "a[b] = 3
" mean, no matter what type of collection y
or a
might be in these examples. In Java, on the other hand, if the
collection is an array, you say "x = y[3]
", but if the collection
is a string, you say "x = y.charAt(3)
", and if the collection is a Vector
,
you say "x = (foo)y.elementAt(3)
" [where "foo
"
is the type of x
]. At best, this is confusing.
There are few other things operator overloading is used for in C++. A biggie is for implementing smart pointers and proxy classes so that their operation is completely (or almost completely) transparent to the code using them. There are situations where itd be nice to be able to do this in Java too, but most of the things that C++ smart pointers do automatically (e.g., adjusting reference counts, deleting objects, etc.) arent necessary in Java.
Despite the usefulness of operator overloading, I suspect that itll never get added to the language. Operator overloading (at least C++-style operator overloading) does significantly complicate things, although Javas overall design would eliminate some of the biggest hassles and there are simpler, more constrained alternatives to C++-style operator overloading that could be used instead. Theres no practical workaround either, other than the obvious one: use regular function calls and put up with the disadvantages described above. Personally, I find this not to be a really big deal, but there are a lot of people that do.
Java doesnt have enumerated types. This is one thing I definitely have missed. In Java, the standard approach taken by the Java Class Libraries is to define named constants of some integral type. For example, where in C++, youd write something like
enum EScrollbarStatus { kScrollbarsNone, kScrollbarsVertical, kScrollbarsHorizontal, kScrollbarsBoth };
instead youd do the following in Java:
public final int SCROLLBARS_NONE = 0; SCROLLBARS_VERTICAL = 1; SCROLLBARS_HORIZONTAL = 2; SCROLLBARS_BOTH = 3;
This is syntactically uglier, which is a pain, but a bigger problem is that its
not type-safe. In C++, you can define parameters and variables of type EScrollbarStatus
.
The compiler will then guarantee that the parameter will have one of the four legal
values; everything else will cause a compile error. In Java, you have to check for illegal
values manually. In other words, where in C++ you can just say
void doSomething(EScrollbarStatus status) { // do something with "status" }
in Java you have to do something more like
public void doSomething(int status) { if (status < SCROLLBARS_NONE || status > SCROLLBARS_BOTH) return; // or throw an exception, or something // do something with "status" }
This both forces more work on the programmer and is brittle. If you add another constant to the enumerated type, you have to remember to check and fix all these validity checks to account for it (you could separate the check into a function of its own, but this gets kind of silly after a while, and it only minimizes the labor).
In C++, enums are often used just to define a bunch of symbolic constants en masse. In Java, you have to put up with a little more syntactic ugliness, but that isnt such a big deal. On the other hand, when you really need a whole new type with a restricted range of values, as in the example above, Java forces you to do more work and introduces the potential for bugs.
There is a workaround for this in Java. Instead of making the individual enumeration values integers, you make them references to individual instances of a private class. It looks something like this:
private static class ScrollbarStatus { } public static final ScrollbarStatus SCROLLBARS_NONE = new ScrollbarStatus(); public static final ScrollbarStatus SCROLLBARS_VERTICAL = new ScrollbarStatus(); public static final ScrollbarStatus SCROLLBARS_HORIZONTAL = new ScrollbarStatus(); public static final ScrollbarStatus SCROLLBARS_BOTH = new ScrollbarStatus();
You usually scope the "enumeration" class inside some other class, and scope the constant definitions in the same class. (The "enumeration" class is a private member of its enclosing class so you cant create more instances of it; interestingly, you can still declare variables or parameters of the enumerated type even though its private.) The class doesnt have to carry any state or define any methodswere only interested in object identity (i.e., address in memory) here. Then the function using the type can look like this
void doSomething(ScrollbarStatus status) { // do something with "status" }
and you still get the type safety check for free. One problem with this arrangement is
that you cant use switch
with the constants of the enumerated type
because switch
only works with primitive types. Instead of
void doSomething(ScrollbarStatus status) { switch (status) { case SCROLLBARS_NONE: // blah blah blah case SCROLLBARS_VERTICAL: // blah blah blah case SCROLLBARS_HORIZONTAL: // blah blah blah case SCROLLBARS_BOTH: // blah blah blah } }
you have to do
void doSomething(ScrollbarStatus status) { if (status == SCROLLBARS_NONE) // blah blah blah else if (status == SCROLLBARS_VERTICAL) // blah blah blah else if (status == SCROLLBARS_HORIZONTAL) // blah blah blah else // blah blah blah }
To get around this, or deal with situations where you need real values for the enumeration constants, you have to modify things like this:
public static class ScrollbarStatus { public ScrollbarStatus(int value) { this.value = value; } private int value; } public static final ScrollbarStatus SCROLLBARS_NONE = new ScrollbarStatus(0); public static final ScrollbarStatus SCROLLBARS_VERTICAL = new ScrollbarStatus(1); // and so on...
Making an enumerated type into a class can be kind of complicated, and probably isnt always worth the trouble, but it very neatly handles the type safety problem. One big problem here is that most of the Java Class Library APIs dont use this approach, so if you want to use this approach with enumeration parameters defined by the JCL, it either doesnt help you or introduces more work.
When faced with a choice between no macro system at all and a bad macro system, its kind of hard to know which way to vote. But there are definitely times when a macro system is a nice thing to have, and I occasionally miss its presence in Java.
In fact, there are some things that you dont need a macro system to do in C++, where you really do in Java, so you could argue that a macro system would be even more useful in Java. In C++, stack-based classes are frequently used to automatically release a resource when a particular block of code is exited. You dont need to do this with memory in Java, but you do have to do it with most other resources. Consider files, for example. In C++, you could define a class that represents a file and takes care of closing the file in its destructor. Then all youd have to do in a function that used it is declare an object of the proper type and do your thing. Whether the function terminates normally or abnormally, youre guaranteed that the file will be closed.
In Java, on the other hand, you have to close the file manually. So code using the built-in Java File class would have to look something like this:
void doSomething() { File f = new File("foo.txt"); try { // do stuff with f } finally { f.close(); } }
Itd be wonderful if you could take all of this boilerplate code out, so the function basically only consists of the code doing the actual work. Something like
void doSomething() { WITH_OPEN_FILE(f, "foo.txt") { // do stuff with f } }
WITH_OPEN_FILE
would do the new
and close
calls
for you, and would also do the try
/finally
block. It would also
automatically define a variable of type File
named f
. To make
things look and work this way, however, requires that you be able to extend the program
syntax, which you cant do. This requires a macro system.
Of course, the C/C++ macro system (#define
) doesnt allow you to do
that either. Since all it does is simple textual substitution, it doesnt really know
about the language syntax or let you extend it. You cant use a single macro call to
bracket a body of code with extra boilerplate on either side. But in C++, you have the
destructor trick, so this doesnt matter.
The designers of Java did take care of one of the other major uses for the destructor
trick by extending the language. Instead of having to manually acquire and release monitor
locks, you use the synchronized
keyword to bracket critical sections of code,
and the release of the lock happens automatically. But other uses of the destructor trick,
such as opening and closing files or acquiring and releasing GUI system resources (such as
windows) still have to be done manually in Java.
The C++ macro system is good for things like inserting canned boilerplate code in a lot
of places (say, youve got a bunch of classes that are all supposed to define the
same four functions with the same implementation) or isolating platform-specific code from
client code (e.g., defining t_int32
to be a 32-bit integer, making it an int
on some systems and a long
on others). These uses are less important in Java,
but do still come up.
A good language-sensitive macro system is hard to do and greatly complicates the language, but it might be possible to add a very limited one, or merely extend the language syntax to cover more of the common cases (e.g., add a keyword that would force an objects finalizer to be called at a predictable time).
const
The most unpleasant and error-prone omission from Java, the one thats given our
whole team the most grief, is the absence of const
. I have to be careful here
to explain what I mean. The const
keyword in C++, combined with the C
declarator syntax, is a syntactic monstrosity, and Im not for a minute advocating
bringing all that complexity over to Java. I also am not confusing the meanings of const
in C++ and final
(const
s closest analogue) in Java.
Let me explain what I do mean. Say youre writing a function in Java and its supposed to call another function and pass it a reference to one of your objects. Youre guaranteed that the variable that points to that object will still point to it when the function you call returns, but youre not guaranteed that that objects state hasnt been changed by the function you called. If youre receiving a reference to an object from your functions caller, you dont know whether the caller is expecting you to leave the object alone or if you can change it.
This is important. Sometimes you want multiple functions or objects pointing to another object to see changes to it, and sometimes you dont. You have to be careful to make sure just which objects hold references and which objects hold copies, and to uphold these invariants everywhere. Breaking those invariants (or setting them up wrong in the first place) can produce wrong results, and they can be hard to track down.
Java dramatically simplified the various access-control and storage-class options
available in C++. Variables (class, instance, and local), function parameters, and
function return values all come in only two flavors: immediate values and references. If a
variable is of a primitive type, its an immediate value; if its of an object
or array type, its a reference. If a variable is declared final
, that
means youre not allowed to change the value of the variable. That means that
if the variable is a reference, you cant change it to refer to another object, but
you can freely make changes to the object it refers to. Theres no way to
prevent this.
In other words,
int x; Foo y; final Foo z;
in Java are equivalent to
int x; Foo& y; Foo & const y;
In C++.
const Foo& x; Foo x; Foo* x;
have no equivalents in Java.
Because theres no built-in way to prevent the innards of an object from being messed with without permission, youre left having to rely on documentation to tell people the expected semantics of function parameters and return values. And we all know how effective relying exclusively on documentation tends to be. The ultimate result is a lot of bugs, a lot of rummaging in other peoples source code looking for the answer, and a lot of defensive copying. You can run into situations in a team environment where the functions on both sides of a call copy defensively:
Bar func1() { Foo f = new Foo(); // do stuff... Bar result = func2((Foo)f.clone()); return (Bar)(result.clone()); } Bar func2(Foo f) { Foo internalF = (Foo)(f.clone()); // do stuff with internalF, create "result"... return (Bar)(result.clone()); }
Notice how many unnecessary copies are being made here. Depending on the situation, they may actually all be unnecessary. This is terribly inefficient.
The only way to prevent excess defensive copying is to have some kind of protocol that
is enforced by the compiler whereby a caller can communicate to a callee whether or
not the callee can change the state of an object passed to it, and whereby a callee can
communicate to a caller whether or not it can change the state of its return value. This
involves a way of identifying which methods on a class change its internal state and which
dont. C++ uses the const
keyword (in different ways) for all of these
things.
The Java language doesnt provide this kind of special support, but you can get pretty close on your own by using interfaces. For each class you think is important enough to protect in this way (the ones that are most expensive to clone, for example), define an interface that contains just the functions in your class that dont change its internal state (i.e., its getter functions). Functions that promise not to change the state of an object referred to by a parameter define that parameters type as the interface type. Functions that return something you dont want changed return the interface type. In the opposite situations, you use the regular class type instead.
Heres an example:
public interface ImmutableFoo extends Cloneable { public int getX(); public Bar getY(); public int getZElement(int index); } public class Foo implements ImmutableFoo { // function bodies omitted for clarity public Foo( /* params */ ); public int getX(); public void setX(int newX); public Bar getY(); public void setY(Bar newY); public int getZElement(int index); public void setZElement(int index, int value); }
Now if you have a function defined like this:
copyState(ImmutableFoo source, Foo destination);
its promising that it wont mess with the internals of source
(which makes sense), but that it reserves for itself the right to mess with destination
(and, in fact, it will do sothats this functions whole point). If
a function calling copyState()
wants to copy into an object that it only
knows as an ImmutableFoo
, it has to create a new object to copy into first.
(Of course, theres nothing to prevent somebody from casting an ImmutableFoo
to a Foo
, but this is the same as casting away const
in C++, and
just as dangerous.)
This is a lot of work. Thats one of the drawbacks, but frequently its worth the trouble. A bigger drawback, like the workaround for enumerated types above, is that the standard Java libraries generally dont work this way (there are some exceptions).
An even bigger drawback is that Java arrays dont either. Theres no
way to create an immutable Java array. Another problem is that Java arrays arent
completely type safe at compile time; you can define a variable of type Object[]
and store any array in it. Then you can put an instance of any class into an Object[]
.
Theres a run-time type check that throws an exception if the type
doesnt match the type of the actual array. The reason this is a run-time check is
that if you get a Foo[]
where youre looking for an Object[]
,
you can still read from it, but not write to it. A const
array type
would allow the check to happen at compile time (using a non-const
array type
would be an indication that you plan to write to it).
You cant change the Java APIs or the Java array implementation, but you can protect your use of them by creating wrapper classes and using them instead.
Our team has run into many glitches, both big and small, in porting our code from C++
to Java and vice versa. Ive tried to hit the highlights here. For a more
comprehensive treatment of this issue, take a look at http://www.ibm.com/java/education/portingc
.
In addition, for a lot more information on mutable and immutable objects in Java,
check out Mark Daviss column in the March issue of Java Report.
Ive left out probably the biggest issue for many hardcore C++ programmers trying to write Java code. Thats the fact that Java doesnt have templates. This is a really big topic, and Im going to save it for the next installment. Hope to see you here again next time.