Next Prev Contents


Java--Simple and Familiar

2.1 - Main Features of the Java Language
2.2 - Features Removed from C and C++
2.3 - Summary

You know you've achieved perfection in design,
Not when you have nothing more to add,
But when you have nothing more to take away.

Antoine de Saint Exupery.

In his science-fiction novel, The Rolling Stones, Robert A. Heinlein comments:

Every technology goes through three stages: first a crudely simple and quite unsatisfactory gadget; second, an enormously complicated group of gadgets designed to overcome the shortcomings of the original and achieving thereby somewhat satisfactory performance through extremely complex compromise; third, a final proper design therefrom.

Heinlein's comment could well describe the evolution of many programming languages. Java presents a new viewpoint in the evolution of programming languages--creation of a small and simple language that's still sufficiently comprehensive to address a wide variety of software application development. While Java superficially like C and C++, Java gained its simplicity from the systematic removal of features from its predecessors. This chapter discusses two of the primary design features of Java, namely, it's simple (from removing features) and familiar (because it looks like C and C++). The next chapter discusses Java's object-oriented features in more detail. At the end of this chapter you'll find a discussion on features eliminated from C and C++ in the evolution of Java.

Design Goals

Simplicity is one of Java's overriding design goals. Simplicity and removal of many "features" of dubious worth from its C and C++-based ancestors keep Java relatively small and reduce the programmer's burden in producing reliable applications. To this end, Java design team examined many aspects of the "modern" C and C++ languages[1] to determine features that could be eliminated in the context of modern object-oriented programming.

Another major design goal is that Java look familiar to a majority of programmers in the personal computer and workstation arenas, where a large fraction of system programmers and application programmers are familiar with C and C++. Thus, Java "looks like" C++. Programmers familiar with C, Objective C, C++, Eiffel, Ada, and related languages should find their Java language learning curve quite short--on the order of a couple of weeks.

To illustrate the simple and familiar aspects of Java, we follow the tradition of a long line of illustrious programming books by showing you the HelloWorld program. It's about the simplest program you can write that actually does something. Here's HelloWorld implemented in Java.

    class HelloWorld {
        static public void main(String args[]) {
            System.out.println("Hello world!");
        }
    }
This example declares a class named HelloWorld. Classes are discussed in the next chapter on object-oriented programming, but in general we assume the reader is familiar with object technology and understands the basics of classes, objects, instance variables, and methods.

Within the HelloWorld class, we declare a single method called main() which in turn contains a single method invocation to display the string "Hello world!" on the standard output. The statement that prints "Hello world!" does so by invoking the println method of the out object. The out object is a class variable in the System class that performs output operations on files. That's all there is to HelloWorld.


2.1 Main Features of the Java Language

Java follows C++ to some degree, which carries the benefit of it being familiar to many programmers. This section describes the essential features of Java and points out where the language diverges from its ancestors C and C++.

2.1.1 Primitive Data Types

Other than the primitive data types discussed here, everything in Java is an object. Even the primitive data types can be encapsulated inside library-supplied objects if required. Java follows C and C++ fairly closely in its set of basic data types, with a couple of minor exceptions. There are only three groups of primitive data types, namely, numeric types, Boolean types, and arrays.

Numeric Data Types

Integer numeric types are 8-bit byte, 16-bit short, 32-bit int, and 64-bit long. The 8-bit byte data type in Java has replaced the old C and C++ char data type. Java places a different interpretation on the char data type, as discussed below.

There is no unsigned type specifier for integer data types in Java.

Real numeric types are 32-bit float and 64-bit double. Real numeric types and their arithmetic operations are as defined by the IEEE 754 specification. A floating point literal value, like 23.79, is considered double by default; you must explicitly cast it to float if you wish to assign it to a float variable.

Character Data Types

Java language character data is a departure from traditional C. Java's char data type defines a sixteen-bit Unicode character. Unicode characters are unsigned 16-bit values that define character codes in the range 0 through 65,535. If you write a declaration such as

    char  myChar = 'Q';
you get a Unicode (16-bit unsigned value) type that's initialized to the Unicode value of the character Q. By adopting the Unicode character set standard for its character data type, Java language applications are amenable to internationalization and localization, greatly expanding the market for world-wide applications.

Boolean Data Types

Java has added a boolean data type as a primitive type, tacitly ratifying existing C and C++ programming practice, where developers define keywords for TRUE and FALSE or YES and NO or similar constructs. A Java boolean variable assumes the value true or false. A Java boolean is a distinct data type; unlike common C practice, a Java boolean type can't be converted to any numeric type.

2.1.2 Arithmetic and Relational Operators

All the familiar C and C++ operators apply. Because Java lacks unsigned data types, the >>> operator has been added to the language to indicate an unsigned (logical) right shift. Java also uses the + operator for string concatenation; concatenation is covered below in the discussion on strings.

2.1.3 Arrays

In contrast to C and C++, Java language arrays are first-class language objects. An array in Java is a real object with a run-time representation. You can declare and allocate arrays of any type, and you can allocate arrays of arrays to obtain multi-dimensional arrays.

You declare an array of, say, Points (a class you've declared elsewhere) with a declaration like this:

    Point  myPoints[];
This code states that myPoints is an uninitialized array of Points. At this time, the only storage allocated for myPoints is a reference handle. At some future time you must allocate the amount of storage you need, as in:

    myPoints = new Point[10];
to allocate an array of ten references to Points that are initialized to the null reference. Notice that this allocation of an array doesn't actually allocate any objects of the Point class for you; you will have to also allocate the Point objects, something like this:

    int  i;
    
    for (i = 0;  i < 10;  i++) {
        myPoints[i] = new Point();
    }
Access to elements of myPoints can be performed via the normal C-style indexing, but all array accesses are checked to ensure that their indices are within the range of the array. An exception is generated if the index is outside the bounds of the array.

To get the length of an array, use the length() accessor method on the array object whose length you wish to know: myPoints.length() returns the number of elements in myPoints. For instance, the code fragment:

    howMany = myPoints.length();
would assign the value 10 to the howMany variable.

The C notion of a pointer to an array of memory elements is gone, and with it, the arbitrary pointer arithmetic that leads to unreliable code in C. No longer can you walk off the end of an array, possibly trashing memory and leading to the famous "delayed-crash" syndrome, where a memory-access violation today manifests itself hours or days later. Programmers can be confident that array checking in Java will lead to more robust and reliable code.

2.1.4 Strings

Strings are Java language objects, not pseudo-arrays of characters as in C. There are actually two kinds of string objects: the String class is for read-only (immutable) objects. The StringBuffer class is for string objects you wish to modify (mutable string objects).

Although strings are Java language objects, Java compiler follows the C tradition of providing a syntactic convenience that C programmers have enjoyed with C-style strings, namely, the Java compiler understands that a string of characters enclosed in double quote signs is to be instantiated as a String object. Thus, the declaration:

    String hello = "Hello world!";
instantiates an object of the String class behind the scenes and initializes it with a character string containing the Unicode character representation of "Hello world!".

Java has extended the meaning of the + operator to indicate string concatenation. Thus you can write statements like:

System.out.println("There are " + num + " characters in the file.");
This code fragment concatenates the string "There are " with the result of converting the numeric value num to a string, and concatenates that with the string " characters in the file.". Then it prints the result of those concatenations on the standard output.

Just as with array objects, String objects provide a length() accessor method to obtain the number of characters in the string.

2.1.5 Multi-Level Break

Java has no goto statement. To break or continue multiple-nested loop or switch constructs, you can place labels on loop and switch constructs, and then break out of or continue to the block named by the label. Here's a small fragment of code from Java's built-in String class:


test:  for (int i = fromIndex; i + max1 <= max2; i++) {
           if (charAt(i) == c0) {
               for (int k = 1; k<max1; k++) {
                   if (charAt(i+k) != str.charAt(k)) {
                       continue test;
                   }
               }     /*  end of inner for loop  */
           }
       }             /*  end of outer for loop  */

The continue test statement is inside a for loop nested inside another for loop. By referencing the label test, the continue statement passes control to the outer for statement. In traditional C, continue statements can only continue the immediately enclosing block; to continue or exit outer blocks, programmers have traditionally either used auxiliary Boolean variables whose only purpose is to determine if the outer block is to be continued or exited; alternatively, programmers have (mis)used the goto statement to exit out of nested blocks. Use of labelled blocks in Java leads to considerable simplification in programming effort and a major reduction in maintenance.

The notion of labelled blocks dates back to the mid-1970s, but it hasn't caught on to any large extent in modern programming languages. Perl is another modern programming language that implements the concept of labelled blocks. Perl's next label and last label are equivalent to continue label and break label statements in Java.

2.1.6 Memory Management and Garbage Collection

C and C++ programmers are by now accustomed to the problems of explicitly managing memory: allocating memory, freeing memory, and keeping track of what memory can be freed when. Explicit memory management has proved to be a fruitful source of bugs, crashes, memory leaks, and poor performance.

Java completely removes the memory management load from the programmer. C-style pointers, pointer arithmetic, malloc, and free do not exist. Automatic garbage collection is an integral part of Java and its run-time system. While Java has a new operator to allocate memory for objects, there is no explicit free function. Once you have allocated an object, the run-time system keeps track of the object's status and automatically reclaims memory when objects are no longer in use, freeing memory for future use.

Java's memory management model is based on objects and references to objects. Because Java has no pointers, all references to allocated storage, which in practice means all references to an object, are through symbolic "handles". The Java memory manager keeps track of references to objects. When an object has no more references, the object is a candidate for garbage collection.

Java's memory allocation model and automatic garbage collection make your programming task easier, eliminate entire classes of bugs, and in general provide better performance than you'd obtain through explicit memory management. Here's a code fragment that illustrates when garbage collection happens. It's an example from the on-line Java language programmer's guide:

class ReverseString {
    public static String reverseIt(String source) {
        int i, len = source.length();
        StringBuffer dest = new StringBuffer(len);

        for (i = (len - 1); i >= 0; i--) {
            dest.appendChar(source.charAt(i));
        }
        return dest.toString();
    }
}
The variable dest is used as a temporary object reference during the execution of the reverseIt method. When dest goes out of scope (the reverseIt method returns), the reference to that object has gone away and it's then a candidate for garbage collection.

2.1.7 The Background Garbage Collector

The Java garbage collector achieves high performance by taking advantage of the nature of a user's behavior when interacting with software applications such as the HotJava browser. The typical user of the typical interactive application has many natural pauses where they're contemplating the scene in front of them or thinking of what to do next. The Java run-time system takes advantage of these idle periods and runs the garbage collector in a low priority thread when no other threads are competing for CPU cycles. The garbage collector gathers and compacts unused memory, increasing the probability that adequate memory resources are available when needed during periods of heavy interactive use.

This use of a thread to run the garbage collector is just one of many examples of the synergy one obtains from Java's integrated multithreading capabilities--an otherwise intractable problem is solved in a simple and elegant fashion.

2.1.8 Integrated Thread Synchronization

Java supports multithreading, both at the language (syntactic) level and via support from its run-time system and thread objects. While other systems have provided facilities for multithreading (usually via "lightweight process" libraries), building multithreading support into the language itself provides the programmer with a much easier and more powerful tool for easily creating thread-safe multithreaded classes. Multithreading is discussed in more detail in Chapter 5.


2.2 Features Removed from C and C++

The earlier part of this chapter concentrated on the principal features of Java. This section discusses features removed from C and C++ in the evolution of Java.

The first step was to eliminate redundancy from C and C++. In many ways, the C language evolved into a collection of overlapping features, providing too many ways to say the same thing, while in many cases not providing needed features. C++, in an attempt to add "classes in C", merely added more redundancy while retaining many of the inherent problems of C.

2.2.1 No More Typedefs, Defines, or Preprocessor

Source code written in Java is simple. There is no preprocessor, no #define and related capabilities, no typedef, and absent those features, no longer any need for header files. Instead of header files, Java language source files provide the definitions of other classes and their methods.

A major problem with C and C++ is the amount of context you need to understand another programmer's code: you have to read all related header files, all related #defines, and all related typedefs before you can even begin to analyze a program. In essence, programming with #defines and typedefs results in every programmer inventing a new programming language that's incomprehensible to anybody other than its creator, thus defeating the goals of good programming practices.

In Java, you obtain the effects of #define by using constants. You obtain the effects of typedef by declaring classes--after all, a class effectively declares a new type. You don't need header files because the Java compiler compiles class definitions into a binary form that retains all the type information through to link time.

By removing all this baggage, Java becomes remarkably context-free. Programmers can read and understand code and, more importantly, modify and reuse code much faster and easier.

2.2.2 No More Structures or Unions

Java has no structures or unions as complex data types. You don't need structures and unions when you have classes; you can achieve the same effect simply by declaring a class with the appropriate instance variables.

The code fragment below declares a class called Point.

    class Point extends Object { 
        double  x;
        double  y;
          methods to access the instance variables
    }
The following code fragment declares a class called Rectangle, that uses objects of the Point class as instance variables.

    class Rectangle extends Object {
        Point  lowerLeft;
        Point  upperRight;
          methods to access the instance variables
    }
In C you'd define these classes as structures. In Java, you simply declare classes. You can make the instance variables as private or as public as you wish, depending on how much you wish to hide the details of the implementation from other objects.


2.2.3 No More Functions

Java has no functions. Object-oriented programming supersedes functional and procedural styles. Mixing the two styles just leads to confusion and dilutes the purity of an object-oriented language. Anything you can do with a function you can do just as well by defining a class and creating methods for that class. Consider the Point class from above. We've added public methods to set and access the instance variables:

    class Point extends Object { 
        double  x;
        double  y;
        
        public void setX(double x) {
            this.x = x;
        }
        public void setY(double y) {
            this.y = y;
        }
        public double x() {
            return x;
        }
        public double y() {
            return x;
        }
    }
If the x and y instance variables are private to this class, the only means to access them is via the public methods of the class. Here's how you'd use objects of the Point class from within, say, an object of the Rectangle class:

    class Rectangle extends Object { 
        Point  lowerLeft;
        Point  upperRight;
        
        public void setEmptyRect() {
            lowerLeft.setX(0.0);
            lowerLeft.setY(0.0);
            upperRight.setX(0.0);
            upperRight.setY(0.0);
        }
    }
It's not to say that functions and procedures are inherently wrong. But given classes and methods, we're now down to only one way to express a given task. By eliminating functions, your job as a programmer is immensely simplified: you work only with classes and their methods.

2.2.4 No More Multiple Inheritance

Multiple inheritance--and all the problems it generates--has beed discarded from Java. The desirable features of multiple inheritance are provided by interfaces--conceptually similar to Objective C protocols.

An interface is not a definition of an object. Rather, it's a definition of a set of methods that one or more objects will implement. An important issue of interfaces is that they declare only methods and constants. No variables may be defined in interfaces.

2.2.5 No More Goto Statements

Java has no goto statement[2]. Studies illustrated that goto is (mis)used more often than not simply "because it's there". Eliminating goto led to a simplification of the language--there are no rules about the effects of a goto into the middle of a for statement, for example. Studies on approximately 100,000 lines of C code determined that roughly 90 percent of the goto statements were used purely to obtain the effect of breaking out of nested loops. As mentioned above, multi-level break and continue remove most of the need for goto statements.

2.2.6 No More Operator Overloading

There are no means provided by which programmers can overload the standard arithmetic operators. Once again, the effects of operator overloading can be just as easily achieved by declaring a class, appropriate instance variables, and appropriate methods to manipulate those variables.

2.2.7 No More Automatic Coercions

Java prohibits C and C++ style automatic coercions. If you wish to coerce a data element of one type to a data type that would result in loss of precision, you must do so explicitly by using a cast. Consider this code fragment:


    int  myInt;
    double  myFloat = 3.14159;
    myInt = myFloat;
The assignment of myFloat to myInt would result in a compiler error indicating a possible loss of precision and that you must use an explicit cast. Thus, you should re-write the code fragments as:

    int  myInt;
    double  myFloat = 3.14159;
    myInt = (int)myFloat;

2.2.8 No More Pointers

Most studies agree that pointers are one of the primary features that enable programmers to inject bugs into their code. Given that structures are gone, and arrays and strings are objects, the need for pointers to these constructs goes away. Thus, Java has no pointers. Any task that would require arrays, structures, and pointers in C can be more easily and reliably performed by declaring objects and arrays of objects. Instead of complex pointer manipulation on array pointers, you access arrays by their arithmetic indices. The Java run-time system checks all array indexing to ensure indices are within the bounds of the array.

You no longer have dangling pointers and trashing of memory because of incorrect pointers, because there are no pointers in Java.


2.3 Summary

To sum up this chapter, Java is:

Now that you've seen how Java was simplified by removal of features from its predecessors, read the next chapter for a discussion on the object-oriented features of Java.


[1] Now enjoying their silver anniversaries
[2] However, goto is still a reserved word.
Next Prev Contents

The Java(tm) Language Environment: A White Paper