1 Introduction This paper, as the title suggests, is a discussion on the functionalities specified for the implementation of CLOS (Common Lisp Object System), an object-oriented extension of the Common Lisp Programming Language. Here, the term operational semantics is defined informally as the high-level, abstract, external specification of the functionalities of the language, which is separate from and independent of the actual implementation. The design of CLOS is the central theme of The Art of the Metaobject Protocol. Kirczales, et. al. present the abstract architecture that supports all the object-oriented features of CLOS in the "backstage". Their implementation of CLOS is intrinsically related to the native features of Common Lisp - after all, CLOS is an extension of Common Lisp in essence, leveraging pre-defined packages and utilizing a built-in object hierarchy with strict precedence ordering to provide the object-oriented functionalities. As such, the authors have meticulously defined and organized the abstract specifiction independent of the implementation. In the words of the authors, the concept of the MetaObject Protocol can be ported and applied to other implementations that might utilize architectures and tools that are vastly different from those of Common Lisp [page 2, Introduction - The Art of the MetaOBject Protocol]. The purpose of this paper, therefore, is to explore some of the functionalities prescribed by the MetaObject Protocol as to how they can be and have been deployed in modern implementations of object-oriented languages - C++ and Java in particular, and to what extent these serve their purposes in meeting modern programming requirements. 2 The Common Lisp Object System In The Art of the Metaobject Protocol, Kirczales et. al. describe a set of functionalities for the implementation of CLOS [Chapter 5, pp 135-153]. The foundations and requirements of the implementation are well-defined within the context where the Common Lisp Language serves as the basis for the supporting all the object-oriented extensions in CLOS. As the Metaobject Protocol is applied, CLOS itself is promoted to be the mechanism for dynamic extension [p 138]. The result of developing this protocol is the specification of a hybrid language that is backward-compatible to Common Lisp programs and at the same time supports a comprehensive suite of extensible object-oriented definitions and operations. This new programming system, devloped with a combination of procedural reflection and object-oriented techniques, provides a set of foundamental behaviors for the language and allows these bahaviors to be extended by incremental adjustments within the context of the core operations of the system iteself. These techniques, in effect, define CLOS in terms of : 1) user interfaces which operate on object types such as class and method, and which support the features of object-oriented programming such as object instantiation, class inheritance, and polymorphism, 2) a foundamental architecture consisting of the so-called "metaobjects" which are the internal representation of classes, methods, instances, and other data types necessary for supporting the interfaces, and 3) extensible semantics which allow the users to modify the default interfaces and behaviors within the new system itself. 2.1 Object-Oriented Features In simple terms, CLOS extends Common Lisp to support the semantics and operations of object-oriented programming. Primitive constructs are built in Common Lisp to represent metaobjects, which are the foundation and "backstage" elements to support the CLOS user environment. After bootstrapping itself with these primitive elements, CLOS then defines the system interfaces to support user-defined objects based on the primitive foundation. Thus CLOS supports both the new object-oriented interfaces of and the original Common Lisp syntax and semantics. In general, object-oriented programming deals with abstraction of data in object entities. Object-oriented programs define classes to encapsulate data, associate classes with functional methods, and provide polymorphic operations on objects through class inheritance and function overloading. CLOS extends Common Lisp with a self-contained environment which includes all the essential features to support and operate on class data types: - macros to define and construct classes, generic functions, and methods - macros to instantiate objects of defined classes - automatic association of classes with attributes and methods based on a class hierarchy with deterministic inheritance rules - generic dispatch functions to match a method to its invoking process by specializing on the parameters to the method - multiply defined methods with the same name but specialized on different parameters - core framework for method invocation and execution - method invocation type definitions to provide alternative frameworks for methods 2.2 Metaobject Foundation The CLOS functional specification is named the "Metaobject Protocol" because, in its core, is a collection of foundation objects the sole purpose of which is to support and manage the object-oriented entities a CLOS user program defines and manipulates. These are called "metaobjects" and are internal to the CLOS system and can only be accessed from the external interfaces through a glue layer [p.17]. The most interesting of these metaobjects define and support - CLOS class, function, and method objects, and their basic behaviors - operation of generic functions and methods - introspection into the class objects - dynamic adjustment and extensions of object behaviors 2.3 Dynamic Extension In a broad sense, any program is an extension of the language in which it is coded, as a lanuage is generally defined as a collection of strings, and a program is a member string of a specific language. Common Lisp, as a language, encourages the users to define and build new functionalities that can be included as operations within its own context (thus some strings in the language, when executed as a program, will extend the language to include other strings as members). The implementation of the Metaobject Protocol for CLOS is a fine example of Common Lisp's power in extensibility. As such, CLOS itself is intrinsically extendable within its own object-oriented context. The CLOS programming interfaces provide functionalities to redefine runtime objects, expand the foundation object hierarchy, and create new or modified behaviors in foundation classes. These attributes for dynamic extension is achieved by flexible external interfaces and inherently polymorphic metaobjects that can be modified and applied within CLOS. 3 Class Inheritance and Polymorphism In his book, Thinking in C++ [Eckel95], Bruce Eckel lists inheritance and polymorphism as two key concepts in object-oriented programming [pp. 37-40]. As Eckel points out, object-oriented data types differ from conventional data types in that there is a deterministic relationship among object types: "Inheritance expresses ... similarity between types with the concept of base types and derived types. A base type contains all the characteristics and behaviors that are shared among the types derived from it. You create a base type to represent the core of your ideas about some objects in your system. From the base type, you derive other types to express the different ways that core can be realized." CLOS supports inheritance with a class hierarchy and a set of specific inheritance rules based on class precedence ordering. These are central concepts that allow the CLOS execution frameworks to specialize on the appropriate object types. 3.1 Class Types and Hierarchy The CLOS class hierarchy is complicated by several factors: 1) it has to accommodate Common Lisp data types that are created in the CLOS environment, 2) it provides for the definition of specialized metaobject classes [p. 75], and 3) it supports multiple inheritance. 3.1.1 Primitive Data Types CLOS defines a number of built-in classes for Common Lisp primitive types in the object-oriented environment. These built-in types are created with a different structure from the user-defined classes, and they cannot be instantiated by the regular CLOS object creation commands [Keene p.85]. Moreover, built-in classes, except for the class t, which is the root of the entire class hierarchy, cannot be inherited by user-defined classes. These restrictions render the built-in classes not very useful in the CLOS environment. The advantage of having these data types represented as classes, of course, is the ability to write methods that specialize on these objects; as Keene points out, "If a Common Lisp type does not have a corresponding class, you cannot define a methods that specializes on that type [p. 84]." The difficulty is, however, primitive types such as integer, short, etc. are not suitable to be defined as general class types, since users really do not need to derive subclasses from them. Moreover, the original behavior of the primitive types must be preserved for Common Lisp compatibility. In fact, not all Common Lisp data types can be made into classes in CLOS because of the lack of a strict precedence order for these types within the class hierarchy. For example, one cannot apply precedence ordering on constants, and thus CLOS cannot resolve which method to apply if an argument to the call specializes a constant. Although CLOS provides the semantics for "individual methods" - methods that evaluate the form (eql type) for a constant argument to specialize to a specific constant - if the argument is not an exact match to a method, CLOS cannot decide if one method is more specific than another based on the constant argument [Keene]. Also, lists, the quintessential data type in Common Lisp, are excluded from the class hierarchy because they can have sublists of different types, thus making it very difficult, if not impossible, to generate a precedence order for lists and have methods specializing on them. Similar issues with built-in types arise in C++, which is also a hybrid language. In his original design for C++, Stroustrup had aimed to make "user-defined and built-in types ... behave the same relative to the language rules and receive the same degree of support from the language and its associated tools." The syntax and semantics of the "pseudoconstructors [Eckel p. 474]" were added to C++, so that primitive types such as such as int, short, etc. can be initialized in the same manner as real objects [Eckel, p. 504]. However, this was the extent of "objectivizing" primitive types in C++. As Stroustrup points out, "the C conversion rules are so chaotic that pretending that int, short, etc., are well-behaved ordinary classess in not going to work. They are either C compatible or they obey the relatively well-behaved C++ rules for classes, but not both [Stroustrup, p. 380]." Fortunately, method invocation in C++ does not rely on a precedence order for specializing on method parameters, thus elminating the need that the data types for parameters must be in a object-based class hierarchy. In Java, the primitive data types - int, char, short, long, etc. - are entirely excluded from the class hierarchy. This separation of primitive types and classes turns out to be very practical and efficient, since programmers do not subclass and inherit properties from primitive types (if necessary, "wrapper" classes can be defined to contain the primitive types.) When special treatment of a primitive type is necessary, one can always define a new class that represents that type with intrinsic attributes of the primitive type and the methods for it. Java, for example, defines the Integer and Character (and wrapper classes for other primitive types in more recent releases) classes in addition to the primitive counterparts of int and char. These classes include methods for conversion and operation on the data type. Since Java is a pure object-oriented language where all information and operations are encapsulated in classes, global function definitions are disallowed. So, when special operations need to be applied on an integer or character, the methods defined in an Integer or Character class are preferred, instead of having to define methods in other classes and applying them on the primitive types of int or char. 3.1.2 User-Defined Classes Clos defines t and standared-object as the root of all inheritances, t is the anchor for both built-in and user-defined classes. While many built-in classes are direct subclasses of t, all user-defined objects, including the standard metaobject classes and specialized metaobject classes, are derived from the standard-object, the direct subclass of t on the user-defined hierarchy. As mentioned earlier, the internal representation and interfaces of built-in classes are quite different from the user-defined ones. The standard-class which is a user-defined class itself is the internal representation of a class. Thus it is derived directly from standard-object, just as any other class. 3.1.2.1 Special Class Types CLOS does not impose explicit access restrictions of instance data between objects as C++ and Java. This is not an issue, however, as Kirczales et. al. have shown that, the semantics of the Metaobject Protocol allows the definitiona and creation of special metaobjects to represent classes with different behaviors. With these special objects, classes can be created with extended features, such as private and protected access rights to the class data [pp. 89-90]. The special metaobject classes are a very powerful feature that provides for dynamic extension within the context of the CLOS environment, as will be discussed in section Another special class type in CLOS is the so-called "mix-in" classes. These are intended for adding "flavors" or properties to subclasses. Mix-ins classes are designed to be inherited by other classes and not to be instantiated alone by themselves, although CLOS does not impose any syntactic or semantic restrictions on them. In C++ and Java, these are called abstract classes. The abstract keyword in both C++ and Java, however, prohibits the instantiation of abstract classes. In addition to abstract classes, Java alos supports interfaces that are essentially abstract classes without any method body. Thus interfaces in Java are strictly for inheritance and their methods must be implemented by subclasses. The purpose of interfaces in Java is to get around the single inheritance restriction - while abstract classes can only be singly inherited, a class can "implement" multiple interfaces. The Java class does not have to worry about precedence ordering in multiple interfaces, since the class itself has to implement the actual methods and deal with the data. Mix-in or abstract classes is a very useful concept for object programming. Used as the base for inheritance, they allow subclasses to share properties and behaviors, and very often provide a consistent external interface to other objects modules. One other class type worth mentioning is the standard-class which represents and implements the class object. As mentioned earlier, all class types are subclasses of standard-object in CLOS. So, standard-class inherits the same properties as all other user-defined classes. The standard-class, however, serves a special purpose - every defined class in CLOS is actually an instance of the class standard-class, including standard-class itself. The standard-class object maintains a lot of information about the class. This is a key piece of the Metaobject Protocol, because object instantiation, introspection, and many of the operations on the object all depend on the class object. In Java, the root base class is the Object class. The class Class, which is a direct subclass of Object, is the equivalent of the standard-class in CLOS. Whenever a class is defined, an instance of class Class is created to support and maintain the new class. There are significant differences between CLOS and Java class objects, however. In CLOS, subclasses can be derived from the standard-class - this is significant because it implies that special types of classes can be created based on standard-class, as will be shown in section. The Java class Class, on the other hand, is final and thus cannot be inherited, so the class type is static and not extendable. A CLOS standard-class object maintains all the relevant information related to the defined class: - the name of the class - its direct superclass and direct subclasses - its class and effective methods - a precedence list Access to these data is provided in CLOS by a number of functions defined for this purpose: A Java class object, on the other hand, does not encapsulate any data of the class, but provides the methods for: - retrieving the class object - retrieving the class loader - retrieving all the interfaces of the class - retrieving the name of the class - retrieving the superclass - identifying itself as a class or interface All the data pertaining to a Java class - the constant pool, fields data, methods data, and other attributes - are maintained in the method area internal to the Java Virtual Machine. The method area is an implementation- specific structure for providing efficient storage and access for class data [JVM Specification pp. 63-64]. Note that Java user programs can only access limited data from the method area through the class object methods. For introspection purposes, Java prescribes a structured approach with separate interfaces from the class object. The detailed of introspection will for CLOS and Java will be discussed in section. 3.1.2.3 Access Restriction by Packages Both CLOS and Java support the concept of packages, an organized grouping of functional classes, data, and methods. The basic reason for package is to avoid name conflicts in programming different modules, and allow linkage to other modules by importing packages. CLOS uses packages also to restrict public access to class data (although private and public data attributes can be implemented with special metaobject class extensions.) When class data and functions are encapsulated in a package, the program can export only the methods that are meant to be public to external users, thus making the slots and other methods of the class effectively private. Java's usage of package is more extended. The semantics of the Java language already supports private, protected, and public, as in C++. By packaging different classes into a package, Java not only resolves the issue with class name conflicts, but also provide special access rights to protected data types among classes in the same package and restrict protected types between classes in separate packages. Furthermore, Java organizes packages by directory names. So, the fully qualified name of a package is actually a path name in the file system. This provides a natural way for organization of class libraries. 3.1.3 Inheritance Heirarchy As in Smalltalk and Java, all user-defined classes in the language are derived from the same root base class, so that all user-defined classes shared some core behavior and properties in common. This aspect is very important for Java which supports only single inheritance. If a built-in class hierarchy does not exists (as is the case with C++), users can define their own base classes and derive classes that cannot be related by inheritance. Then, the only way for one class to include properties and behaviors of an unrelated class is by containment. There are several issues: 1) Containment expresses only the notion of inclusion of properties, not the intrinsic characteristics of a class. For example, a class Building can include Door and Window classes as properties, but a Theater is really a type of Building, not a property. So, Theater should inherit from Building instead of being contained in it. 2) In the cases here every class has to include another class in order to derive properties, it may create many levels of indirection if classes are included in a chain. Suppose there are Building, Theater, Stage, and Curtain classes, all of which are derived from separate and unrelated class hierarchies. A class Theater-Building will have to inherit from Building and include class Theater, which in turn includes Stage, which includes Curtain. To express the property of Curtain, on will have to apply multiple levels of indirection, which is inefficient, not to mentioned being inelegant: theaterBuildingObj->theaterObj->stageObj->curtainObj->colorRed. 3) Without the same root base class, a generic reference to objects cannot be used in the program. Eckel points out that, when a program uses a container object to maintain a variety of different object types, it will need to have a generic reference or pointer to objects. The generic reference can work only if all user-defined classes are derived from the same root. In early implementations of C++, multiple inheritance was not implemented. Furthermore, the language did not (and still does not in general) provide a built-in class hierarchy, and users can construct classes from distinct and unrelated inheritance trees. This created a problem with the container class - there is no way to assign objects from different inheritance hierarchies to the same generic reference. For example, two classes, A and B, are derived from separate base classes: A pointer to a generic object can be declared to reference an object of class A: Object *objPtr; A a; objPtr = &a; However, the same objPtr cannot be used to reference an object of class B. One solution is to create an object that will share the properties of class Object and class B - ie. to multiply inherit from more than one classes: Now, both object a and ob can be manipulated by objPtr: OB ob; objPtr = &ob; This was one reason that multiple inheritance was eventually introduced int C++. 3.1.3.1 Method Resolution by Class Hierarchy Even though CLOS supports multiple inheritance, it still requires a built-in hierarchy with a common root class, so that method resolution can be done correctly. With t as the root base class, all objects can be specialized to t. Thus a user can define a method without any specializing parameters. This method will be invoked by the generic dispatching function when the method is invoked with arguments that do not satisfy any specialized parameters: (defmethod foo ((ClassA objA) (ClassB objB)) ... (defmethod foo ((ClassA objA) noClass) ... (defmethod foo (noClass noClass) ... When foo is invoked with arguments that do not satisfy ClassA or ClassB, the last method will be called, because non-specializing parameters in a method are equivalent to specializing on t. Without a common object-based hierarchy, CLOS would not be able to resolve object precedence among different types, and it would not be able to implement methods that specialize on t as a default, catch-all method. 3.1.4 Object Introspection One the principle designs of CLOS is the ability to "open up", or expose the internal implementation to the users. By allowing access to the metaobjects, users will be provided the backstage functionalities necessary to expand on the properties and behaviors of the language [Kirczales, p.47]. A very important aspect is class introspection - the run-time inspection of the properties of classes, generic functions, and methods in a program module. One must keep in mind that data encapsulation does not necessarily mean hiding certain data in an object and keeping the information away from the users. The basic principle for object-oriented programming is the notion that data in a program can be modelled after real-life objects, and their properties and behaviors can be captured and modified through interfaces to the objects. The motivation for encapsulating private data in an object is to be able to separate the implementation details of the object from its external interfaces. This enforces the usage of the interfaces when the users need to interact with the object. If the interfaces are kept consistent, the users do not have to concern themselves with the implementation of the object, even if it is modified internally. Introspection, on the other hand, is a run-time protocol that allows the program to obtain information about the interfaces and sometimes the internal structures of an object. There are two main reasons for introspection: 1) When given a generic object, a program sometimes may not have knowledge of the object's interfaces. In such case, the program must inquire the object itself for its interfaces. 2) When the users need to extend the functionalities of an object, particularly a metaobject that supports the program itself, they have to examine its internal structure in order to make relevant decisions at run-time. Class introspection, however, as Kirczales et. al. point out, is not about "arbitrarily exporting the internal structure of existing implementations [p. 48]." Rather, it is a formalized and standardized procedure to procure access of information for analyzing the properties and behaviors that characterize an object. The above actually are the same principles that drive the design of the Metaobject Protocol in CLOS - that metaobjects are "backstage" representations - which the audience (users) normally do not see - of the "on-stage" objects in the user programs. However, at times when the language itself needs to be extended, the metaobjects must be conditionally and methodically exposed, and the users will be allowed to participate in modifying the behavior of the language through their knowledge of the metaobjects from introspection. 3.1.4.1 Metaobject Representation One of the most powerful usage of metaobjects in CLOS is the internal representation of classes, generic functions, methods, and instances with standard-class, standard-generic function, standard-method, and standard-instance metaobjects, respectively. The interesting part is that these metaobjects are in turn instances of the standard-class. CLOS resolves the circulatory problem - ie. defining the class standard-class by instantiating an object of type standard-class - using a reflective procedure which basically manually bootstraps the primitive structures to support the standard-class definition. Since the definition and creation of these metaobjects occur at program run-time, a user program, if allowed to access these objects by exposed interfaces, have complete control over the behavior of not only the program, but the system that supports the program itself. CLOS introspection is provided by a series of accessor methods applied to these metaobjects to examine the structures of the objects they represent. 3.1.4.2 Accessor Functions CLOS enforces the notion of introspection with the specification of accessor methods as part of the class method declaration. Accessor methods are either reader or writer methods which allows a user interface to extract information or modify the structure, respectively, of an object. This technique has become a design pattern that is widely employed in C++ and Java, because accessor methods allow the programs to call an interface to access information from an object, without having to know the details how the information is represented and provided internally. In the case with Java, as will be explained later in section , its core reflection interface actually relies on accessor and signature design patterns to resolve the names of properties and methods in an object. So, to support introspection through metaobjects, CLOS makes available a number of supplemental accessor functions that allow the users to browse the internal structures of the metaobjects for classes, generic functions, or methods. A CLOS program can take advantage of introspection in the following ways: 1) A program can obtain documentation of the interface and properties about the class within the objects themselves, rather than relying on distribution documentation in manual forms. 2) A program can use the information from introspection on an object to make run-time decisions and interface with objects programmatically. 3) Packages can be used to control the exposure of the objects through intropsection, and thus restricting the behaviors of objects. 3.1.4.3 Component Objects These attributes of introspection - that the properties and interfaces of an object can be conditionally discovered and analyzed at run-time - provide the foundation for the formalization of "component objects." Component objects, loosely defined, is a standardized approach to develop and package objects that can be interfaced with other packaged objects modularly and dynamically. These objects are often described as "re-useable" in the sense that they do not require recompilation or static linking in order to be used in a program. The most popular component object models at the time of this writing are Microsoft's Component Object Model (COM) and Sun's Java Beans. COM is an extension of C++ and provides a rudimentary introspective tool call QueryInterface. Basically, QueryInterface is member function of all COM component which queries other objects to determine what interfaces are available at run-time. This technique however, is very limited. As Rogerson points out in Inside COM [pp. 56-57], for QueryInterface to work, the client module making the query has to know what interfaces it is looking for in another object. QueryInterface returns a pointer to the interface of the server module, but if the server object returns an interface that is not recognized by QueryInterface, then the client will not be able to do anything with it. In other words, QueryInterface serves only to verify if an interface is available in another object, and if so, returns a pointer to the client available in COM; it does not provide for the analysis of an object's internal structures. Although COM provides an external type library for run-time objects, and the user program can access the information about the parameters of functions in an interfacei for browsing, general programmatic documentation of object properties is not available in COM. At any rate, C++ semantics in general does not support the capability of introspection well. Firstly, C++ objects are statically linked an bound at compile time. In order to support introspection, its internal structures for classes and objects must be redefined with new semantics to include interfaces for retrieving information about classes and objects. Libraries to access the internal objects must also be provided, most likely from the same supplier of the compiler, and these have to be statically linked. Finally, since C++ objects types are bound at compile time , a program cannot rely on the type of the reference to an object for its class properties; the object itself must maintain a pointer to its class object, much like a virtual function table pointer, so that the proper information will be retrieved from the object. These restrictions make it undesirable to revamp the basic structures for C++ to include programmatic introspection. Java, on the other hand, supports a comprehensive set of user-visible interfaces for discovering and analyzing the internal structures of objects. Moreover, the core mechanism that enables components built with Java Beans to interface with each other is the Introspection API. 3.1.4.4 Java Introspection API The Java Introspection API actually consists of the high-level introspection interfaces and the low-level reflection services. The low-level "core reflection" API is actually maintained separately under the java.lang package, whereas the introspection API is associated with java.beans. This implies that core reflections can be used independently to deal with objects that are not Beans. However, while users can directly apply the reflection interfaces to examine run-time objects, it is recommended by the Java Beans designers that the more general introspection procedure be used, instead. That is because the introspection API follows the rules prescribed by the Java Beans design specification; it is for the best interest of the users to adhere to the standard. 3.1.4.4.2 Core Reflection and Reflection Objects To support reflection efficiently, Java defines a series of so-called "reflection objects": Class, Array, Constructor, Field, Method, and Modifier objects. These objects in some ways are parallels of the CLOS special metaobjects in that they are instrumental for a user program to access the internal structure of a class object. CLOS defines the standard-class, standard-generic-function, and standard-methods as the internal representation of class, generic-function, and method objects, respectively. CLOS also defines a series of accessor functions that are applied on these objects. These functions allows a program to find a class object by name, its superclasses, its precedence-list, slots, methods. Methods can also be found and analyzed by accessor functions applied on a generic-function object. The reflection objects in Java provide very similar functionalities. With the accessor methods defined in the class Class, a program can find another class object by its name, its constructors, its members classes, fields, and methods. In both CLOS and Java, these accessor functions return real references to the desired data, which can then be used programmatically. There is one significant difference, however, between CLOS metaobjects and Java reflection objects: Reflection classes belong to the Java class hierarchy, but they are static and final, and thus do not have constructors. In other words, subclasses cannot be derived from them, and they cannot be instantiated with the "new" directive. These objects are declared and used primarily to provide object references to other existing objects in a running program. It is interesting to note that earlier releases of Java (pre-1.1) nly the Class class was supported, and it had only limited usage for browsing class object names or verifying ownership of instances. Recent releases of Java include support for Java Beans. So, the Class definition has been greatly expanded to include accessor methods for introspection, and the other core reflection objects have also been defined for this purpose. It has been said that the most important part of Java Beans is introspection(??). Probably, the most essential elements of introspection are these core reflector objects. 3.1.4.4.3 High-Level Introspection The Java introspection API is packaged under java.beans. The Introspector object is the core of the introspection API and is used exculsively for discovery and analysis of Beans properties. Bean introspection in Java goes beyond the core reflection in that it allows the user program to provide customized information that a Bean would want to exposed. When a user creates a Bean, the SimpleBeanInfo class may be extended to provide explicit or customized information in a BeanInfo object. The user can override the methods in the SimpleBeanInfo to return descriptors on properties, events, methods, and parameters. The BeanInfo object, which is part of a Bean instance, provides the user program a standard way to deal with higher level information of the Bean - a collection of objects, whereas the reflection API deals with the information of a specific class. A Java Bean does not need to have a BeanInfo object, or it can choose to override only selected methods in SimpleBeanInfo, not all of them. The Introspector object always looks first for explicitly defined and publicized information about a bean in the BeanInfo object through the overridden methods. The Introspector will apply core reflection on the default methods, or on all the methods of SimpleBeanInfo if BeanInfo is not provided by the Bean. The interesting part about the core reflection procedure is that it will try to extrapolate the properties of the Bean by matching the design pattern of the accessor methods it has found. For example, if getXYZ() and setXYZ() methods are found in a Bean, it is assumed that the property XYZ exists and it is readable and writable (because both getter and setter methods are available), and it is of the type that getXYZ() returns. Moreover, even though the reflection interface can find private data and methods, the introspecting program will not be able to use them because of restricted access. To support reflection properly, therefore, a developer of a Bean must adhere to the specified design patterns for naming Bean properties. Otherwise, a BeanInfo class with explicit accessor methods should be used for introspection. 3.2 Operational Framework In the context of object-oriented programming, polymorphism [Greek, "many shapes"] is a feature that allows method invocation to take many different forms. CLOS supports polymorphism with a suite of powerful techniques which include generic dispatch functions, method overloading and overriding, the core execution framework, and method invocation type definitions, 3.2.1 Generic Functions and Specializing Methods Generic functions in CLOS is somewhat similar to virtual functions in C++. Both generic functions and virtual functions specify the interface, not the implementation of the methods. Both generic and virtual functions redirects the invocations to the actual methods with specializing parameters that satisfy the argument of the invocations. Generic functions performs run-time binding of the invoked function to the actual method by a precedence ordering based on the class inheritance hierarchy; virtual functions in C++ uses a table of indirect pointer to achieve the effect of run-time binding. Examples: Defining class A in CLOS with generic function foo: (defclass A .... (defgeneric foo (X) ... (defmethod foo ((ClassA objA) ... (defmethod foo ((ClassB objB) ... Assuming ClassB is a subclass of ClassA. When the following function is called, (foo (b) ... where b is an instance of ClassB, The second method defined above will be called, because CLOS detects that ClassB is more specific for b than ClassA. Now, if the second method is not defined, then the (foo (b)) invocation will resolve to the first method, which specializes on ClassA. So, the method invocation resolves the class type being specialized on during run-time. In C++, ClassA and ClassB delcare a virtual function foo() with the same signature, with ClassB a subclass of ClassA: virtual void ClassA::foo() { ... }; virtual void ClassB::foo() { ... }; Now, declare a pointer of ClassA but assign to it an instance of ClassB: ClassB objB = new ClassB(); ClassA *pObjA = &objB; Then the call pObjA->foo() will effectively invoke ClassB::foo(). C++ supports virtual function dispatching by implementing a virtual function table for each class that declares member virtual functions. Each object that has virtual functions is created with a virtual function pointer that points to the virtual function table for the class. The virtual function table and virtual function pointer are expected to be in the same location for all classes. Since the layout of the virtual function table of the base class is the same as all its subclasses, at compile time, a pObjA->foo() call will be bound to the address in the virtual function pointer plus the offset to the virtual function table where the function is located. Then, at run time, when the actual object of ClassB is assigned to pObjA, the virtual function pointer of pObjA will point to the virtual function table of objB, and the pObjA->foo() call will effectively become pObjA->pVtable->foo(), where pVtable is the virtual function pointer for ClassB. To be exact, C++ simulates the effect of late binding by an indirect call through a function pointer of the object to be assigned in run-time. In effect, the program actually does not need to know the class type of the object. There is some major difference between C++ virtual functions and CLOS generic functions. A generic function is not a member of a class; it is associated with different classes at run-time when it dispatches the invocation to methods based on the specializing argument by the calling process. A virtual function, on the other hand, is a member of a specific class and abides by inheritace rules, although a virtual function does not necessarily have an implementation in the class. Moreover, since generic functions and methods are loosely couple with classes, it is possible to specialize on more than one class in a method invocation through generic function dispatching (see section on Multi- Methods.) A C++ virtual function or method in general can specialize only on the class of which it is a member. 3.2.2 Multiple Inheritance Both CLOS and C++ support multiple inheritance, where a class can be derived from more than one superclasses. This feature, while powerful (and necessary for C++ to resolve the generic reference problem), could be a source of confusion and other problems. CLOS requires that a deterministic precedence list can be generated from the superclasses of a subclass based on the order of specification and the order of the superclasses in the inheriance hierarchy. Otherwise, the multiple inheritance is not legal. Stroustrup, however, strongly felt that relying on order dependence to resolve ambiguities in multiple inheritance was not a good idea - it is too error-prone and inefficient: programmers have to take care to make sure the ordering of the inherited classes, and all their superclasses are consistent. Internal structures must also be maintained for the ordered lookup in a partial order set that represents the inheritance. [Stroustrup, pp. 118, 259] So, the implementation of multiple inheritance is independent of the ordering of the superclass specification. Ambiguities are resolved at compile time. To get around the classic "diamond" problem - where a class inherits from parents which in turn are derived from the same ancestor - C++ implements "virtual base classes" so that multiple versions of inherited properties will be not derived from the superclasses and thus "disambiguates" the invocation of inherited methods and access of properties. Java manages to avoid these issues because it does not supporting multiple inheritance of classes. To allow classes to inherit attributes from multiple sources, Java allows classes to "implement" multiple interfaces in a single-inheritance architecture. Interfaces are like abstract classes - they cannot be instantiated - but contains only unimplemented methods, and the user-defined classes must supply the implementation of the methods. So, there is no ambiguity that needs to be resolved, as in CLOS and C++. However, as Venners points out, subclassing a class from interfaces can potentially have a minor effect on performance. The instance variables and instance methods in a Java object can be represented as a table with ordered entries - variables and methods of the base class object on top, followed by those of the subclasses. Since the inheritance in Java is a linear partial order, the positions of the instance variables and methods of superclasses can be expected to be the same for all the subclasses with the same parent. Thus a direct reference can be created for any of the variable or method entries in the table by indexing. When a class implements an interface, however, its superinterfaces cannot be expected to have variable and methods positioned at the same entries, because interfaces are multiply inherited and are not required to have a common base. Thus the variables and methods pertaining to the interface have to be searched and cannot be indexed [p. 229]. 3.2.3 Multi-methods In "ANSI Common Lisp", Paul Graham distinguishes between the message-passing model and generic function model in method invocation. Both C++ and Java employs message passing, where "methods belong to objects, and are inherited in the same sense that slots are [p. 192]." With the generic function model, however, methods do not specifically belong to any object but are associated loosely with them. The limitation of message-passing is that the method can specialize only on the object to which the method belongs; the semantics of message-passing does not provide for specialization involving multiple objects. With generic functions, method invocation can specialize on as many objects as desired, as long as the methods are defined. In a sense, generic function dispatching is a superset of message-passing, because message-passing can be simulated with generic function by specializing on one object, but not the other way around. A CLOS method that specializes on more than one parameters is called a multi-method. One advantage with multi-methods is obvious - that a single method definition (defined outside of classes) can accommodate multiple classes that need to be operated on together or that require the same behavior. For example: (defmethod install ((Engine e) (Vehical v)) The objects of Engine and Vehical classes use the same install method, and the method operates on both the objects together. As explained above, there is no equivalent to multi-methods in C++ and Java, even though multi-methods can potentially be very useful in C++ - a method can be defined outside of classes (not a member of any specific class) and yet can specialize on as many object as desired. Presumably, with some change in semantics, multi-methods can be included in C++, but not with Java, because the language specification of the latter dictates that all methods are strictly encapsulated with classes. C++ designers have long considered adding multi-methods, which they omitted in the original design, to the language. As early C++ version evolved, and the syntax and semantics became more solidified, it seems unlikely that multi-methods will eventually be supported in C++. Within the existing context of C++, multi-methods will require syntactical changes in C++ that will allow declarations such as: class ClassA : public SomeBaseClass { public: void foo(): } class ClassB : public SomeOtherBaseClass { public: void foo(); } The definition of the multi-method would be something like: public void ClassA&ClassB::foo() { ... }; The invocation will be tricky. As Stroustrup points out, (objA@objB)->foo(1, 'c', 2) is not efficient nor elegant within the context of existing C++ syntax [p. 298]. For one thing, the C++ compiler has to resolve the awkward invocation with multiple class specification, in all different combinations and orders, during compile time. A question immediately arises: how would the compiler distinguish between the multi-method ClassA&ClassB::foo() from the ordinary invocations of ClassA::foo() and ClassB::foo()? Or should one type of method definition exclude another, if they have the same signature? The more serious problem, however, is how can the "this" pointer be utilized in this syntax to specify data belonging to a specific object. In the above example, data in ClassA and ClassB must be distinguished, but there is no handle to refer to them, since the "this" pointer is meaningless in this context. Despite all these, Stroustrup refers to a proposal by Douglas Lea which has the potential for implementing multi-methods in C++ in a logical and efficient way. Lea's suggestion is to apply the virtual keyword on specialized parameters in multi-methods: void foo(virtual ClassA&, virtual ClassB&) { ... } The multi-method will be defined as external functions outside of classes, and its parameters will be references to the objects of the specialized classes. This will provide the semantics for supporting mult-methods in the same manner as CLOS. Still, there are several issues: 1) The semantics in C++ will have to change to accommodate both virtual and non-virtual declarations of function arguments. This means the language, in addition to the above declaration for a multi-method, has to support declarations of methods of the ordinary kind: void foo(ClassA&, ClassB&) { ... } 2) Additional structures similar to the virtual function table will have to be organized for the virtual parameters in a multi-method declaration. 3) Multi-methods will have to support the variable-length argument list with default values, as in ordinary C++ method overloading. 4) In CLOS, there is a root base class where every class derives from. Therefore, a "catch-all" method with the root base as specializing parameters can be written. Any invocation with arguments that does not satisfy any other methods will be caught by this "catch-all" method. In C++, there is no built-in root base, classes can be derived from entirely separate inheritance trees. A "catch-all" multi-method in C++ has to specialize on the root base for each of the objects in the parameter. It would have been interesting, as Stroustrup believes, to have implemented multi-methods in C++ in its early existence, before the syntax and semantics are firmly established. It will be perhaps too much work now to add the feature of multi-methods and still maintain backward compatibility. 3.2.4 Primary and Auxiliary Methods CLOS prescribes the notion of method combination types which specify different frameworks for method invocation. The so-called "standard combination type" is the default behavior for the method invocation procedure. Within the standarad method combination type, the core framework is defined to "declare the role of a method [Keene, p. 102]" The different roles that methods play are defined in the forms of primary and auxiliary methods. Auxiliary methods include before-, after- and around-methods, the purpose of which are to enhance the functionality of the primary method. 3.2.4.1 Before- and After-methods in the Core Framework In CLOS's core framework, before-methods are called in the most specific order first, then the primary method, and then the after-methods are called in the least specific order first (in reverse of the before-methods). This procedural sequence of method invocation provides new behaviors around the primary method in a flexible and modular manner. Before- and after-methods are called "declarative techniques" in CLOS [Keene, p. 102], because their functionalities are pre-assigned when they are defined, and they follow the invocation rules of the core framework. Although the constructor and copy constructor methods in C++ and Java, and the destructor methods in C++ have similar purpose and functionalities, there is no semantics to support before- and after-methods in C++ or Java in general. Also, constructors in a C++ or Java class is executed from the root base in the least-specific-first order, and the destructors in C++ are executed in the most-specific-first. These are the exact opposite of the CLOS before- and after-methods orders. The reason for that is C++ and Java specify the initialization of the objects starting from the root base. When the objects are to be destroyed, the destructors have to be called in reversed order of the initialization, lest the constructors might have created some ordering dependencies. The CLOS before- and after-methods semantics were actually implemented in a predecessor of C++ called C With Classes. In each class of C With Classes, a "call()" function can be defined such that when every member function except for constructors is called, the call() function will be executed before the member function is actually called. Another "return()" function can also be defined to be executed before a member function returns [Stroustrup, p.57]. Call() and return() methods were dropped from the specification of C++, because few people saw that they were important for the language, and could cause potential problem. 3.2.4.2 Around Methods and the Standard Method Combination CLOS distinguishes the declarative technique using the before- and after-methods in the core framework from the imperative technique using around-methods to enhance the core framework and control explicitly the calling sequence. Around-methods provides a wrapping layer around the execution of the core framework. The major difference between around-methods and other methods is that around-methods can optionally control which method to call next by the call-next-method function. CLOS first executes the most specific around-method from the generic function dispatch. If an around-method calls call-mext-methods, it will call the next most specific around-method. When there is no more around-method, and call-next-methods is called, the core framework with before- and after-methods will be invoked. If an around-method does not call call-next-methods, it will return the value to the generic function without calling the core framework. Thus around-methods can conditionally. prevent the execution of the core framework. Before-, after-, and around-methods can be implemented in both C++ and Java as standard method combination features. On one hand, these techniques allow an object to leverage functionalities from it superclasses automatically. On the other hand, these may add complication to programmers who now have to think how the methods of one class may affect the usage of methods in another. In other works, when one implements an around-, before-, or after-method in a class, one must take into consideration how it will be used by potential subclasses because of the standard method combination. Users of the subclasses, also, must understand the procedure of the standard methods combination and how the auxiliary methods interact with the core method, in order to apply the techniques properly and effectively. Therefore, these auxiliary methods may be not desirable in the C++ and Java framework. Moreover, as Stroustrup observes, the standard method combination can be simulated in C++ (or Java, for that matter) without having to impose the entire framework on the language. The idea is if a user wants to provide the functionalities of the auxiliary methods in a class, these can be implemented as protected methods with a special tag. Derived classes then can implement methods that will invoke the tagged methods in the superclass before (in the cases of simulating around- and before-methods) or after (in the case of simulating after-methods) calling the core method. The tagged methods will interact with other simulated auxiliary methods in its own superclasses, if they are available. This approach allows the flexibilty of having auxiliary methods where they are desired, and does not impose upon all users a strict method combination framework. 4. Dynamic Extension of the Lanuage Behavior Perhaps the most entriguing aspect of CLOS's Metaobject Protocal is that it allow a program to affect the semantics and behavior of the language within the execution context of CLOS itself. In the Introduction to The Art of Metaobject Protocol, the authors state upfront the motivation for their approach in the CLOS design: "The protocols followed by this object-oriented program serve two important functions. First, they are used by the designers to specify a distinguished point in that region, corresponding to the language's default behavior and implementation. Second, they allow users to create variant languages, using standard techniques of subclassing and specialization. In this way, users can select whatever point in the region of language designs best serves their needs." The Kircazles et. al. describe a strategy of developing "intercessory protocols" for user programs to change the behavior of the language. The dynamism in this language design can be realized in CLOS on account of several closely related factors: 1) Common Lisp, itself an extensible language, is employed as the foundation for developing CLOS. The primitive instructions of CLOS - defclass, defgeneirc, defmethod - are actually macro extensions of Common Lisp. Therefore, CLOS is inherently extensible. 2) The Metaobject Protocol specifies a designing scheme to represent internal structures supporting CLOS programming within the CLOS context. This blurs the line between the language that implements the program and the program itself - ie. the CLOS programming environment itself can be viewed as a CLOS program, with some built-in primitive Common Lisp structures in the core to support user-defined objects. The ramification is that CLOS can be programmatically enhanced, even when it is running as a user program. 3) As a result of the above factors, CLOS internals, or more specifically the core elements that support the CLOS functionalities, can be optionally and selectively exposed to programmers. Therefore, as the authors assert, many variants of the language can be implemented by providing the required points of services within the region of the language design. 4.1 Intercessary Protocol on Metaobjects In practical terms, CLOS enables the user to modify the basic notion of classes, generic functions, and methods as espoused by the original semantics of CLOS. A user program, in effect, can redesign the language to tailor the features and functionalities for its own purposes, all within the running context of the program itself. It has been shown that intercessary facilities supplementing the metaobjects can efficiently redefine the semantics of standard-class objects, modfify class precedences, change inheritance rules, and enhance the properties of objects. These are possible because the design of the Metaobject Protocol makes available the internal metaobjects that support the language to the users, who can then construct special metaobject classes with modified behaviors. Hence, the semantics of standard-class, standard-generic-function, and standard-method can be extended to include additional properties and behaviors. User programs, then, can define objects with new flavors based on the enhanced metaobjects. Clearly, this extensibility and dynamism come with complexity in the language design and are very much dependent on the intrinsic properties of Common Lisp and the flexibility of the Metaobject Protocol. A static language such as C++, where the notion of classes and methods are immutable, and the internal representation of the elements are insulated from the users, is ill-suited for supporting this metamorphic behavior in the language itself. At the very least, a dynamic program must be able to access the elements in its internal representation in order to adjust its own behaviors. This is not something the C++ language is designed for or capable of. 4.1.2 The Extensibility of a Java Program Java, on the other hand, has potentials for implementing these dynamic features of CLOS. Although a Java program is compiled code, The Java compiler generates platform independent bytecode that is interpreted by the Java Virtual Machine (JVM). The specification of the JVM, in and of itself, allows a running Java program to make decisions and extend its functionalities in the following ways: 1) A Java program can conditionally access, load, and execute new classes on demand, in effect dynamically extending the program's behavior. 2) A Java program can introspect into other objects and make decisions on the interaction based on the information gathered. 3) The Java Virtual Machine can modify the program's internal structures, such as changing ordinary instructions to "quick" commands, cache instructions, or apply "just-in-time" compilation on bytecode for optimizing performance. 4.1.2.1 Metaobjects for Java In effect, the JVM is "the program" that simulates the execution of the bytecode instructions. As shown in the section on Introspection, The JVM actually constructs Class, Constructor, Field, and Method objects to represent corresponding elements created by a user-define class. These "reflector objects" basically define the interfaces to the object's properties and behavior. These somewhat are parallel to the metaobjects in CLOS, although Java use these objects mainly for introspection in Java Beans. Nonetheless, these internal objects can be used as the basis for extending the language, as in CLOS. In other words, specialized metaobject classes must be derived from the basic reflector objects in similar fashion as in CLOS, if certain flexibility can be applied to the Java lanuage and Virtual Machine Specifications. The issue is that even though the JVM has access to all the internal structures representing all the class data, these reflector objects are static and final classes - they cannot be inherited to create specialized metaobject classes. Moreover, the primitive definition of a Java class (ie. the structure of the class object), is determined statically at compile time, whereas CLOS defines and constructs the class metaobjects within its running environment. Therefore, in order to open up the implementation of Java, the language specification has to be changed to make the reflector classes non-final, so that special metaobject classes can be derived by the user program. A new syntax is also needed specifically for defining special metaobject classes to distinguish them from ordinary Java classes. Lastly, the JVM will need to be enhanced to recognize special class definitions and generate structures with attributes different from those of the default class objects. 4.1.2.2 Specializing Java Classes As an example, a counted-class can be defined by extending the class Class, provided that class Class is not non-final. A newInstance() method will be defined for Counted-Class, so that an internal count will be incremented each time newInstance() is called: class Counted-Class extends Class { private count = 0; public Object newInstance() { count++; return super.newInstance(); } public getCount() { return count;} public decrementCount() { count--; } ... }; Internally, then, JVM will construct a class object for Counted-Class. When a user program defines a new class with Counted-Class, the compiler should generate bytecode to instruct the JVM to create a class object from the extended Counted-Class. A new syntax, however, is required to specify the extended Class as in the following: class myClass:Counted-Class extends Applet { ... }; The specification myClass:Counted-Class - analogous to the :metaclass keyword in CLOS - indicates to the compiler that this is a special class based on the Counted-Class previously defined. The compiler should generate a class file with attributes referring to the Counted-Class. When the JVM loads the class file for myClass, it should recognize from the special attributes that it is a special class and create the internal class object from Counted-Class instead of Class. Subsequently, each time an instance of myClass is instantiated, the JVM will invoke Counted-Class's newInstance() method and the count variable of the class object will be incremented. To make this complete, an instance of myClass should implement a finalize() method which invokes the getClass() method to obtain the Counted-Class instance. Then finalize() can call a public method in Counted-Class to decrement the count. 4.1.3 Extending Java Class Behavior Although the above is a simple-minded example, but it shows the potential of Java to have the capability to extend the language semantics dynamically, if the internal structures of a class can be made accessible to the user. It should be clear that other specialized features such as slot attributes and default initialization of variables, as shown in CLOS extensions, can be achieved in Java with a similar approach as the. above. With the exception of Class, all the reflector objects in Java are implemented for the purpose of introspection in Java Beans. However, these metaobjects can be specialized and leveraged to support modified behaviors of classes. The JVM will then create new structures based on these specialized objects and provide the different interfaces to user-defined class declared with special attributes. The Java compiler and the JVM will need to be modified, but the new functionalities will not conflict with the original specification, if the user chooses not to use the extended features. Given that a Java program is still constrained by compiler-generated code, and it probably will not be as flexible as CLOS in defining new semantics, the extended features will still provide alternative facilities to users who require programming support in addition to what is available in the basic language. 5. Conclusion Object-Oriented Programming has come a long way to establish itself among the mainstream of modern programming paradigms. The motivations of OOP - to standardize data abstraction, to separate interfaces from implementation, to facilitate function invocations, and to provide flexible and efficient semantics - are well-suited for the vastly varied requirements in the real world. The design of CLOS, based on the Metaobject Protocol, is unique in the way that it provides for dynamic extension from a single point of default functionalities to different adaptable regions based on the requirements of the users. This paper has discussed those selected features of CLOS that make it powerful and extensible, and how they can be adapted to the more popular modern object-oriented languages, namely, C++ and Java. Hopefull, this comparative study will help the readers to gain insight to some of the design issues of these languages and their limitations relative to the features in discussion.