Honors theses guided by Professor Kim Bruce |
Most student theses that Kim Bruce has guided over the last five years have been concerned with the design of statically-typed type-safe object-oriented languages. As part of this design effort careful attention has been paid to the design of the type system and to the formal semantics of the languages.
The theses by van Gent, Schuett, Petersen, and Browne all build on each other to design more sophisticated languages. Van Gent's thesis added imperative features to the language TOOPLE to get the language TOIL. Schuett add support for polymorphism to TOIL to obtain PolyTOIL. Petersen's thesis involved a simplification to PolyTOIL by replacing subtyping by matching, and also added a sophisticated module system. Browne added support for concurrency to LOOM, obtaining Concurrent LOOM. Each of these students implemented their languages by constructing interpreters.
Vanderwaart's thesis was a bit different from the earlier ones in that he focused on developing a typed intermediate language for LOOM. He wrote a compiler from LOOM to the typed intermediate language. We hope eventually to hook this up to a compiler backend like the FLINT backend of Standard ML of NJ.
Seligman's thesis examined the object-oriented core of C++. As well as discovering a few ambiguities in the language, he was also able to document just how complex the semantics of C++ really are.
Burstein's thesis involved integrating the innovative features of LOOM into the existing language, Java. His language is based on the proposal described in the paper Increasing Java's expressiveness with ThisType and match-bounded polymorphism by Kim Bruce. Burstein's Rupiah compiler added both parametric polymorphism and a "ThisType" construct to Java, with an implementation based on Java's reflection facilities. Foster rewrote the Rupiah compiler based on Sun's Java 1.4 compiler, adding a ThisClass construct as well as type casts and "instanceof", modifying and improving many aspects of the previous implementation.
Gonzalez's thesis aimed at improving the efficiency of the Java virtual machine in executing code compiled from LOOJ (and incidentally from GJ). The idea is that information about type parameters, ThisType, and ThisClass were kept in annotations in the .class file and used by a modified verifier to check the code as it is loaded into the virtual machine. Because the class files have access to this extra information, the superfluous type casts generated by the compiler can be omitted, with the result that the virtual machine can execute faster JVML code, resulting in faster overall execution time.
Student theses are listed in (reverse) order of completion date.
Abstract Since Java's release in 1996, researchers have been exploring ways to improve its type system. One feature that has received a great deal of consistent attention is F-bounded parametric polymorphism [CCH+89]. After years of debate, Sun Microsystems is finally adding support for parametric polymorphism to the Java language specification in its next major release. However, there are other type system enhancements that add expressiveness to Java. LOOJ, a language extension to Java developed at Williams by Kim Bruce and his students, includes support for F-bounded parametric polymorphism as well as for exact types and the ThisType and ThisClass constructs. Foster's undergraduate thesis [Fos01] explained how these language extensions together add considerably to the expressiveness of the Java language.
Until now, all proposed modifications to the Java language specification have had to work around the inflexible type system of its target platform, the Java Virtual Machine. For example, GJ's compiler inserts extra type casts and NextGen produces a complicated type hierarchies to convince the Java bytecode verifier that the bytecode they produce for their Java extensions is safe to execute in the JVM.
This thesis presents LOOJVM, a modified JVM that is able to efficiently run code produced by the LOOJ compiler. LOOJVM includes an enhanced verifier whose static type system is the same as the LOOJ programming language and allows code lacking the traditional superfluous type casts to be verified and run safely. LOOJVM also optimizes bridge method calls for greater efficiency.
Abstract Despite Java's popularity, several practical limitations imposed by the language's type system have become increasingly apparent in recent years. A particularly glaring omission is the lack of a generic mechanism. As a result of this shortcoming, many recent projects have extended Java to support polymorphism in the style of C++ templates or Ada generics. One project, GJ [BOSW98], adds F-bounded parametric polymorphism [CCH+89] to Java via a homogeneous translation (such that only one class file results from each compiled source file), and produces bytecode that is compatible with the standard Java Virtual Machine. However while GJ's simple translation based on erasure allows for maximum interaction with existing Java code, the new parameterized types that it supports do not operate consistently with Java's semantics for lightweight reflection (i.e., checked type-casts and instanceof operations).
We present Rupiah, a language based on features adapted from LOOM [BFP97], a provably type-safe language, and implemented by a translation based on GJ. However its translation differs from GJ's in that it harnesses Java's built-in reflection to store information about parameterized types. The resulting bytecode correctly executes checked cast and instanceof expressions because it has access to the necessary type information at run-time. We also add a ThisType construct, which solves many of the problems that arise when binary methods are mixed with inheritance, and we replace subtyping with a different relation, matching. Finally, we add exact types, an inheritable virtual constructor mechanism: ThisClass, and compiler features to allow separate compilation of Rupiah source files. These features are implemented in a modified javac compiler. Bytecode emitted by our compiler runs on any Java 1.2 VM. Thus, the Rupiah project contributes a complete implementation of an extension of Java with a more expressive type system that maintains a close fit with existing Java semantics and philosophies.
Abstract In the past few years a large body of work
has developed on the use of typed intermediate languages in compilers.
It has become apparent that the retention of type information in the intermediate
representation of a program is useful for ensuring compiler
correctness and facilitating optimizations. The use of intermediate
languages resembling typed $\lambda$-calculus in the compilation of
functional languages like ML and Haskell has been particularly successful,
but not much has been done on using this kind of intermediate format for
non-functional languages.
Meanwhile, TOOPL, TOIL and \LOOM{} have been developed as object-oriented programming languages with static type-safety and semantic foundations firmly in mind. Encodings for the object and class constructs of these languages in lambda-calculus are known, and consideration of these encodings has proved fruitful in their design and implementation.
In this thesis, our intention is to exploit the work on the semantic foundations of object-oriented languages and on typed intermediate languages for compilation, in an attempt to design a compiler for LOOM. Our translation makes use of two intermediate languages, each based on a version of polymorphic typed $\lambda$-calculus and similar enough to the intermediate formats of other compilers to make the use of a previously developed back-end possible in principle. We outline the design of our intermediate languages and the translations involved in our proposed compilation strategy, and we discuss our prototype implementation which compiles a subset of LOOM into ambda-calculus for subsequent interpretation.
Abstract The programming language Java has gained widespread acceptance throughout the computer industry. Java's type system, though, is lacking in flexibility. This lack of flexibility limits the expressiveness of the language, especially for the creation of container classes. To improve Java's expressiveness, we extend its type system through the addition of three constructs: match-bounded parametric polymorphism, ThisType, and exact typing. These constructs allow a programmer to write flexible, extensible, and statically type-safe code. Our current implementation targets the standard Java Virtual Machine through a source-level translation. Translation allows Rupiah programs to be run on existing Java installations, but carries with it a performance cost. We conclude with a comparison of our language changes and implementation and other proposals for extended Java.
Abstract Over the past decade, both object-oriented and concurrent programming languages have become popular tools for solving complex problems, and language designers have recognized that combining these two paradigms can be useful. Concurrency allows object-oriented languages to address naturally concurrent problems and potentially speed up execution. Object-oriented features can improve the organization of concurrent programs and potentially remove some of the low-level process control by encapsulating it in object structure. In this thesis, we explore the addition of concurrency to the object-oriented language \loom, focusing on providing a safe, easy to use language for programmers familiar with object-oriented programming but not necessarily with concurrency. We introduce a large grain of concurrency by identifying processes with objects, and allow these objects to communicate both synchronously and asynchronously using the existing message structure. Special semantics for self-inflicted messages prevent deadlock when processes send messages to themselves, but mutually referential objects can cause deadlock. Conditional synchronization of message reception is handled with separately inheritable pre- and post-condition sections in classes, avoiding extensive redefinitions of synchronization constraints in subclasses and most forms of inheritance anomalies.
Abstract A strong module system is a very important language tool for developing software systems. Classes alone do not allow for sufficient levels of abstraction and separate compilation. Modules can be very helpful in organizing code, providing abstraction, and supporting separate compilation. Abstraction makes it difficult to share types between modules, but transparent types can propagate too much information to allow separate compilation. The use of partially abstract types and manifest types can help to avoid these problems.
Earlier work by Robert van Gent and Angela Schuett under the direction of Professor Kim Bruce resulted in the design and implementation of the language PolyTOIL, a type safe object-oriented language with strong polymorphic features. LOOM is a direct descendant of PolyTOIL which omits subtyping in favor of a more flexible version of matching, including matching-based subsumption. We give an overview of LOOM and of a prototype interpreter for the language. Proofs of the complexity of the matching algorithm and the decidability of type checking are presented. We describe the design and implementation of a module system for LOOM, and present an in-depth discussion of the issues that motivated and affected the design process. Formal type checking and semantic rules are given, and the prototype implementation is described. The module system is evaluated, and proposals are made for further work.
Abstract This thesis presents a formal analysis of the core of C++, the method binding mechanism. The analysis includes features such as inheritance, overloading, overriding, hiding, and conversions. Our model will clearly explain method binding under all these features, without the ambiguity that colloquial descriptions can introduce.
Many of the intricacies of our formal model will demonstrate that the
informal descriptions used to explain the language are not sophisticated
enough to explain certain examples. The type rules and operational semantics
we present should provide the basis for a more accurate conceptual model
of the language.
Abstract This thesis describes the language PolyTOIL, a type-safe, polymorphic, object-oriented programming language. PolyTOIL is an extension of the object-oriented language TOIL, also developed at Williams College. The extensions to TOIL include support for bounded and unbounded parametric polymorphism. It is relatively unusual in using a "matching" relationship rather than subtyping to determine constraints on type arguments. Subtype polymorphism (through substitution) and inheritance are also features, though the subtype and inheritance hierarchies are not identified. A PolyTOIL interpreter has been implemented in ML and can be used to run PolyTOIL programs.
Abstract The object-oriented programming language paradigm offers a useful approach toward data abstraction. However, conventional object-oriented languages fail to provide safe static typing systems, a serious drawback especially for large scale programming projects. Bruce presents TOOPLE, a functional object-oriented language which supports structural subtyping, class-based inheritance, and updateable instance variables, while still maintaining completely safe and deterministic type checking and subtyping algorithms.
This paper details the design and prototype implementation of TOIL,
an imperative version of TOOPLE. As an imperative language, TOIL hopes
to provide a stronger and more efficient platform for realistic object-oriented
programs. The semantics of TOIL are described in a manner similar to that
used for TOOPLE. Although many of the basic semantic techniques remain
the same, TOIL's imperative features require new syntactic constructs as
well as fundamental changes in the semantics of the language. A semantic
model is used to prove that the type-checking algorithm is consistent.