Monday, November 7, 2011

Semantics of Nested Classes

Although nested classes are widely adopted in many popular programming languages (e.g., C++, C#, Java, Python, Ruby), I have yet to see a good treatment giving the motivation, use cases, and architectural issues associated with the feature. Moreover, another undocumented aspect is how the semantics of nested classes vary from language to language. They appear to be useful for namespace management (though C++ namespace and Java packages also cover this), simulating closures, and callbacks for GUIs and event handling (basically a variant of closure simulation).
First thing is first, what are nested classes?
class A {
  class B { }
}
In essence, they are scoped type declarations and definitions. Instead of defining a class B at the top-level of the program source, one declares a locally scoped one within the definition of another class, A in this case. One immediate consequence of this is a form of namespace management for user-defined aggregate types (i.e., classes).
There are a lot of gotchas due the language design here. The scoping discipline is incomplete. C/C++'s restriction on shadowing prohibits the following code:
 
class B; 

class A {
  B something;
  class B { };
}  
In contrast, the following code compiles:
class A {
  private:
    class B {};
  public:
    B makeB();
}
Despite the fact that A::B is unnameable outside an instance of A, A exports the member function makeB which yields a value of type A::B. Though one cannot name the return value of makeB outside of A, the value can still be used as is in the case in the following:
class A {
  private:
    class B { 
      int x;
       public: 
      B add(B _x) { x += _x; }
    };
  public:
    B makeB();
};

int f() {
  A a0, a1;
  a0.makeB().add(a1.makeB());
}
The above illustrates an important point. C++ nested classes do not declare distinct types for each instance. Although one cannot annotate them as static, they are exactly that. In the example, a0.B and a1.B are of the same type. This is why a0.makeB().add(a1.makeB()) type checks. This means that writing a nested class is equivalent to lifting that class definition out to the top-level and selectively hiding the name of the nested class.
Between different languages, there is a design chasm. In C++, there is only the unitary nested class and its associated semantics. Java distinguishes between static nested classes (similar to the C++ nested class) and inner classes. Java actually has a number of variants of nested classes (https://blogs.oracle.com/darcy/entry/nested_inner_member_and_top). The above C++ example translated into Java fails type checking because of A.B is private.
class A {
      private class B {

        public B add(B rhs) { val += rhs.val; return this; }
        private int val;

      }

      public B makeB() { return new B(); };
}

public class Nested0 {

   public static void main() {
     A a0 = new A();
     A a1 = new A();
     a0.makeB().add(a1.makeB());
   }
}

The call to add is flagged as a problem because a0.makeB()'s type (i.e., A.B) is inaccessible. If A.B were public or protected, however, the type checker is satisfied despite a0.B and a1.B supposedly denoting two distinct inner classes. Thus, at least at this point, inner classes do not provide abstract types in the sense of generative types.

On the other hand, as this discussion on Java decompilation (javap) shows, the semantics of inner classes are such that have a kind of C++

friend
relationship with their outer enclosing class. The discussion on TheServerSide shows how this is implemented: the compiler generates a static method that "breaks the encapsulation wall" to permit accessing private members of the outer class.

References

  1. Herb Sutter wrote an article on nested classes in C++.

No comments:

Post a Comment

Note: Only a member of this blog may post a comment.