Friday, December 2, 2011


The past few days have seen a veritable firestorm of flames between the Java and Scala communities. The proximate cause was a leaked private email outlining what the Yammer folks disliked about Scala. An official post from Yammer followed but not before many bloggers had a field day with the leaked email. What is interesting is that one such discussion on Stephen Colebourne's blog pointed that the the lack of a "module system" is a principal deficiency of Scala. This point, mind you, was not part of the Yammer email. In fact, it preceded the email by a few days. The post cited the Scala's community's response to an inquiry about Scala and "module systems" [Google Groups]. That inquiry initially resulted in some misunderstanding. Some of the responders, probably coming from a higher-order typed background, understood the question as referring to an ML-like module system. That led to a reference to a journal version of Andreas Rossberg and Derek Dreyer's Mixin Modules paper [PDF]. James Iry helpfully pointed out on the Colebourne's blog that "module system" is an overloaded term meaning different things to different communities. That said, it is interesting to follow what has been happening in both communities and to discover where there exists overlap.

Outside of the higher-order typed community, module system refer to a linguistic mechanism for managing versions of code and associated tools. The Racket community's notion of modules mostly follows this definition. Along with Jacob Matthews' PLaneT package distribution system, such a "module system" is intended to avoid so-called "DLL-hell" or, in the Java world, JAR-hell. Static typing is sufficient but not necessary for this form of modularity, which should be obvious from the abundance of package distribution systems in the dynamically typed world such as Python eggs and Ruby gems. The problem has to do with cascading versioning problems, the most basic of which is the diamond import problem (a client imports two "modules" each of which may import different versions of the same third module). Rok Strnisa has an OOPSLA paper [PDF] that offers a nice synopsis. In short, the linguistic part of the "module system" in this case explicitly enumerates import dependencies. Java's support for "component programming", another related concept, is mainly embodied in OSGI and Jigsaw.

In the higher-order typed world (well, basically in the ML world), the module system primarily serves a linguistic mechanism for programming-in-the-large architecting (i.e., code organization and namespace management) and for enforcement of abstractions to support type-safe generic containers (similar to the role of the C++ template language in purpose). I would go as far as to say that type-safe generic programming ("parametric polymorphism" in ML and Haskell terminology) to enable implementation of type-safe generic containers is primary reason for the ML module system. For example, the Standard ML Basis, similar in purpose as the Standard Template Library (STL), implements generic maps, sets, and much more. To better understand how the ML module system compares with other mechanisms for generics, consider Ron Garcia et al's study [PDF] where the authors set out to implement a graph library using support for generics in several languages including C++, ML, Haskell, and Java. Rossberg and Dreyer's journal paper points out that Scala's encoding of modules does not support opaque signature ascription, which is central to the "enforcement of abstractions" part of the deal. This is fine, I suppose, because Scala has an assortment of other means to achieve type abstraction. A while ago, Geoff Washburn, who worked on Scala, wrote a series of posts that demonstrated how the moral equivalent of ML modules, functors (parameterized modules), and maybe even opaque signature ascription can be approximated in Scala using structural types. One pet peeve of mine is that many people make assertions such as the ML module system does the support recursive modules, higher-order functors, etc. Strictly speaking, it is true, but what this overlooks is that the "ML module system" to which this refers is that in the Definition of Standard ML, which has not been updated since 1997. The ML module system in practice as embodied in implementations of Standard ML such as Standard ML of New Jersey and MLton have since evolved module system semantics. Higher-order functors are now more or less standard, though the precise meaning is not standard. Recursion is trickier but can be done in certain cases (the subject of Dreyer's work).

Notice that there is considerable overlap in the two notions of module system. Both can support hierarchical composition of namespaces. Both utilize type abstraction, but in the version management case, that is a means to an end, i.e. version management, whereas type abstraction itself is the goal itself in the ML module system. The ML module system can be, and has been, used as part of a system to enforce version management. In fact, some researchers have adapted ML modules to support (online) dynamic software updates.

No comments:

Post a Comment

Note: Only a member of this blog may post a comment.