Wednesday, November 16, 2011

Programming Language Gotchas

Looks like I spoke too soon. The October rally was looking a little long in tooth, and a little warning from the credit agencies was enough to get the market to go risk off at least for this afternoon. At least yields did not decline too much (2.01% for the 10-year according to MarketWatch). The Dollar Index (DX), however, has certainly been in an uptrend since the end of October trough.

Coming from the Standard ML world of languages makes me take many things for granted. It also gives me a probably pretty narrow perspective. Hacking on my trading analysis tools in other languages helps me broaden that somewhat, though I have certainly worked in the mainstream languages long before I encountered ML and its brethren. I would like to keep a record of these gotchas as I encounter them. I do not intend this as value judgments but rather a documentation of noteworthy differences from a higher-order typed programming language perspective.

Python

  • Global Interpreter Lock (GIL): Guido wants to encourage lock-free concurrency, which is good. One way of encouraging this in Python is to penalize threading and favor multiprocessing. Thread unsafe memory management is the main architectural reason, but the claim is that other code has come to rely on GIL especially in terms of C extensions/libraries. There is some rumbling in the Unladen Swallow community and perhaps some future iteration of CPython to change this, but nothing concrete to my knowledge. The real problem here is the need for some alternative high-level lock-free model of concurrency. Valued Lessons has an implementation of actors in Python but it uses Python threads. In one sense, this isn't an issue unique to Python. Any implementation of a programming language that relies on a significant enough runtime system may opt to keep a global lock on it. This is the case with OCaml. The difference is that other languages are adding more fine-grain concurrency facilities to work around the issue.
  • Lambdas in Python aren't as flexible as blocks are in Ruby. Lambdas cannot be multiline. This would have been fine in other languages except for the fact that Python does not have a single line local binding syntax (such as let in ML, Scheme, and Haskell). Furthermore, lambdas cannot be pickled and thus cannot be used with the multiprocessing.pool. The workaround is to lift the lambda to the top-level as a named function or to wrap things in a functor object (class). Since Python favors OO at some level, this is not too surprising. The official word is that complex functionality in lambdas is discouraged and that such functionality should be in a named function.
  • Python uses name mangling to achieve private member functions. The community feels pretty strongly about discouraging information hiding and encapsulation (the quip goes "we are all consenting adults here"). Name mangling is reminiscent of the current state-of-the-art in session key management these days, except mangled names typically do not expire unless program termination is considered expiration. The obviously departs from the higher-order typed world view considerably where all our fancy type abstractions are intended to provide forms of information hiding.
  • Python had issues with lexical scoping until version 3 with its new nonlocal keyword.
  • One thing is Python that really sets it apart is the excellent suite of numerical computing libraries and environments, most of which would not be possible without Python's strong C foreign function interface. The core of this is NumPy. It is apparently a rising celebrity in the scientific computing community.
  • lxml is a high-performance XML/HTML parser. It really works quite well. I have been switching between lxml and nokogiri in Ruby. At one point, nokogiri's CSS selector implementation seemed to be more complete. I am not sure this remains the case, but these two are certainly very competitive in terms of performance. Since they both link against the native libxml parser, this should not be too surprising.
  • Python has list comprehension syntax as a convenience.

Ruby

  • Similar to Python. Ruby also has a Global Interpreter Lock (at least CRuby does). There is an interesting discussion here. Certain variants of Ruby have done away with GIL.
  • Rails is certainly Ruby's current killer app. The combination of ActiveRecord and Rails routing roams everywhere on the web.
  • Blocks are first class in Ruby. They can be passed around, applied, mapped, and everything.
  • Ruby has no such syntax, but using map or foreach with blocks achieves similar convenience.
  • In Ruby, global ($), class static (@@), instance (@), and local scoped variables must be syntactically distinguished. This eliminates some of the opportunities for shadowing among other things.

Scala

Scala's recent rise to prominence is due to a few proximate causes. From the early days as Martin Odersky's experiment in language design, fusing functional and object-oriented programming mostly from an object-oriented foundation, Scala has quickly gained significant mindshare through Twitter and Foursquare's support.

No comments:

Post a Comment

Note: Only a member of this blog may post a comment.