Tuesday, November 8, 2011

Buridan's Ass and the decline of Object Orientation


For those of working as programmers today one of the main questions is which languages are likely to retain their current traction going forwards?  Looking at the dominant mainstream languages as candidates I think even in the near-mid term the answer will be "none of the above". 

This leaves us in somewhat of a dilemma; Karl Popper in his classic essay The Poverty of Historicism points out that "It is logically impossible to know the future course of history when that course depends in part on the future growth of scientific knowledge (which is unknowable in advance)", and further, even if we did have all the information in advance we cannot even solve a three-body problem, let alone try and account for the myriad of variables present in the general computing ecosystem.

However, to stop us turning into Buridan's ass we must put aside Poincare and Popper's observations on the unknowable and non-deterministic future for the moment and take a view on this. We necessarily cannot become experts in all languages (and still have a life) and there must focus on a likely subset. 

It is useful here to take a quick look at the current environment. Like many of you I expect, my professional programming since the 90s has split between C earlier on, moving to Java or a similar VM-based language and client/server scripting languages, with occasional forays into less mainstream languages. In the wider world Groovy, Scala and Clojure that build on the independence of virtual machines are gaining some following, and there has also been a re-emergence of languages from the 90s such as Python and Ruby, and that child of the 80s, Apple's creaky Objective-C as MacOS and IOS have picked up substantial mindshare. 

Which brings us to approximately now. While Popper's point is that something novel may well spring up that we could not anticipate we can still make some guesses based on what is happening around us. There are at least 2 pygmy elephants in the room that are rapidly growing, namely, on the hardware side the move to multi-core computers and in the application domain the rise in always connected push-type applications for tens of thousands if not millions of simultaneous users. These forces are already affecting current development projects and they both contrive to push languages and platforms down the same path, which is horizontal scaling of processing and communications. 

The current JVM-based languages I think are already starting to struggle in this environment and this will only, I believe, become more obvious with time.  The JVM just was not designed with this type of multi-core hardware in mind. The threading model is heavyweight and does not scale that well, the GC is not suited to very large heaps and large numbers of threads and the Java language itself is tied into the concept of shared memory between threads, which makes parallelisation at any scale challenging. While companies like Azul have shown it is possible to progress in the GC area, I cannot see how the other issues can easily be overcome without a lot of re-engineering, which is tricky in a 10+ year old runtime.

It also appears obvious that the newer JVM/CLR-languages and actor frameworks, while offering some possibilities at the language level, are still inherently bounded by the behaviour of the VM itself. 

If we accept this to be the current state of things, we have to look for answers outside the JVM/CLR platforms and also most of the existing mainstream compiled languages. C++ and C for example have no cross-platform standard threading library and the ones that exists such as Pthreads, Win32 threads, OpenMP, and Boost have slightly different semantics, making portability awkward. Python (even the stackless kind) and Ruby both suffer from the Global Interpreter Lock bottleneck which pretty much is a blocker for multi-core scaling. 

Of the remaining candidates, the two most promising appear to be Erlang and Go (Golang).  Without going into too much detail, both heavily emphasise message passing as a primary construct of the language, both are designed explicitly for upwards of thousands of lightweight processes, and both are heavily focussed on the communication domain. Erlang by dint of its telecomm hardware background and Go as a system language designed with excellent communication protocol support in the core language.

Having used both of these languages, each has a few drawbacks. Erlang's syntax takes a bit of getting used to and, not wishing to offend the functional-only crowd,  the purely functional approach will, I think, ultimately limit its appeal. The addition of a built-in non-functional distributed table storage and the in-memory Mnesia DB is a tacit concession that sometimes you need a bit more flexibility than pure functional programming can offer. 

Golang is still very immature and the first official release G1 has not yet made it out of the blocks. The dropping of OO concepts is a big win in reducing code volume (even Alan Kay noted that messaging was the key takeaway from Smalltalk not Objects), the syntax is easy to pick up and the idioms are very terse from a programmer's perspective. However, the language is compiled and does not offer some of the flexibility you can get from VM-based environments.

On balance Go for me has the edge, even with the coarseness of things like its primitive GC and compiler (as these can only get better with time). 

Programmers who have grown up with Java and .Net should keep in mind that the dominance of Object-Oriented languages is probably going to be a thing of the past. Concurrency of scale is the key challenge in the next few years and the languages best suited to this are from a different pedigree. For those of you stuck in that OO-mindset, you might want to start delving into non-mainstream programming styles before you find yourself going the way of the BCPL programmer. 





4 comments:

  1. I think your analysis would be spot if it weren't for one simple fact: you assume that the best technologies win, and this is clearly not the case.

    Ruby and PHP are perhaps the two slowest programming languages ever invented. And yet, they dominate the web.

    Did Windows win the desktop wars because it was technically superior to Unix? No. And yet, it still won, and continues to dominate. Even where it's losing ground -- to OS X -- the reasons have little or nothing to do with the tech, but design.

    That said, using the right tool/technology/language/framework for the job certainly can be an advantage, and I personally hope to demonstrate the superiority and brilliant simplicity of Go by leveraging its legitimate, inherent advantages as best I can... to build a distributed computing platform... thing. Yeah.

    tl;dr -- I think you're wrong, but I hope you're right.

    ReplyDelete
  2. I think even the inferior technologies that win out tend to be just good enough to fulfil the niche requirements. However, in this case I am wondering if that many of the existing technologies will even meet the threshold for this type of ecosystem.

    However, one should not underestimate the amount of time a seemingly obsolete solution can stay around, especially given the sunk cost of investment by many organisations and developers.

    ReplyDelete
  3. While, I agree that programming at scale will become more and more dominant, I completely fail to understand your point about the JVM being unable to support languages that focus on parallel programming. Practice shows otherwise, and with the advent of languages like Scala and Clojure, JVM becomes one of the best places to do concurrent stuff. What is so inherently broken in the JVM? Last time I checked the GC become fully concurrent and there is a whole bunch of features planned to support Java 8 concurrency.

    You point at Erlang and Go, but I doubt those two languages will have much influence in the "mainstream". Sure Erlang had its five minutes but with its fairly obscure (as you point out) syntax it will never gather major following. Go could easily be the successor to C... had it appeared 20 years ago. Today people have either moved to managed environments or will stick with C (kernel programming etc). Go will share the fate of its close relative, D.

    On the other hand you miss another contender - good ol' JavaScript. With its ubiquity, familiar C/Java-like syntax (at least on the surface) and powerful runtimes we will see it more and more widely used, in a range of applications beyond the "Web". JS with its even-oriented approach is already a great tool for programming high-throughput distributed systems (e.g node.js). With a good message-passing or actor library it could be a tool of choice for concurrent stuff.

    ReplyDelete
  4. Ok. so a couple of things here:
    1) I did not say that the JVM was unable to cope with languages like Scala Clojure, just that whatever the language approach (LISP like or otherwise) it is inherently bounded by the functioning of the JVM. So the language change is not able to really change the behaviour apart from the programmers point of view. As the Op codes for instance must be the same in order for the JVM to be able to execute them.
    Additionally, any shared memory must be synchronised (or volatile type semantics) in order to have correct thread visibility and in lots of cases safe access
    2) Even with stuff like lightweight reactor patterns personal experimentation shows that for 200k connections the jvm takes around 1.2Gb in mem (split between native and heap) just to accept the connections, never mind do any work on them and this is assuming nothing else happens on the vm. Languages that allow a closer to the metal termination, or have very lightweight network handling (like Go and Erlang) do not have devote any where near this memory to terminate these sort of numbers).
    2) the GCs are not fully parallel and even the Parallel ones must bring all the threads to safe halt in the tenured space in order to perform a full GC. With hundreds of threads this becomes a really big problem. Hence Azul's work to introduce a pauseless collector.
    3) I admire your scepticism, but I think the multi-core architectures are a significantly difficult problem that languages and runtimes designed for them will start to push out the older languages, which focussed on single hardware threads and looked at throughput speed not parallel operations. This is an entirely different problem. Also, Multithreading in C is hard and multi-cores make it harder to scale this out.
    4) Javascript on the server like Node.js is inherently single threaded and the proposal for the future is for forking of whole processes. http://stackoverflow.com/questions/2387724/node-js-on-multi-core-machines. I can't really see this working in practice, but who knows it may get more suitable with time.

    My point is that it is horizontal scalability in networking and data handling that is the key. For this you need as much parallelisation as possible simultaneously on all the cores you have. That is where the current languages and runtimes struggle, as they are just not built for it in my view. But as Popper points out you never know what is round the corner.

    ReplyDelete