Saturday, December 10, 2011

Premature optimisation is the root of all misquotes

One annoying phrase that crops up again and again is that well known Knuth quote : "premature optimisation is the root of all evil". This is originally from Knuth's Structured Programming with go to Statements. Now given the very large number of people who quote this axiom I must ask hands up those who have actually read the original paper? Well that is a poor of show hands.

The majority of times I encounter this it is almost always in either the context of:
  • "its running ok with our set of test users so lets not worry about revisiting any of the code to see if we can make improvements unless it breaks, as management have got a bucket of features to implement"
  • "we don't want to think about how we use data and the GC is so good it all sort of magically just works no matter how we write stuff (or so Brian Goetz assures us)"
  • or more usually "I have no idea what I am talking about but I read this cool quote from some guy called Knuth everyone goes on about on some tech blog so it must be right".
So let us examine really what Knuth is talking about in this context. The paper is about the much maligned goto statement, whether certain classes of programmes can be written without them and further whether some programming solutions can be efficient without using it. 

The first problem he presents in this paper is to illustrate this is:
"Let's suppose that we want to search a table A[1] . . . A[m] of distinct values, in order to find where a given value x appears; if x is not present in the table, we want to insert it as an additional entry. Let's suppose further that there is another array B, where B[,] equals the number of times we have searched for the value A[i]."

His initial suggested solution using goto is:

for i := 1step 1until m do. if A[i] = x then go to found fi;
not found: i := re+l; m := i; A[i] := x; B[i] := 0;
found: B[i] := B[i]+1;

The suggested solution without goto is:

A[mq-1] := x; i := 1; 
while A[i] != x do i := i+1; 
if i >m then m:=i; B[i]:=1; 
else B[i] := B[i]+1 fi;

So what I hear you say, goto has been and gone. However, Knuth then goes on to discuss the performance aspects of this fragment. 

"If we assume that programs are translated by a typical "90 % efficient compiler" with bounds-checking suppressed, the corresponding run-time figures are about 14n + 5 and l l n + 21. .... we save about 21%, so the elimination of the go to has also eliminated some of the running time." He expands this solution; introducing loop doubling, discusses whether values can be accessed directly in hardware registers, and covers eliding range checking for subscripts and redundant tests in inner loops. 

Hmm... now how does this fit with our initial statement here, as Knuth seems to be talking about relative performance of small elements like loop structures. 

If we read the rest of the paper it should gradually dawn on us that Knuth is actually talking about the fact that goto is not always needed to improve efficiency (as was commonly held) and in cases where it is more efficient (as he outlines in later sections) it is better applied in areas that are critical, so as not to break up a more structured coding approach.

Additionally, a large part of the early optimisation discussion is also based on the assertion that empirically most of the time in non-io bound programmes is spent in 3% of the code so why bother with the other 97% when the programme is not re-used? A question to ask yourself is does my app fit this distribution? or is it more evenly distributed or exhibit some Pareto type distribution?

This paper is almost entirely rooted in the hot topic of the day around readability of programmes without gotos and how structured programming should be adopted as an alternative. Note: stuctured programming in this sense is mostly about use of for loops and while loops rather than interleaved goto statements, which is perhaps not quite what you were expecting.

This obviously does not have the same meaning today (apart from languages still with gotos) and yet it has become a sort of folklore idiom in modern development for programmers irrespective of the language and optimisations suggested.

Too often programmers use the short form of the statement as an excuse to do no testing and optimising, think about what algorithms, data structures and allocation patterns they are coding for, or to enable them to write things in such a way that ignores many well understood optimisations. Even in code that is known in advance to be executed a massively disproportionate amount of times.

While code is often pumped out to fit current need (and we are all guilty of that especially on large projects), we should strive to try and make our initial code as efficient as is possible (especially when we know it is going to be heavily used) and where we cannot, to revisit it at the first opportune moment. The more inefficiencies we leave in the code, the slower and more memory hungry our programme tends to become. Starting out with this in mind usually means more upfront thought and in my experience actually less not more code. As the most efficient code is the lines that do not exist, and the best way to deal with memory garbage is not to create it in the first place.

Indeed, Knuth returns to the concept of efficiency later in the discussion:
"I'm saying that a similar thing should be considered standard practice for all but the simplest software programs: A programmer should create a program P which is readily understood and well-documented, and then he should optimize it into a program Q which is very efficient. Program Q may contain go to statements and other low-level features, but the transformation from P to Q should be accomplished by completely reliable and well- documented "mechanical" operations.
At this point many readers will say, "But he should only write P, and an optimizing compiler will produce Q." To this I say, "No, the optimizing compiler would have to be so complicated (much more so than anything we have now) that it will in fact be unreliable."

That really does not fit with the the current usage of our initial quote. Knuth is actively advocating in this paper that optimisations should be continually applied to any but the most trivial programs. So how do we resolve this apparent contradiction of what is normally understood by the original statement? 

If we examine the conclusion of the paper we can see that Knuth has a different context when he wrote the paper than most programmers working today :

"In our previous discussion we concluded that premature emphasis on efficiency is a big mistake which may well be the source of most programming complexity and grief. We should ordinarily keep efficiency considerations in the background when we formulate our programs. We need to be sub- consciously aware of the data processing tools available to us, but we should strive most of all for a program that is easy to understand and almost sure to work. (Most programs are probably only run once; and I suppose in such cases we needn't be too fussy about even the structure, much less the efficiency, as long as we are happy with the answers)"

Further than that, the entire paper is about the use of goto in a structured programming environment and whether any efficiencies from it are worth the trade off in structure and readability it introduces. So if we fast forward from 1974 to now, the next time someone utters this idiom it might be worthwhile considering if they are right and whether you really need to re-valuate your use of goto, or more likely it is a case of:

goto no_idea_what_they_are_talking_about;

Taking the paper as a whole it seems that Knuth suggests that optimisation should be done on programs run more than once, optimising a program should be the next logical step after debugging and we should aim for as much efficiency as possible where we can achieve it.

A better quote that sums up the paper is "Runtime analysis and optimisation should be standard practice for all but the simplest programmes (and you don't often need to use goto)".







Tuesday, November 8, 2011

Buridan's Ass and the decline of Object Orientation


For those of working as programmers today one of the main questions is which languages are likely to retain their current traction going forwards?  Looking at the dominant mainstream languages as candidates I think even in the near-mid term the answer will be "none of the above". 

This leaves us in somewhat of a dilemma; Karl Popper in his classic essay The Poverty of Historicism points out that "It is logically impossible to know the future course of history when that course depends in part on the future growth of scientific knowledge (which is unknowable in advance)", and further, even if we did have all the information in advance we cannot even solve a three-body problem, let alone try and account for the myriad of variables present in the general computing ecosystem.

However, to stop us turning into Buridan's ass we must put aside Poincare and Popper's observations on the unknowable and non-deterministic future for the moment and take a view on this. We necessarily cannot become experts in all languages (and still have a life) and there must focus on a likely subset. 

It is useful here to take a quick look at the current environment. Like many of you I expect, my professional programming since the 90s has split between C earlier on, moving to Java or a similar VM-based language and client/server scripting languages, with occasional forays into less mainstream languages. In the wider world Groovy, Scala and Clojure that build on the independence of virtual machines are gaining some following, and there has also been a re-emergence of languages from the 90s such as Python and Ruby, and that child of the 80s, Apple's creaky Objective-C as MacOS and IOS have picked up substantial mindshare. 

Which brings us to approximately now. While Popper's point is that something novel may well spring up that we could not anticipate we can still make some guesses based on what is happening around us. There are at least 2 pygmy elephants in the room that are rapidly growing, namely, on the hardware side the move to multi-core computers and in the application domain the rise in always connected push-type applications for tens of thousands if not millions of simultaneous users. These forces are already affecting current development projects and they both contrive to push languages and platforms down the same path, which is horizontal scaling of processing and communications. 

The current JVM-based languages I think are already starting to struggle in this environment and this will only, I believe, become more obvious with time.  The JVM just was not designed with this type of multi-core hardware in mind. The threading model is heavyweight and does not scale that well, the GC is not suited to very large heaps and large numbers of threads and the Java language itself is tied into the concept of shared memory between threads, which makes parallelisation at any scale challenging. While companies like Azul have shown it is possible to progress in the GC area, I cannot see how the other issues can easily be overcome without a lot of re-engineering, which is tricky in a 10+ year old runtime.

It also appears obvious that the newer JVM/CLR-languages and actor frameworks, while offering some possibilities at the language level, are still inherently bounded by the behaviour of the VM itself. 

If we accept this to be the current state of things, we have to look for answers outside the JVM/CLR platforms and also most of the existing mainstream compiled languages. C++ and C for example have no cross-platform standard threading library and the ones that exists such as Pthreads, Win32 threads, OpenMP, and Boost have slightly different semantics, making portability awkward. Python (even the stackless kind) and Ruby both suffer from the Global Interpreter Lock bottleneck which pretty much is a blocker for multi-core scaling. 

Of the remaining candidates, the two most promising appear to be Erlang and Go (Golang).  Without going into too much detail, both heavily emphasise message passing as a primary construct of the language, both are designed explicitly for upwards of thousands of lightweight processes, and both are heavily focussed on the communication domain. Erlang by dint of its telecomm hardware background and Go as a system language designed with excellent communication protocol support in the core language.

Having used both of these languages, each has a few drawbacks. Erlang's syntax takes a bit of getting used to and, not wishing to offend the functional-only crowd,  the purely functional approach will, I think, ultimately limit its appeal. The addition of a built-in non-functional distributed table storage and the in-memory Mnesia DB is a tacit concession that sometimes you need a bit more flexibility than pure functional programming can offer. 

Golang is still very immature and the first official release G1 has not yet made it out of the blocks. The dropping of OO concepts is a big win in reducing code volume (even Alan Kay noted that messaging was the key takeaway from Smalltalk not Objects), the syntax is easy to pick up and the idioms are very terse from a programmer's perspective. However, the language is compiled and does not offer some of the flexibility you can get from VM-based environments.

On balance Go for me has the edge, even with the coarseness of things like its primitive GC and compiler (as these can only get better with time). 

Programmers who have grown up with Java and .Net should keep in mind that the dominance of Object-Oriented languages is probably going to be a thing of the past. Concurrency of scale is the key challenge in the next few years and the languages best suited to this are from a different pedigree. For those of you stuck in that OO-mindset, you might want to start delving into non-mainstream programming styles before you find yourself going the way of the BCPL programmer. 





Thursday, November 3, 2011

Heretics, Development and the One True Authority.

At the risk of playing up to my own golden age bias there is a worrying trend in modern developers to be a card-carrying member of a particular world view based one a "One True Authority" to provide a simple and easy answer to all their problems.

It is not difficult spot the members of the currently popular sects. Every conversation about a development or design issue is a lesson in scripture from the holy text and no actual detailed thought has to be given to the problem as the good book gives us an easy take away answer.  Among the obvious are the uptight Fowlerites clutching their holy domain models, the process obsessed Agileans and their inquisition offshoots, Church of the Latter day Requirement Gatherers and the GOFian and Patternism branches with their platonic one-size fits all models undistorted by the cave wall of reality.

I am not saying that the many books written in these areas have no useful lessons to give us, rather my point is that there is a trend towards the single world "Argument From Authority" bias among a lot of developers I interact with whose views are dominated by these types of computer writings. Argument from authority is a variation of the induction round-tripping problem that leads to "The Authority" being accepted as a source of ex-cathedra knowledge.  Once this is your starting point, then all such pronouncements must be followed in some semi-automaton manner and are the base from which all your opinions flow without critical evaluation. Now, I think if you talked to the guys who actually wrote the books many would be the first to assert this not how they intended them to be interpreted. However, like all reasonable ideas that are not empirically grounded (and some that are) they are rapidly dogmatized by the faithful.

It is as if the actual business of writing programs that are elegant, efficient, performant and based on empirically demonstrable algorithms and data structures is no longer of importance in the higher level world of the true-believers.

The interesting commonality of these types of books and their die-hard followers is that they are only passingly interested in the nuts and bolts of running programs, but rather focus on the process surrounding programming, metrics of the project or vague over-arching non-falsifiable approaches that purport to smooth away the ugly realities of real-world issues.  At the danger of over-generalising, in my experience, many programs that are generated from these divine principles in practice end up bloated, over-complex, poorly designed in respect of their runtime behaviour and survive mostly by the dint of throwing memory and hardware at them. Such ideology cannot make make your programs better if you neglect the runtime algorithmic and data structure fundamentals of lower level programming approaches, no matter how much your belief re-assures you these are mere details, or how close you stick to the divine path.

A respite from this modern zealotry can paradoxically be found in earlier readings that revolve around the specific programming literature characterised in works like Kernighan and Ritchie's The C Programming Language, Jon Bentley's Programming Pearls or Knuth's epic The Art of Computer Programming.  Many of these types of books and papers tended to address specific techniques that can be empirically derived and tested and provide a useful growing toolkit that you can dip in and out of.

Observationally, (and I am generalising again) many developers who are rooted in this sort of background tend to be more open to taking bits and pieces of approaches and techniques as appropriate to solve problems and do not demand that anyone diverging from this view should suffer the fate of all heretics,  and, to quote the Life of Brian, "Even... and I want to make this absolutely clear... even if they do say, "Jehovah. "


Sunday, October 30, 2011

Architecture and the Principle of Least Authority

There is one particular model of system that might look great on the director's PowerPoint slide as a couple of boxes joined together with arrows, but in many cases becomes a death march weighed down by a huge codebase and management process akin to Soviet-era planning, just without the vodka.  And that is the large scale integration project.

These integration style projects are the ones that I seem to have worked on more than most in my investment banking career and usually consist of a common middle-tier architecture platform which integrates existing business specific services to provide a supposed new and improved experience for the end users.

Lots of books and articles are written about large scale issues in development and the organisational and development anti-patterns that lead to such dysfunctional outcomes. However, I think many of these are really descriptions of the effects of a much more widespread narrative cognitive bias that affects many walks of life (most notably politics and economics). This is most commonly manifested in the fallacy of central planning.

Central Planning is a siren song for humans; the assumption that we can aggregate enough information and knowledge to overcome complexity and failure through upfront planning and control by a group of superhumanly knowledgeable and experienced individuals (usually ourselves) if only we make the plans detailed enough. From such good intentions springs totalitarianism and fragile systems

This simple bias is in most cases shared by the developers and the management team; and often defines unconsciously the starting assumptions we take forwards on the project. The most common being that the framework needs to understand and exert control of the data that passes between the 2 parties you are ultimately writing the system for; the end users and the business system. 

In essence you are setting yourself up as the controller and mediator of all such interactions in the system in the same way central government planners cannot stop themselves from trying to assert more and more control in all societal transactions.

While appearing sensible and logical, the effects of this initial mindset starts to spreads its tentacles into concepts in the framework such as (and by no means comprehensive):
  • large common data models - which have to account for every team's slightly different requirements and rapidly become a bottleneck both in development velocity (as it becomes larger change becomes more difficult) and in complexity as it becomes more fragile to unintended consequences of changes.
  • a common transport format that must be used by all teams (such as XML, JSON, or Protocol Buffers etc) so the backend and client data data must be transformed at many stages into a "better" format as it passes through the system. Leading to growing binding and transformation layers which produce inefficient process bottlenecks at the data boundaries though garbage generated and impedance mismatch of data.
  • granular access rules based on knowledge of the data structure that attempts to unify multiple downstream permissioning models. Spawning a monster that is usually some variant of runtime introspection and rule engine/static code rule overhead for every method invocation (even if hidden in a 3rd party security framework).
  • re-implementation of business rules in the system that exist downstream as you find you are pulling more and more of the locus of control of the data up into the common layers. These have to be kept in sync as those systems are seldom static and usually are a big tangle of rules as they have evolved organically over a period of time. This also results in the framework having an at best linear order-n problem of overlapping rules as more systems are integrated.
  • caching a growing set of our synthetic data locally, as it takes ages to get the data, transform, enrich with common structures and apply permissions etc. So we now have an expanding cache (in many cases that must be distributed) that has to somehow not get out of sync with the backend master system or refresh itself regularly. In many cases this leads to backend data being replicated locally into DBs from performance pressures, which in turn now suffer from data mastering issues and schema fragility on downstream change.
From a management perspective the growing complexity and scale brings along with it the usual bedfellows of central planning:
  • A larger and larger amount of effort must be put into tracking and planning all the interdependencies and infrastructure, and new "processes" and document requirements start springing up that attempt to contain the growing complexity but in reality are a huge and mostly pointless exercise in hindsight bias.
  • Planning starts to take on heroic proportions as repeated 3-month, 6-month plans are developed in detail for the "platform features" by a team of overworked PMs that are thrown away every 3 weeks as real forces and random (to the development team) business direction changes make any detailed planning irrelevant beyond the near term.
Stepping back from the practical effects of this mindset for a moment the question is what can we do about it?

While we may like to think that system design is so new and different, we are really talking about individuals and their biases and how structures arise in organisations. This is nothing new and has been a subject for philosophical writers from John Locke onwards.

However, we shall limit ourselves to one particular aspect of this from which we can derive some practical value, namely how we can inhibit the unconscious tendency towards central over-planning and the fragility it entails, but first a bit of background on the idea. (I shall examine the logical fallacy of forecasting the future disguised as project planning in a later article).

Locke in his second treatise on government addressed the idea of the "principal of least authority". Essentially that we should give up only as much of our individual rights as to make it possible for a society to preserve those rights. "The right to do whatever one thought fit to preserve oneself is given up to be regulated by society so far forth as the preservation of himself and others shall require. When any rights are given up, it is only with an intention in every one to better preserve himself, his liberty, and his property."

In computing, principal of least authority for security is often used as the model to isolate processes to only the data and resources they need to perform a particular task. Apologies to the security guys in overloading this meaning, but this is not the sense which i am using it in. 

Instead, I am using it as a simple principle we can use to try and inhibit our natural planning and centralisation biases when designing common framework systems, and try and limit how much unintended direction and control we exert over other actors in the system while providing them with enough commonly useful runtime artefacts (and in the process greatly simplifying the scope and complexity of our own code).  

In effect, try and imagine that you are the in the other teams using your framework and at each stage ask yourself what is minimal and reasonable for them to give up in order to ensure they can operate together in a minimally restrictive society that is the overall project, but still take advantage of commonality.

For most middle-tier frameworks the minimal set of functionality is termination of client connections, routing data efficiently to and from the client and performing some minimal authentication. In many ways this sounds a lot like a bare-bones messaging server in concept!

Some computing idioms implicitly practice this approach and it is especially prevalent in areas like communication protocols and messaging systems where it is understood that the purpose of the system is to facilitate other processes, while at the same time try to impose only enough restriction to enable systems to interact freely upon it.  See how similar this is to Locke's ideas;  by giving up just enough of our computing freedom to ensure that we can communicate easily with another entity while at the same time enabling almost any data or higher level behaviour to be modelled between the parties. 

Many projects do not even consider at this level of abstraction just what they are trying to achieve and whether the common framework should philosophically demand total obedience of all participants at a micro level and centralise its behaviour (usually the default starting assumption), or provide a minimal platform on which individual participants can decide to build as simple or as complex behaviours as is appropriate. Often this only reflects the individual developer and management members' worldview and is not an introspection people readily undertake.

Even many projects that start out with a messaging-like paradigm often find themselves re-implementing downstream functionality as a result of centralisation forces pulling them in that direction like an undertow.

All that said, what about the practical outcome of this? After all, it is all about building a running system irrespective of the philosophical approach that was used to get there. I shall take a real example of a system (although for simplicity and business reasons I cannot go into too much detail), on which I have had the fortunate opportunity to design a significant portion of the framework system and put into practice these ideas and how this approach worked out in the real world.

Project A is an integration project that ultimately will integrate functionality from 20-30 teams (at peak probably > 300 developers & PMs) in the organisation and accordingly result in 20-30 separate development streams. Teams from each area will be tasked with providing specific functionality built on a common framework and the framework developers themselves will be tasked with providing a common middle-tier infrastructure for each team to leverage. From a standing start the project will have at least a handful of business teams working in parallel, and the others will be brought on quickly.

Keeping in mind how centralisation creeps upon us, an early decision was made to provide as little as possible in restriction to the other teams using our simple principle as a guide for our framework design. This resulted in the following design practices:
  • All data passed to and from the client to the backend systems would be assumed to be as opaque to the framework as possible. 
  • As little as possible (and ideally no) transformation should be needed in the framework paths. A major aim was data only needed to be understood at a point when a process needed to act upon it. 
  • No transformation of format should be mandated as data passed between layers.
  • An asynchronous and synchronous path would be provided between the clients and the downstream systems, but this would be modelled with no business interfaces at all and offer no further abstraction than a single un-typed Object parameter. 
  • Simple client termination handling would be offered, with some minimal abstraction from the connection lifecycle boilerplate.
  • The client modules that deal with a particular team are the only part of the system that needs to understand a particular data item in any detail. Effectively making a contract between the client module and the downstream system, but no other process cares about this data. As they tended to be written by the same team this means that the dev teams had a contract with themselves that was outside the framework control.
  • Pluggable communication protocols would be used as appropriate for each downstream system and would be flexible enough to deal with messaging, bespoke point-to-point, multicast, REST and HTTP and whatever else came up. 
  • Some standards would be encouraged but not mandated for data types and communication, mainly to give teams a simple starting point from which to evolve their thinking.
  • Runtime instances would be specific for each team, but minimally managed by the framework so no team could introduce side-effects or problems for other teams.
  • Authentication and authorisation at any granular level was managed by existing downstream auth systems and entities.
  • A very strong push back against duplicating downstream data.
While this sounds like a recipe for a free-for all and goes against the grain of many of the enterprise design books you have read, in practice things turned out a bit differently. Lets examine the actual outcome in the light of our hypothesis and statements:

Positives:
  • The framework has not (yet) suffered from massive complexity and bloat. Most features that are accepted centrally are driven bottom up by more than one team needing the same abstract function. Even so we are parsimonious in accepting these as some have turned out to be transient.
  • Re-use is not as prevalent as one would think given the copious literature.
  • All teams were able to get up and running pretty quickly and develop in parallel their own specific functions without too much assistance.
  • Pass through is a lot more useful than it first appears. Many teams can do most of the functionality they require by having a mostly pass though framework.
  • Security introspection of data can be performed surprisingly well under this model with little direct overhead (especially if you do not transform or deserialise the data as a side effect).
  • We did not need to scale the infrastructure team to tens of people as one commonly finds on such projects as the code base stayed relatively small. 
  • Individual teams do not need to care about a common model and can create abstractions that suit themselves. 
  • It is far easier to perform optimisations when the framework is very bounded in its behaviour.
  • Business interfaces in the middle tier surprisingly did not arise spontaneously in each team's code base and in fact overall hardly at all. Most of the business specific functions have stayed down in the backend systems where they belong and have not been dragged up.
  • Most teams got along easily when they realised that they do not need to follow the usual abstraction/transformation/aggregation activities that normally characterise the middle tier.
  • middle tier authentication is very coarse (e.g. can user X connect to a particular channel of data).
  • We have so far not had a compelling need for the over-arching big giant cache pattern or much local db replication apart from a couple of business specific cases.
Negatives:

  • Teams are sometimes unwilling to accept more ownership and responsibility.
  • The top-down process burden has not reduced as much as one would expect. It seems planning is too much of a security blanket for some people to let go of just yet despite all the evidence to the contrary.
  • Deployment and testing have more complexity as each team's application has a certain number of unique behaviours.
  • Network design is more complicated.
  • No matter how simple the framework there are always unexpected uses it gets put to.
  • Not many restrictions are sometimes mistaken for no restrictions.
  • In true whack-a-mole style, some irreducible complexity ends up being inside the framework to keep the interfaces simple.

I am not saying that this is a silver bullet to all your design needs, nor am i saying that a dogmatic idea that no planning is always right.  Rather, we can plan to encourage self-direction in development teams by being extremely considered in the freedoms we take away from other participants. Encouraging bottom up feature origination in practice can result in quite a few positive benefits. As in life, the areas that become successful are often the ones that were not even on the radar at the planning stage.

So what can we take from this experiment? One is that subtle starting assumptions can have very large effects, a second is that coherent systems can be built upwards without detailed planning (as in the real world), and a third is that many things we are conditioned to think as being required may not in practice be all that important.

Such a small alteration in your thinking can produce very large outcome differences if you can challenge your own biases and act to limit your inner central planner. And as a bonus you might find that you are freed just that little bit from the gulag experience that characterises many large scale projects.

If you find yourself reading this and thinking that this is a quick and dirty approach and once we have it working we could just define a new all seeing plan to refactor and centralise the functions to make it cleaner and more sensible I invite you to start again at the top of the article.