Thursday, January 13, 2011

Why are we building so much software technologies that will eventually fail?


Modular systems share some key features to achieve the characteristics that are shown above. They are build around a message-passing architecture for communicating between components, they are highly concurrent and they often provide the ability to do late binding of modules in various ways. These features encapsulate the components of modular systems very well and when done right also help minimizing the dependencies between modules. Furthermore, existing dependencies are specified using protocols and sometimes the message flow is described by state machines as well [1]. This makes such systems very comprehensible and maintainable.
Modular systems are also reliable and fault tolerant by design. If some component fails only one specific piece of software with known dependencies is crashing. It does not bring the whole thing to a grinding halt. Other components that are depending on the faulty component will be informed thereof and the faulty component may be restarted. Thereafter all the other components are able to recover from the fault and go on.
Thus one can really say that the whole is more than the sum of the parts.
Some good examples of modular systems are Erlang, QNX Neutrino RTOS, Microsoft's Singularity research project, Google's Android, modern web browsers (like Google's Chrome), all the popular internet backends as Google's search engine and Facebook's social network and of course the internet as a whole.

Monolithic systems do not rely on a message-passing architecture at its core. They are made up of coarse-grained building blocks that have much more shared state than the fine-grained components of modular systems. Hence, monolithic systems do very much (on the same machine) in the same address space on the same thread context using the same call stack and therefore have much less concurrency build right into them. To sync the threads that cross the building blocks of monolithic systems there are very complex locking patterns.
Monolithic systems also encapsulate the details of building blocks quite good when done right. The management of dependencies may be also quite good when done right but the monolithic architecture often tempts people to do more hacks. Another problem with dependencies in monolithic system is that it is often not very clear which threads may cross the border from one component to the other (it is not only a message that crosses the border as in modular systems). Because of that reasons you often need much more knowledge of the building blocks that interfere with each other to understand and maintain them (e.g. keep them thread safe during their evolvement). A building block or component cannot be considered a 'black box' as in modular systems. This complicates the maintainability of such systems a lot.
Some good examples of monolithic systems are Microsoft Windows, traditional Linux systems, the Eclipse IDE and a lot of proprietary frameworks that power phones, TVs, cars, medical devices, etc.

So why are we building so much software technologies that will eventually fail? Well, also the modular systems had its drawbacks. Especially one big drawback: performance. It was simply not affordable to build highly modular systems with the hardware resources and software tools of former times. In the 1960s it started with monolithic operating systems like Multics, but there also was LISP which showed that we can do better.
Later in the 1980s we still built very bad systems like MS-DOS, but there also was Smalltalk which got it right. Smalltalk was just too fat for the emerging PC era.
Later in the 1980s QNX started working on their microkernel OS. Also Joe Armstrong started working on Erlang. They all needed some time to get this kind of systems to the scalability and performance they have today. And it took some time until the hardware resources of the average computer got to a level you could work with. If you do not have much resources why do you need high scalability?
A modular message-passing architecture is nice but also has its cost. For operating systems it required some features form the CPU to do context switches really really fast and from their MMU (memory management unit) and TLB (translation lookaside buffer) to do message passing efficiently. For programming languages and their high-level language virtual machines it required some investigations until they got the right paradigms and high-performance implementations for message-passing, concurrency, reliability, maintainability (e.g. software updates), etc.
But now we crossed the turning point as shown in the image above. Thanks to today's hardware resources that allow for modular systems in internet backends, embedded devices like smartphones and desktop computers and thanks to today's software tools that facilitate the building of more and more complex software technologies we should start building modular systems and refactor or throw away the old monolithic stuff.

But this is not that easy. As Bell Labs tried to replace UNIX with their far better successor Plan 9 they learned an important lesson, as Eric S. Raymond put it:
The long view of history may tell a different story, but in 2003 it looks like Plan 9 failed simply because it fell short of being a compelling enough improvement on Unix to displace its ancestor. Compared to Plan 9, Unix creaks and clanks and has obvious rust spots, but it gets the job done well enough to hold its position. There is a lesson here for ambitious system architects: the most dangerous enemy of a better solution is an existing codebase that is just good enough.

And also Linus Torvalds' commentary at LinuxCon 2009 makes clear that it is not that easy to maintain monolithic systems:
The Linux kernel has become "bloated, huge and scary" and it isn't "the streamlined, hyper-efficient kernel I envisioned when I started writing Linux."

Happily this is not true for new fields of technology. Yes, it is really hard (if not impossible) to establish new desktop computer systems on the market. But for new areas like smartphones or tablets new modular systems are spreading and also prevail against their monolithic counterparts. The interesting thing is that they are superior to our desktop systems in many ways :-).
The most interesting area today are the internet backends. Here new technologies (like Erlang) that share the modular design principles and reduce the complexity and effort needed to get the systems up and running are used more and more.



Some links:
...