The statistics for civil aviation accidents during 2012 showed that the number of deaths was the lowest since the 1940s, when there were hardly any commercial flights to speak of. Flying is spectacularly safe: you’re much more likely to die on the way to the airport, especially if you go there by motorbike.
Why should this be so? A piece by Bertrand Meyer, a professor at the ETH in Zurich and ITMO in St Petersburg, discusses this very subject (CACM January 2012, p 15). The main reason is, as he puts it, ‘accidents’. Every accident is analysed, for example using the data contained in the ‘black box’, to extract the maximum amount of information about the cause. The results are shared with all the stakeholders – airlines, aircraft and component manufacturers, and infrastructure suppliers, such as airports and air traffic control. Nothing is hidden. The result is continuous improvement, leading to today’s outstanding safety record.
The same cannot be said for the IT industry. Meyer points out that we do not learn from mistakes; they are all too often swept under the carpet. The result of this neglect can be serious, with unstable systems in critical roles. As an example, he cites a major Swiss mobile service provider, which had an outage lasting most of a day. The lack of service prevented many customers from accessing their bank accounts, as security codes are sent to their cell phones, one-time pad style.
I fully agree with Meyer’s argument but I’d like to take it further by looking at the problem from a couple of different points of view to see what extra insights we might gain. The perspectives I’ll consider are first hardware and software platforms, and secondly application projects.
Problems detected in hardware or software platforms potentially affect all users of the products. Fortunately, the suppliers can correct the problems as soon as they know about them and distribute the corrections to users. Software corrections can be rapidly distributed electronically, with indications of the urgency of applying them. This is fortunate as software errors are the most common.
There are advantages for users of integrated stacks comprising hardware and software. The suppliers of such systems cannot escape responsibility for the problems by pointing the finger at another supplier; the buck stops with them. They can also correct problems by making changes in any part of the stack to come up with the best solution. A software-detected problem, for example, may be best solved by making hardware or firmware changes.
The fact that the supplier of the platforms can fix the problems for its user base does not completely eliminate the need to share information more widely. Newly-discovered security weaknesses in particular may potentially affect other products, so there is a good case for sharing information among different vendors.
But it’s in IT application projects that the industry is most wanting in sharing experience. Although there are arguments about the extent of IT project failure, there is no doubt that the situation is very unsatisfactory; vast amounts of money are wasted. There are many causes; I’ve written about a couple of them at least in earlier blogs (see for example Are IT Procurement Processes an Obstacle to Success? and Two Worlds, Two Languages).
It’s here that we could really learn from experience but information is all too often buried. Hiding the truth is much harder in the public sector than in the commercial world but it’s still possible. And even if some details of what went wrong are available, there is a considerable reluctance to learn and do it better next time. Perhaps the major reason is the tendency to blame someone or some group in the event of any failure. The urge to find culprits leads unsurprisingly to attempts to dodge the fallout.
All this is at odds with the considerable body of knowledge about the processes of scientific discovery. Perhaps the industry should learn from some of the important works in the philosophy of science, for example by Karl Popper (see his ‘Conjectures and Refutations’, Routledge & Kegan Paul, for instance) if IT is to be a real science.