I watched the BBC's excellent The Challenger last night, a dramatisation of the Rogers commission into the 1986 Challenger disaster, and was reminded what a hero Richard Feynman was.
The shuttle. Conceived in the 1950s, designed from the late 1960s and through the 1970s, five built, flew from 1981, mothballed in 2011, two terrible disasters. Some reading:
Richard Feynman's appendix to the Rogers Commission Report. Showed NASA and partners' management failings, and that vested interests deliberately or accidentally caused statistical truth and safety policies to be ignored. Striking for its clarity and, I think, humanity.
It appears that there are enormous differences of opinion as to the probability of a failure with loss of vehicle and of human life. The estimates range from roughly 1 in 100 to 1 in 100,000. The higher figures come from the working engineers, and the very low figures from management. [...]
We have also found that certification criteria used in Flight Readiness Reviews often develop a gradually decreasing strictness. The argument that the same risk was flown before without failure is often accepted as an argument for the safety of accepting it again. [...]
the computer software checking system and attitude is of the highest quality. There appears to be no process of gradually fooling oneself while degrading standards so characteristic of the Solid Rocket Booster or Space Shuttle Main Engine safety systems [...]
If a reasonable launch schedule is to be maintained, engineering often cannot be done fast enough to keep up with the expectations of originally conservative certification criteria designed to guarantee a very safe vehicle. In these situations, subtly, and often with apparently logical arguments, the criteria are altered so that flights may still be certified in time. They therefore fly in a relatively unsafe condition, with a chance of failure of the order of a percent (it is difficult to be more accurate).
Official management, on the other hand, claims to believe the probability of failure is a thousand times less. One reason for this may be an attempt to assure the government of NASA perfection and success in order to ensure the supply of funds. The other may be that they sincerely believed it to be true, demonstrating an almost incredible lack of communication between themselves and their working engineers. [...]
For a successful technology, reality must take precedence over public relations, for nature cannot be fooled.
On the team that wrote software for the shuttle. Reading this in the late 1990s, I remember being struck by the cultural and procedural differences between this team and the wider "speed at all costs" culture of startups.
The article follows the Capability Maturity Model in arguing that there's an evolutionary spectrum of development practice, from immature chaotic "fix it on the server" death marches through to a measured "perfection". The truth though, proven by successful agile methods (which have themselves matured), is that the approach should fit the business need and context. When change is expensive, get it right first time; when change is cheap, change it when you need to. From the perspective of 2010 and later, this seems obvious - perhaps it wasn't in 1996.
the last three versions of the program -- each 420,000 lines long-had just one error each. The last 11 versions of this software had a total of 17 errors. [...]
That's the culture: the on-board shuttle group produces grown-up software, and the way they do it is by being grown-ups. It may not be sexy, it may not be a coding ego-trip -- but it is the future of software. When you're ready to take the next step -- when you have to write perfect software instead of software that's just good enough -- then it's time to grow up. [...]
"People have to channel their creativity into changing the process," says Keller, "not changing the software." [...]
The specs for the current program fill 30 volumes and run 40,000 pages. [...]
"Most people choose to spend their money at the wrong end of the process," says Munson. "In the modern software environment, 80% of the cost of the software is spent after the software is written the first time -- they don't get it right the first time, so they spend time flogging it. In shuttle, they do it right the first time. And they don't change the software without changing the blueprint. That's why their software is so perfect."
The space shuttle program as white elephant - this was a punch to the gut, the space bubble pricked.
Most of the really wrong design decisions in the Shuttle system — the side-mounted orbiter, solid rocket boosters, lack of air-breathing engines, no escape system, fragile heat protection — were the direct fallout of this design phase, when tight budgets and onerous Air Force requirements forced engineers to improvise solutions to problems that had as much to do to do with the mechanics of Congressional funding as the mechanics of flight. In a pattern that would recur repeatedly in the years to come, NASA managers decided that they were better off making spending cuts on initial design even if they resulted in much higher operating costs over the lifetime of the program. [...]
Having failed at its stated goal, the Shuttle program proved adept at finding changing rationales for its existence. It was, after all, an awfully large spacecraft, and it was a bird in the hand, giving it an enormous advantage over any suggested replacement. [...]
In the thirty years since the last Moon flight, we have succeeded in creating a perfectly self-contained manned space program, in which the Shuttle goes up to save the Space Station (undermanned, incomplete, breaking down, filled with garbage, and dropping at a hundred meters per day), and the Space Station offers the Shuttle a mission and a destination. The Columbia accident has added a beautiful finishing symmetry - the Shuttle is now required to fly to the ISS, which will serve as an inspection station for the fragile thermal tiles, and a lifeboat in case something goes seriously wrong. [...]
The Apollo program showed how successful the agency could be when given a clear technical objective and the budget required to meet it. But the Shuttle program has shown the flip side of NASA, as rational goals detach from reality under constantly changing political and funding pressures. NASA has learned valuable bureaucratic lessons - it knows to spread its work over as many jurisdictions as possible, it has learned that chronic funding is always better than acute funding, however much money a one-time outlay might save in the long run, and it has demonstrated that ineffectual projects can be sustained indefinitely if cancelling them is sufficiently awkward. But these are lessons we have already learned for far less on the ground, with Amtrak, and building a more photogenic, spaceborne version of the Sunset Limited in orbit hardly seems like a space policy for the 21st century.