Something to Squawk About
Was the Legacy Transponder Switched off – or Just Faulty with a Known Flaw?

 In the aftermath of the Sept. 29, 2006 Brazilian collision between a 737-800 of GOL and an Embraer Legacy 600 of ExcelAire, the two pilots of the Legacy have been widely accused in the press of not having their transponder on and squawking their allocated code at the time of the collision. They are also being accused of being northwest bound on the airway at an inappropriate altitude.

On joining that airway, they should have descended from Flight Level 370 to FL 360, odd levels being inappropriate when on

 a northwesterly heading. The press reports have theorized that the two pilots had switched off their transponder so that they could deviate in altitude and track without those maneuvers being detected by Brazilian ATC radar. Why would they have needed to do that? Reporters claim that they might have been demonstrating features of the aircraft to its new owner, who was on board for the delivery flight. However, they may have some reasonable defenses in the form of other factors in play.

First, the crew was experiencing communications difficulties. Even though a crew may have planned to change levels at an airways intersection, normal practise is to first advise ATC when doing so. If comms fail, then the standard procedure is to maintain the last assigned level and trust that ATC will check and see that on your transponder’s mode Charlie squawk and vector other traffic out of your way. Relaying through other airborne traffic is another means of overcoming spotty comms over vast tracts of the Brazilian jungle – or so we’re told. Assertions that the Legacy crew diverted by thousands of feet from their assigned altitude may eventually be borne out by their flight data recorder. We have also picked up on a report that both of the GOL 737’s recorders are quite badly damaged.

On the subject of whether the crew could have switched off their transponder, we suspect that this proposition is based on their “squawk” not being received by either/both of Manaus and Brasilia centers. It’s believed that the crew has denied switching off their transponder. Might there be another answer?

Honeywell transponders have long had a fault that causes them to lapse into an idle or standby mode “if the crew takes longer than 5 seconds to change codes when using the rotary knob on the control unit”. This is a quote from FAA Airworthiness Directive AD 2006-19-04 (replicated in a similar European Directive EASA AD 2005- 0021). The AD expands upon a prior Jan. 27, 2006 Alert Service Bulletin from Honeywell). In comments to the Notice of Proposed Rulemaking, Embraer as makers of the Legacy, requested an extension of the compliance date. The effective date of the AD’s Final Rule was 17 October 2006. The condition affected 1365 airplanes worldwide. The compliance action required by the AD was: a. Within 14 days after the effective date of this AD, revise the Normal Procedures section of the applicable “After completion of any 4096 ATC Code change (also referred to as Mode A Code), check the status of the transponder. If the transponder indicates that it is in standby mode, re-select the desired mode (i.e., the transponder should be in the active mode). b. Within 18 months (of 17 Oct 06), replace the mode S transponder of the COM unit with a new or modified unit …..

Prima facie therefore, it would seem that replacement of the faulty unit could take up to 18 months and in the interim flight-crews were to double check the status light on their box after any code change (something that could easily be overlooked).

The potential impact of this fault was described in an earlier Aug. 2005 document that said: “This type of failure will increase ATC workload and will result in improper functioning of TCAS.”

In other words, head two aircraft along the same airway in opposite directions at the same height and a collision is inevitable. The only thing that could avert that is an operating TCAS in one aircraft and an operating transponder (and preferably, but not necessarily, its associated TCAS set). Just one TCAS (the 737’s) would have averted the collision by warning the GOL Flightcrew – as long as the Legacy’s transponder had been operating.

A Swiss ATC document from SkyGuide dated as early as April 13, 2004 expressed great concern about the fault and its potential impact upon aircraft separation. In a later Skyguide Safety bulletin dated Sept. 11, 2005, a safety officer agonizes over why their SMS (Safety Management System) had failed to get action to rectify the anomaly.

At that stage, the final deadline for fixing the fault was May 2006, some nine months after the publication of the European AD. Recall that the FAA AD is dated Oct. 17, 2006 for effectiveness. You can read that SkyGuide Bulletin at: www.iasa.com.au/skyguide.htm. To see why this delay has possibly helped generate a collision, read also the two articles linked from www.iasa.com.au/offset.htm.

The moral of the fatal Legacy incident, so far: it takes more than two operating transponders to sidestep collisions.

GOL 737 Crash Developments - Alternative Explanations

Possible systemic explanations for the Embraer Legacy 600 mid-air collision on Airway UZ6 between Brasilia and Manaus are starting to solidify.

The Legacy crew are being accused in the Brazilian press of having switched off their transponder and being at the wrong flight level (FL370 vice the flight-planned FL360 - the correct RVSM level when Northwestbound).

a.  According to Brasilia and Manaus ATC Centers seven (7) unanswered calls were made to the Legacy as it started up Airway UZ6. It is reasonable to assume that the Legacy crew may have selected the comms loss transponder code of 7600 once they were unable to contact Center. In accord with FAA's AD 2006-19-04 (effective 17 Oct 06), that code-change may have caused the transponder to fault to standby (a fault condition that has existed in certain modern Honeywell transponders since late 2003). It's caused by taking more than 5 seconds to dial in the new code. The Standard Operating Procedure is to remain at the last assigned level when experiencing comms loss. The crew would have believed that ATC would have seen that level maintenance (of FL370) from their transponder transmitted height read-out....not knowing that their transponder had faulted to STBY.

b.  After the collision, the Legacy crew would have changed their squawk to the Mayday squawk of 7700, and thus reactivated the transponder from its dormant state of STBY (explaining the Brazilian ATC accusation that the crew reactivated their squawk after the collision).

These explanations may be borne out by the Legacy's DFDR, however the Legacy CVR evidence would have been overwritten - assuming that it was a standard 30 minute over-writer style device. It is possible that, not knowing what height the Legacy actually was at, and assuming that it might have descended to FL360 (per its flight-plan), controllers may have just never tumbled to any possible conflict between the Legacy and the GOL 737. If so, this may prove to be the tombstone event required to bring about the long sought lateral offset tracking solution (where each aircraft on an airway sidesteps a half-mile to the right). For more, see the October 9th Air Safety Week and www.iasa.com.au/offset.htm

Of course the clincher in all this is that a transponder defaulting to STBY will effectively kill the TCAS safety system in both oncoming jets. Without TCAS, airways tracking accuracy and RVSM height-keeping guaranteed that a collision would occur.

 

The ATC Factor

Note that the Princeton/Rochlin article below goes on at length about the workload on AT Controllers possibly being subtly increased by extensive computerisation. Seeing as Brazil is reputed to have spent $122M on ATC upgrades in the last couple of years, one has to ponder the possibilities of too much of that money going into technology, and not enough into training, and the human factor.

Chapter 7: Expert Operators and Critical Tasks

[¶1.]

In the airliner of the future, the cockpit will be staffed by a crew of two--a pilot and a dog. The pilot will be there to feed the dog. The dog will be there to bite the pilot if he tries to touch anything.--Commercial airline pilot

[¶2.] I've never been so busy in my life, and someday this [highly automated cockpit] stuff is going to bite me.--Another commercial pilot

[¶3.]

Having the Bubble

[¶4.] Over the past few years, my colleagues and I have studied the operation of aircraft carrier flight operations, nuclear power plants, air traffic control centers, and other complex, potentially hazardous advanced technologies, using interviews and field observations to find out what it is that makes some operations reliable and others not.1 Out of this research has emerged the beginning of a better language for understanding the difference between these complex, critical, and reliability-demanding operations and more mundane and ordinary ones with which most of us have direct experience.

[¶5.] Every group of operators we interviewed has developed a specialized language that sets them apart. Although every group expressed clearly their very special response to the demands for integration and interpretation placed on them, only in the Navy did we find a compact term for expressing it. Those who man the combat operations centers of U.S. Navy ships use the term "having the bubble" to indicate that they have been able to construct and maintain the cognitive map that allows them to integrate such diverse inputs as combat status, information flows from sensors and remote observation, and the real-time status and performance of the various weapons and systems into a single picture of the ship's overall situation and operational status.2

[¶6.] For the casual visitor to the operations center, the multitude of charts and radar displays, the continuous flow of information from console operators and remote sources of surveillance and intelligence, the various displays that indicate weapons systems status, what aircraft are aloft, and who is in them, the inputs from ship and senior staff, are overwhelming. What surprised us at first was that even experienced officers did not attempt to make overall status assessments on the basis of a casual visit. Only when you have the bubble do these pieces begin to fall into place as parts of a large, coherent picture.

[¶7.] Given the large amount of information, and the critical nature of the task, creating and maintaining the required state of representational mapping, situational awareness, and cognitive and task integration is a considerable strain. On many ships, operations officer shifts are held to no more than two hours. "Losing the bubble" is a serious and ever-present threat; it has become incorporated into the general conversation of operators as representing a state of incomprehension or misunderstanding even in an ambiance of good information.3 In principle, the process could be carried through by logical, deductive chains of reasoning even if the bubble were lost, but even the most experienced of tactical officers would rather relinquish operational control if he loses the bubble than try to press on without it.4

[¶8.] When we mentioned this terminology in air traffic control centers and other operations exhibiting similar behavior, it was met with an immediate and positive acknowledgment. Because it expressed in compact form what they often have difficulty in explaining to outsiders, it has even become widely adopted.5 It was as much of a surprise to them as it was to us to find out just how much behavior and culture was held in common by operators performing complex and safety-critical tasks in a variety of seemingly disparate industries and work environments. Subsequently, we have been able to identify behavior that we would describe as functionally and operationally equivalent in other studies, performed by other field workers who did not have the convenient label to attach to it.6

[¶9.] Operators also shared a number of concerns about the consequences of the lack of understanding of the nature and special character of their work, particularly by the engineering and management divisions of their own organizations.7 The operational divisions consider themselves to be the reservoir of expert knowledge of how the plant or system actually works, in contrast to engineering divisions and other technical professionals (including consultants), who tend to think of it more in formal terms, such as design, drawing, specifications, rules, and procedures. Although operators respect the status and expertise of engineers and other professionals, they are very wary of interference from those who have almost no hands-on experience or other practical knowledge.

[¶10.] This is particularly apparent when professional consultants are called in to improve performance or reliability at the human-machine interface through technical or human factor improvements to plant or operational controls. As nuclear plant operators put it, the "beards" come in here, look around for an hour or two, and then go back and write up all these changes to the controls. Because it will cost the company a fortune, the decisions are going to be made upstairs (i.e., by managers and professional engineers). We'll have to argue for hours and hours about changes that will interfere with the way we work. And sometimes they make them anyway.8

[¶11.] We have heard similar comments from others who operate in such diverse working environments as air traffic control centers, electric utility grid management centers, and military combat command centers. The autonomy of design mentioned in chapter 2 clearly extends even into systems for which the consequences of error or poor user adaptation can be serious. In an observational study of commercial aviation similar in design to our field work, Gras and colleagues found that 75 percent of the pilots felt that engineers did not take the needs of users into account when designing an airplane, and almost 90 percent believed that the logic of designers differed substantively from that of pilots and other users.9

[¶12.] To those who study the history of technology, or of industrialization, these comments may seem typical of those made by industrial workers in the face of technical change. But it would be wrong to write off the comments of this particular group of operators as reflecting no more than preservation of status, maintenance of traditional prerogatives, or reflexive reactionary protection of role and position in the workplace. The operators in the systems we studied differ greatly from the general run of office workers, tool operators, and even skilled plant and process operators. They are highly trained professionals in an already highly automated environment. They were selected for study because they perform extraordinarily complex tasks with demanding requirements and heavy responsibilities for human life and safety, and because they can perform them well only by always performing near the top of their intellectual and cognitive capacities.10

[¶13.] The introduction of advanced automation is not an issue because of any putative threat to jobs or skills. Extensive automation and computerized controls already play a major role in all of these operations. For the most part, these operators easily accept the new representations of work described by those who have studied similar transitions in other types of organizations. Most of them are comfortable with computers as intriguing technical artifacts as well as a means of operation and control, and have at least a passing interest in and knowledge of the technical details of the hardware and software they are using. Their objections arise not from unfamiliarity with what computers are, or what they can do, but from concern over what computers are not, and what they cannot do, or at least what they cannot do well.

[¶14.] Because the technologies they manage are complex and sophisticated, and the range of possible untoward events is not only large but in some sense unbounded, skilled operators have developed a repertoire of expert responses that go beyond simple mastery of the rules, regulations, controls, and process flow charts. Although this is identified in different ways by different operators in different settings, there is a common sense that the cognitive map that distinguishes the expert operator from the merely competent one is not well enough understood to be "improved" upon by outsiders, particularly those whose skills lie more with engineering and technical systems than with operations and human performance.

[¶15.] There is in the human factors research community a growing sense that the operators may have a point, that there is some danger in tinkering with the systems they operate without gaining a better un-derstanding of how they become expert and what that means.11 In particular, there is a recognition that the introduction of computerized methods into control rooms and controls may increase the chance of operational error by interfering not only with the processes by which experts "read" system inputs, but also those by which expert skills come to be developed in the first place. Since all of the systems we studied were safety-critical, and involved the active reduction of risk by human management of inherently hazardous operations, the consequences can be serious.12 And nowhere has this been more apparent than in the ongoing debate about the automation of airline cockpits.13

[¶16.]

Pilot Error

[¶17.] On February 24, 1989, United Airlines Flight 811 from Hawaii to New Zealand suffered major damage when a door blew off in a thunderstorm, taking with it a large chunk of fuselage.14 In addition to severe structural and control damage, nine people were sucked out, and twenty-seven injured. Fortunately for those aboard the 747, Captain David Cronin was one of the older and more experienced pilots in the fleet (and only one month from mandatory retirement at age sixty). Relying primarily on his judgment and thirty-eight years of accumulated experience, he managed to retain control of the aircraft by "feel" and bring it safely back for a gentle landing, a feat that was regarded as near-miraculous by those who examined the airframe afterwards.

[¶18.] A few months later, United Airlines Flight 232 had a disk come apart in one of the engines; the fragments of the disintegrating engine severed all hydraulic lines and disabled all three of the hydraulic control systems.15 For the next forty minutes, Captain Alfred C. Haynes, a thirty-three-year veteran pilot, and his flight crew "rewrote the book" on flying a DC-10, improvising ways to control it, and coming heartbreakingly close to making a nearly impossible landing at the Sioux City, Iowa airport.16 According to investigators and other expert pilots, the ability of the pilot to keep his aircraft under control, at all, let alone try to land it, was almost beyond belief; nor could any flight computer system, however complicated, have taken over from him.17 Although 111 died when the plane turned over at the last moment, 185 survived.

[¶19.] A similar performance with a better outcome was that of Captain Bob Pearson when Air Canada 767 ran out of fuel over Red Lake, Ontario, because of an error in calculating fuel load on the ground. Fortunately for all, Captain Pearson was not only an experienced pilot, but an experienced military glider pilot. He was able to remember the runway orientation at an auxiliary airport in Gimli well enough to bring his streamlined, 132-ton "glider" in for a successful dead-stick landing.18 The investigating team noted that it was amazing he was able not only to fly his sophisticated airliner without power, but to bring it to a safe landing.

[¶20.] Mechanical heavier-than-air flight is an inherently unnatural act. Only if machines and operators perform correctly, according to rather specific rules, can it be maintained. The consequences (hazard) of an uncontrolled air-to-ground impact, from any altitude, cannot be avoided or mitigated. The twin goals of air safety are therefore to reduce the risk through reducing the probability of error or failure, whether from equipment or design or through flawed human action. As aircraft grow more complex, the number of possible mechanical failure modes increases. As their operation grows more complex, the number of things a pilot can do wrong also tends to increase. Automation may reduce some types of erroneous actions, or technical mistakes, but it also creates new categories and types of error.19 Every effort has been made to engineer greater safety into each generation of new aircraft, but there are always technical or economic limits. A modern jet airliner is a tightly coupled technical system with a great deal of complexity and comparatively little tolerance for poor performance in equipment, maintenance, or operation.

[¶21.] In the early days of passenger flying, it was assumed that passengers were knowingly accepting the risk involved, not only from major events such as outright mechanical failure or vicious weather but from other types of mechanical or human failure that were uncertain and difficult to anticipate--perhaps unknowable in advance of actual circumstances. Pilots were not just operators, but "experts," expected to provide not just flying skills, but the ability to cope with the unanticipated and unexpected. That is exactly what happened in the cases mentioned previously; experienced pilots and crews saved many lives by appealing to ad hoc flying experience, procedures based largely on the accumulation of experiential knowledge and familiarity with their aircraft.20

[¶22.] Of course, not every story about airline accidents has the same diagnosis or outcome. Although some are caused by mechanical failures, or errors in maintenance or design, pilots, being human, will err, sometimes seriously, and flight crew performance problems continue to dominate the accident statistics.21 If the implicit assumption is that all pilots are hired specifically to be experts in the cockpit, anything short of a gross physical or environmental disaster so direct and so unambiguous that no action could possibly have saved the aircraft from accident can almost always be blamed post hoc on what the pilot did, or failed to do.22 But what if the situation is presented in the cockpit in such a way that the correct choice is obscured, masked with ambiguity, or does not give the pilot time for interpretation and response? Consider, for example, the following cases, all of which were eventually ruled to be examples of pilot error by the relevant boards.

[¶23.] In January 1989, a British Midland 737 crashed just short of the East Midlands airport in England, killing forty-four of the 126 passengers aboard. The left engine had failed; while trying to reach the airport, the pilot mistakenly shut down the functioning right engine as well.23 In September, another 737 plunged into the water at La Guardia airport when a young, inexperienced copilot hit the wrong controls on takeoff. On February 14, 1990, an Indian Airlines Airbus 320 simply flew into the ground near Bangalore when its crew was distracted and did not notice the loss of flight energy because the automatic controls kept the plane steady.24 On January 20, 1992, another A-320 crashed in mountainous terrain near Strasbourg, apparently because the crew set up the wrong mode on one of their computerized instruments.25

[¶24.] Although there is a certain air of familiarity about these events, the question of the ultimate source or origin of the human action ruled to be in error requires a little more thought. In each case, the proximate cause could be found to be the negligence or error of the flight crew. But in all four there is some question as to whether the information presented for pilot choice was as clear, or as unambiguous, as it should have been. It may be true that more experienced pilots than the ones involved might have noticed that something was wrong more quickly, and recovered the situation (indeed, the pilot of the British Midland flight kept control even without working engines, saving many lives). But, of course, not every aircraft can be flown by experienced and expert pilots in every circumstance, nor is a pilot likely to become expert without making some mistakes.

[¶25.] The assignment of blame to pilot error assumes that people given sufficient information and a working mechanism will not fail unless they err--either by doing something they should not (commission) or by failing to do something they should (omission). All four of these incidents seem instead to fall into a third category of "representational failures," similar to what Karl Weick has called mistakes of "rendition."26 Automation, advanced instrumentation, and other electronic aids that have created what pilots call the "glass cockpit" (because of the replacement of dials and gauges with computerized panel displays) may have reduced the probability of direct pilot errors, but it may well have increased the incidence and importance of indirect, renditional, or interpretive ones.27

[¶26.]

The Glass Cockpit

[¶27.] Every new airliner has had shake-out periods, not infrequently accompanied by crashes resulting from design flaws or other mechanical difficulties. But the Airbus 320 was a source of concern even before it was commercially flown. Because it was the first commercial aircraft with a fully automated cockpit, it was an aircraft "apart" to its pilots as well as to the general public.28 The pilots actuated the control computers rather than the controls, and the control computers were able to impose limits on what the pilots could demand of the airframe. The multitude of lights, dials, gauges, and other direct readouts, each with its own unique function and physical location, were replaced with a few large, multifunction, flat panel displays, capable of presenting a variety of data and information that have been analyzed and integrated by onboard computers to an unprecedented degree. Pilots were not just in a new cockpit, but in an entirely new flying environment.

[¶28.] Because this was the first commercial aircraft for which there was no manual backup or override for the flight controls, much of the attention at first centered on the question of the reliability of the electronic flight controls and their computers. Without manual backups, there was always the prospect that an electrical failure or error in programming or installation could cause an accident beyond pilot intervention.29 Only over time was attention paid to the more indirect and long-term concerns of the pilots about the potential for systemic, operational failures caused by loss of direct involvement with the aircraft rather than by technical malfunction.

[¶29.] Some of the major concerns expressed by pilots to human factors consultants and other researchers interviewing them about glass cockpits have been:30

[¶30.]

  • Too much workload associated with reprogramming flight management systems
  • Too much heads-down time in the cockpit attending to the systems
  • Deterioration of flying skills because of over-reliance on automation
  • Increasing complacency, lack of vigilance, and boredom
  • Lack of situational awareness when automated systems fail, making it difficult to identify and correct problems
  • Reluctance to take over from automated systems even in the face of compelling evidence that something is wrong

[¶31.] The first three of these relate directly to concern that pilots were becoming equipment operators and data managers instead of flyers. But the last three lead to something more subtle.

[¶32.] It was analysis of the first few A-320 accidents that first raised the question in the aviation community of whether control and display automation had created new categories and possibilities for errors or mistakes. In the Indian Airlines crash, the pilots seemed absolutely certain that the automatic control system would not allow their mistakes to become fatal; they were both overconfident and inattentive.31 As a result, all Indian Airlines A-320s were grounded until the aircrews could be retrained. The Strasbourg crash raised the further question of whether human factors engineering, which had resolved so many of the problems of preventing pilot misapprehensions in the increasingly complex cockpits of precomputerized aircraft, had really understood how much the new cockpit environment could interfere with situational awareness.32

[¶33.] The other major thread running through the pilots' concerns was the question of what it means to be a "pilot" at all in a modern, fully automated aircraft. As with the case of the automation of the industrial and business workplace, insertion of a computer into the operating loop moved the pilot one step further from direct involvement with flight. In the new cockpit, the pilot is a flight-system manager, an operator of the control system computers instead of one who flies the aircraft, or adjusts its controlling machinery, directly.33

[¶34.] More often than not, errors and failures in the cockpit are made by younger pilots; only for a few "unlucky" ones has the error been so serious that they were unable to learn from it. For most, the accumulated experience of small (or even medium) errors has been the basis for becoming an expert pilot. But how would learning take place at all once pilots became the controllers or managers of their aircraft instead of flying it? Would younger pilots, trained to fly with automated control and navigation systems that provide little intellectual or tactile feedback, and almost no discretion, still be able to invoke their experience or "feel" some ten or twenty years hence?34

[¶35.] Even as late as the 1930s, aircraft carried few instruments other than airspeed and fuel indicators, oil pressure and temperature (when appropriate), some navigational equipment, and perhaps a simple turn-and-bank indicator. Although instrumentation was greatly boosted during the Second World War, only with the coming of jet aircraft and improved avionics were cockpits stuffed with the multitude of dials and gauges with which we are now familiar. By the 1970s, military pilots also had to master operation of a wide range of complex and sophisticated avionics instrumentation, much of it for the sake of electronic warfare.35 Learning to fly such an aircraft also meant mastering a considerable amount of formal knowledge and technical competence, but there can never be enough sophistication for someone whose primary goal is surviving multiple electronic threats.

[¶36.] The move toward computerized, fly-by-wire control was at first also driven by military requirements. As was rediscovered in Vietnam, and institutionalized in the military's Top Gun school, human performance is still a primary requirement even for the modern military pilot, who still becomes expert by mastering the art of flying, more or less directly.36 But many military aircraft are physically capable of maneuvers the pilot could never execute, or, in some cases, survive.37 Some are even designed with inherently unstable flight dynamics through portions of the flight envelope, either to increase maneuverability or for the sake of low radar profiles ("stealth"). In these regimes, they can only be flown by means of a flight computer.

[¶37.] Military pilots are encouraged to fly to the limits of their aircraft, and the aircraft, and the control systems, are designed to take them to the limits of what can be achieved. In contrast, the premium in commercial flight is on efficiency, on smoothness in flight, and on avoiding the risk of making even small errors. All of these are vastly improved in the glass cockpit. But at what cost? According to one reporter, "Pilots who fly the glass cockpit aircraft say they have never been busier, even though automated cockpits are supposed to relieve workload."38 But busy doing what? Programming and making data entry, primarily, and reading and interpreting all of the visual displays.

[¶38.] Commercial pilots have become much better data monitors than before: but are they as good at monitoring the physical situation required to detect the onset of problems or fly out of unanticipated difficulty? According to recent human factors studies, the answer is still ambiguous.39 It apparently takes even longer to get familiar with a glass cockpit aircraft than a traditional one, and it is not entirely clear whether simulators or other artificial aids are as useful for physical orientation as they are for learning the electronics suite.40

[¶39.] Because of the high salience of aviation safety, and the ability of pilots and other flying associations to make their voices heard, these concerns have finally been given serious attention.41 Even in the avionics industry there is growing criticism--based at least partially on the grounds of experiential learning--of those who design control systems that require placing arbitrary limitations on what the pilot can or cannot make the aircraft do.42 Aviation magazines ran special issues or sets of articles on the subject,43 and the phenomenon spawned a specific name--the "glass cockpit syndrome"--which has since been generalized to describe other cases where human operators are removed from tactile and direct modes of operation and put into automated control rooms where computerized displays are their only sensory inputs.44 As a result, both the airlines and the electronics industry have recently taken a more subdued, indirect, and cautious approach to cockpit automation. Pilots are being encouraged to let discretion decide their use of automatic systems, and airline companies and designers are backing off enough to keep the pilot in the control loop as a flyer as much as a manager.45

[¶40.] Circumstances roughly paralleling those that have arisen over glass cockpits have been described by other operators, ranging from air traffic controllers and nuclear power operators to others operating similarly complex systems that are less hazardous and therefore attract less attention.46 In each case, there is a construct that expresses the integration of technical, process, and situational complexity into a single, spatio-temporal image that organizes and orders the flow of information and allows decisions to be made on the basis of overall, systemic situations and requirements. And in each case, there has been similar concern over the long-term effects of the introduction of advanced, computerized operational controls and process displays. But the outcome in other industries, where operators have neither the status nor the authority of commercial pilots, is far less clear.

[¶41.]

Air Traffic Control

[¶42.] Related to ongoing controversy over the long-term effects of cockpit automation is the continuing discussion over the future automation of air traffic control, which presents a greater challenge for cognitive coherence. The pilot has only one aircraft to manage. Air traffic controllers have to manage a system of aircraft operated by individual pilots, moving in three dimensions in different directions at different altitudes and with different destinations, without losing track of any, keeping them on course and on time, and not letting any of them collide, or even violate the several very strict rules on altitude and distance separation.47

[¶43.] Although there has never been a midair collision between two aircraft under positive control, the rapid growth in airline traffic over the past few years is already stretching the capabilities of existing centers to the limits of what can be achieved using present technology.48 This has prompted the Federal Aviation Administration (FAA) to turn to computer technology as the only way to increase traffic without compromising safety. The FAA was not content to limit itself to marginal upgrades when modernizing its antiquated equipment. A few years ago, it began a process for the design and installation of an automatic control suite to replace the old consoles. But a series of delays for a variety of technical and financial reasons provided time for the controllers, and their human factors consultants, to intervene strongly in the process.

[¶44.] The debate was similar to that over cockpit automation, but at a different level.49 What air traffic controllers do is cognitively more complex and abstract than what pilots do, and the environment they manage is more extended and complexly linked socially, even if technically simpler. The kind of mental map that controllers build is not easily describable, even by the controllers; it has been described as "holistic knowing," a fully accessible interpretive model of the airspace they are managing based on intimate involvement in and experience with every detail of the process.50

[¶45.] If you visit any one of the present en route air traffic control centers, you will see a large, open room filled with an orderly array of desk-sized grey consoles dominated by large, circular display screens.51 Each console has before it an operator, staring intently at the display and talking into a headset. Each is managing a "sector"--a geographical subset of the air space assigned to the center--which may have several dozen aircraft in it. The display screen at each console is two-dimensional and indicative rather than representative. Although it provides a flat map of the sector, and other important landmarks, there is no visual display of the type, altitude, or airspeed of the individual aircraft. Instead, each is represented by a box on the screen carrying some limited information (such as the flight number), and a small track indicating heading. Lining the screen on both sides are racks containing small slips of paper, the flight progress strips (FPSs) with aircraft identification and salient characteristics written upon them.

[¶46.] The novice observer is totally incapable of forming any visual map of the reality that is represented. That is a very special and highly developed skill possessed only by trained operators. To them, the real airspace is being managed through mental images that reconstruct the three-dimensional sector airspace and all of its aircraft. The consoles are there to update and feed information to the operator, to keep the mental image fresh and accurate; they neither create nor directly represent it. The displays and display programs are more primitive than any personal computer they, or their children, might have at home. Nor is there anything remarkable about the computers that control the consoles. Even in those centers where the computers have been recently upgraded, the most polite term for their implementation is "old-fashioned." And the FPSs with their scribbled annotations look like a holdover from another century.

[¶47.] At the present time, en route air traffic control relies primarily on three tools: the radar information provided on the console; radio and telephone communication; and the FPSs sitting in the racks next to the visual displays.52 Controllers have long insisted that the FPSs, continuously rearranged and marked up by hand even while computerized systems are fully operational, are an important, perhaps essential element of their ability to maintain cognitive integration, particularly during heavy traffic periods. Given that automation would impede the present system of managing the strips, and advanced automation might replace them altogether, the argument over whether or not they actually serve a central communicative or cognitive purpose has been the subject of continuing debate.53

[¶48.] The general conclusion has been that there is some risk in replacing them with an automated equivalent--even a display that replicates them.54 It is not just a matter of whether manual operation should be encouraged, or how much is appropriate, but whether and to what degree the reallocation of functions that accompanies automation, the degree to which the "operator" actually controls the aircraft as opposed to being the operator of a system that does the controlling, will affect the performance of the operator when he or she has to step in to retake control.55 Moreover, a great deal of the observed teamwork in air traffic control centers that has been deemed essential for maintaining the culture of safety has been based on the "silent" practices of mutual monitoring and tacit task allocation, and it is not yet clear how new modes of automation will interact with or affect them.56

[¶49.] A second and very pressing concern, even if the manual systems continue to be supported and nurtured, is that the introduction of automation will provide both a means and an excuse for increasing both the density of air traffic and the number of aircraft for which a single control set will be responsible. At most of the present centers, operators are still capable of managing the system "by hand" if the computers go down, using the paper slips at the sides of the consoles to maintain the mental images, and reconfiguring the distribution of aircraft in the airspace to a configuration that can be managed without visual tracking. At the increasing workloads that would accompany further automation, breakdowns or failures would present a double problem. Even at present loads, interference with cognitive integration, or the inability to form a cognitive map quickly enough, whether because of lack of manual aids or a loss of operator skills and practice, could cause a collapse of control that would at best cause the entire air traffic system to shut down until automation could be restored.

[¶50.] At increased load, an electronics failure could well leave operators with a traffic pattern too dense, too complex, or too tightly coupled to be managed by human beings at all, whatever their level of skill or experience. Getting it disentangled safely would be as much a matter of luck as of skill. Yet, the economic goal of increasing traffic density remains at least as important in justifying the system as modernizing and updating the increasingly obsolete electronics.

[¶51.] It has been pointed out by many that those who operate commercial aircraft or air traffic control centers have unique access to decision makers, not only because of the public visibility and salience of what they do, but because those in power are among the greatest consumers of their output. Although controllers had neither the status nor the political clout of pilots, they, too, were eventually able to convince the FAA and its engineers that the present record of safety depends greatly on their ability to form and maintain the "bubble," and that automation that is insensitive to operators and their understanding of their own task environments could pose serious dangers. If all goes well, the new automation system that does finally come online will reduce controller workload without disturbing representation or situational awareness, and will provide a means for accommodating manual aids such as the paper strips.57

[¶52.] The optimistic view is that the new approach represents formal recognition that many of those charged with operating complex technical systems safely do so through a remarkable process of cognitive integration, and that there might be a direct correlation between disrupting their representations and controlling public risk. A more pessimistic view is that commercial aviation is a very special case and not a generalizable one, that there are many other socio-technical systems of comparable risk where elites are less directly threatened, operators less empowered, and the locus of performance and reliability more diffuse.

[¶53.]

Industrial and Other Operations

[¶54.] There are in modern industrialized societies many other industrial and technical systems that depend on operators for achieving a safe balance between performance and efficiency. In many cases, what has evolved over time is a technical array of equipment and controls that is totally bewildering to the outsider, and difficult to learn and master even for their operators. In nuclear power plants, for example, walking the plant to observe almost every job and piece of machinery is an integral part of operator training; gaining this level of familiarity with the machinery is the foundation on which integrated mental maps are built.58

[¶55.] Plant operation traditionally has been a form of integrated parallel processing. Operators integrate the disparate elements of the board conceptually, scanning instruments in known patterns and monitoring for pattern as well as individual discrepancies; when an alarm goes off, or there is some other indication of a problem, they scan rapidly across their varied instruments, seeking known correlations that will help diagnose the severity and cause of the event. There is increasing concern among the new generation of human factors analysts and human performance psychologists that automation, particularly if poorly done, will undermine this ability, reducing the ability of the operators to respond quickly in complex situations.

[¶56.] It has also been suggested that automation could actually increase workload in a crisis. Computer-displayed serial presentation of fast-moving events rather than the present essentially parallel system of more primitive displays may inhibit operators from keeping pace with events or quickly finding the most critical one.59 Individual consoles improve data access, but also tend to block communication and otherwise interfere with collective interaction. This matters not only in emergencies, where rapid and coordinated response may be vital, but even in the course of ordinary task performance, where shared communication, especially regarding errors, is critical to the robustness and reliability of the task.60

[¶57.] Even when task integration is done at the individual rather than the team level, it is not an isolated activity, but part of a collective effort involving all members of the team, crew, or shift. Moreover, the processes of communication are not always verbal or visible. Because all controls and displays in a traditional control room have fixed, known locations, operators are able to maintain a situational awareness of each others' actions and make inferences about their intentions by observing their location and movement at the control panels. The use of CRT (cathode ray tube) displays and diagnostic aids such as expert systems linearizes and divides the process, organizing both data and presentations hierarchically and categorically according to engineering principles.61

[¶58.] The issues facing nuclear plant operators, chemical plant and refinery operators, and many others in the face of automation of plant control are therefore very similar to those facing pilots, or air traffic controllers.62 But plant operators lack the status and public visibility to mount a similar challenge. Their concern is clear: that the new, computerized systems will remove them not only from control of the machinery or other operation, but from real knowledge about it. Although differently expressed, they also fear that expertise will be lost through evolution, that those who eventually supplant them will be trained primarily on computers and management systems, that no one with a deep and instinctive anticipatory sense for trouble will be at the controls, that taking away the easy part of an operator's tasks will make the hard ones more difficult.63

[¶59.] This was brought home to me quite sharply when I was interviewing in a nuclear power plant control room, and heard an operator say that he "did not like the way that pump sounded" when it started up. Although the instrumentation showed no malfunction, the pump was stripped down on the operator's recommendation, at which point it was found that one of the bearings was near failure. My research notes (and those of others doing similar work) contain dozens of similar stories, ranging from detection of the onset of mechanical failures to air traffic control operators intervening in an apparently calm situation because they did not like the way the "pattern" of traffic was developing.

[¶60.] "Smart" computerized systems for diagnostics and control will radically alter the situational and representational familiarity that fosters that level of expertise. As summarized by Hollnagel and colleagues, in their report of a recent workshop on air traffic and aviation automation, "Understanding an automatism is, however, often difficult because their `principle of functioning' is different and often unknown. Automatisms may be more reliable, but they are also more difficult to understand. In highly automated cockpits the basis for acting may therefore be deficient."64 Chemical plant operators interviewed by Zuboff expressed similar concerns, in almost identical language:

[¶61.]

The new people are not going to understand, see, or feel as well as the old guys. Something is wrong with this fan, for example. You may not know what; you just feel it in your feet. The sound, the tone, the volume, the vibrations . . . the computer will control it, but you will have lost something, too. It's a trade-off. The computer can't feel what's going on out there. The new operators will need to have more written down, because they will not know it in their guts. I can't understand how new people coming in are ever going to learn . . . They will not know what is going on. They will only learn what the computers tell them.65

[¶62.]

The Computer in the Loop

[¶63.] Computer-aided process control can serve as a useful aid to human judgment, in the cockpit as well as on the assembly line. Computer-aided diagnostics may aid pilots in diagnosing engine problems, nuclear power plant engineers in wading through the complexity of design books and blueprints, managers in establishing flow diagrams and sequences, and administrative and finance officers in evaluat-ing how individual actions will ripple through a large and complex organization. And computerized displays and data management can analyze almost instantly a diverse and complex series of inputs and provide to the operator a variety of models and representations of the current situation.

[¶64.] But humans are human, and human error remains. The temptation is to try to reduce risk further by removing even the potential for human error, by inserting computers directly into the action loop to ensure that tasks, and in particular small tasks, are executed correctly. This is what was done to the Airbus to limit the maneuvers the pilot could order, and to military aircraft to prevent the pilots from flying into an unstable or dangerous part of the flight envelope. It is being considered in air traffic control to maintain separation, and in control rooms to prevent certain actions from being taken under predefined combinations of circumstances, such as opening an outside vent when an inside valve is leaking slightly radioactive steam.

[¶65.] This is the area of most contention between operational approaches, with their emphasis on crew and team performance, whole system overviews, integrative pictures, and near-holistic cognitive maps, and engineering approaches that seek to reduce the prospects for risk one element (and often one person) at a time. On the one hand, most errors are made during routine tasks; if computers take these over, the probability of error should decrease. On the other hand, depriving the human operators of practice, and possibly even context, might increase the chance of error in more critical and discretionary situations.

[¶66.] Human learning takes place through action. Trial-and-error defines limits, but its complement, trial-and-success, is what builds judgment and confidence. To not be allowed to err is to not be allowed to learn; to not be allowed to try at all is to be deprived of the motivation to learn. This seems a poor way to train a human being who is expected to act intelligently and correctly when the automated system fails or breaks down--that is, in a situation that comes predefined as requiring experience, judgment, and confidence as a guide to action.66

[¶67.] Computerized control systems can remove from human operators the opportunities for learning that make for correct technical judgments. In so doing, they may also remove from plants and processes those operators for whom personal responsibility and action was the prime motivation for engaging in such difficult and often unrewarded tasks. Yet, the growing complexity of many modern technical systems and the growth of the demands for both performance and safety that are placed upon them clearly requires better operators as well as better control and display systems. In principle, sensitive and interactive processes of design and implementation could show progress on both fronts. In practice, that is usually not the case.

[¶68.] The implementation of computer control almost inevitably causes the organization to seek greater efficiency sooner or later.67 Even when that is not a primary purpose, managers are almost always quick to realize that automation will allow supposedly accurate situation evaluation and response in times much shorter than that required if humans are involved. For many of the new automated systems, the consequence is to reduce the margin of time in which any human being could evaluate the situation and take an action, rendering supposed human oversight either moot or essentially useless. Those humans who are retained are increasingly put there as insurance that someone will be around to mop up the mess if and when it occurs--or to divert the blame for consequences from the designers and managers. The epi-graph at the beginning of this chapter was completed for me by a pilot to whom I repeated it. His version continued with an airline executive explaining why high-salaried pilots were not just phased out entirely. "Because," said the executive, "accidents will always happen, and no one would believe us if we tried to blame the dog."

[¶69.] The engineering solution is often to provide for additional redundancy by providing multiple and overlapping systems to monitor or control critical activities. Three separate and separately programmed control computers may be required, or three independent inertial guidance systems. But safety in the expertly operated systems we have studied depends not only on technical redundancy against possible equipment failures and human redundancy to guard against single-judgment errors, but also on that wonderfully scarce resource of slack--that sometimes small but always important excess margin of unconsumed resources and time through which an operator can buy a little breathing room to think about the decision that needs to be made, and in which the mental map can be adjusted and trimmed.

[¶70.] In many of the newer automated systems, however, such human requirements as slack, excess capacity, trial-and-error, and shift overlaps are often assumed to be wasteful, an inefficient use of resources to be engineered away.68 In the worst case, that can lead to a plant or process that has no extra resources and no backup, and can collapse much more quickly in the case of extensive failure.

[¶71.] Computer-aided decision making, whether in actual operations or in management, finance, or administration, can serve as a useful aid to human judgment, and a source of rapid access to a considerable body of knowledge otherwise not easily retrieved. Indeed, many of the most successful of the current attempts at deploying "expert systems" fall into this domain. Other, similar systems may aid pilots in diagnosing engine problems, nuclear power plant engineers in wading through the complexity of design books and blueprints, managers in establishing flow diagrams and sequences, and administrative and finance officers in evaluating how individual actions will ripple through a large and complex organization.

[¶72.] As with the implementation of "expert systems" in other domains, the claim being made is that what is being introduced is not an autonomous system, but one that intelligently supplies the correct information needed to make informed decisions in complex task environments. But as computers are given such mundane tasks as managing and presenting information, there is a constant risk of a gradual slide into automation, which, by removing task definition and control from operators, begins to coopt context and rendition, to define and bound both the task environment and the discretionary time or range for human action. If the system designers have thoroughly explored the possible boundaries, well and good. If not, the potential problems are limitless.

[¶73.] For the operator in an emergency, deprived of experience, unsure of context, and pressed into action only when something has already gone wrong, with an overabundance of information and no mechanism for interpreting it, avoiding a mistake may be as much a matter of good luck as good training. The presumption is often that the person in charge is by definition an expert, and is there solely for the purpose of taking correct action when it is needed.69 If that does not happen, the air is filled with cries of error and blame, on the outrageous, but not untypical grounds that it is the job of the operator to make good judgments, evaluate situations quickly, and act correctly, whatever the circumstances, whatever the context, and however infrequently the opportunities to learn from experience.

[¶74.]

Conclusion

[¶75.] What having the bubble expresses in compact form is the formation and maintenance of an integrated, expert, cognitive map of a complex and extended set of operations by human beings performing critical operations in an environment where information is always too plentiful and time always too short. As might be expected from the description, such representations cannot be acquired quickly or simply. They require prolonged and stable apprenticeships, tolerance for the errors and blind alleys of learning, and elaborate and overlapping networks of communications both for their establishment and for their maintenance and support. Aboard U.S. Navy ships, for example, procedures for transferring and communicating information during shift changes are extensive. Tactical operations shifts overlap by up to an hour to ensure smooth transfer to the succeeding tactical officer without a potentially dangerous break in routine or perception. Similar procedures are followed in air traffic control, and, depending upon circumstances, in many other less pressing systems, such as nuclear power plant operations.

[¶76.] The costs are not small. For some, such as air traffic controllers, who feel their responsibility for human lives directly and frequently, these costs are often manifest at the personal as well as the professional level. Even for the others, the stress of their work and the responsibility they bear are considerable. Yet, these operators have for the most part chosen their line of work, they were not forced into it. Stress and responsibility also mean continuous challenge, sometimes excitement, and a real pride in their skill and accomplishments that does not require much in the way of public acknowledgment.

[¶77.] For the most part, neither their equipment nor their task environments have been optimally, or even coherently, designed. But the operators are intimately familiar with both and have mastered them. However designed or laid out, the controls, dials, and other instruments of their working environment become part of the structuring of control room processes and procedures, dynamically interpreting and translating the technical environment into action and social process.70

[¶78.] Operators can, and do, adapt to changes, particularly those that can be demonstrated to promise increased situational awareness and better task integration. But adaptation takes time, and effort; it is not something that operators should be put through without good and sufficient reason. Unless the instrument or control in question can be demonstrated to be faulty, ambiguous, or misleading, the operators would rather live with what they have than submit to changes that will impose upon them an additional learning burden without providing provable benefits.

[¶79.] Engineers and other experts involved in the specification and design of new automation systems for aviation were at first relatively insensitive to operator concerns and operating procedures.71 The concerns of pilots and controllers about the realities of operation in the real world (as opposed to the idealized one in which designers operate) were brought into the process only as the result of a major struggle, and only when the operators themselves found allies among political and business elites, as well as in the professional community of consultants and advisors. In less salient and more ordinary cases where a similar degree of automation of similarly complex, and similarly hazardous, technical operations is taking place, the very similar concerns of the operators are rarely given serious consideration, either by elites or by designers and managers. And that is, or should be, a major source of potential concern. Otherwise the traps being set by clumsy automation, poor definitions of human factors and human performance, and insensitive computerization may have consequences extending far outside the boundaries of the organization.

NOTES:

The first of the two pilot quotes used as an epigraph to this chapter is taken from Aviation Week & Space Technology, May 4, 1992: 2 The second is taken from Phillips, "Man-Machine Cockpit Interface" (note 32 below).

1 For a further description of our work, see, for example, Roberts, ed., New Challenges; Rochlin, Todd La Porte, and Roberts, "Self-Designing High-Reli-ability Organization"; La Porte and Consolini, "Working in Practice but Not in Theory"; and the collection of articles in the special "Future Directions in HRO Research" issue of Journal of Contingencies and Crisis Management 4, no. 2 (June 1996), edited by Gene I. Rochlin.

2 Rochlin and others, "Self-Designing High-Reliability Organization"; Roberts and Rousseau, "Having the Bubble"; Rochlin, "Informal Organizational Networking." Note that the bubble is not a metaphor for the cognitive map or representation; rather, "having the bubble" expresses the state of being in control of one. The term seems to derive from earlier times, when the sighting of guns, and the movement of the ship, was read from mechanical level-reading devices consisting of a bubble of air in a curved tube full of fluid, much like the familiar carpenter's spirit level.

3 As is true in other areas where integrative expertise passes over into mastery, not all officers are equally adept, and the ship tries to make sure the best are on duty at the most critical times.

4 To declare publicly that you have "lost the bubble" is an action that is deemed praiseworthy in the Navy because of the potential cost of trying to fake it. Depending upon the situation, either one or more of the other people in the center will feel they have enough of the bubble to step in, or everyone will scramble to try and hold the image together by piecemeal contributions. As a rule, even a few minutes of relief will allow the original officer to reconstruct the bubble and continue.

5 La Porte and Consolini, "Working in Practice but Not in Theory"; Roberts and Rousseau, "Having the Bubble"; Schulman, "Analysis of High-Reliability Organizations."

6 Similar behavior is cited by Perby, "Computerization and Skill in Local Weather Forecasting," for Swedish weather forecasters, and by Zuboff, Age of the Smart Machine, 53, 64ff, for operators of complex industrial facilities.

7 Similar expressions have been found in other empirical work in similar situations. See, for example, Thomas, What Machines Can't Do, 213ff; Gras, Moricot, Poirot-Delpech, and Scardigli, Faced with Automation, 23.

8 Rochlin and von Meier, "Nuclear Power Operations." This "quote" is actually summarized from comments made during an extended interview with three nuclear power plant operators in Europe--where the professional consultants are more likely to be bearded and less likely to have practical, hands-on experience than in the United States.

9 Gras and others, Faced with Automation, 37.

10 But compare these interviews with those of Zuboff, Age of the Smart Machine, or of Thomas, What Machines Can't Do. The people they interviewed expressed pretty much the same range of concerns in situations that were conceptually similar, if somewhat less exacting. Many have four-year college degrees or more, some in engineering or other professional disciplines, and almost all consider regular schooling and retraining as part of the process of maintaining their skills in a changing world.

11 Much of the discussion has taken place at professional conferences whose proceedings are sometimes not as well known as those directed as specific issues such as plant safety or aviation accidents. Interesting articles addressing these points may be found in Göranzon and Josefson, Knowledge, Skill and Artificial Intelligence, and in Bauersfeld, Bennett, and Lynch, Striking a Balance. Among the other relevant literature containing a number of useful sources and references are Bainbridge, "Ironies of Automation"; Roth, Mumaw, and Stubler, "Human Factors Evaluation Issues."

12 I adopt here the conventional terminology of dividing risk (exposure to the potential for human harm) as the product of the particular potential harm for any event (hazard), which depends only on the character of the physical artifacts and processes, multiplied by the probability that any particular sequence or event will occur. In simple notation, risk = hazard x probability. In more complex systems, where many hazards are present, this is usually expressed as (total risk) = sum (individual hazards x individual probabilities):

[place gif here]

Engineered safety can involve either the reduction of hazard or a reduction in the probability of occurrence. Operators are there not only to try to reduce probabilities, but to manage sequences once they start and control or mitigate their effects. See, for example, Clarke, Acceptable Risk?; Ralph, ed., Probabilistic Risk Assessment.

13 See, for examples, Gras and others, Faced with Automation; "Automated Cockpits: Keeping Pilots in the Loop"; Foushee and Lauber, "Effects of Flight Crew Fatigue on Performance"; Hollnagel, Cacciabue, and Bagnara, "Workshop Report"; Hopkins, "Through the Looking Glass"; Squires, "The `Glass Cockpit' Syndrome."

14 Fisher, "Experience, Not Rules, Led Airliner Crew in Emergency." Also see Aviation Week and Space Technology, March 6, 1989, 18.

15 Weiner, "Jet Carrying 290 Crashes in Iowa."

16 Parker, "Pilots Added Page to DC-10 Manual."

17 Malnic and Kendall, "Struggle to Gain Control of Jet Told."

18 Hughes, "Human Factors Are Critical in Computer-Driven Systems."

19 For a discussion in the aviation context, see Wiener, "Fallible Humans and Vulnerable Systems." Also see Bainbridge, "Ironies of Automation"; Hollnagel and others, "Limits of Automation in Air Traffic Control"; Vortac, Edwards, Fuller, and Manning, "Automation and Cognition in Air Traffic Control."

20 For the purpose of simplicity, I continue to refer in this section to "pilot" error rather than distinguishing errors made by individuals in command from those that arose as part of overall crew performance. In many cases, however, it is difficult to separate the two factors. See, for example, Foushee and Lauber, "Flight Crew Fatigue"; Weick, "Vulnerable System."

21 Foushee and Lauber, "Flight Crew Fatigue."

22 Historical examples include such disparate events as the crimping of DC-10 controls by a cargo door collapse and such external events as lightning (rare) or severe and unexpected wind shear (not uncommon). A recent example is the Lauda Air crash in Thailand in 1991, when an engine control computer deployed a thrust reverser during full power climb-out from the Bangkok airport.

23 "Air Crashes: Murders and Mistakes."

24 "Crash that Killed 92 in India Is Attributed to Pilot Error." Crossette, "Crash of Indian Airlines Plane Kills 89."

25 "Airliner Crashes in France." According to a report in Internet newsgroup comp.risk (RISK 14.74: June 28, 1993), "In France-Soir of Monday 10th May the Commission of Enquiry into the crash of an A320 near Strasbourg on 20th January 1992 . . . is about to deliver its final report. The conclusion on the cause of the accident is "pilot error." The main error was the confusion of the "flight-path angle" (FPA) and "vertical speed" (V/S) modes of descent, selected on the Flight Management and Guidance System (FMGS) console. The pilots were inadvertently in V/S when they should have been in FPA mode. The error was not noticed on the console itself, due to the similarity of the number format display in the two modes. The other cues on the Primary Flight Display (PFD) screen and elsewhere (e.g., altitude and vertical speed indicator) were not noticed since the pilots were overloaded following a last-minute change of flight plan, and presumably were concentrating on the Navigational Display." In this regard, it is interesting to note that another A320 pilot, landing at Frankfurt, refused to switch from Runway 25R to Runway 25L because "my copilot would have had to make something like 12 reprogramming steps with the computer, and I was not in the position to properly monitor what he was doing" (Aviation Week & Space Technology, February 3, 1992: 29).

26 Weick, "Organizational Culture as a Source of High Reliability." In Weick's lexicon, "errors" are lapses in judgment or performance in a decision environment where time and information are adequate and ambiguity low. Mistakes arise from specific individual judgments made in a context where the information is presented poorly, or presented in such a way as to interfere with rather than aid the exercise of informed judgment in the time, or with the information, available. See, for example, Rochlin, "Defining High-Reliability Organizations in Practice"; Weick, "Collapse of Sensemaking in Organizations."

27 See, for example, the detailed analysis of the Tenerife disaster in Foushee and Lauber, "Flight Crew Fatigue"; Weick, "Tenerife."

28 Gras and others, Faced with Automation, 24.

29 There was redundancy of all systems, of course, but it was in the form of multiple systems rather than complementary ones (e.g., hydraulics in case of electronic failures). There were also vivid memories both of the Florida 727 crash where all three engines failed owing to a common-mode mistake in maintenance, and of the Sioux City accident, where the disintegrating engine severed the lines of all three hydraulic systems because they were all routed through the same part of the fuselage.

30 I thank Alain Gras for having helped me condense a very extensive literature into this brief summary. Also see Aviation Week, "Keeping Pilots in the Loop"; Squires, "Glass Cockpit Syndrome."

31 "Airbus May Add to A320 Safeguards." Having set the aircraft on automatic pilot in a landing approach, the pilot in charge was instructing the trainee in the left seat while the airspeed dropped dangerously low. Although the computerized autopilot compensated to prevent a stall, it did it so smoothly that neither pilot was concerned. By the time they did notice, there was not enough energy or thrust to bring the aircraft out of its glide path in time. The overconfidence syndrome was being seen frequently enough to warrant a special conference. Limits were placed on the controls to prevent pilots from overcontrolling, given the lack of tactile feedback, and more (computerized) alarms were added to notify pilots of dangerous excursions within the flight envelope.

32 Phillips, "Man-Machine Cockpit Interface."

33 Hughes, "Mixed Reviews to Glass Cockpits."

34 Stix, "Along for the Ride?" Also see Gras, Moricot, Poirot-Delpech, and Scardigli, Le Pilote, le Contrôleur, et l'Automate.

35 One side effect of this was a major debate among pilots as to whether a future aircraft would be better off with two-man crews to divide the work, even at the expense of increased weight, that continues to the present day. Just as the Navy has both the two-seat F-14 and the single-seat F-18, the Air Force has the two-seat F-15 and the single-seat F-16. In both cases, the two-seat aircraft is larger, heavier, and thought to possess an extra measure of sophistication that makes it superior under some circumstances.

36 The question of expertise went a long way toward explaining why the average German pilot had such a short life in the air while a very few had victories numbering in the hundreds. As was explained by several of the German aces after the war, the training given to new pilots, especially toward the end when losses were mounting, was neither as lengthy nor as extensive as that given by the British or Americans, making them more vulnerable. But the few pilots who had survived (some since the Spanish Civil War) and were still fighting had become so expert that they were practically invulnerable in combat, barring equipment problems or serious mistakes. By rotating their pilots home after a fixed number of missions, the Americans raised morale and improved their training, but they also systematically removed from combat those few who had become truly expert.

37 After several crashes of high-performance aircraft were attributed to the controls flying the plane into a regime where the pilot blacked out because of the g-loads, the computers were programmed to put limits on what the pilots could order the aircraft to do--prefiguring the limits placed on commercial aircraft when they also went to fly-by-wire.

38 Hughes, "Mixed Reviews to Glass Cockpits."

39 Phillips, "Man-Machine Cockpit Interface"; Hughes, "Mixed Reviews to Glass Cockpits."

40 Gras and others, Faced with Automation.

41 It has been suggested that this is at least partially because so many of the analysts, engineers, regulators, and politicians involved spend a great deal of their time flying from one place to another.

42 Phillips, "Man-Machine Cockpit Interface."

43 Most notable were the series of articles in Aviation Week, "Keeping Pilots in the Loop." Also see Stix, "Along for the Ride?"

44 Squires, "Glass Cockpit Syndrome." Squires's analysis also extended to the Iran Air incident, as discussed in chapter 9.

45 Because of the limitations placed on flight evolutions and maneuvers by the automatic control systems of the newer airliners, pilots complain that they will not be able to fly the aircraft out of a bad situation in an emergency unless they are given the power to override the limits. The matter is still being discussed.

46 Schulman, "Analysis of High-Reliability Organizations," provides an interesting comparison of air traffic control, nuclear plant operations, and a conventional power plant.

47 La Porte and Consolini, "Working in Practice but Not in Theory."

48 La Porte, "U.S. Air Traffic System."

49 See, for example, La Porte, "U.S. Air Traffic System"; La Porte and Consolini, "Working in Practice but Not in Theory"; Hollnagel and others, "Limits of Automation in Air Traffic Control"; Vortac and others, "Automation and Cognition." The U.S. General Accounting Office, the Federal Aviation Administration, and NASA have also addressed these issues in a long series of publications. For a parallel analysis of the situation in Europe, see Gras and others, Faced with Automation; Gras and others, Le Pilote, le Contrôleur, et l'Automate.

50 Hirschhorn, Beyond Mechanization, 92ff.

51 En route control centers must be distinguished from terminal centers that manage the airspace only in the immediate vicinity of airports, and those that manage the traffic on the ground, not only by the scope of their operational area but by the diversity of their tasks.

52 Vortac and others, "Automation and Cognition," 632.

53 This was equally true in France, where the strips also were both symbols of operator culture and status and a link to traditional ways of forming cognitive maps. See, for example, Gras and others, Faced with Automation, 64-65.

54 An excellent summary of recent work is Hollnagel and others, "Limits of Automation in Air Traffic Control."

55 The question in air traffic control has always been "when" and not "if" the operator has to step in and take over control manually, which supports the observations of the Berkeley research group that air traffic control has the characteristics of a "high-reliability organization." See, for example, La Porte, "U.S. Air Traffic System"; Rochlin, "Defining High-Reliability Organizations in Practice." Even Vortac and others, "Automation and Cognition," who found no significant decline in performance or cognitive integration in controllers operating a simulation of the new automated system, were very cautious about how to implement their findings.

56 Hollnagel and others, "Limits of Automation in Air Traffic Control."

57 Henderson, "Automation System Gains Acceptance." The new system was designed in collaboration with the NASA Ames Research Center, which also did much of the path-breaking human factors interface work on automated cockpits.

58 Rochlin and von Meier, "Nuclear Power Operations." Zuboff, Age of the Smart Machine made very similar observations about operators of complex chemical plants.

59 Beare and others, "Human Factors Guidelines."

60 See, for example, Norman, Things That Make Us Smart, 142ff.

61 Roth and others, "Human Factors for Advanced Control Rooms." This is quite similar to the observations of Hollnagel and others, "Limits of Automation in Air Traffic Control," about aviation.

62 Roth and others, "Human Factors for Advanced Control Rooms."

63 Bainbridge, "Ironies of Automation."

64 Hollnagel and others, "Limits of Automation in Air Traffic Control," 564.

65 Zuboff, Age of the Smart Machine, 65.

66 See, for example, the excellent discussion in Norman, Things That Make Us Smart, especially at pages 142ff and 225ff.

67 As James Beniger has pointed out, this is nothing new. The tendency to turn information techniques put into place primarily to ensure safety into a means to exert control for greater efficiency was manifest as early as the 1850s, when the telegraph, first put on the lines to prevent collisions, was turned into a mechanism for dispatch; Beniger, Control Revolution, 226ff.

68 Scott Morton, Corporation of the 1990s, 14.

69 Norman, Things That Make Us Smart, 225ff.

70 For a discussion of the interactive process in this context, and its relation to formal theories of structuration (e.g., Giddens, Central Problems in Social Theory), see Thomas, What Machines Can't Do, 225ff.

71 One engineer even described operator beliefs derisively as being little more than operator "black magic," implying that the issue was as much one of social class and education as operating realities.