5 Sep 98
CRM THE MISSING FACTOTUM in sr111
1. The media seemed to be initially off on a tangent with the EVAS [plastic bag] system for maintaining internal pilot vision in the event of dense smoke in the cockpit. In light of events it may well have been a factor but only that. It's more likely to be eventually shown that there was a much greater deficiency at work in the Swissair accident; one that is shared by most of the world's airlines. The buzzword in airline flight-crew conduct and relationships for the past decade has been CRM [Cockpit [or Crew] Resource Management]. Broadly speaking CRM means that, without contravening the command rank structure, any flight-crew member is expected to challenge any other when he is dissatisfied with developments or excursions beyond limits- supposedly without fear of retaliation. It also means that there should be no "one-man-bands" [i.e. the workload is shared] and that during high work-load situations crews are limited to the job at hand i.e. they can't discuss last night's Seinfeld whilst lining up for take-off. Military crews have also adopted this CRM credo - but they have different imperatives to commercial airlines so they do it with a significantly different emphasis. Because of cultural differences, CRM was not always evident in many Asian-crewed cockpits and this failing showed up in a number of accident critiques of the 70's and 80's. Before we reveal the suspected SR 111 deficiency it is necessary to run through a typical emergency evolution and build the case. A possible Swissair-style scenario follows.
2. Pilot concern, on a black night, is not for forward vision through the wind-screen - that is essentially irrelevant at night in a radar environment until such time as the pilot flying needs to look for VASIS [Visual Approach Slope Indicator System] or HIAL [High Intensity Approach Lights] on finals [at about 2 to 3 miles] for his line-up and above/below glide-path cues. Swissair 111 never got to that stage. Loss of control in their accident was predicated by any combination of pilot incapacitation, loss of flight instrumentation or loss of control stemming from a later, sudden and drastic development.
The fact that the aircraft orbited for many minutes following their advisory PAN declaration meant that reducing aircraft weight for landing was the initial and paramount consideration, not an immediate overweight landing due to a worsening situation. If the situation had been deteriorating, a distress call or "Mayday" declaration would have been made early on. The pilots upgraded their distress phase later, about 10 to 15 minutes later, and that was initially thought to be the key to the real accident cause.
3. Let us consider a typical sequence of events for noxious fumes or "smoke in the cockpit". It can start off fairly innocuously in the form of irritated nostrils or eyes as the pungency of noxious fumes or smoke begin to permeate the cockpit atmosphere - possibly nothing visible as yet. The source is probably electrical but could be engine fumes via the engine-driven compressor ducting for the flight-deck air conditioning. Either pilot will eventually notice it, declare it and the Captain would command
[typically]: "activate the Fire Bill for Fire of unknown Origin". The copilot [or RH seater] would begin going through the item by item check list just as soon as both pilots had donned the full-face oxygen-fed smoke-masks. One of the first items on the checklist would be to alert the cabin crew. Probably next would be the decisions to divert and descend, as dictated by the escalation of the situation - and then the radioed emergency [or perhaps a distress] message. At this point the trouble-shooting part of the checklist would begin. In modern fly-by-wire aircraft there is much redundancy; dual systems are designed to accommodate all sorts of singular failures. Flight controls, flight and engine instrumentation, even autopilot systems are protected, inasmuch as when one fails there is either another power source or another complete system [or probably both]. For instance, the Captain would have at least two sources of power to his Flight Director system which gives him vital aircraft attitude information. He would also have at least two sources of data input to it. He also has a backup "peanut gyro" - a smaller artificial horizon which can give him vital backup info in the event of failure of the primary instrument or his "head-up" display. Vital flight instruments such as the attitude indicator, electric altimeter and compass have elementary DC [possibly battery] -powered backups. The copilot's side of the panel is similarly configured. He also has a plethora of possible combinations of vital attitude, airspeed, altitude and direction-indicating sources. The smoke in the cockpit drill" is a relentless pursuit of the cause and it involves increasingly critical manual steps. At each stage of the trouble-shooting checklist the pilots will be "monitoring off" systems, working their way through non-essential ones and hopefully being able to pause long enough that they can assess whether the situation is improving or not ; i.e. whether they have nobbled the root cause. Meanwhile they have to cope with the fuel dumping, increasing radio traffic, intercom with the cabin, PA announcements and the navigation for the diversion -but they must not forget to "fly the jet". This critical task is complicated by the fact that the auto-pilot will have been de-selected [either by the pilot in initiating descent or by the checklist step that robs it of power]. It would be very easy for the pilots to be sufficiently distracted when reaching for, identifying, mutually confirming and actuating switches that the aircraft could insidiously roll and pitch to an unrecoverable unusual attitude. This is the main hazard of this complex and very active checklist but not the only one -remember the ongoing fuel dump? They're busy so they probably won't. Normally pilots tend to practice instrument flying approaches in the simulator with a routine scan of their familiar, normal, operational flight instruments. They oft-times have to contend with a contemporaneous practice engine [or singular systems] failure. The smoke in the cockpit checklist changes that ball-game. All of a sudden their instrument cross-reference is misshapen and they are easily distracted and probably disoriented by seeing their normal flight instruments [now powered down] toppled or frozen. Normally a singular failure in [say] the Captain's attitude indicator could easily be confirmed by a cross-check of the copilot's serviceable RHS instrument and a double-check of the standby peanut gyros. In the "smoky" instance however you will have many conflicting aircraft attitude cues, most of them invalid. It drastically increases the workload and the adrenaline rate to scan and see unrecoverable flight attitudes - then have to tell yourself that it's only because they've been "offed". Flight attitude management [i.e. control] can be very borderline in this situation when there's a smoke haze in the cockpit, distracting R/T, rampant switch-flicking and an increasing sense of urgency to get the aircraft on the ground. Bear in mind that peripheral vision is being badly affected by the full-face smoke mask [or a set of goggles] and that is probably misting up on the inside [because you're sweating and the air conditioning is going to go off at some stage of the checklist]. What if you're wearing corrective lenses and they're fogging up as well? Cockpit and instrument lighting will be eventually affected by the checklist. It may end up as only two or three battery-powered floodlights. Communication cross-cockpit is usually via normal speech but in the smoky case there is another distracting abnormality - the pilots can now only communicate over the intercom. Using the telephone to the rear cabin poses an additional problem. What happens when the check-list step that turns off the intercom is reached? What happens when you're so far into the checklist that you lose the power to the warning, caution and failure colored captions - you start to lose track of what you've still got going for you and you begin to doubt all the presentations and indications that you've got left. Developments are always likely to be disconcerting as the checklist progresses and smoke or fumes are less likely to dissipate once the aircon is off.
4. Probably the saddest thing about this accident is that possibly [even probably] the actual malfunction wasn't all that critical. If the system or avionics box had failed properly it would have blown its internal fuse or popped its circuit-breaker and so shut itself down - or simply failed to a dormant state. Many modern systems however will not do this because of the redundancy built into them. In a way it's self-defeating. We are not building systems with benign failure modes and sufficiently hooked into central indication warning systems [CIWS] that will tell us incontestably that a particular system has failed. We are all familiar with the confusing failure modes and resulting unfathomable messages of our desktop computer's operating systems. Aircraft computer systems are just as liable to tell you "porkies" - or worse, nothing at all.
5. It is a perilous undertaking to embark upon the "smoke" checklist because you are going to be necessarily failing your own systems in bulk. There is supposedly no other way, with current technology, to determine the root cause of an electrical fire. Most pilots would assume the cause is electrical and not air conditioning related but it takes a keen pair of nostrils to discriminate. Even if the cause is not identified and isolated the checklist should provide a solution i.e. most electrical fires will die away once the amps are removed. Unfortunately by the time the fumes and smoke begin to clear the checklist will normally [and necessarily] have been completed and the aircraft will be in a very crippled state. If the problem is seen to be resolved a mature crew will pause, sit on their hands and reassess their status for recovery. In most instances crews will be very loathe to re-activate critical or essential systems, either because it's not SOP [standard operating procedure] [i.e. there's no power-up checklist] or because they are fearful of restarting the emergency. Unlike older aircraft it will not be possible to turn off all electrical busses or trip all AC generators. Modern airliners can not function in an electrically inert state. However AC and DC distribution has been worked out such that everything except the emergency essential AC and DC buses can be "offed". Pilots should then be left with manual [hydraulics on] flight control, basic instrumentation, functioning manual throttles [i.e. no FADEC], at least one COM radio and a good generator [even if it's only the APU's or Ram Air Turbine's]. The fuel system should be electrically redundant in most cases i.e. pumps going off should not induce flame-outs. However the MD-11 is unusual in that, to reduce trim drag, it has an integral tail-plane fuel-tank.. Transfer pumps and fuel dump pumps are powered from different buses through different circuit-breakers. Would it be possible that, with a partial electrics-out configuration, with main wing-tanks dumping, the tail-plane transfer pumps weren't powered? It is such a long moment arm that an adverse Center of Gravity controllability situation could soon develop? This sort of fuel transfer induced controllability loss was not uncommon in another Boeing aircraft, the B52 Strato-fortress.
Undercarriage and flaps would be readily extensible. The lethal variant however is the pilot's newly configured flight instrument configuration. It will be anathema to his normal instrument scan technique and the way he's been trained. At best he will be uncomfortable - at worst he will be ricocheting from one unusual attitude to another as he is continually distracted by inert flight instruments, the demands of checklist responses and the hectic workload. In this scenario it would be easy to overlook the ongoing fuel dump.
6. Modern airline pilots rely routinely and heavily upon Flight Director Systems, head-up displays, altitude alerting, autopilot-controlled "fly-to" points [and programmed course intercepts] as well as ground Radar monitoring of their track and altitude. During and after the smoke checklist the aircraft assumes a barely "flyable" configuration in instrument-flying or night conditions that the pilots are not really familiar or comfortable with. Their situation is more precarious because of this than because of the possibility of them being overcome by fumes or toxic smoke or robbed of "inside cockpit" visibility. The possibility of an unrecoverable flight attitude developing or of the aircraft being flown inadvertently into the water during descent becomes the real hazard. The Ground Proximity Warning system may or may not be of much use in such a circumstance. Pilot input response to a high speed, high rate-of-descent GPWS alert may well cause structural failure anyway. This might have been a probable cause for SR 111 heavy - but their transponder and FDR cut out at almost 10,000 ft. If the aircraft altimeter's altitude transponder output to Radar gets "offed" by the checklist, ATC will not see any dangerous descent and radio a warning.
7. Unfortunately ATC can often stimulate and stoke the criticality factor by being too helpfully voluble. Real emergencies are nowadays rare but all too frequently they take on a life of their own and the resulting R/T pressure-cooker effect can defeat the most disciplined pilot's resolve not to be panicked into precipitate action. Having said that, it is also readily acknowledged that a smoke checklist cannot be slow-tempoed. It too has an irresistibly urgent quality. The pilots cannot afford to dither over whether or not the next debilitating step is necessary when the smoke is building up or not clearing.
8. Incapacitation should not be a real problem with a full-face oxygen mask or properly sealing goggles - but it remains a possibility. As long as the power remains on, a short circuit can continue to burn the length of the shorted wirings insulation even without the arc-tracking phenomenon of Kapton. A small amount of burning insulation can produce a surprisingly thick smoke haze that can blind eyes, suffocate breathing and poison critical thought processes. Just as hypoxia causes us to think in terms of "time of useful consciousness" so must toxicological anaemia induced by poison gases.
9.In many [if not most] instances the modern airline pilot will be experiencing his first really dire in-flight emergency and will be intensely provoked into commencing [or continuing] the recovery phase -either because it is necessary or because he sees it as the logical conclusion to the furor he's created by declaring the emergency. Resuming his route or holding off due to poor divert airfield weather will rarely be an option because one of the first steps was to dump down to a landing fuel state. Hopefully the Swissair flight-crew remembered to secure the dump before it all went over the side. At the dump-rate of an MD-11 it is possible [but not likely in the accident's time-scale] that the crew simply forgot to turn off the dump until they were alerted by low fuel level warnings. This could have been what precipitated their "upgrade" call to ATC for an immediate recovery [reportedly] 10 minutes after the first advisory. Because it is an embarrassing mistake it is unlikely that a professional flight-crew would want to advertise the fact that they'd compounded their own situation by oversight. The panic to then get on the ground ASAP might then disrupt disciplined procedures and a CFIT [controlled flight into terrain] or "upset" accident would become more likely - given the 5000' overcast that existed. Or perhaps there's a more likely explanation that is related to CRM resources?
10. Having identified what I think are the problems, do I have any suggestions for modern airline flight-crews or aircraft designers? Well, yes. Designers must be compelled to "design in" benign failure modes and plumb them into a CIWS so that the crews are not kept guessing. Uncertainty is a killer. Mere loss of a probably redundant system or non-critical avionic should not affect the time that the next meal is served or unduly affect navigation. But the fact that it has died should be obvious. Likewise computer monitoring of the aircraft electrical distribution system should alert pilots to any high amp load or fluctuating cycles that could be related to actual or imminent failure. It should not take a bus Circuit-breaker [or alternator or inverter] trip to trigger an alert. Push-pull circuit-breakers are simple devices that trip [or pop] because of thermal overload caused by too high an amperage. If they pop and are reset they should pop again if the triggering electrical situation was other than intermittent. If they don't function as they should [particularly when reset- as is permitted] you've got the beginnings of Dante's Inferno airborne. Modern aircraft are choc a bloc with them, all of different ratings and critically so. Most, but not all, are conveniently situate on flight-deck panels, accessible to the crew, but not obvious when popped. All too often they're used by maintenance [and aircrews] as an on/off switch. It's not what they're designed for and in fact it is detrimental to too frequently cycle them [particularly ganged CBs]. In fact, come to think of it, the basic design of the common garden-variety circuit-breaker hasn't changed in donkey's years. Perhaps that's worth looking at. How reliable is that ancient technology once it's married to the electronics of a modern electric airliner? It is a known fact that arc-tracking [or preliminary "ticking" faults] will not necessarily trip associated Circuit-breakers [possibly because in the conductor there is not the high amperage associated with a short, all the heat in an arc tracking circumstance is traveling externally, along the insulation]. Most aircraft manufacturers are now conscientiously utilizing in their aircraft sophisticated wiring that will not support arc-tracking insulation fires. Simply stated it means that a short circuit caused flashover fire will not be propagated along that wire by the burning outer covering. Most home handymen and car mechanics will be familiar with a shorted-out overheating wire very rapidly melting its insulation along its full length. Glass-fiber style inert non-flammable outer sheathing tends to retard that. Many aircraft still in service, probably as many as 50%, do not sport that optional extra. The MD-11 didn't. Boeing has revealed that the MD-11 uses a form of wiring insulation called Kapton. This type of wiring insulation was banned from US Navy aircraft years ago for safety reasons.
Designers must also ensure that the CVR and accessible FDR outputs will still be forthcoming no matter what eventual electrical configuration the aircraft ends up in. Avionics plumbing must ensure that ATC can continue to provide a watching-brief backup through observation of the aircraft's transponder track and altitude. ATC-to-crew alerting of high descent rates and cleared altitude penetration is axiomatic - but the transponder must be always powered for them to do this.
11. The built-in failing of the smoke checklist is that eventually you get down to a bare-bones electrical configuration, you're struggling to retain control on partial panel and still the smoke situation's not improving. The likelihood of an unrecoverable unusual attitude developing is very high. My basic contention is that a third man [the old Flt Eng] would be a boon in off-loading the pilots in such a circumstance. I always found it to be so. I'm afraid that the SR 111 crew were just overloaded into a loss of control accident in IMC that was predicated by the eliminatory type of trouble-shooting smoke checklist that is common to all multi-engined aircraft.
The fix required is an immediately selectable [one switch], yet minimally basic, electrical configuration from which you can then [only if required] start to ADD buses and systems until the problem recurs. The way in which it's been traditionally done [monitoring OFF systems and buses piece by piece] never ever was going to stop the build-up of smoke and fumes in the long interim. I know from experience that it was always hard to tell when you'd had success after the smoke and fumes have built up. There seemed to be always the lingering taste and smell that was impossible to dispel via the "Smoke and Fumes Elimination Checklist". My innovative suggestion straight off kills most possibilities of the situation compounding over time yet allows you to judiciously reintroduce necessary systems, as and if required, over a calmer, less frenetic period. It should be a reasonably simple modification to most modern airliners.
12.Possibly relevant to the Swissair crash [or any similar event] is the logical caveat that designers must allow specific amounts of fuel to be programmed for jettison. The dump valves must auto-close at a remaining specified fuel level in whatever electrical situation the aircraft may get down to. In the absence of a flight engineer systems supervisor this is absolutely vital. In some aircraft it is presently too easy to initiate a fuel-dump early in the checklist and either forget to cease it at the appropriate time
- or miss the fact that when you do actuate "dump off"at the correct fuel remaining that it does not actually cease because the ongoing checklist has removed power to the jettison circuit's bus and its solenoid actuated valve [and possibly also the fuel gauges]. All you've then got left to warn of low fuel is the low-level warning lights.
13. How's about designing the flight deck so that, once it's depressurized, a manually operated ram air vent plus a high capacity battery-powered exhaust fan can discharge the nasty air directly overboard [and not simply rely, once you're depressurized, upon the reduced airflow through an open outflow valve way back aft]. I wouldn't mind betting that in the MD-11, once the aircon is knocked out as part of the checklist [i.e. once depressurized], the flight-deck exhaust fan [if there is one] dies as well??? That would mean the nasty air is trapped in limbo. Crews may be concerned that to do this might stoke the "fire". I don't think that's a valid concern. Electrical fires are all about overheated wires and components and charring insulation. I don't think flames will leap up or smoke intensify because of increased circulation. If they do and it's visible - so much the better [you see the source and you selectively and discretely kill its power].
14. There should be lessons learnt and resultant change when the death toll is so appallingly high. Public confidence is at a low ebb. Searching philosophical questions about CRM need to be asked. What are the lessons? Where did CRM break down? Could it happen again? What's the weak link? Can you really lay it at the door of maintenance when it might have been a design issue or started as a simple system failure? Technology is allowed to fail - but it should fail-safe, be fail-tolerant or readily isolatable. When you're stuck with only a two-man crew in the smoke and fumes situation, it is far safer to have one dedicated airframe pilot and one checklisting trouble-shooter [and I think it may be proven from the CVR that this factor was their undoing]. Secondly, my gut feeling is that if they'd had a systems-supervising flight-engineer the pilots would have been able to get on with the real task "flying the jet". That's not just being "hands on" and concentrating on the instrument flying control aspects. It includes radio navigation, listening out, looking out, R/T, FMS, liaison with cabin crew and instrument/avionics system monitoring plus the necessary ongoing ahead lookout on their weather radar. A cross-cockpit double-checking backup that is always vital may well have broken down [i.e. in their final descent, altitude cross-checks for instance]. A flight engineer would look after electrics, hydraulics, pneumatics, circuit-breakers, non-FADEC'd throttles, fuel system [including the jettison], engine related systems and caution panels [and also backup the pilots if he had any spare time]. In the final analysis I think you will find that the TSB will come out with a very honest report that reveals that the SR 111 crew was simply overloaded to buggary by developments, and, was always, as a duo, one man short in extremis - and therefore a potential accident looking for a situational trigger. This is really the case with most of the cockpits flying around on RPT [Regular Public Transport]. But, nowadays, particularly in long-haul Digital "Glass" cockpits with automated systems operation, the surveillance and warning kit is normally reliable. The critical third man is only the lynch-pin when the situation starts coming unglued and the automated systems are on the fritz. Unfortunately the flight engineer third man has been "designed out" since about 1975 and it will take more than an MD-11 going down to reverse that. You just don't need him in a modern electrified jet until you really need him - and it will be proven without a doubt that SR 111 would probably have coped well if they'd been so endowed with that ultimate component for CRM - the third man's capacity, systems knowledge, tempering influence and divorcement from the "hands on" flying task.
15. The "disappearance off the radar screen". I think you will find simply means that ATC lost their transponder return. Civil ATC worldwide tends to rely heavily upon challenge and reply secondary radar [IFF in military terms]. Few controllers would be capable of following the primary "paint" blip of a maneuvering target on primary radar nowadays. By the time a controller adjusted his gain, PRF, antenna tilt, sector scan and anti-clutter devices, SR 111 was in the drink. The SR 111 transponder became unpowered either because of a bus failure, structural breakup or because of a checklist step that canned its power. Their remaining COM radio would probably be powered by the bare-bones essential AC and DC buses [the ones that are meant never to be monitored off]. However crew silence would not be strange if an unusual attitude recovery was underway. Believe me, an insidious slow roll and pitch to an unrecoverable attitude can happen to the best of crews who are ensconced in a vital drill. Been there, done that. In a heavy jet, pulling out of an unrecoverable attitude and trying not to overstress in speed or "gs"would be a 150% attention-getting task for both. Bearing in mind that large jets tend to frequently shed bits on finals, no-one should be surprised to find that they started their break-up mere seconds into the attempted high-speed recovery. An adverse C of G because of tail-to-main-tank fuel transfer failure during dump is an outside possibility.
16. Over-tasking begets overloading - first the pilots then inevitably the airframe. T'ain't as if it ain't happened before. It's a pity that de-regulated competition means airlines feel that economically they must persist with two-pilot crews - because the third seat is a great training ground for young airline pilots. Qantas has been doing it for many years with their second-officer program and I would not have it any other way. The Qantas safety record speaks for itself. Military crews worldwide are normally augmented because it is a recognized cheap training context for up-and-coming aircraft commanders. Many military pilots moving into commercial cockpits would nowadays have a sense of loss without quite being able to put their finger on what's dropped out of the safety equation.
17. What can "smoked" aircrews do in their presently paired configuration to improve their chances? Firstly management must make sure that the checklist is valid and flexible. A lot of checklists, particularly emergency drills, once started must be completed or that is the general philosophy. The smoke checklist must only be a guide, should be done at an appropriate rate and "held" at the captain's discretion. It must be structured so that crews cannot "itemize" themselves into a corner unnecessarily. Simulator drills must emphasize the unusual circumstances into which the crews are thrust by the checklist and each drill should be followed through in real time to a logical recovery conclusion [i.e. not frozen and suspended for discussion]. This will allow crews to see the probability of forgetting to secure their fuel dump before they overshoot what's needed. Realistic drills must include smoke, weather environmentals and ATC and cabin crew inputs [or you're perhaps eliminating the "straw that broke the camel's back" and thus short-changing the crews and their future passengers]. Challenge and reply type checklists aren't required in this situation. Two man crews should split the task because the greatest likelihood is an unrecoverable attitude developing simply because, for a critical moment, no-one is concentrating on "flying the jet" - and that's a much more demanding and time consuming task on "partial panel". It only takes mere moments for an "upset" to happen. The copilot should concentrate on simply flying the jet and the Captain [with a possibly greater depth of systems knowledge] handles the R/T, PA and the smoke checklist. And it goes without saying that crews should be familiar with the function of each circuit-breaker [not just its label]- just in case it fails to pop, the smoke-source is identified and they need to pull it in anger. Lastly, and management will love this: put the third man back in the cockpit [and call him a fireman if you wish]. The third man always was the critical quotient in the CRM equation. As well as definable duties he has a proper audit function - overseer, and he's anyways training for eventual command. Over five hundred "smoke" instances on the NTSB data-base alone would tend to support this as a prudent safety initiative for modern airliners.
Non-compliance by carriers should be a consideration for passengers deciding who they will trust themselves or their families' lives to. A third set of eyes and a third brain is worth the cost of haulage and an extra ten bucks on my ticket any night.
List of Annexes