Human Error: Mars Climate Orbiter, Mars Polar Lander and Deep Space 2 - Systems, architecture and engineering solutions! - Site Home - MSDN Blogs

Systems, architecture and engineering solutions!

This site will do in depth analysis of subjects such as service oriented architecture, software engineering and technologies such as Exchange and Sharepoint.

Human Error: Mars Climate Orbiter, Mars Polar Lander and Deep Space 2

Human Error: Mars Climate Orbiter, Mars Polar Lander and Deep Space 2

  • Comments 1

image Well, could there have been a worse time for unmanned flight than the trio of evil: The vehicles: Mars Climate Orbiter (MCO), Mars Polar Lander (MPL) and Deep Space 2 (DS2) were failures in the terminal and most expensive phase of the flight program.  JPL and NASA made rookie errors in program management and the post crash investigations indicate that the program was run in a manner to attempt to save money, always a good idea, unless you fail, and in this case the failure was at the most basic levels and in one case it could have been prevented even after the vehicles entered orbit.

"People sometimes make errors," said Dr. Edward Weiler, NASA's Associate Administrator for Space Science. "The problem here was not the error, it was the failure of NASA's systems engineering, and the checks and balances in our processes to detect the error. That's why we lost the spacecraft."  (See: RELEASE 99-113)

Without usual peer reviews the 3 vehicles were doomed according the crash inquiry reports, and in the case of the MCO, a navigation team on Earth might have detected the orbital changes and determined the cause. 

 

Yet, like the Colombia Shuttle disaster JPL/NASA hadn’t set up a collaborative atmosphere that meant that the navigation team would have had rapid knowledge at their fingertips to examine the software systems or question the program managers of the software.  With the Colombia re-entry disaster, abnormalities were noticed by the ground base controllers (for the shuttle) and the astronauts were not informed.  For the MCO the navigators observed variations but didn’t have a way to investigate software issues.  The MCO injected into orbit significantly lower than expected, and the final thruster burn instructions was given, at that time the MCO entered to below the 85 KM safety zone and likely burned up.  with the loss of the MCO significant communications capabilities were lost as well as the future vehicles like the current rovers was lost.

The MPL the second of three vehicles, was designed to provide Earth researchers with information about atmospherics, soil and whether water exists in the polar regions of Mars.  Now we know that there is water, but it took quite a few more years to determine this.  The Mars Polar Lander (MPL) appeared to enter the Martian atmosphere successfully and then at sometime during descent the MPL most likely did not fire it’s engines because the “squat” switches triggered incorrectly telling the control system that it was on the ground.  The “squat” switches are commonly used in aircraft and planetary landers for decades, in commercial airliners the “squat” switch prevents the pilots from accidentally rising the landing gear when they are on the ground.  The “squat” switches were not tested in a physical environment, something that the simulations indicated would work.  The MPL problem points to the lack of making sure that the simulation is mapped to at a few real life physical tests.

In combination with my blog:

WTF#: F#, the Mars Climate Orbiter, Mars Polar Lander and Deep Space 2, Oops from 1999

I will be talking and hopefully demonstrating how F# and other tools in the Visual Studio family could be used by your software and hardware development to prevent these types of errors of management.

To be blunt, NASA and JPL have shown a tendencies over the past decade to fail to lead in systems and test as they had in the late 50s, 60s, and 70s.  It is time that these two organizations begin to show the rest of us, myself included, how it is done.  They need to open the documents around these failures, software and so forth (as allowed by national security) to public scrutiny.  The small amount of easily found information is not enough to be inspiring to the 

Leave a Comment
  • Please add 7 and 7 and type the answer here:
  • Post