On the safe side
Last week we had a rare event: HiRISE turned off! We call this safe mode, because it’s a safety measure built into the instrument’s software. Whenever any of the sensors starts going out of bounds, like temperatures or voltages, the instrument powers down to prevent damage to the electronics. In this case, one temperature sensor went over its upper limit of 35 degrees Celsius. It’s pretty disconcerting when something unexpected like this happens, but at least we know the instrument is protected.
We had the difficult detective job of figuring out what went wrong. It was clear early on that the instrument overheated, but we couldn’t figure out why. Our tool that predicts the temperatures (”HiTemp”) didn’t predict anything that hot. We didn’t take a really large image, which would heat us up (at least, nothing bigger than normal!
). The local operations team worked with the health & safety people, the spacecraft engineers at LMA, and some of the software developers at Ball Aerospace that originally designed HiRISE. Together we all investigated the problem.
We studied the telemetry (information from the spacecraft), the details of the commands that were sent to the instrument, and we re-modeled the temperatures and memory use. The problem was complicated by several other unusual events that occurred around the same time: First, the memory on board the spacecraft (the “Solid State Recorder”, or SSR) had filled up because one of the dishes of the Deep Space Network was broken. This meant we couldn’t send data back to Earth, so it piled up in the memory until it overflowed. Second, HiRISE’s “keep-alive counter” was withheld. This is a steady heartbeat HiRISE sends to MRO that indicates HiRISE is still running. After a certain number of heartbeats are missed, MRO will safe HiRISE. Also around the same time, there were some errors in the spacecraft’s software. The timing was also mysterious: HiRISE safed about 15 minutes after an image. This is a long time afterwards – the image should have been completely done within just a few minutes. Instead, the temperature sensors showed that we continued to heat up for 15 minutes!
Finally, after a day of research, we found an answer. What happened was this: First HiRISE did all the setup steps to take an image (set the number of lines, etc.). One of these steps turns on the CCDs (sensors) in the camera. Then, right before it was about to actually expose the image, it found out that the memory was full. Since there wasn’t enough room in memory for the data, it didn’t take the image. However, everything was left turned on! So with everything powered on, we continued to heat up until we reached the limits we have set to protect the instrument. This withheld the keep-alive counter, and HiRISE safed. So in fact, the instrument worked exactly as it should have, in order to keep itself out of danger. It was just an unexpected response to this unusual situation.

With the help of the LMA engineers, we were able to power HiRISE back on the following day and start imaging again very quickly. Thankfully, we were up & running in time for another very special observation that I’ll be writing about soon….


March 3rd, 2008 at 1:50 pm
[...] for example.) PSP_007338_2640 happened to be the first image we took after powering back on after a safing event. So we were examining the image to make sure the camera was still working OK (it is – as you can [...]
March 3rd, 2008 at 9:41 pm
[...] Wow. This pictures shows a billowing dust cloud caused by an avalanche from the left part of the picture. The picture was taken straight down and the the reddish bands just left of center are a 2300 foot cliff with a 60 degree slope. The white on the left is carbon dioxide frost that is sublimating as the martian spring advances. You can read more about this photo here and here. It’s interesting that the photo may have ended up being stored for a few weeks before being examined. Generally there are too many photos coming in from the HiRise camera on the Mars Reconnaissance Orbiter to analyze them as they arrive on Earth. However, the avalanche photo just happened to be the first photo taken with the camera after the camera automatically shut down due to a safing event. [...]
March 5th, 2008 at 5:03 pm
[...] us a comfortable buffer below the scary solid red line. That’s when HiRISE would shut off, or safe. We know from experience by now that this is a big pain in the neck – a lot of work is required to [...]
September 8th, 2008 at 6:24 am
Very cool pictures.