Safety

MSFC To Safety Contractor: Just Ignore Those SLS Software Issues

By Keith Cowing

NASA Watch

December 12, 2016

Filed under MSFC, safety, SLS

MSFC To Safety Contractor: Just Ignore Those SLS Software Issues

Keith’s note: According to sources at NASA MSFC the contractor performing software safety tests found massive errors in the SDF test cases (no hardware testing, just software). The quality check of the test cases was given a stand down order by George Mitchell. Mitchell is Andy Gamble’s deputy on SLS flight software safety. Mitchell had already told the contractor that they were not allowed to impact the testing or ask for re-testing. Further, he said that these tests had already been accepted as successfully verifying the flight software requirements. In other words the issues raised by NASA MSFC employees about SLS flight software testing are officially moot.
– SLS Flight Software Safety Issues Continue at MSFC, earlier post
– SLS Flight Software Safety Issues at MSFC (Update), earlier post

Keith Cowing

NASA Watch founder, Explorers Club Fellow, ex-NASA, Away Teams, Journalist, Space & Astrobiology, Lapsed climber.

13 responses to “MSFC To Safety Contractor: Just Ignore Those SLS Software Issues”

BeanCounterFromDownUnder says:

December 9, 2016 at 10:23 pm

0
0
Could someone explain in non-technical language what this actually means? Is it a real problem like it sounds or is it just some technical jargon for issues that have no real impact. Sounds like the former but …
Cheers and thanks in advance.
- fcrary says:
  
  December 10, 2016 at 8:31 pm
  
  0
  0
  There isn’t really enough information to tell. It looks like they had a test procedure (a set of tests, the criteria for passing, etc.) and the software passed. It looks like the contractor subsequently decided that this procedure was in some way flawed. If someone is saying “massive errors”, then they must be pretty serious about it. The folks at MSFC have apparently decided that the test procedure was fine, and they aren’t going to go back and revise it and retest.
  
  Unfortunately, this sort of thing isn’t uncommon when testing flight software. It just isn’t practical to test every scrap of code with every single permutation of each and every input parameters. The normal practice (at least for unmanned spacecraft, which is what I’m familiar with) is to “test as you fly”. The tests cover the expected operating conditions, including off-spec and anomalous conditions like hardware failures. But at some point, in writing the test procedures, people have to make a judgement call and say, “we don’t need to test that case, because the odds of it ever coming up are a million to one against.” Those procedures get a pretty through review, so those judgement calls aren’t just one guy’s whim. But reasonable people can disagree, and after the fact, someone can realize there are some situations that weren’t tested and might come up. Sometimes, it can even be a case where it’s clear the code would hit a serious bug.
  
  But, from what we know, we just can’t tell. There is a disagreement over how comprehensive the tests were. That could be nothing or it could be a very real problem. Without more information, there is no way to tell.
  - Chris Winter says:
    
    December 20, 2016 at 4:04 pm
    
    0
    0
    You’re right; but given the history of NASA, this causes me concern. The HST’s main mirror comes to mind.
- numbers_guy101 says:
  
  December 12, 2016 at 11:05 pm
  
  0
  0
  I would really like to get the inside scoop on what’s happening. To try to spin some plain english into where I have seen stories that sound similar, often this conflict among people is a combination of bad requirements and poor definition on roles.
  
  So requirements get written around process, but the auditor or reviewer thinks they are around to check results. A requirement might be for the software version by whenever date to have the basis for some whatever. Check! It’s hard to fail that audit, unless the reviewer actually was dense enough to think the software rev x was supposed to actually be a near done attempt at doing something.
  
  I have been that dense myself. In retrospect…I was corrected and put back on the proper path, helping assuring everything got done…whenever…
Daniel Woodard says:

December 10, 2016 at 2:05 am

0
0
It would be nice to have more information about the actual problems they identified.
- Bill Housley says:
  
  December 12, 2016 at 5:16 pm
  
  0
  0
  It might just be CYA on the part of the contractor, just in case something that was missed in the testing (likely to be some, from what fcrary pointed out above) becomes an inflight issue later.
mfwright says:

December 12, 2016 at 7:06 pm

0
0
I wonder if Saturn rockets had advantage that computer tech was limited so the software written had to be compact. Unlike SLS with a zillion lines of code and trying to isolate fault modes be much more difficult?
- Michael Spencer says:
  
  December 13, 2016 at 11:55 am
  
  0
  0
  And hence my not-well-received comment some time back, wondering why we haven’t figured out yet how to build rockets.
  
  Maybe someone will explain exactly what SLS will do that Saturn V couldn’t do.
  - fcrary says:
    
    December 14, 2016 at 4:11 am
    
    0
    0
    Two examples have occurred to me since your earlier comment. Unfortunately, neither are about SLS in particular.
    
    Modern software can automate much of the work previous done by the launch control team. The new, Japanese Epsilon small launch vehicle requires a team of _eight_ according to one source and 20 according to another, compared to 150 people for the previous generation, similar capability launch vehicle. Along these lines, the Falcon 9 CRS-1 launch, they had an engine cut off early, but successfully complete the primary mission by throttling up the remaining eight, burning them longer and/or adjusting the trajectory. Something similar happened to the Saturn V/Apollo 13 launch, but the adjustments where done by the folks on the ground. If I understand it correctly, the Falcon 9 was smart enough to do it on its own.
    
    Another example is the use of parking orbits. They have become less common, in favor of doing the whole thing at once. That’s more efficient, but requires each stage and burn to work. With high precision. Parking orbits give the folks on the ground time to replan and refine the next burn, to correct for delivery errors. But is the launch vehicle is precise enough, you don’t need that and can use a more efficient trajectory.
    
    This does not address two questions. First, does this apply to the SLS, rather than modern launch vehicles in general? I have no idea; I haven’t seen the SLS specs in sufficient detail. Second, are the added efficiencies worth the added complexity? That isn’t clear. When it comes from dropping a 150 person launch team down to eight, I suspect it is (salaries are a bigger part of the cost than many suspect.) But using more efficient trajectories is mass efficient. Cost efficiency is not the same thing, so I’d call that an open issue.
    - kcowing says:
      
      December 14, 2016 at 4:24 am
      
      0
      0
      You used “modern” and “Software” in a post about SLS. Tsk tsk.
      - fcrary says:
        
        December 14, 2016 at 4:36 am
        
        0
        0
        I’ll plead not guilty. I was talking about Epsilon and Falcon 9 and said I wasn’t sure if these issues applied to SLS. But, in fact, the question was a comparison between SLS and the Saturn V. By those standards, SLS is more modern. Code written in a compiled language, not assembly or binary (how did they code flight software on the Saturn V?) But you have a point. Just because it’s developed on modern computers and runs on microprocessors which weren’t available in 1968, that doesn’t make it modern. Just _more_ modern than the computers on a Saturn.
    - Michael Spencer says:
      
      December 14, 2016 at 11:31 am
      
      0
      0
      A thoughtful response with issues I hadn’t considered; automation at that scale is significant; the size of ground teams is an issue we’ve discussed before. The efficiency delta is an open question.
- fcrary says:
  
  December 14, 2016 at 3:48 am
  
  0
  0
  Yes, it would be vastly easier to test short, simple code that the sort used on modern spacecraft. Object oriented code is also creeping in and, despite its advantages, that also makes testing more difficult. Multitasking and operating systems are already here. But all these things add capabilities, and it is generally assumed that those capabilities are worth it. I’ve rarely seen that assumption justified. Not that it couldn’t be; it’s just that no one thinks justifying it is worth doing.