SlideShare a Scribd company logo
HAB Software Woes
John Graham-Cumming
September 2012

Or “My capsule didn‟t crash but my software did”
Background
        > 30 years of
         programming
         experience

        One HAB flight
         ◦ GAGA-1
http://blog.jgc.org/2011/04/gaga-1-flight.html
https://github.com/jgrahamc/gaga
Where‟s your flight‟s
complexity?
   Example: GAGA-1
    ◦ One balloon, parachute, polystyrene box
    ◦ Many metres of cord attached with knots
    ◦ An off-the-shelf camera

    ◦ 2,836 lines of code
    ◦ Common to see defect rates of 2 to 4 per
      KLOC
    ◦ So GAGA-1 likely has 5 to 10 errors in it
Real Stuff Seen on HAB
flights
 Complete computer crash
 Altitude going negative
 Latitude and longitude garbled
 Cutdown triggered in back of car
 Long periods of no transmission
 Not setting the GPS up before launch
 Not turning the camera on
 Running out of camera disk space
 Altitude jumping around rhythmically
The Curse and Joy of
Determinism
   Computers do what you tell them to
    ◦ Precisely what you tell them to
    ◦ Not what you think you told them to do
   A Curse
    ◦ Will do things you don‟t expect
    ◦ Will process bogus input without
      complaint
   The Joy
    ◦ Easy to test that it does what‟s expected
HAB Is A Harsh Environment
 Cold
 Vibration
 Stuff breaks in flight


 Software needs to be able to cope with
  failing hardware
 Very important to think about failure
  modes
 YOUR CODE IS ON ITS OWN OUT
  THERE
Deadly Sins
 The “It works!” Fallacy
 The Last Minute Change
 Being Far Too Clever
 Overlooking Odd Behaviour
 Copying Other People‟s Code
 Assuming Finding A Bug Solves The
  Problem
The “It works!” Fallacy
   If you‟re an inexperienced (and
    sometimes experienced)
    programmer…
    ◦ You hack some code together
    ◦ It works once
    ◦ You assume it will always work

   Only solution to this is
    ◦ Testing
    ◦ Paranoia
The Last Minute Change
 Never, ever change anything in code
  at the last minute no matter how
  simple.
 Example: HABE 1
    ◦ Complete camera failure
    ◦ Maximum integer size in uBASIC on
      CHDK is 999,999
    ◦ Last minute change of integer from
      600,000 to 1,000,000 caused total failure
Being Far Too Clever
       Example: GAGA-1
        ◦ Entered the wrong value of 2 * pi in code
          to do GPS position conversion from
          radians to degrees

        ◦ Caught before flight because I verified the
          location of my own back garden

        ◦ Note to self: 2 * pi != 6.2818.


https://github.com/jgrahamc/gaga/blob/master/gaga-1/flight/gaga1/gps.cpp#L113
Overlooking Odd Behaviour
       Example: GAGA-1
        ◦ In tests RTTY output was fine some of the
          time, garbled at other times
        ◦ Turned out to be interrupts from the GPS
          messing up the RTTY timing
        ◦ Solution: disable GPS serial interface while
          sending RTTY string

     ALWAYS BE HONEST WITH
      YOURSELF ABOUT YOUR CODE
     EXPECT THE SPANISH INQUISITION!

https://github.com/jgrahamc/gaga/blob/master/gaga-1/flight/gaga1/tsip.cpp#L229
Copying Other People‟s Code
     Don‟t do this, you have no idea what
      you are copying or who they copied it
      from
     Better practice is to look at other
      people‟s code and…
        ◦   Write your own version
        ◦   That you understand
        ◦   That you are able to test
        ◦   Example: GAGA-1
              Read lots of people‟s RTTY code, wrote my
               own
https://github.com/jgrahamc/gaga/blob/master/gaga-
APRS Tracker using copied
     code




   If the altitude in metres contained an 8 or a 9 the altitude reported would
   be wrong

http://sharon.esrac.ele.tue.nl/users/pe1rxq/aprstracker/aprstracker.html
Assuming Finding The Bug
Solves The Problem
 Just because you‟ve found A bug
  doesn‟t mean it was THE bug
 Lots of research in computer science
  shows bugs tend to cluster
 Example: CLOUD1, CLOUD2
    ◦ Three bugs in printing latitude, longitude
      and altitude
    ◦ One fixed on CLOUD1, …
“The One Thing I Didn‟t Test”




 http://ukhas.org.uk/guides:common_coding_errors_payload_testing
Common problems with uC
 Lack of floating point support
 Small integers
You might never be a
great programmer…

… but you can be a
paranoid tester!
Good Things To Do
 No infinite loops
 Self-Checking
 Unexpected Error Handling
 Handle Exceptions
 Simulation
 Simplify, Simplify, Simplify
 Unit Test
 Write Log Files
No Infinite Loops
 Never sit in a loop waiting forever
 Example: ATLAS 3
while (1) {
  // Make sure data is available to read
  if (Serial.available()) {
    b = Serial.read();

         if(bytePos == 8){
           navmode = b;
           return true;
         }

         bytePos++;
        }
        // Timeout if no valid response in 3 seconds
        if (millis() - startTime > 3000) {
          navmode = 0;
          return false;
        }
    }
}
             https://github.com/jamescoxon/Atlas-Flight-Computer/blob/master/Atlas3/Atlas3_3.pde#L
Self-Checking
  -- Now enter a self-check of the manual mode settings

  log( "Self-check started" )

  assert_prop( 49, -32764, "Not in manual mode" )
  assert_prop( 5,     0, "AF Assist Beam should be Off" )
  assert_prop( 6,     0, "Focus Mode should be Normal" )
  assert_prop( 8,     0, "AiAF Mode should be On" )
  assert_prop( 21,     0, "Auto Rotate should be Off" )
  assert_prop( 29,     0, "Bracket Mode should be None" )
  assert_prop( 57,     0, "Picture Mode should be Superfine" )
  assert_prop( 66,     0, "Date Stamp should be Off" )
  assert_prop( 95,     0, "Digital Zoom should be None" )
  assert_prop( 102,     0, "Drive Mode should be Single" )
  assert_prop( 133,     0, "Manual Focus Mode should be Off" )
  assert_prop( 143,     2, "Flash Mode should be Off" )
  assert_prop( 149, 100, "ISO Mode should be 100" )
  assert_prop( 218,     0, "Picture Size should be L" )
  assert_prop( 268,     0, "White Balance Mode should be Auto" )
  assert_gt( get_time("Y"), 2009, "Unexpected year" )
  assert_gt( get_time("h"), 6, "Hour appears too early" )
  assert_lt( get_time("h"), 20, "Hour appears too late" )
  assert_gt( get_vbatt(), 3000, "Batteries seem low" )
  assert_gt( get_jpg_count(), ns, "Insufficient card space" )
https://github.com/jgrahamc/gaga/blob/master/gaga-1/camera/gaga-1.lua#L96
Self-Checking
      Example: ALTAS 3
      Makes sure uBlox GPS will work at
       high altitude; fixes it if not
    if((count % 10) == 0) {
     digitalWrite(6, LOW);
     checkNAV();
     delay(1000);
     if(navmode != 6){
       setupGPS();
       delay(1000);
     }
     checkNAV();
     delay(1000);
     digitalWrite(6, HIGH);
   }


https://github.com/jamescoxon/Atlas-Flight-Computer/blob/master/Atlas3/Atlas3_3.pde#L3
Unexpected Error Handling
    def temperature():
      t = at.cmd( 'AT#TEMPMON=1' )

      # Command returns something like:
      #
      # #TEMPMEAS: 0,28
      #
      # OK
      #
      # So split on whitespace first to isolate the temperate 0,28
      # and then split on comma to get the temperature

      w = t.split()
      if len(w) < 2:
          logger.log( "Temperature read returned %s" % t )
          return -1000

      m = w[1].split(',')
      if len(m) != 2:
          logger.log( "Temperature read returned %s" % t )
          return -1000
      else:
          return int(m[1])


https://github.com/jgrahamc/gaga/blob/master/gaga-1/recovery/util.py
Handle Exceptions
     If your language can generate
      exceptions then you‟d better handle
      them!
     Example: GAGA-1
       ◦ Recovery computer used Python
       ◦ Exception could have killed it
       ◦ Global exception handler
    except:
        logger.log( "Caught exception in main loop: %s" %
   sys.exc_info()[1] )



       Bonus: What‟s wrong with that code?
https://github.com/jgrahamc/gaga/blob/master/gaga-1/recovery/gaga-1.py#L144
Simulation
 Simulate a flight
 Example: UKHAS wiki has example of
  using a PC as a fake GPS
http://www.ukhas.org.uk/guides:common_coding_errors_payload_testing

   Example: GAGA-1
    ◦ To test the embedded Telit module wrote
      modules that faked the entire Telit Python
      interface.
https://github.com/jgrahamc/gaga/blob/master/gaga-1/recovery/GPS.py
https://github.com/jgrahamc/gaga/blob/master/gaga-1/recovery/MDM.py
Simplify, Simplify, Simplify
 Make your code as simple as possible
 Never have „duplicated‟ or „copy and
  paste‟ code
 Break it up into small functions that
  you understand
 Make sure you understand the
  limitations of the functions you call
Unit Test
 Break your program up into small,
  separate functions
 Write tests that call that function and
  make sure it does what you expect.
 Lots of ways to do this
    ◦ Use something like cpptest
    ◦ ArduinoUnit
    ◦ Write your own test program
Unit Test Example
 In the bad APRS program
 Turn metres to feet code into a
  separate function: int m_to_f(int m)
    assertEquals(m_to_f(1000),3300)
    assertEquals(m_to_f(2000),6600)
    assertEquals(m_to_f(3000),9900)
    assertEquals(m_to_f(4000),13200)
    assertEquals(m_to_f(5000),16500)
    assertEquals(m_to_f(6000),19800)
    assertEquals(m_to_f(7000),23100)
    assertEquals(m_to_f(8000),26400)
    assertEquals(m_to_f(9000),29700)
    assertEquals(m_to_f(10000),33000)
Write Log Files
 Write detailed log files to non-volatile
  memory for post flight debugging
 Data sent via RTTY or APRS is limited
 Log exceptions and errors in detail
 Make sure you have a timestamp
Perform system testing
   Test your entire system before flight
    ◦ Put your tracker in the garden
    ◦ Get a GPS lock
    ◦ Listen to the RTTY on your radio
    ◦ Look at the decoded RTTY on your
      computer
    ◦ Test uploaded data on the tracker*


    ◦ *I didn‟t do that step, on the day people
      had to fix the tracker for me.

More Related Content

PPTX
Lua: the world's most infuriating language
PDF
Go memory
PDF
Go Profiling - John Graham-Cumming
PDF
Go Memory
PDF
Go debugging and troubleshooting tips - from real life lessons at SignalFx
PDF
Go Concurrency
PDF
Golang Channels
PDF
Something about Golang
Lua: the world's most infuriating language
Go memory
Go Profiling - John Graham-Cumming
Go Memory
Go debugging and troubleshooting tips - from real life lessons at SignalFx
Go Concurrency
Golang Channels
Something about Golang

What's hot (20)

PDF
Lua London Meetup 2013
PDF
marko_go_in_badoo
PDF
Apache Hadoop for System Administrators
PDF
rx.js make async programming simpler
PDF
The algebra of library design
PDF
What's Special About Elixir
PDF
GoとElixir、同時開発した時の気づき
ODP
Event Loop in Javascript
KEY
Parallel Computing in R
PDF
Go Concurrency
PDF
ES2015 (ES6) Overview
PDF
Deploying Prometheus stacks with Juju
PDF
Apache beam — promyk nadziei data engineera na Toruń JUG 28.03.2018
PDF
Доклад Антона Поварова "Go in Badoo" с Golang Meetup
PDF
Apache Hadoop Shell Rewrite
PPTX
2015 555 kharchenko_ppt
PDF
New Ways to Find Latency in Linux Using Tracing
PDF
Odoo Online platform: architecture and challenges
PDF
"Metrics: Where and How", Vsevolod Polyakov
PDF
Aplicações assíncronas no Android com
Coroutines & Jetpack
Lua London Meetup 2013
marko_go_in_badoo
Apache Hadoop for System Administrators
rx.js make async programming simpler
The algebra of library design
What's Special About Elixir
GoとElixir、同時開発した時の気づき
Event Loop in Javascript
Parallel Computing in R
Go Concurrency
ES2015 (ES6) Overview
Deploying Prometheus stacks with Juju
Apache beam — promyk nadziei data engineera na Toruń JUG 28.03.2018
Доклад Антона Поварова "Go in Badoo" с Golang Meetup
Apache Hadoop Shell Rewrite
2015 555 kharchenko_ppt
New Ways to Find Latency in Linux Using Tracing
Odoo Online platform: architecture and challenges
"Metrics: Where and How", Vsevolod Polyakov
Aplicações assíncronas no Android com
Coroutines & Jetpack
Ad

Similar to HAB Software Woes (20)

PDF
Actor Concurrency
PDF
Android Things Linux Day 2017
PPTX
IOT Firmware: Best Pratices
PDF
Advanced iOS Debbuging (Reloaded)
PPT
20081114 Friday Food iLabt Bart Joris
PDF
May2010 hex-core-opt
PDF
Programar para GPUs
PDF
Beyond Breakpoints: A Tour of Dynamic Analysis
PDF
2015-GopherCon-Talk-Uptime.pdf
PPT
Threaded Programming
PPTX
Workshop 1: Good practices in JavaScript
PDF
FPGA based 10G Performance Tester for HW OpenFlow Switch
ODP
Java gpu computing
PDF
maXbox Starter 45 Robotics
PDF
Intro to Arduino Programming.pdf
PDF
100 bugs in Open Source C/C++ projects
PDF
Us 17-krug-hacking-severless-runtimes
PPTX
Running a Go App in Kubernetes: CPU Impacts
PPTX
Velocity 2015: Building Self-Healing Systems
PPTX
Velocity 2015 building self healing systems (slide share version)
Actor Concurrency
Android Things Linux Day 2017
IOT Firmware: Best Pratices
Advanced iOS Debbuging (Reloaded)
20081114 Friday Food iLabt Bart Joris
May2010 hex-core-opt
Programar para GPUs
Beyond Breakpoints: A Tour of Dynamic Analysis
2015-GopherCon-Talk-Uptime.pdf
Threaded Programming
Workshop 1: Good practices in JavaScript
FPGA based 10G Performance Tester for HW OpenFlow Switch
Java gpu computing
maXbox Starter 45 Robotics
Intro to Arduino Programming.pdf
100 bugs in Open Source C/C++ projects
Us 17-krug-hacking-severless-runtimes
Running a Go App in Kubernetes: CPU Impacts
Velocity 2015: Building Self-Healing Systems
Velocity 2015 building self healing systems (slide share version)
Ad

More from jgrahamc (8)

PPTX
Better living through microcontrollers
PDF
Big O London Meetup April 2015
PDF
Go Containers
PDF
How to launch and defend against a DDoS
PPTX
Software Debugging for High-altitude Balloons
PDF
Highlights of Go 1.1
PPTX
That'll never work!
PPTX
Javascript Security
Better living through microcontrollers
Big O London Meetup April 2015
Go Containers
How to launch and defend against a DDoS
Software Debugging for High-altitude Balloons
Highlights of Go 1.1
That'll never work!
Javascript Security

Recently uploaded (20)

PDF
Unlocking AI with Model Context Protocol (MCP)
PPTX
Programs and apps: productivity, graphics, security and other tools
PDF
August Patch Tuesday
PDF
gpt5_lecture_notes_comprehensive_20250812015547.pdf
PDF
MIND Revenue Release Quarter 2 2025 Press Release
PPTX
Group 1 Presentation -Planning and Decision Making .pptx
PDF
Architecting across the Boundaries of two Complex Domains - Healthcare & Tech...
PPTX
OMC Textile Division Presentation 2021.pptx
PDF
Assigned Numbers - 2025 - Bluetooth® Document
PDF
Diabetes mellitus diagnosis method based random forest with bat algorithm
PDF
Encapsulation theory and applications.pdf
PDF
Building Integrated photovoltaic BIPV_UPV.pdf
PDF
Heart disease approach using modified random forest and particle swarm optimi...
PDF
A comparative analysis of optical character recognition models for extracting...
PDF
Getting Started with Data Integration: FME Form 101
PDF
Reach Out and Touch Someone: Haptics and Empathic Computing
PPTX
SOPHOS-XG Firewall Administrator PPT.pptx
PPTX
KOM of Painting work and Equipment Insulation REV00 update 25-dec.pptx
PDF
Encapsulation_ Review paper, used for researhc scholars
PDF
Mobile App Security Testing_ A Comprehensive Guide.pdf
Unlocking AI with Model Context Protocol (MCP)
Programs and apps: productivity, graphics, security and other tools
August Patch Tuesday
gpt5_lecture_notes_comprehensive_20250812015547.pdf
MIND Revenue Release Quarter 2 2025 Press Release
Group 1 Presentation -Planning and Decision Making .pptx
Architecting across the Boundaries of two Complex Domains - Healthcare & Tech...
OMC Textile Division Presentation 2021.pptx
Assigned Numbers - 2025 - Bluetooth® Document
Diabetes mellitus diagnosis method based random forest with bat algorithm
Encapsulation theory and applications.pdf
Building Integrated photovoltaic BIPV_UPV.pdf
Heart disease approach using modified random forest and particle swarm optimi...
A comparative analysis of optical character recognition models for extracting...
Getting Started with Data Integration: FME Form 101
Reach Out and Touch Someone: Haptics and Empathic Computing
SOPHOS-XG Firewall Administrator PPT.pptx
KOM of Painting work and Equipment Insulation REV00 update 25-dec.pptx
Encapsulation_ Review paper, used for researhc scholars
Mobile App Security Testing_ A Comprehensive Guide.pdf

HAB Software Woes

  • 1. HAB Software Woes John Graham-Cumming September 2012 Or “My capsule didn‟t crash but my software did”
  • 2. Background  > 30 years of programming experience  One HAB flight ◦ GAGA-1 http://blog.jgc.org/2011/04/gaga-1-flight.html https://github.com/jgrahamc/gaga
  • 3. Where‟s your flight‟s complexity?  Example: GAGA-1 ◦ One balloon, parachute, polystyrene box ◦ Many metres of cord attached with knots ◦ An off-the-shelf camera ◦ 2,836 lines of code ◦ Common to see defect rates of 2 to 4 per KLOC ◦ So GAGA-1 likely has 5 to 10 errors in it
  • 4. Real Stuff Seen on HAB flights  Complete computer crash  Altitude going negative  Latitude and longitude garbled  Cutdown triggered in back of car  Long periods of no transmission  Not setting the GPS up before launch  Not turning the camera on  Running out of camera disk space  Altitude jumping around rhythmically
  • 5. The Curse and Joy of Determinism  Computers do what you tell them to ◦ Precisely what you tell them to ◦ Not what you think you told them to do  A Curse ◦ Will do things you don‟t expect ◦ Will process bogus input without complaint  The Joy ◦ Easy to test that it does what‟s expected
  • 6. HAB Is A Harsh Environment  Cold  Vibration  Stuff breaks in flight  Software needs to be able to cope with failing hardware  Very important to think about failure modes  YOUR CODE IS ON ITS OWN OUT THERE
  • 7. Deadly Sins  The “It works!” Fallacy  The Last Minute Change  Being Far Too Clever  Overlooking Odd Behaviour  Copying Other People‟s Code  Assuming Finding A Bug Solves The Problem
  • 8. The “It works!” Fallacy  If you‟re an inexperienced (and sometimes experienced) programmer… ◦ You hack some code together ◦ It works once ◦ You assume it will always work  Only solution to this is ◦ Testing ◦ Paranoia
  • 9. The Last Minute Change  Never, ever change anything in code at the last minute no matter how simple.  Example: HABE 1 ◦ Complete camera failure ◦ Maximum integer size in uBASIC on CHDK is 999,999 ◦ Last minute change of integer from 600,000 to 1,000,000 caused total failure
  • 10. Being Far Too Clever  Example: GAGA-1 ◦ Entered the wrong value of 2 * pi in code to do GPS position conversion from radians to degrees ◦ Caught before flight because I verified the location of my own back garden ◦ Note to self: 2 * pi != 6.2818. https://github.com/jgrahamc/gaga/blob/master/gaga-1/flight/gaga1/gps.cpp#L113
  • 11. Overlooking Odd Behaviour  Example: GAGA-1 ◦ In tests RTTY output was fine some of the time, garbled at other times ◦ Turned out to be interrupts from the GPS messing up the RTTY timing ◦ Solution: disable GPS serial interface while sending RTTY string  ALWAYS BE HONEST WITH YOURSELF ABOUT YOUR CODE  EXPECT THE SPANISH INQUISITION! https://github.com/jgrahamc/gaga/blob/master/gaga-1/flight/gaga1/tsip.cpp#L229
  • 12. Copying Other People‟s Code  Don‟t do this, you have no idea what you are copying or who they copied it from  Better practice is to look at other people‟s code and… ◦ Write your own version ◦ That you understand ◦ That you are able to test ◦ Example: GAGA-1  Read lots of people‟s RTTY code, wrote my own https://github.com/jgrahamc/gaga/blob/master/gaga-
  • 13. APRS Tracker using copied code If the altitude in metres contained an 8 or a 9 the altitude reported would be wrong http://sharon.esrac.ele.tue.nl/users/pe1rxq/aprstracker/aprstracker.html
  • 14. Assuming Finding The Bug Solves The Problem  Just because you‟ve found A bug doesn‟t mean it was THE bug  Lots of research in computer science shows bugs tend to cluster  Example: CLOUD1, CLOUD2 ◦ Three bugs in printing latitude, longitude and altitude ◦ One fixed on CLOUD1, …
  • 15. “The One Thing I Didn‟t Test” http://ukhas.org.uk/guides:common_coding_errors_payload_testing
  • 16. Common problems with uC  Lack of floating point support  Small integers
  • 17. You might never be a great programmer… … but you can be a paranoid tester!
  • 18. Good Things To Do  No infinite loops  Self-Checking  Unexpected Error Handling  Handle Exceptions  Simulation  Simplify, Simplify, Simplify  Unit Test  Write Log Files
  • 19. No Infinite Loops  Never sit in a loop waiting forever  Example: ATLAS 3 while (1) { // Make sure data is available to read if (Serial.available()) { b = Serial.read(); if(bytePos == 8){ navmode = b; return true; } bytePos++; } // Timeout if no valid response in 3 seconds if (millis() - startTime > 3000) { navmode = 0; return false; } } } https://github.com/jamescoxon/Atlas-Flight-Computer/blob/master/Atlas3/Atlas3_3.pde#L
  • 20. Self-Checking -- Now enter a self-check of the manual mode settings log( "Self-check started" ) assert_prop( 49, -32764, "Not in manual mode" ) assert_prop( 5, 0, "AF Assist Beam should be Off" ) assert_prop( 6, 0, "Focus Mode should be Normal" ) assert_prop( 8, 0, "AiAF Mode should be On" ) assert_prop( 21, 0, "Auto Rotate should be Off" ) assert_prop( 29, 0, "Bracket Mode should be None" ) assert_prop( 57, 0, "Picture Mode should be Superfine" ) assert_prop( 66, 0, "Date Stamp should be Off" ) assert_prop( 95, 0, "Digital Zoom should be None" ) assert_prop( 102, 0, "Drive Mode should be Single" ) assert_prop( 133, 0, "Manual Focus Mode should be Off" ) assert_prop( 143, 2, "Flash Mode should be Off" ) assert_prop( 149, 100, "ISO Mode should be 100" ) assert_prop( 218, 0, "Picture Size should be L" ) assert_prop( 268, 0, "White Balance Mode should be Auto" ) assert_gt( get_time("Y"), 2009, "Unexpected year" ) assert_gt( get_time("h"), 6, "Hour appears too early" ) assert_lt( get_time("h"), 20, "Hour appears too late" ) assert_gt( get_vbatt(), 3000, "Batteries seem low" ) assert_gt( get_jpg_count(), ns, "Insufficient card space" ) https://github.com/jgrahamc/gaga/blob/master/gaga-1/camera/gaga-1.lua#L96
  • 21. Self-Checking  Example: ALTAS 3  Makes sure uBlox GPS will work at high altitude; fixes it if not if((count % 10) == 0) { digitalWrite(6, LOW); checkNAV(); delay(1000); if(navmode != 6){ setupGPS(); delay(1000); } checkNAV(); delay(1000); digitalWrite(6, HIGH); } https://github.com/jamescoxon/Atlas-Flight-Computer/blob/master/Atlas3/Atlas3_3.pde#L3
  • 22. Unexpected Error Handling def temperature(): t = at.cmd( 'AT#TEMPMON=1' ) # Command returns something like: # # #TEMPMEAS: 0,28 # # OK # # So split on whitespace first to isolate the temperate 0,28 # and then split on comma to get the temperature w = t.split() if len(w) < 2: logger.log( "Temperature read returned %s" % t ) return -1000 m = w[1].split(',') if len(m) != 2: logger.log( "Temperature read returned %s" % t ) return -1000 else: return int(m[1]) https://github.com/jgrahamc/gaga/blob/master/gaga-1/recovery/util.py
  • 23. Handle Exceptions  If your language can generate exceptions then you‟d better handle them!  Example: GAGA-1 ◦ Recovery computer used Python ◦ Exception could have killed it ◦ Global exception handler except: logger.log( "Caught exception in main loop: %s" % sys.exc_info()[1] )  Bonus: What‟s wrong with that code? https://github.com/jgrahamc/gaga/blob/master/gaga-1/recovery/gaga-1.py#L144
  • 24. Simulation  Simulate a flight  Example: UKHAS wiki has example of using a PC as a fake GPS http://www.ukhas.org.uk/guides:common_coding_errors_payload_testing  Example: GAGA-1 ◦ To test the embedded Telit module wrote modules that faked the entire Telit Python interface. https://github.com/jgrahamc/gaga/blob/master/gaga-1/recovery/GPS.py https://github.com/jgrahamc/gaga/blob/master/gaga-1/recovery/MDM.py
  • 25. Simplify, Simplify, Simplify  Make your code as simple as possible  Never have „duplicated‟ or „copy and paste‟ code  Break it up into small functions that you understand  Make sure you understand the limitations of the functions you call
  • 26. Unit Test  Break your program up into small, separate functions  Write tests that call that function and make sure it does what you expect.  Lots of ways to do this ◦ Use something like cpptest ◦ ArduinoUnit ◦ Write your own test program
  • 27. Unit Test Example  In the bad APRS program  Turn metres to feet code into a separate function: int m_to_f(int m) assertEquals(m_to_f(1000),3300) assertEquals(m_to_f(2000),6600) assertEquals(m_to_f(3000),9900) assertEquals(m_to_f(4000),13200) assertEquals(m_to_f(5000),16500) assertEquals(m_to_f(6000),19800) assertEquals(m_to_f(7000),23100) assertEquals(m_to_f(8000),26400) assertEquals(m_to_f(9000),29700) assertEquals(m_to_f(10000),33000)
  • 28. Write Log Files  Write detailed log files to non-volatile memory for post flight debugging  Data sent via RTTY or APRS is limited  Log exceptions and errors in detail  Make sure you have a timestamp
  • 29. Perform system testing  Test your entire system before flight ◦ Put your tracker in the garden ◦ Get a GPS lock ◦ Listen to the RTTY on your radio ◦ Look at the decoded RTTY on your computer ◦ Test uploaded data on the tracker* ◦ *I didn‟t do that step, on the day people had to fix the tracker for me.