The Road to Debugging Success

 

A colleague recently asked me how to debug a Linux embedded system that crashed in the Unix shell (and only there), when its memory got filled through the buffer cache. He added that when he emptied the buffer cache the crash no longer occurred.

One may indeed solve this problem through technical brilliance: understanding the interaction that can lead to this situation. This is probably the input my colleague expected from me; a recommendation to add FOOBAR=xyzzy in file /boot/config.bliz. However, in most cases we lack the knowledge and experience for troubleshooting problems through this path. Therefore, the road that actually works involves more humdrum drudge work.

  • Examining logs and adding logging statements
  • Tracing packets or system calls
  • Enabling debugging support and running a debugger
  • Adding test cases
  • Reviewing recent code changes
  • Constructing a minimal failing test case

In this case I recommended two actions:

  • Compile bash with debugging support to see where it crashes
  • Run a memory diagnostic test

Both actions require work: downloading the source code of bash and compiling it, or finding a suitable memory diagnostic test and hooking up a console to the embedded system. However, these are the actions that can uncover the fault, or at least provide clues for narrowing down on it.

In debugging, the role of brilliance is overrated.

Comments   Toot! Share


Last modified: Thursday, February 16, 2017 10:55 am

Creative Commons Licence BY NC

Unless otherwise expressly stated, all original material on this page created by Diomidis Spinellis is licensed under a Creative Commons Attribution-NonCommercial 4.0 International License.