The Road to Debugging Success
A colleague recently asked me how to debug a Linux embedded system that crashed in the Unix shell (and only there), when its memory got filled through the buffer cache. He added that when he emptied the buffer cache the crash no longer occurred.
One may indeed solve this problem through technical brilliance: understanding the interaction that can lead to this situation. This is probably the input my colleague expected from me; a recommendation to add
FOOBAR=xyzzy in file
/boot/config.bliz. However, in most cases we lack the knowledge and experience for troubleshooting problems through this path. Therefore, the road that actually works involves more humdrum drudge work.
- Examining logs and adding logging statements
- Tracing packets or system calls
- Enabling debugging support and running a debugger
- Adding test cases
- Reviewing recent code changes
- Constructing a minimal failing test case
In this case I recommended two actions:
- Compile bash with debugging support to see where it crashes
- Run a memory diagnostic test
Both actions require work: downloading the source code of bash and compiling it, or finding a suitable memory diagnostic test and hooking up a console to the embedded system. However, these are the actions that can uncover the fault, or at least provide clues for narrowing down on it.
In debugging, the role of brilliance is overrated.Read and post comments, or share through