Over the past few months, the Avatar code has been having a few crashes that leave no recognisable/usable stack for GDB to read.  It’s also been having a few hangs, with strace indicating futex_wait (in an application that doesn’t use threads), and gdb of the core (after killing the process) indicating __kernel_vsyscall.  Unfortunately, I’m not really a programmer/coder, so my efforts to track the cause down have probably been a bit haphazard.

The most annoying part so far is that yesterday we encountered the hang situation 4 times, so I enabled a strace against the binary, and channeled the output across the ‘net to my PC where I’ve got a rolling 40,000 line buffer.  24 hours later, at a constant 2 Mbit/s, and we still haven’t hung.

I call Heisenbug.