GDB is great for stepping through machine code to figure out what is going on.
It uses debug information under the hood to present you with a tidy backtrace
and also determine how much machine code to print when you type disassemble.
This debug information comes from your compiler. Clang, GCC, rustc, etc all
produce debug data in a format called DWARF and then embed that debug
information inside the binary (ELF, Mach-O, …) when you do -ggdb or
equivalent.
Unfortunately, this means that by default, GDB has no idea what is going on if
you break in a JIT-compiled function. You can step instruction-by-instruction
and whatnot, but that’s about it. This is because the current instruction
pointer is nowhere to be found in any of the existing debug info tables from
the host runtime code, so your terminal is filled with ???. See this example
from the V8 docs:
#8 0x08281674 in v8::internal::Runtime_SetProperty (args=...) at src/runtime.cc:3758
#9 0xf5cae28e in ?? ()
#10 0xf5cc3a0a in ?? ()
#11 0xf5cc38f4 in ?? ()
#12 0xf5cbef19 in ?? ()
#13 0xf5cb09a2 in ?? ()
#14 0x0809e0a5 in v8::internal::Invoke (...) at src/execution.cc:97
Fortunately, there is a JIT interface to GDB. If you implement a couple of functions in your JIT and run them every time you finish compiling a function, you can get the debugging niceties for your JIT code too. See again a V8 example:
#6 0x082857fc in v8::internal::Runtime_SetProperty (args=...) at src/runtime.cc:3758
#7 0xf5cae28e in ?? ()
#8 0xf5cc3a0a in loop () at test.js:6
#9 0xf5cc38f4 in test.js () at test.js:13
#10 0xf5cbef19 in ?? ()
#11 0xf5cb09a2 in ?? ()
#12 0x0809e1f9 in v8::internal::Invoke (...) at src/execution.cc:97
Unfortunately, the GDB docs are somewhat sparse. So I went spelunking through a bunch of different projects to try and understand what is going on.
GDB expects your runtime to expose a function called
__jit_debug_register_code and a global variable called
__jit_debug_descriptor. GDB automatically adds its own internal breakpoints
at this function, if it exists. Then, when you compile code, you call this
function from your runtime.
In slightly more detail:
jit_code_entry linked list node that points at your object
(“symfile”)__jit_debug_descriptor linked list__jit_debug_register_code, which gives GDB control of the process so it can
pick up the new function’s metadata__jit_debug_register_code againThis is why you see compiler projects such as V8 including large swaths of code just to make object files:
RepackEntries), but I’m not sure exactly what it doesBecause this is a huge hassle, GDB also has a newer interface that does not require making an ELF/Mach-O/…+DWARF object.
This new interface requires writing a binary format of your choice. You make the writer and you make the reader. Then, when you are in GDB, you load your reader as a shared object.
The reader must implement the interface specified by GDB:
GDB_DECLARE_GPL_COMPATIBLE_READER;
extern struct gdb_reader_funcs *gdb_init_reader (void);
struct gdb_reader_funcs
{
/* Must be set to GDB_READER_INTERFACE_VERSION. */
int reader_version;
/* For use by the reader. */
void *priv_data;
gdb_read_debug_info *read;
gdb_unwind_frame *unwind;
gdb_get_frame_id *get_frame_id;
gdb_destroy_reader *destroy;
};
Here are some details from Sanjoy Das.
Only a few runtimes implement this interface:
I think it also requires at least the reader to proclaim it is GPL via the
macro GDB_DECLARE_GPL_COMPATIBLE_READER.
Since I wrote about the perf map interface recently, I have it on my mind. Why can’t we use it in GDB?
I suppose it would be possible to try and upstream a patch to GDB to support
the Linux perf map interface for JITs. After all, why shouldn’t it be able to
automatically pick up symbols from /tmp/perf-...? That would be great
baseline debug info for “free”.
In the meantime, maybe it is reasonable to create a re-usable custom debug reader:
/tmp/perf-... as you normally would/tmp the magic number?)It would be less flexible than both the DWARF and custom readers support: it would only be able to handle filename and code region. No embedding source code for GDB to display in your debugger. But maybe that is okay for a partial solution?
Update: Here is my small attempt at such a plugin.
V8 notes in their GDB JIT docs that because the JIT interface is a linked list and we only keep a pointer to the head, we get O(n2) behavior. Bummer. This becomes especially noticeable since they register additional code objects not just for functions, but also trampolines, cache stubs, etc.
Since GDB expects the code pointer in your symbol object file not to move, you have to make sure to have a stable symbol file pointer and stable executable code pointer. To make this happen, V8 disables its moving GC.
Additionally, if your compiled function gets collected, you have to make sure to unregister the function. Instead of doing this eagerly, ART treats the GDB JIT linked list as a weakref and periodically removes dead code entries from it.