воскресенье, 30 июля 2023 г.

gcc plugin to collect cross-references, part 1

Every user of IDA Pro likes cross-references - they are very useful but applicable for objects in global memory only. What if I want to have cross-references for virtual methods and class/record fields - like what functions some specific virtual method was called from? Unfortunately IDA Pro cannot shows this - partially because this information is not stored in debug info and also due to weak algo for types propagation. Call of virtual method typically looks similar to

 mov rax, [rbp+var_8] ; this
 mov     rax, [rax]   ;
this._vptr
 add     rax, 10h
 mov     rcx, [rax]   ; load method from vtable, why not
mov rcx, [rax+0x10]?
 call    rcx ; or even better just call [rax+0x10]?

Lets think where we can get such kind of cross-references - sure compiler must have it somewhere inside to generate native code, right? So generally speaking compiler is your next friend (right after disassembler and debugger).

Run gcc with -c -fdump-final-insns options on simple C++ test file and check how call of virtual method looks like:
(call_insn # 0 0 2 (set (reg:DI 0 ax)
        (call (mem:QI (reg/f:DI 1 dx [orig:85 _4 ] [85]) [ *OBJ_TYPE_REF(_4;this_7(D)->3B) S1 A8])
            (const_int 0 [0]))) "vtest.cc":31:21# {*call_value}

What? What is _4, which type has this and what means ->3B instead of method name? Looking ahead, I can say that actually all needed information really stored in RTL thought function dump_generic_node (from tree-pretty-print.cc) is just too lazy to show it properly. Seems that we can develop gcc plugin to extract this cross-references (in fact, the first couple of months of development this was not at all obvious)

why gcc?

bcs gcc is standard de-facto and you can expect that you will able to build with it almost any sources (usually after numerous loud curses finally read the documentation), even on windows. Nevertheless lets consider other popular alternatives