понедельник, 11 сентября 2023 г.

location lists from dwarf5

I added during past weekend support for var location lists from DWARF5 (located in separate section .debug_loclists) in my dwarfdump. As usually lots of bugs were found

First - they presents only for functions but not for methods. Probably this is real bug and can have serious impact when debugging

Second - generated expressions is not optimal. Lets see example:

locx 53e
4FB5DF - 4FB5F4: DW_OP_piece 0x8, DW_OP_reg0 RAX, DW_OP_piece 0x8, DW_OP_breg0 RAX+0, DW_OP_lit3, DW_OP_lit8, DW_OP_mul, DW_OP_plus, DW_OP_stack_value, DW_OP_piece 0x8
4FB5F4 - 4FB60F: DW_OP_piece 0x8, DW_OP_reg2 RCX, DW_OP_piece 0x8, DW_OP_breg0 RAX+0, DW_OP_lit3, DW_OP_lit8, DW_OP_mul, DW_OP_plus, DW_OP_stack_value, DW_OP_piece 0x8

As you can see this expressions are the same but for adjacent addresses ranges. Why not use single expression for range 4FB5DF - 4FB60F?
Update: I have made patch to estimate amount of identical adjacent location lists. C++ extractor from codeql has total 149850 lists and 116 redudant or 0.077%

DW_OP_mul just pops from stack couple of values and put back result of their multiplication (see evaluation logic in method execute_stack_op), so this sub-expression can be rewritten as just DW_OP_lit24

Also it`s curious to check what other compilers support subj:

gcc

Yes, with options -g -gdwarf-5 -fvar-tracking

golang

No - they even don`t support DWARF5 at all

openwatcom v2

No - judging by the funny comments

WHEN YOU FIGURE OUT WHAT THIS FILE DOES, PLEASE DESCRIBE IT HERE!

пятница, 1 сентября 2023 г.

gcc plugin to collect cross-references, part 7

Part 1, 2, 3, 4, 5 & 6

Lets check if we can extract other kind of constants - numerical. Theoretically there are no problems - they have types INTEGER_CST, REAL_CST, COMPLEX_CST and so on. And you even can meet them - mostly in programs written in fortran
In most code they usually replaced with RTX equivalents like
  • INTEGER_CST - const_int (or const_wide_int)
  • REAL_CST - const_double

const_double is easy case but const_ints are really ubiquitous, they can appear in RTX even when they do not occur in operands of asssembler`s code. So main task is to select only small subset of them. Let`s consider what we can filter out

fields offsets

Luckily this hard part has already been solved in previous part

local variables offsets in stack

RTX has field frame_related:
1 in an INSN or a SET if this rtx is related to the call frame, either changing how we compute the frame address or saving and restoring registers in the prologue and epilogue

this flag affects both parts of set, for loading something from stack it looks something like:
set (reg:DI 0 ax [83])
        (mem/f/c:DI (plus:DI (reg/f:DI 6 bp)
                (const_int -8 [0xfffffffffffffff8]))

and for storing to stack like:
set (mem/f/c:DI (plus:DI (reg/f:DI 6 bp)
                (const_int -8 [0xfffffffffffffff8])) [4 this+0 S8 A64])
        (reg:DI 5 di [ _0 ]))

conditions

Yes, if_then_else almost always follows compare: 
(set (reg:CCZ 17 flags)
        (compare:CCZ (reg:QI 2 cx [orig:83 _2 ] [83])
            (const_int 0 [0]))) "vtest.cc":44:19 5 {*cmpqi_ccno_1}
(jump_insn 10 9 11 2 (set (pc)
        (if_then_else (eq (reg:CCZ 17 flags)
                (const_int 0 [0]))
            (label_ref 16)
            (pc))) "vtest.cc":44:19 891 {*jcc}
All these bulky constructions will be translated to just jz, so no const_int 0 will be placed in output

EH block index

like in each function call:

(expr_list:REG_EH_REGION (const_int 0 [0])