вторник, 7 июня 2022 г.

position independent sw64 code

lets see how PIC looks like for sw64 on the example of a function from libLLVM-7.so.1 (huge shared library - size 45Mb):

1000ED0   ldih    GP, PV, 0x1D3

PV almost always contains address of called function so value of GP now 2D30ED0
1000ED4   ldi     GP, GP, -0x1290

value of GP now 2D30ED0 - 1290 = 2D2FC40. I expected that this base address always located inside .got but this is not true - it can lie anywhere, sometimes even not inside elf module! All remaining refs use this base address in GP register:

1000ED8   ldih    PV, GP, 0
1000EDC   ldl     PV, PV, -0x4EC0
...
1000F14   call    RA, PV, 0
1000F18   ldih    GP, RA, 0x1D3 ; 2D30F18
1000F20   ldi     GP, GP, -0x12D8 ; 2D2FC40

wait, WHAT? they use return address in RA to fill GP with the same value 2D2FC40. and even worse - they restore value of GP even in epilogue where it is not used

Lets estimate size overhead. libLLVM-7.so.1 has 41337 functions, 8432116 instructions and 781997 to set value of GP. rate 781997 / 8432116 = 0.092740
Lets assume that each function anyway need to setup GP, so required number of instructions is 41337 * 2 = 82674. remaining is 781997 - 82674 = 699323
remove unneeded GP setups from epilogues: 699323 - 82674 = 616649
this amount easy can be reduced in half - just store calculated value of GP in stack with stl gp, sp, offset (+41337 instructions) and then pop it when needed with ldl gp, sp, offset 
So actual amount of instructions could be 616649 / 2 + 41337 + 82674 = 432336
new rate: 432336 / 8432116 = 0.05127
overhead is 0.092740 - 0.05127 = 4.1%
cool, almost 2Mb of code is just unnecessary

Комментариев нет:

Отправить комментарий