Finally I add registers tracking in my perl sass disasm
Now I can do some full-featured analysis of sass - like find candidates pairs of instruction to swap/run them in so called "dual" mode - and all of this in barely 1200 LoC of perl code
Let's think what must mean for couple of instructions to be fully independent:
- they should belong to the same block - like in case of
IADD R8, -R3, RZinstructions should be treated as located in different blocks
.L_x_14:
FMUL R11, R3.reuse, R3 - they should not depend from the same barriers
- they should not update registers used by each other
So I implemented building of code-flow graph, barriers & registers tracking
Building of CFG
Unfortunately I was unable to find lean & mean algorithm for CFG building - seems that those belongs to the category 'everyone has known for a long time'. What makes it even worse - SASS address space is not contiguous on old SMs bcs instructions grouped into blocks of 7 or 3 instructions and first 64bit word of each block is control word. Besides unlike classical SSA block can have more than 2 output edges - for Indirect Branches
So I invented my own clumsy and not very efficient algorithm - see comments before function dg
It has 3 passes:
- disasm and collect labels/jumps instructions. Btw what means so called "Pre-return" or "Pre-break" etc?
- sort collected data by addresses and build blocks - complexity of this pass is classical for sorting O(N * log(N))
- resolve blocks x-references - surprisingly this is most compute-intensive part - if we have M blocks then complexity of 3rd pass is O(M * M * log(M))
It took ~400 LoC
Barriers tracking
Nothing special - just collect all used barriers and then cross-check them for couple of instructions in function sched_check
Registers tracking
This was hardest part of work - suddenly I realized that I need not just history of registers usage for whole block but "snapshot" for each individual instruction. So I had to add this possibility first in my C++ code and then make binding to perl. As side effect I can also track registers "reusing"
Main logic for registers dependency check located in function is_interleaved
Code collecting/dumping of registers history + another ~400 LoC
Комментариев нет:
Отправить комментарий