понедельник, 10 ноября 2025 г.

barriers & registers tracking for sass disasm

Finally I add registers tracking in my perl sass disasm

Now I can do some full-featured analysis of sass - like find candidates pairs of instruction to swap/run them in so called "dual" mode - and all of this in barely 1200 LoC of perl code

Let's think what must mean for couple of instructions to be fully independent:

  1. they should belong to the same block - like in case of
      IADD R8, -R3, RZ
    .L_x_14:
      FMUL R11, R3.reuse, R3
    instructions should be treated as located in different blocks
  2. they should not depend from the same barriers
  3. they should not update registers used by each other 

So I implemented building of code-flow graph, barriers & registers tracking

Building of CFG

Unfortunately I was unable to find lean & mean algorithm for CFG building - seems that those belongs to the category 'everyone has known for a long time'. What makes it even worse - SASS address space is not contiguous on old SMs bcs instructions grouped into blocks of 7 or 3 instructions and first 64bit word of each block is control word. Besides unlike classical SSA block can have more than 2 output edges - for Indirect Branches
 
So I invented my own clumsy and not very efficient algorithm - see comments before function dg
It has 3 passes:
  1. disasm and collect labels/jumps instructions. Btw what means so called "Pre-return" or "Pre-break" etc?
  2. sort collected data by addresses and build blocks - complexity of this pass is classical for sorting O(N * log(N))
  3. resolve blocks x-references - surprisingly this is most compute-intensive part - if we have M blocks then complexity of 3rd pass is O(M * M * log(M))

It took ~400 LoC

Barriers tracking

Nothing special - just collect all used barriers and then cross-check them for couple of instructions in function sched_check

Registers tracking 

This was hardest part of work - suddenly I realized that I need not just history of registers usage for whole block but "snapshot" for each individual instruction. So I had to add this possibility first in my C++ code and then make binding to perl. As side effect I can also track registers "reusing"
 
Main logic for registers dependency check located in function is_interleaved
 
Code collecting/dumping of registers history + another ~400 LoC

Комментариев нет:

Отправить комментарий