пятница, 27 июня 2025 г.

curse of IMAD

Found strange case while disassembly some forms of IMAD (btw raison d'être of GPU). Official nvdisasm shows:

IMAD.WIDE R2, R7, R6, c[0x0][0x168] ; /* 0x00005a0007027625 */

my nvd:

; IMAD line 63362 n 1196 15 render items 1 missed: wide
 /*40*/  IMAD R2,P7,R7,R6,c[0][0x168] &req={0}; 

Problem here not only missed P7 - at least it has default value: 

CLASS "imad_wide__RRC_RRC"
FORMAT PREDICATE @[!]Predicate(PT):Pg Opcode /WIDEONLY:wide /FMT("S32"):fmt
Register:Rd
','Predicate("PT"):Pu
','Register:Ra {/REUSE("noreuse"):reuse_src_a}
','Register:Rb {/REUSE("noreuse"):reuse_src_b}
',' [-] C:Sc[UImm(5/0*):Sc_bank]*   [SImm(17)*:Sc_addr]

Both P7 & PT has the same value 7 (and btw wide does not have corresponding encoding field). Mask for this instruction ends with "011000100101" - 0x5

Main problem is that IMAD with form Reg, Reg, Reg has another mask:

CLASS "imad__RRC_RRC"
FORMAT PREDICATE @[!]Predicate(PT):Pg Opcode /LOOnly("LO"):wide /FMT("S32"):fmt
Register:Rd
','Register:Ra {/REUSE("noreuse"):reuse_src_a}
','Register:Rb {/REUSE("noreuse"):reuse_src_b}
',' [-] C:Sc[UImm(5/0*):Sc_bank]*   [SImm(17)*:Sc_addr] 

mask ends with "011000100100" - 0x4

As you can see original instruction bytes is  0x00005a0007027625 - nvdisasm just produced incorrect output

Why this happens? I have hypothesis that Nvidia just don't have own official sass asm and so output of nvdisasm never used/verified

Комментариев нет:

Отправить комментарий