I've add tracking of registers to both nvd & pa - you can use -T option. And I have lots of bad news
nvdisasm lies
CS2R R100, SRZ
FORMAT PREDICATE @[!]Predicate(PT):Pg Opcode /QInteger("64"):sz
Register:Rd
','SpecialRegister:SRa
PREDICATES
IDEST_SIZE = 32 + ((sz==`QInteger@"64"))*32;
FORMAT PREDICATE @[!]UniformPredicate(UPT):UPg Opcode
UniformRegister:URd
PREDICATES
IDEST_SIZE = 32;
lack of documentation
Predicates
FORMAT PREDICATE @[!]Predicate(PT):Pg Opcode /ICmpAll:icmp /REDUX_SZ("S32"):fmt /Bop:bop /EXONLY:ex
Predicate:Pu
','Predicate:Pv
PREDICATES
IDEST_SIZE = 0;
IDEST2_SIZE = 0;
FORMAT PREDICATE @[!]Predicate(PT):Pg Opcode /OFMT_F16_V2_BF16_V2("F16_V2"):ofmt /FCMP:cmp /H_AND("noh_and"):h_and /FTZ("noftz"):ft
z /Bop:bop
Predicate:Pu
','Predicate:Pv
PREDICATES
IDEST_SIZE = 0;
IDEST2_SIZE = 0;
I don't know if they set their first predicate Pu only or both Pu & Pv. Btw famous IMAD has very curious MD for some forms:
FORMAT PREDICATE @[!]Predicate(PT):Pg Opcode /HIONLY:wide /FMT("S32"):fmt /XONLY:X
Register:Rd
','Predicate("PT"):Pu
','Register:Ra {/REUSE("noreuse"):reuse_src_a}
','Register:Rb {/REUSE("noreuse"):reuse_src_b}
',' [~] Register:Rc {/REUSE("noreuse"):reuse_src_c}
',' [!]Predicate:Pp
Usually IMAD means multiply and add, so Rd = Ra * Rb + Rc. But here we have two predicates, so should it have semantic Rd = Ra * Rb * Pu + Rc * Pp?
Barriers
FORMAT PREDICATE @[!]Predicate(PT):Pg Opcode /ONLY32:sz
BD:barReg
','CBU_STATE_NONBAR:cbu_state
PREDICATES
IDEST_SIZE = 0;
IDEST2_SIZE = 0;
BD "B10"=10 , "B11"=11 , "B14"=14 , "B4"=4 , "B5"=5 , "B6"=6 , "B7"=7 , "B0"=0 , "B1"=1 , "B2"=2 , "B3"=3 , "B15"=15 , "B12"=12 , "B8"=8 , "B9"=9 , "B13"=13;
ptxas produces code that is far from perfect
Never used registers
LDC R1, c[0x0][0x17c]
mov r154, r3
@p0 bra ; to some block with EXIT
mov r154, r3 ; srsly?
iadd r8, r2, r8 ; r8 = r2 + r8
last two instructions can be replaced with single
iadd r8, r8, r8 ; r8 = r8 + r8
iadd r8, r8, ur4 ; r8 = r8 + ur4
iadd3 r8, r8, ur4, r8
Комментариев нет:
Отправить комментарий