I've made native sass disasm - just adding c++ codegen (can be produced by ead.pl with -C option). It works via dynamic loading of right disasm module - see list of supported architectures in map s_sms. For now it supports only operands dump with -O option - not rendered yet (bcs rewriting bunch of perl code with duck-types to C++ is boring and tedious work). Also you can dump attributes with -e option. You can make those modules with something like "make sm90.so". Btw dumb gcc allocates for local vars ~600kb on stack and with -Os option it compiles each module for 10 minutes with stack consumption shrink to normal values)
Tests show zero unrecognized instructions (and I am truly proud of this), however if you will find such - I also add option -N to dump it's content to bit-mask, which you then can pass to ead.pl with the same -N option to see what happened
On the other side it seems that nvidia trying to hide something important from us - let's check libcublas.so from v12 - we can notice lots of sections
- .nv.merc.nv.info - genuine nvdiasm unable to show their content
- .nv.capmerc.text - however, the instructions they contain are clearly in some other format and cannot be disassembled - I add -s option to disasm single section by it's index, so you can try it by yourself
- and they obviously has corresponding relocs in sections .nv.merc.rela.text
- and even .nv.merc.rela.debug_frame & .nv.merc.symtab
Known problems
Duplicates
[NonZeroRegister:Ra + SImm(17/0)*:Ra_offset]
[ZeroRegister("RZ"):Ra + SImm(17/0)*:Ra_offset]
The problem here is that enum NonZeroRegister contains value 255 for RZ too, so they are totally indistinguishable - for first mask for Ra will be xxxxxxxx. For second form will be generated mask 11111111 for Ra field. As you can guess input will match both and ead.pl shows exactly the same decoding:
LDC R1,C[0x0][0x28]
Branches
...RSImm(58)*:sImm
PROPERTIES INSTRUCTION_TYPE = INST_TYPE_DECOUPLED_BRU_DEPBAR_RD_SCBD; BRANCH_TARGET_INDEX = INDEX(sImm) ; BRANCH_TYPE = BRT_BRANCH ;
FORMAT PREDICATE @[!]Predicate(PT):Pg Opcode /DEPTH("nodepth"):depth [!]Predicate("PT"):Pp ','Register:Ra SImm(58/0)*:Ra_offset
PROPERTIES INSTRUCTION_TYPE = INST_TYPE_DECOUPLED_BRU_DEPBAR_RD_SCBD; BRANCH_TARGET_INDEX = INDEX(Ra) ; BRANCH_TYPE = BRT_BRANCH ;
FORMAT PREDICATE @[!]Predicate(PT):Pg Opcode /ABSONLY:abs /CALL_DEPTH("INC"):depth [!]Predicate("PT"):Pp ','C:Sa[UImm(5/0*):Sa_bank]* [SImm(17)*:Sa_addr]
PROPERTIES INSTRUCTION_TYPE = INST_TYPE_DECOUPLED_BRU_DEPBAR_RD_SCBD; BRANCH_TARGET_INDEX = INDEX(Sa) ; BRANCH_TYPE = BRT_CALL ;
Other quirks
FORMAT PREDICATE @[!]Predicate(PT):Pg Opcode /U(noU):uniform /LMT(noLMT):lmt { CC(CC):TestCC/Test(T):CCTest }
ENCODING Opcode12 = Opcode; Pred = Pg; PredNot = Pg@not; CCC_1 = Test; CA = 0; Imm24 = sImm; U = U; LMT = LMT; !NencBRA; !RegA;
There is no encoding for field TestCC. And for CCTest too. Instead we have 5bit field just Test and enum Test has all 32 values, and T = 0xf. On other side CC is 1 bit enum with only value CC=1. I have zero ideas which bit from Test should be used for CC
Комментариев нет:
Отправить комментарий