вторник, 23 июня 2026 г.

RE of PTX grammar from ptxas, part 4

Parts 1, 2 & 3

First of all, it should be noted that the mask of instruction attributes has size 20 bytes, so I updated dump for them.
structure for this attributes descriptor has size 0xd8 bytes and some fields:
  • mask at offset 0
  • name of instruction at 0xC8
  • index at 0xD0
Instructions selecting first by name and then right form by operand types. This means that while the order of the attributes does not matter, the relative order of the operand types is important - leftmost is type of operand 0, next is type of operand 1 and so on

 

Names of numerical pseudo-instructions

in part 3 I pointed out that there are 473 names consisting only of numbers, like "1030557441". Grigory Evko suggested that this is adler32 hash from builtin function names, so I found huge function for instruction 0xc6 (_gen_proto) returning 1078 prototypes like
.weak .func (.reg .f32 %fv1) __cuda_sm20_div_rz_f32 (.reg .f32 %fa1, .reg .f32 %fa2)
and then intersected them by hash - so now we know all real names


EBNF grammar

You can see it here
To build run iptx.pl -e
The last two columns are operand suffix & encoding 

 

How complete it is?

That's good question. If we accept that attributes descriptors contain full list of attributes for each instruction then 20 bytes masks has 121 non-zero bits:
FD FF FF FF F1 FF FF 9F F9 FF E7 CF FF F3 DF FF FF 00 00 00
I was able to identify 114 of them - this is 94%
Also currently I extracted 119 tables with attributes names and only 11 are still not connected (check them with iptx.pl -t)

On other hand in function for attributes processing there are 3 switch tables with 139, 140 & 173 cases (last one has ~90% of entries with error "Unexpected instruction types specified")

Комментариев нет:

Отправить комментарий