суббота, 19 июля 2025 г.

sass instructions: LUT operations

I was asked yesterday why I didn't transformed sample from my previous record

iadd r8, r2, r8 ; r8 = r2 + r8
iadd r8, r8, r8 ; r8 = r8 + r8
iadd r8, r8, ur4 ; r8 = r8 + ur4

to more simple

imad r8, r8, 2, ur4 ; r8 = r8 * 2 + ur4

While this is technically correct the problem here - ISA is non-orthogonal. You can use my ina to check available forms of IMAD for universal registers - and suddenly we will discover that it has only 2 forms

  1. @Pg IMAD E:wide E:fmt E:Rd E:Pu E:Ra E:reuse_src_a E:Rb E:reuse_src_b -E:URc
  2. @Pg IMAD E:wide E:fmt E:Rd E:Pu E:Ra E:reuse_src_a E:URb -E:Rc E:reuse_src_c

And no forms with imm value for Ra/Rb. So you can generate only something like:

imad r8, r8, rXX, ur4

And for UIMAD with imm values we have forms with universal registers only:

  1. @UPg UIMAD E:wide E:fmt E:X E:URd E:UPu E:URa ,Sb ~E:URc !E:UPp
  2. @UPg UIMAD E:wide E:fmt E:URd E:UPu E:URa ,Sb -E:URc
  3. etc

But all this is just kids games compared to LUT operations. In short - you can have 255 combinations of logical operations over 3 operands driven by index. nvdisasm shows them like:

LOP3.LUT R0, R3, R0, RZ, 0x30, !PT 

Very informative, yeah. So I employed sympy to generate table of simplified expressions - however I am too old and lazy to write python scripts. So pretty obvious solution:

  • make perl script to enumerate all possible combinations and generate python script
  • which in turn generates string table
  • and then sed add quotes and commas
And now my disasm shows much clearer output:
LOP3.LUT PT,R0,R3,R0,RZ, 0x30,!PT &req={5}; LUT 30: a & ~b
So here a = R3, b = R0 and result R0 = R3 & ~R0

Комментариев нет:

Отправить комментарий