Unfortunately, the only sass assembler I know of has several drawbacks:
- it's inactive last couple of years. I dropped email to his author and he didn't replied. Hope he is well
- it don't support modern sm architectures sm1xx
- it's matmul solver sometimes produces wrong instructions
- and it don't support many EIATTRS
The last problem is not related with CuAssembler itself - it is more general: seems that nvdisasm produces output which cannot be used to assembly cubin files
Also we still don't know format of some sections like SHT_CUDA_RELOCINFO. All this makes task of rebuilding cubin files very hard
However do we really need to rebuild cubin files? In my experience 99.9% of desired patches are just set/remove some instructions attributes like register reusing/caching policy/wait groups for USCHED_INFO etc - just boring tuning to squeeze out the last couple of percent of productivity
So the flow of thought was something like
- it would be good to make plugin for hex-editor to disasm sass instruction at some known offset and show GUI where I could patch some fields
- I am talentless at creating GUI - so perhaps it would be better to dump instructions fields in text form and then just edit it
- hey - if you can parse this text representation and patch it back to sass - you don't need hex-editor at all - you could just use sed-like tool to patch instructions via script
and so being lazy and impatient I wrote such tool - it's called ced. Name similarity to sed is not coincidence - it allows you run text script to patch or replace some sass instructions inside cubin files
Syntax of ced scripts
selecting section/function to patch
- s section_index - can be obtained from readelf -S or from my nvd
- sn section_name
- fn function_name. Note that single section can contain several functions
replacing instruction
patching selected fields of instruction
abSize
CInteger@U8 CInteger@U8 -> 0
CInteger@U8 CInteger@S8 -> 1
CInteger@U8 CInteger@U16 -> 2
CInteger@U8 CInteger@S16 -> 3
CInteger@U8 CInteger@"32" -> 4
...
Command line options
ced [options] path2cubin path2script
- -k - dump fields and values of instructions
- -v - verbose mode, like dump fields names etc
- -t - dump symbols
- -d - dump many useless stuff for debugging
Limitations
- arch specific
- unknown
Building
For historical reason all source code contained in subdir test. First edit test/Makefile for ELFIO/FP16 headers directory and then just run
cd test
make
Next you need to build smxx.so for your CUDA card - this involves translation from MD to c++ code with giant perl script so you also must install perl (I used 5.30 but probably any standard perl5 in your linux distro will be ok). I used Confess module for debugging - you can either install it from CPAN or just remove it from PERL_OPTS in test/Makefile
Final step - set and export env var SM_DIR to full path of directory with smxx.so shared libraries:
export SM_DIR=`pwd`
Happy hacking!
Комментариев нет:
Отправить комментарий