четверг, 20 июня 2024 г.

frame sizes in dwarfdump

Add today dumping of stack frame sizes to my dwarfdump (well, where they are exists). Format of .debug_frame section obviously was invented by martian misantrophes so patch is huge and ugly

Sample of output for some random function from mips kernel:

// Addr 0x183D27C .text
// Frame Size 18
// FileName: drivers/char/random.c
// LocalVars:
//  LVar0, tag 6965A71
//    int ret
//  LVar1, tag 6965A8F
//    bool branch
ssize_t random_read_iter(struct kiocb* kiocb,struct iov_iter* iter);

and the same in JSON format: 

"110516780":{"type":"function","file":"drivers/char/random.c","type_id":"110427157","name":"random_read_iter","addr":"25416316","section":".text","frame_size":"24","params":[{"name":"kiocb","id":"110516807","type_id":"110466611"},{"name":"iter","id":"110516828","type_id":"110470510"}],"lvars":["110516849":{"type":"var","file":"drivers/char/random.c","owner":"110516780","type_id":"110426600","name":"ret"},
"110516879":{"type":"var","file":"drivers/char/random.c","owner":"110516780","type_id":"110427073","name":"branch"}]}

Unfortunately dwarfdump can't work with kernel modules bcs they are actually just object files and for this sad reason they have relocations even for debug sections. So to properly deal with this files I need to apply relocations first and this is arch-specific action (which I prefer to avoid)

Binutils has 2 solution of this problem:

  1. objcopy calls bfd_simple_get_relocated_section_contents from libbfd.so and this means that tool should have dependency from it
  2. readelf has it's own relocation code in apply_relocations and this is huge pile of code

And I really don’t like both of the above approaches

вторник, 18 июня 2024 г.

function stack size in GCC

Let's continue our wandering in endless dead end
GCC has struct stack_usage and even field su inside struct function. And this last is accessible as global var cfun. So it's should be easy to patch dwarf2out.cc yet one more time for example to extract stack size (like function output_stack_usage_1 do) and put it inside DW_AT_frame_base block, right?

NO

As you can see function allocate_stack_usage_info called only when
In other cases field function->su is zero
There should probably be heartbreaking conclusion about quality of opensource in general and gcc and in particular...

суббота, 15 июня 2024 г.

stack frames size in DWARF

As you might suspect, the stack size in the kernel is quite meager so it's very important to know how much of stack occupy your driver. So I conducted inhumane experiments on my own driver lkcd to check if stack frame size can be extracted from DWARF debug info. Biggest function in my driver is lkcd_ioctl so lets explore it

mips

Prolog of lkcd_ioctl looks like:
addiu   sp,sp,-688
 
Lets try to find this number in output of objdump -g

воскресенье, 9 июня 2024 г.

kotest fix for mips

kotest refused to count string literals for MIPS kernel modules. Reason was in that gcc does not put sizes/object types of unnamed string literals - it looks in asm files like
$LC0:
        .ascii  "const string %f\012\000"

At the same time it does for named literals:
        .type   fmt_msg, @object
        .size   fmt_msg, 15
fmt_msg:
        .ascii  "enter with %d\012\000"

I am too lazy to investigate which ancient specification from past century it follows. Fortunately this is easy repairable problem - just calculate size of symbol as distance to next one (or till end of section). Since I suspect that this is not the only architecture with similar gcc behavior, I add -f option to do such kind of sizes recalculation

Some results for mips32 kernel 6.0:
find ~/linux60/ -type f -name "*.ko" | xargs ./kotest | awk -f total.awk
1890293
1497092

potential memory savings is almost 1.9Mb from moving string literals used only in .init.text + yet almost 1.5Mb from unloading some unnecessary sections

понедельник, 27 мая 2024 г.

kotest

Linux kernel allows you to have discardable sections in LKM and this creates problem of links between two kind of memory. As you can guess keeping pointer to already unloaded area can be very dangerous so I made simple tool kotest to check such kind of links. It divides sections of ELF file into two category and check all relocations - relocs between areas of the same type considered as ok. To keep track if some symbol from persistent area is used only from discardable sections I also use couple of reference counts

 

command line options

  • -b take into account variables in .bss
  • -h make hexdump of found vars
  • -v verbose mode 
To run on lots of LKMs use something like
find path_to_kernel_root -type f -name "*.ko" | xargs kotest
 
To get summary you can run awk -f total.awk on output of previous command

 

it is reliable to use for analysis only fixups?

No - there are false positives. Consider excerpt from ip_vs.ko, function ip_vs_register_nl_ioctl:
.init.text:0000000000016155   mov     rdi, offset ip_vs_genl_family
.init.text:000000000001615C   mov     cs:ip_vs_genl_family.module, offset __this_module
.init.text:0000000000016167   mov     cs:ip_vs_genl_family.ops, offset ip_vs_genl_ops
.init.text:0000000000016172   mov     cs:ip_vs_genl_family.mcgrps, 0
.init.text:000000000001617D   mov     qword ptr cs:ip_vs_genl_family.n_ops, 10h
.init.text:0000000000016188   call    __genl_register_family

it turns out that ip_vs_genl_ops (located inside .rodata section) referred only from function ip_vs_register_nl_ioctl in .init.text, but actually it cannot be moved to discardable area bcs it was registered with genl_register_family. Kotest cannot analyze usage of addresses and so gives FP:
.rodata + 5A0 (ip_vs_genl_ops) rref 1 xref 0 add size 768
 
Another issue is string merging by ld. Lets assume that we have couple of strings: "foobar" referred from some function(s) in .text section and "bar" referred from code in .init.text. Linker can (and usually do) put only string "foobar" into .rodata and fixup to string "bar" will point to middle of this single string "foobar"
 
So consider output of kotest as estimated upper bound of memory which can be potentially saved by moving into discardable area

 

why not use famous objtool?

Because of NIH syndrome objtool employs disassembler and as consequence it is slow and supports only few architectures. Kotest is based on elfio and can process both 32 & 64 bit ELF files from any arch (and it is very fast)


LKM loading

четверг, 16 мая 2024 г.

linux input handles

Try convince me that input_register_handle is not best place for installing keylogger, it's even strange that they were embarrassed to connect there their holy cow eBPF. Long story short - there are 3 structures in linux kernel for servicing of input devices:

  1. input_dev chained in list (sure non-exported) input_dev_list
  2. input_handler chained in list input_handler_list
  3. input_handle with pointer to input_handler and attached to input_dev (in list h_list)

So keylogger could

  • just call input_register_handle
  • to be more stealthy - patch functions pointers in already registered input_handler (very convenient that sysrq_handler missed out method event)
  • attach own input_handle to desired input_dev but without registering corresponding input_handler - yes, this is perfectly legal
  • patch functions pointers directly in input_dev

Guess in three tries what exactly you can extract from sysfs?
So I add to my lkcd dumping of all above-mentioned structures. Sample of output

среда, 8 мая 2024 г.

asm injection stub

Lets check what this stub should do being injected in some linux process via __malloc_hook/__free_hook (btw this implicitly means than you cannot use this dirty hack for processes linked with musl or uClibc - they just don't have those hooks)
  • bcs our stub can be called from two different hooks we should store somewhere via which entry point we was called
  • restore old hooks values
  • call dlopen/dlsym and then target function (and pass it address of injection stub for delayed munmap. No, you can't free those memory directly in your target function - try to guess why)
  • get right old hook and jump to it if it was installed or just return to code called __malloc_hook somewhere in libc

So I collected all parameters to do job in table dtab consisting from 6 pointers

  1. __malloc_hook address
  2. old value of __malloc_hook
  3. __free_hook address
  4. old value of __free_hook
  5. pointer to dlopen
  6. pointer to dlsym
after those table we also has couple of string constants for injected.so full path and function name. Also bcs we must setup 2 entry point I decided to put 1 byte with distance between first and second (to make injection logic more universal) right after dtab. Sounds easy, so lets check how this logic can be implemented on some still living processors (given that RIP alpha, sparc, hp-pa etc)