среда, 7 декабря 2022 г.

timers in linux kernel

timers are very important artifact for forensics, for example Volatility even has plugin to dump timers from windows kernel. Unfortunately Volatility cannot dump timers from linux kernel so I made such dump in my lkcd (with -T option)

kernel timers are just structures timer_list and the most important field is

void (*function)(unsigned long); 

bcs if your machine rootkited - probably one of timers will contains address from some unknown module. timers are chained in linked list via entry field and lots of this lists stored in array vectors into per-cpu variable timer_base. As you can see there can be 2 instances of this structure - this depends from undocumented config option CONFIG_NO_HZ_COMMON

Some timers are part of so called workqueue - structure delayed_work. In such case timer_list.function contains address of exported function delayed_work_timer_fn

вторник, 29 ноября 2022 г.

linux drivers cross-compilation

Just reminder for myself how to build driver for arm64 having x64 based machine with ubuntu

Install right gcc

for arm64 we need gcc-aarch64-linux-gnu:

sudo apt-get install gcc-aarch64-linux-gnu

Build Kernel

You cannot use installed kernel and must build one for appropriate architecture - in my case for arm64 (note - gcc has prefix aarch64, C - consistency). Clone or unpack kernel source tree to some directory KROOT and then

make ARCH=arm64 CROSS_COMPILE=aarch64-linux-gnu- menuconfig
make ARCH=arm64 CROSS_COMPILE=aarch64-linux-gnu-
make ARCH=arm64 CROSS_COMPILE=aarch64-linux-gnu- modules

Patch Makefile

Usual trick is to use something like

MACHINE ?= $(shell uname -m)
ifeq ($(MACHINE),x86_64)

but this gives you arch of host machine, so you must rewrite all such cases to use ARCH variable (and to setup make -C $(KROOT))

Building

and finally you can cross-compile your driver with something like

make ARCH=arm64 CROSS_COMPILE=aarch64-linux-gnu- -f Makefile.arm64

понедельник, 31 октября 2022 г.

BTI incompatible exported functions in kernel 5.15.0-53

if BTI is enabled, the first instruction encountered after an indirect jump must be a special BTI instruction

from here

I downloaded Ubuntu for arm64 (jammy-desktop-arm64.iso) and decided to check if there are some functions with don`t contain BTI c at start

17804 such functions. System.map-5.15.0-53-generic contains 62819 functions in total. Next I just intersected them with exported - 1269

This is obvious bug - maybe in gcc (Ubuntu 11.2.0-19ubuntu1) 11.2.0

at least some of this functions are really important - like register_ftrace_function

пятница, 29 июля 2022 г.

dirty secrets of ld.so

As you can know you can set library path under linux with several ways:

  • envvar LD_LIBRARY_PATH, but it can be removed somewhere inside program so /proc/pid/environ is useless (as usually they expose via official API only useless trash but carefully hiding any really important things)
  • via option --library-path to ld.so - like /lib64/ld-linux-x86-64.so.2 --library-path path someprogram Again command line can be patched
  • via /etc/ld.so.conf - this file also can be patched after your program was launched
So good question is "is there some trusted source to see what library path was installed for some running program?"

Yes, this is ld,so itself - because it uses this data while dynamically loading some modules, So long story short: value from --library-path & LD_LIBRARY_PATH stored in variable library_path and whole directory set in rtld_search_dirs
Bad news - they are not exported and even worse - they are hard to find even using disassembler

суббота, 9 июля 2022 г.

PoC to blind pamspy

Lets disasm jit code from this spyware:

 [8] prog 0xffffb02dc0133000 id 160 len 46 jited_len 215 aux 0xffff8ccb58fab400 used_maps 1 used_btf 0 func_cnt 0
     tag: 0F 86 19 76 BC 37 68 B3
  stack_depth: 16
  num_exentries: 0
  type: 2 BPF_PROG_TYPE_KPROBE
  expected_attach_type: 0 BPF_CGROUP_INET_INGRESS
  used maps:
   [0] 0xffff8ccbc1b1c600 - rb
...
ffffffffc07bc801 e80a38e6f1  call 0xffffffffb2620010 ; bpf_ringbuf_submit
ffffffffc07bc806 31c0        xor eax, eax
ffffffffc07bc808 415e        pop r14
ffffffffc07bc80a 415d        pop r13
ffffffffc07bc80c 5b          pop rbx
ffffffffc07bc80d c9          leave
ffffffffc07bc80e c3          ret

and in ebpf opcodes:
43 85 00 00 00 C0 CF 02 00 call 0x2CFC0 ; bpf_ringbuf_submit
44 B7 00 00 00 00 00 00 00 mov r0, 0
45 95 00 00 00 00 00 00 00 ret

Here 0x2CFC0 is offset to bpf_ringbuf_submit from __bpf_call_base
The last call submit some data to bpf map rb with type BPF_MAP_TYPE_RINGBUF. If we could patch this function no data will be passed to usermode. How are these native function addresses filled in at all?

четверг, 30 июня 2022 г.

size of ebpf jit code on different processors

it doesn't make much sense but bcs I have now several jit compilers - why not compare how much size have jitted code for different processors?

I chose 3 ebpf programs

  1. simple BPF_PROG_TYPE_CGROUP_SKB with only comparison, 8 opcodes
  2. BPF_PROG_TYPE_RAW_TRACEPOINT with 3 maps, 68 opcodes
  3. enough complex BPF_PROG_TYPE_RAW_TRACEPOINT with 6 maps, 1824 opcodes
results

processor1st2nd3rd
x64543128195
arm649956712959
powerpc7854611462
risc-v1024709494
s3907853412622
sparc7948210446

среда, 29 июня 2022 г.

verification of jitted ebpf code

There are some projects for ebpf in usermode, but for verification purposes you need the same code which was used in kernel. So I ripped out some jit code to run it in usermode

  • x64
  • powerpc
  • risc-v
  • s390
  • sparc
  • sunway sw64
And now we can make verification of jitted code - we have actual generated code for some ebpf, next we run JIT for ebpf opcodes in usermode, and finally can compare them

суббота, 25 июня 2022 г.

pmu events

Some details

pmu stored in tree pmu_idr and synced with mutex pmus_lock. and as usually can be used to blind EBPF. How? Lets see:

General speaking there are usually four steps involved to attach an eBPF program to a perf event:

  1. Open the perf event
  2. Load the eBPF program
  3. Set the eBPF program on the perf event
  4. Enable the perf event
We interested in point 4 - enabling of the perf event involves calling of pmu->event_init & pmu->add methods. And worse - all pmu structures located in .data section and thus writable. So I add today some code to dump them:

понедельник, 20 июня 2022 г.

ebpf opcodes patching

I made today disasm for eBPF opcodes. Lets see how they looks like:
85 00 00 00 C0 10 02 00 call 0x210C0

in jitted code this is call 0xffffffffb4c14110. ffffffffb4c14110 - 210C0 = FFFFFFFFB4BF3050, address of __bpf_call_base. Suppose that we have some paranoidal code in kernel mode and don`t want to be traced with all this ebpf black magic, what we can do on machine without JIT?

First, we could just patch first opcode to
95 00 00 00 00 00 00 00 ret

Second - we could find some empty native function in kernel (or even reuse __bpf_call_base) and patch address let`s say htab_map_update_elem to it. Can some linux ebpf-based EDR detect this?

среда, 15 июня 2022 г.

epbf maps

As you can see from function bpf_map_alloc_id all bpf maps stored in balanced tree map_idr and synced on spinlock map_idr_lock. No surprise that you can`t view them in user-mode - there is bpf command BPF_MAP_GET_NEXT_ID but it can only enumerate ID of maps. So I add today some code to view bpf maps: lkmem -c -d -B gives output like

bpf_maps at 0xffffffff929c1880: 15
 [0] id 3 UDPrecvAge at 0xffff99e344f48000
  type: 1 BPF_MAP_TYPE_HASH
  key_size 8 value_size 8
 [1] id 4 UDPsendAge at 0xffff99e344cb4c00
  type: 1 BPF_MAP_TYPE_HASH
  key_size 38 value_size 8

also disasm of jitted ebpf code began to look better:
mov rdi, 0xffff99e344f48000 ; UDPrecvAge
call 0xffffffff90c191f0 ; __htab_map_lookup_elem

This letter explains that JIT replacing sequence of opcodes
bpf_mov r1, const_internal_map_id
bpf_call bpf_map_lookup

with direct loading of 64bit address of map (BPF_LD_IMM64 pseudo op). But this code is not optimal - every instruction occupy 10 bytes. Lets consider case where we employ constants pool and put all map addresses somewhere after function - sure this will require at least 8 bytes for each address + perhaps some space for alignment. But now we can produce code like:
mov rdi, qword [map1_addr wrt rip] ; 7 bytes
call __htab_map_lookup_elem
...
; somewhere after function
map1_addr: resq 1 ; jit should put real address of map here

if function has 3 or more reference to the same map we can have some decreasing of jitted code size

вторник, 7 июня 2022 г.

position independent sw64 code

lets see how PIC looks like for sw64 on the example of a function from libLLVM-7.so.1 (huge shared library - size 45Mb):

1000ED0   ldih    GP, PV, 0x1D3

PV almost always contains address of called function so value of GP now 2D30ED0
1000ED4   ldi     GP, GP, -0x1290

value of GP now 2D30ED0 - 1290 = 2D2FC40. I expected that this base address always located inside .got but this is not true - it can lie anywhere, sometimes even not inside elf module! All remaining refs use this base address in GP register:

1000ED8   ldih    PV, GP, 0
1000EDC   ldl     PV, PV, -0x4EC0
...
1000F14   call    RA, PV, 0
1000F18   ldih    GP, RA, 0x1D3 ; 2D30F18
1000F20   ldi     GP, GP, -0x12D8 ; 2D2FC40

wait, WHAT? they use return address in RA to fill GP with the same value 2D2FC40. and even worse - they restore value of GP even in epilogue where it is not used

Lets estimate size overhead. libLLVM-7.so.1 has 41337 functions, 8432116 instructions and 781997 to set value of GP. rate 781997 / 8432116 = 0.092740
Lets assume that each function anyway need to setup GP, so required number of instructions is 41337 * 2 = 82674. remaining is 781997 - 82674 = 699323
remove unneeded GP setups from epilogues: 699323 - 82674 = 616649
this amount easy can be reduced in half - just store calculated value of GP in stack with stl gp, sp, offset (+41337 instructions) and then pop it when needed with ldl gp, sp, offset 
So actual amount of instructions could be 616649 / 2 + 41337 + 82674 = 432336
new rate: 432336 / 8432116 = 0.05127
overhead is 0.092740 - 0.05127 = 4.1%
cool, almost 2Mb of code is just unnecessary

суббота, 4 июня 2022 г.

reversing of sunway sw64 ISA

It seems that Chinese are hiding information about their another homemade processor sw64 - try to find some technical details with google, baidu or gitee. At the same time they ported linux on this processor - and you even can find some details in openEuler project. I think this conspiracy is very funny and at least violating licenses for binutils/clang/gcc etc

Anyway lets see if we can reverse ISA for sw64 having only linux image and some source code from linux kernel (spoiler: also write processor module for ida pro)

registers

try to compare registers of sw64 with Alpha AXP - can you find any difference? at least we now know that processor has 32 general purpose registers and 32 for floating point, so fields for register encoding must be 5 bits


ELF relocs

relocs can be extracted from arch/sw_64/include/asm/elf.h. So the next thing which I wrote was small ida pro plugin to apply this relocs - nothing special, actually it was almost exactly copy of the same plugin for LoongArch

mnemonics

воскресенье, 22 мая 2022 г.

ida pro plugin to handle loongson elf relocs

It seems that you can't just go ahead and implement your own proc_def_t for processor module - bcs ida pro sdk don`t include needed symbols, you will just get something like

1>reg.obj : error LNK2019: unresolved external symbol "public: __cdecl proc_def_t::proc_def_t(struct elf_loader_t &,class reader_t &)" (??0proc_def_t@@QEAA@AEAUelf_loader_t@@AEAVreader_t@@@Z) referenced in function "public: virtual __int64 __cdecl xxx_t::on_event(__int64,char *)" (?on_event@xxxson_t@@UEAA_J_JPEAD@Z)

1>reg.obj : error LNK2001: unresolved external symbol "public: virtual bool __cdecl proc_def_t::proc_supports_relocs(void)const " (?proc_supports_relocs@proc_def_t@@UEBA_NXZ)

1>reg.obj : error LNK2001: unresolved external symbol "public: virtual char const * __cdecl proc_def_t::proc_handle_reloc(struct rel_data_t const &,struct sym_rel const *,struct elf_rela_t const *,struct reloc_tools_t *)" (?proc_handle_reloc@proc_def_t@@UEAAPEBDAEBUrel_data_t@@PEBUsym_rel@@PEBUelf_rela_t@@PEAUreloc_tools_t@@@Z)

1>reg.obj : error LNK2001: unresolved external symbol "public: virtual bool __cdecl proc_def_t::proc_create_got_offsets(struct Elf64_Shdr const *,struct reloc_tools_t *)" (?proc_create_got_offsets@proc_def_t@@UEAA_NPEBUElf64_Shdr@@PEAUreloc_tools_t@@@Z)

1>reg.obj : error LNK2001: unresolved external symbol "public: virtual bool __cdecl proc_def_t::proc_perform_patching(struct Elf64_Shdr const *,struct Elf64_Shdr const *)" (?proc_perform_patching@proc_def_t@@UEAA_NPEBUElf64_Shdr@@0@Z)

1>reg.obj : error LNK2001: unresolved external symbol "public: virtual bool __cdecl proc_def_t::proc_can_convert_pic_got(void)const " (?proc_can_convert_pic_got@proc_def_t@@UEBA_NXZ)

1>reg.obj : error LNK2001: unresolved external symbol "public: virtual unsigned __int64 __cdecl proc_def_t::proc_convert_pic_got(class segment_t const *,struct reloc_tools_t *)" (?proc_convert_pic_got@proc_def_t@@UEAA_KPEBVsegment_t@@PEAUreloc_tools_t@@@Z)

1>reg.obj : error LNK2001: unresolved external symbol "public: virtual char const * __cdecl proc_def_t::proc_describe_flag_bit(unsigned int *)" (?proc_describe_flag_bit@proc_def_t@@UEAAPEBDPEAI@Z)

1>reg.obj : error LNK2001: unresolved external symbol "public: virtual bool __cdecl proc_def_t::proc_load_unknown_sec(struct Elf64_Shdr *,bool)" (?proc_load_unknown_sec@proc_def_t@@UEAA_NPEAUElf64_Shdr@@_N@Z)

1>reg.obj : error LNK2001: unresolved external symbol "public: virtual char const * __cdecl proc_def_t::proc_handle_dynamic_tag(struct Elf64_Dyn const *)" (?proc_handle_dynamic_tag@proc_def_t@@UEAAPEBDPEBUElf64_Dyn@@@Z)

1>reg.obj : error LNK2001: unresolved external symbol "public: virtual bool __cdecl proc_def_t::proc_is_acceptable_image_type(unsigned short)" (?proc_is_acceptable_image_type@proc_def_t@@UEAA_NG@Z)

1>reg.obj : error LNK2001: unresolved external symbol "public: virtual void __cdecl proc_def_t::proc_on_start_data_loading(struct elf_ehdr_t &)" (?proc_on_start_data_loading@proc_def_t@@UEAAXAEAUelf_ehdr_t@@@Z)

1>reg.obj : error LNK2001: unresolved external symbol "public: virtual bool __cdecl proc_def_t::proc_on_end_data_loading(void)" (?proc_on_end_data_loading@proc_def_t@@UEAA_NXZ)

1>reg.obj : error LNK2001: unresolved external symbol "public: virtual void __cdecl proc_def_t::proc_on_loading_symbols(void)" (?proc_on_loading_symbols@proc_def_t@@UEAAXXZ)

1>reg.obj : error LNK2001: unresolved external symbol "public: virtual bool __cdecl proc_def_t::proc_handle_symbol(struct sym_rel &,char const *)" (?proc_handle_symbol@proc_def_t@@UEAA_NAEAUsym_rel@@PEBD@Z)

1>reg.obj : error LNK2001: unresolved external symbol "public: virtual void __cdecl proc_def_t::proc_handle_dynsym(struct sym_rel const &,unsigned int,char const *)" (?proc_handle_dynsym@proc_def_t@@UEAAXAEBUsym_rel@@IPEBD@Z)

1>reg.obj : error LNK2001: unresolved external symbol "public: virtual int __cdecl proc_def_t::proc_handle_special_symbol(struct sym_rel *,char const *,unsigned short)" (?proc_handle_special_symbol@proc_def_t@@UEAAHPEAUsym_rel@@PEBDG@Z)

1>reg.obj : error LNK2001: unresolved external symbol "public: virtual bool __cdecl proc_def_t::proc_should_load_section(struct Elf64_Shdr const &,unsigned int,class _qstring<char> const &)" (?proc_should_load_section@proc_def_t@@UEAA_NAEBUElf64_Shdr@@IAEBV?$_qstring@D@@@Z)

1>reg.obj : error LNK2001: unresolved external symbol "public: virtual bool __cdecl proc_def_t::proc_on_create_section(struct Elf64_Shdr const &,class _qstring<char> const &,unsigned __int64 *)" (?proc_on_create_section@proc_def_t@@UEAA_NAEBUElf64_Shdr@@AEBV?$_qstring@D@@PEA_K@Z)

1>reg.obj : error LNK2001: unresolved external symbol "public: virtual char const * __cdecl proc_def_t::calc_procname(unsigned int *,char const *)" (?calc_procname@proc_def_t@@UEAAPEBDPEAIPEBD@Z)

1>reg.obj : error LNK2001: unresolved external symbol "public: virtual unsigned __int64 __cdecl proc_def_t::proc_adjust_entry(unsigned __int64)" (?proc_adjust_entry@proc_def_t@@UEAA_K_K@Z)

1>D:\ida75\procs\xxx64.dll : fatal error LNK1120: 21 unresolved externals


So I wrote plugin to handle ELF relocs for this new fashionable chinese processor.
Source
some description of relocs can be found here

понедельник, 16 мая 2022 г.

ida pro plugin for unpacking lzma compressed linux kernel

UOS linux for mips64 contains strange linux kernel which cannot be unpacked with famous extract-vmlinux
Lets see what happens:
zimage_start = (unsigned long)(&__image_begin);
zimage_size = (unsigned long)(&__image_end) -
    (unsigned long)(&__image_begin);
...
/* Decompress the kernel with according algorithm */
__decompress((char *)zimage_start, zimage_size, 0, 0,
	   (void *)VMLINUX_LOAD_ADDRESS_ULL, 0, 0, error);

The problem is that System.map does not contain symbols __image_begin & __image_end. Investigation showed that compressed body of kernel located in .data section so the only unknown parameters for unpacking are start address and size of unpacked data. Fortunately used algo lzma puts size of unpacked data as last DWORD in data. And address you can extract from System.map for symbol _text

So logic of plugin is
  • get filename of input file
  • make right name for System.map from it
  • read this System.map
  • try to find xrefs in .data section - the only two will be __image_begin & __image_end
  • unpack
  • add new segment (and this was most terrible part of development - ida pro failed several times with memory dumps)
  • put unpacked data to newly added segment
  • profit
Link to github

вторник, 25 января 2022 г.

plugin for Binary Ninja

due to the sad fact that IDA Pro moving to cloud (just think about confidentiality) I decided to look at some alternatives - Binary Ninja. First impression was terrible

  • totally unknown API, guys - why not make some compatibility layer with IDAPython?
  • counterintuitive types in LLIL - for example constant ptr has type RegisterValue. whut?
  • I found bug in LLIL types conversion to python types (and suspect it is not alone)
anyway after couple of weeks I was able to write some simple plugin for checking functions who left some linux kernel resource locked. Perhaps it can be remastered for windows kernel too