четверг, 3 октября 2024 г.

TLS in gcc RTL

Lets check how TLS looks like in RTL. I wrote simple test:
(insn 29 8 10 3 (set (reg:SI 0 ax [orig:82 _1 ] [82])
        (mem/c:SI (const:DI (unspec:DI [
                        (symbol_ref:DI ("tls_state") [flags 0x2a]  <var_decl 0x7f61bce052d0 tls_state>)
                    ] UNSPEC_NTPOFF)) [2 tls_state.idx+0 S4 A32 AS1])) "stest.cc":15:37 81 {*movsi_internal}
     (nil))

We can see attribute UNSPEC here - it can be extracted as XINT (in_rtx, 1) for GET_CODE == UNSPEC. The first problem here is that values UNSPEC_XXX are machine-specific. For example the same code while cross-compiling for mips:
insn 60 13 61 (set (reg:SI 3 $3)
        (unspec:SI [
                (const_int 0 [0])
            ] UNSPEC_TLS_GET_TP)) "stest.cc":15:37 706 {*tls_get_tp_si_split}
     (nil))
(insn 61 60 15 (set (reg:SI 2 $2 [201])
        (reg:SI 3 $3)) "stest.cc":15:37 313 {*movsi_internal}
     (nil))
(insn 15 61 16 (set (reg:SI 3 $3 [202])
        (high:SI (const:SI (unspec:SI [
                        (symbol_ref:SI ("tls_state") [flags 0x2a]  <var_decl 0x7ff96a9dfcf0 tls_state>)
                    ] 306)))) "stest.cc":15:37 313 {*movsi_internal}
     (nil))

Second - NTPOFF in comments marked as belonging to "Relocation specifiers" while 

;; TLS support
  UNSPEC_TP
  UNSPEC_TLS_GD
  UNSPEC_TLS_LD_BASE
  UNSPEC_TLSDESC
  UNSPEC_TLS_IE_SUN

Thirdly - lets recompile the same file with -fPIC option:
(call_insn/u 9 8 11 3 (parallel [
            (set (reg:DI 0 ax)
                (call:DI (mem:QI (symbol_ref:DI ("__tls_get_addr")) [0  S1 A8])
                    (const_int 0 [0])))
            (unspec:DI [
                    (symbol_ref:DI ("tls_state") [flags 0x10]  <var_decl 0x7fcea13c92d0 tls_state>)
                    (reg/f:DI 7 sp)
                ] UNSPEC_TLS_GD)
        ]) "stest.cc":15:37 1048 {*tls_global_dynamic_64_di}
     (expr_list:REG_EH_REGION (const_int -2147483648 [0xffffffff80000000])
        (nil))
    (nil))
(insn 11 9 46 3 (set (reg:SI 0 ax [orig:82 _1 ] [82])
        (mem/c:SI (reg/f:DI 0 ax [88]) [2 tls_state.idx+0 S4 A32])) "stest.cc":15:37 81 {*movsi_internal}
     (nil))

You can notice that address of TLS var gathered with __tls_get_addr but next access to it not marked in any way - just regular chain set mem component_ref. Disgusting

Just note for myself what else cannot be extracted from gcc RTL:

  • pointer to members -
     despite the fact that in file cp/cp-tree.def there are OFFSET_REF & PTRMEM_CST in real RTL they are just integer constants
  • pointer to member methods - similalry
  • TLS are indistinguishable from regular global vars
  • to be continued

понедельник, 30 сентября 2024 г.

gcc plugin to collect cross-references, part 9

Lets extract some useful results from my gcc plugin for collecting cross-references: 1, 2, 3, 4, 5, 6, 7 & 8

I've noticed that plugin worked unbearably slowly on big source files (like compiling itself) - something above 30 minutes, so I add profiling to it (see details here). Profiler showed that consumed user-time was only 18 seconds - this is glare sign that plugin spending most of time on some kind of lock. After some meditation (and strace/ltrace) I finally found that root of problem was in sqlite - it sleeping in fdatasync on each data writing to its DB! The cure is to execute
PRAGMA synchronous=OFF
this gives a non-illusory x100 speed up. Can I now consider myself as one of mythical "x100 programmer", he-he?

The next problem is how to store data from multiple gcc processes into single sqlite DB (like parallel make -j X). I've decided to use Apache Thrift - old and unfashionable but it really works and you can make multithreaded RPC server in 500 lines of c++ code. Server just listen on some port and stores all data from RPC clients into sqlite DB. Command line arguments path2db port_number .To shutdown RPC server use rpcq port_number. Of course you can make your own implementation of RPC server, for example with PHP & DB/2 (unfortunately Thrift does not supports Cobol)

As a consequence now plugin has 3 implementations for data storing:

  1. in plain text files, self-test takes 14s
  2. into sqlite, self-test takes 18s
  3. in sqlite via RPC server listened on some arbitrary TCP port. Self-test takes 23s. Connection string looks like -fplugin-arg-gptest-db=localhost:port_number 

Btw sqlite version has size 2.67Mb, Thrift rpc client - 4.65Mb. Something is fundamentally wrong with "modern frameworks"

DB schema 

is very simple - in essence it consists of two tables:

symtab for storing symbols

  • id - primary key
  • mangled name of symbol/function or value of literal
  • fname - for functions this is name of file where they were declared
  • bcount - amount of basic blocks of function, can be used as some complexity metric

xrefs to link symbols and functions from which they are referred

  • id of function
  • bb - index of basic block
  • arg - if non-zero - index of function's argument
  • what - id of referred symbol
  • kind - one letter type of xref:
    • 'c' - function call
    • 'v' - call of virtual method
    • 'l' - literal
    • 'r' - just reference to some global symbol
    • 'f' - field of some struct/class/union
    • 'F' - constant (with -fplugin-arg-gptest-ic option)

Fetching results

I wrote perl script to search and dump xrefs. If I want to find functions calling virtual methods of class FPersistence:

пятница, 20 сентября 2024 г.

Tracking arguments of functions in gcc

I continuing to improve my gcc plugin for collecting cross-references: 1, 2, 3, 4, 5, 6 & 7. On this week I decided to see if I can extract source of complex types like records and most prominent kind of them is arguments of function - they are easy to identify in asm (but not so easy to bind them in gcc RTL expressions)

Having function declaration fdecl we can extract arguments with something like:

for (tree arg = DECL_ARGUMENTS (fdecl); arg; arg = DECL_CHAIN (arg))
  {
    auto a = DECL_RTL_IF_SET (arg);
    if ( a && REG_EXPR(a) ) { // do something with argument in
REG_EXPR(a)

 
I need only arguments that are a pointer or reference to record/union so I filtered them in method check_arg

Those was easiest part, and now try to find how arguments can be tracked in RTL. 

воскресенье, 15 сентября 2024 г.

bug in gcc?

It seems that gcc not always put COMPONENT_REF when access fields of structures passed by reference. For example I add today simple static function append_name(aux_type_clutch &clutch, const char *name)

It has reference to field txt of structure aux_type_clutch but RTL looks like:

(insn 20 19 21 4 (set (reg/f:DI 0 ax [89])
        (mem/f/c:DI (plus:DI (reg/f:DI 6 bp)
                (const_int -8 [0xfffffffffffffff8])) [258 clutch+0 S8 A64])) "gptest.cpp":1364:24 80 {*movdi_internal}
     (nil))
(insn 21 20 22 4 (parallel [
            (set (reg/f:DI 0 ax [orig:83 _2 ] [83])
                (plus:DI (reg/f:DI 0 ax [89])
                    (const_int 8 [0x8])))
            (clobber (reg:CC 17 flags))
        ]) "gptest.cpp":1364:24 230 {*adddi_1}
     (expr_list:REG_EQUAL (plus:DI (mem/f/c:DI (plus:DI (reg/f:DI 19 frame)
                    (const_int -8 [0xfffffffffffffff8])) [258 clutch+0 S8 A64])
            (const_int 8 [0x8]))
        (nil)))

First instruction just loads in register RAX parm_decl (of type aux_type_clutch) like

mov     rax, [rbp+clutch]

and second add to RAX just some const 0x8 (offset to field txt):

add     rax, 8

it's impossible from RTL to track back this constant to offset in COMPONENT_REF

What is more even strange - for methods you can track fields access for parameters passed by reference (like this) - for example in constructor of the same aux_type_clutch:

(insn 12 11 13 2 (set (mem:SI (plus:DI (reg/f:DI 0 ax [94])
                (const_int 40 [0x28])) [4 this_12(D)->level+0 S4 A64])
        (const_int 0 [0])) "gptest.cpp":465:4 81 {*movsi_internal}
     (nil))

четверг, 5 сентября 2024 г.

hidden executable pages in linux kernel, part 2

In part 1 I've described how memory managed by hardware. Now lets dig into how kernel sees memory. Not surprisingly that we should check the same structures that malicious drivers update while hiding

Modules

List of module structures with head in modules and lock modules_mutex. It has projection on file /proc/modules but sizes in those file are sloppy - function module_total_size calculates total size of driver (including discarded sections!). So we should use only some selected fields:
  • on kernel >= 6.4 mem[MOD_TEXT].base & mem[MOD_TEXT].size
  • on kernel < 4.5 module_core & core_text_size
  • otherwise core_layout.base & core_layout.text_size

vmap_area_list

It has projection on file /proc/vmallocinfo and requires root access. Sure sophisticated rootkits can intercept it but that's ok since we use it for cross-scan only

 

False positives

As you can guess not every executable page belongs to some driver - there are couple exceptions

среда, 28 августа 2024 г.

hidden executable pages in linux kernel

Standard method to find rootkits like this (or like this) is cross-scanning PTEs of memory without NX bit, then extract pages belonging to LKMs - thus in set difference we will gather hidden executable memory. Lets check how we can scan PTEs under linux

disclaimer

this article is not digest of Intel or linux documentation - I'll just describe how you can traverse page tables from LKM. Also code

Lets start with some simple things:

cat /boot/config-$(uname -r) | grep -E 'X86_5LEVEL|PGTABLE_LEVELS'
CONFIG_PGTABLE_LEVELS=5
CONFIG_X86_5LEVEL=y

So my kernel has 5 levels and this exactly correspond to hardware:

  1. pte_t - PTE, 9 bits of page address (size of page is 4096 bytes - low 12 bits)
  2. pmd_t - PDE. another 9 bits of address
  3. pud_t - PDPTE, 9 bits
  4. p4d_t - PML4, 9 bits
  5. pgd_t - PML5, 9 bits

Total 12 + 5 * 9 = 57bits

It's really amazing how memory management is implemented differently in different operating systems running on the same hardware. For example in Windows all PTE are located in huge contiguous sparse array and you can get address of PTE for some address of memory with very simple function MiGetPteAddress. Let MiGetPteAddress(addr1) is addr2. We then can continue this process for all paging levels - get PteAddress(addr2) and so on - to find if all 5 parts of address is valid. And this can be used in reverse direction - skip scan of huge PTEs areas if they are not presented in memory

Unfortunately in linux PTE not stored in one huge contiguous memory. So we need to start with top-level (from PGD) and scan all tables on lower levels. Root of pgd_t stored in init_mm->pgd. As usually var init_mm is not exported

<sarcasm>Linux widely known for the consistency, completeness and backward-compatibility of its API and being developers-friendly in general</sarcasm>

Next we need way to find valid pXX_t. Seems that there are functions pXX_present, pXX_bad and so on. The right sequence of calls is

  • pXX_none
  • pXX_leaf - this is damn good name for functions to check for large pages
  • pXX_bad
  • and finally pXX_offset to get item for next level

Unfortunately there are also so called hugeTLB pages (enabled with CONFIG_HUGETLB_PAGE):

grep 'HUGETLB_PAGE' /boot/config-$(uname -r)
CONFIG_HUGETLB_PAGE=y
CONFIG_HUGETLB_PAGE_FREE_VMEMMAP=y

As you may expect functions pmd_huge & pud_huge non-exported too (and p4d_huge & pgd_huge are just dumb macros)

Finally we need to check if some page is executable. This is very hardware specific - for example

  • Arc has flag _PAGE_EXECUTE
  • aarch64 has flag _PAGE_KERNEL_EXEC
  • powerpc has _PAGE_EXEC
  • s390 has _PAGE_NOEXEC

so for some arch there is function pte_exec, while for another pte_no_exec. Also it's curious that there are no analogs for pud/pmd etc - so actually I have zero ideas how check executability for large & huge pages

However, this is not the end of suffering. Quick check:

grep address /proc/cpuinfo
address sizes    : 39 bits physical, 48 bits virtual
shows that they lie - my hardware actually supports only 48bit addresses, so kernel should have only 4 levels of paging. Try to guess how they swept the trash under the carpet?

вторник, 20 августа 2024 г.

bpf_verifier_ops

Lets dissect some typical ebpf spyware. It sets up uprobes on

  • SSL_read_ex
  • SSL_read
  • SSL_write_ex
  • SSL_write
  • gotls_write_register
  • gotls_read_register
  • gotls_exit_read_register

and uses bpf functions probe_read_user & probe_read_user_str to steal data and map_update_elem & ringbuf_submit to store data in bpf maps

How we can mitigate this?

Official way is to use LSM - function __sys_bpf calls security_bpf so we could register with security_add_hooks LSM hook with index bpf. This effectively prevents loading of ebpf program and sometimes is not what we want - for example in case of honeypots there is high chance that usermode program just will exits after failed ebpf program loading and you can't monitor which connections it will try to establish

Another way - is to patch bpf_func_proto for selected functions, like I did. However this is brutal method and affects all ebpf programs (I still believe that some is not spyware, he-he)

Luckily there is way to blind only some types of ebpf programs - method get_func_proto in bpf_verifier_ops. I made PoC to blind aforementioned 4 functions for BPF_PROG_TYPE_TRACING & BPF_PROG_TYPE_KPROBE only

Now we have another problem - how to check integrity of bpf_verifier_ops? I've also add this check in my lkcd. Example of output when PoC ublind is loaded looks like:

[24] type BPF_PROG_TYPE_TRACING at 0xffffffffc1357720 - ublind!s_trace_patched
  get_func_proto: 0xffffffffc13551e0 - ublind!my_func_proto
  is_valid_access: 0xffffffffaee24e20 - kernel!tracing_prog_is_valid_access