воскресенье, 5 апреля 2020 г.

static code analysis

This cool article is good case to show how you can employ static code analysis for extracting some unexported symbols from binary code - in this case we need ExNPagedLookasideLock & ExNPagedLookasideListHead

Sure the first thing you need is disassembler. If you search at GitHub "x86 disasm" you will get something about 20 repositories, but we need one that satisfies some requirements:
  • disasm to some intermediate code and not in string output
  • can be used in kernel mode (just in case if you want to do it) which means that it must be written in C

So just choose the one with the most comprehensible code - bcs they all contains bugs and you anyway will fix them and/or add missed instructions

Lets start with exported function ExInitializeNPagedLookasideList. Simplest cases - xp 64bit:


...
 mov     [rcx+54h], r10d
 lea     rdx, [rcx+40h]
 lea     r8, ExNPagedLookasideLock
 lea     rcx, ExNPagedLookasideListHead
 add     rsp, 28h
 jmp     ExInterlockedInsertTailList


You can just collect all lea instructions which loading addresses from .data section, r8 will hold address of lock and rcx list. ok, this was simple, so lets see more complex case - w2k3 64bit
 sub     rsp, 28h
 test    cl, 0Fh
 jz      short loc_52A748
 mov     r8, [rsp+28h]
 mov     rdx, rcx
 lea     rcx, aInitializeslis
 call    DbgPrint
 mov     ecx, 80000002h
 call    ExRaiseStatus
 int     3                              ; Trap to Debugger


Ups, Bad Thing happened - you can`t disasm after int3 instruction bcs this is something that should not happen during normal execution. This is where you need graph codeflow. Obvious choice is to consider codeblocks as nodes and jumps to other code blocks as edges. Sure each node can have lots of directed edged (and some of this edges even can point to the same node - so we have directed graph with loops). The good news is that you don`t need to build whole graph - bcs you anyway discovering all your nodes and edges dynamically. It`s like Lee algo for pathfinding in maze - you just need to have set of codeblocks which you processing, another set of edges which you will process in next cycle, and some storage for already processed blocks. So algo is simple:
  1. add your initial address (in this case address of ExInitializeNPagedLookasideList) in current set
  2. process all of codeblocks in your current set and extract links to other codeblocks in next set
  3. when you finished with current set - add all processed blocks to some "storage of processed" - it can be something simple like std::list<std::pair< PBYTE, DWORD >> or some heavy weapon like Boost interval container library (or some interval tree written in plain C) . The main idea is that this storage can tell you if your address already was processed
  4. now you have set of newly discovered addresses, filter out already processed and put them in your current set
  5. if this set is empty - algo ends
  6. goto step 2

So in our sample from w2k3 disasm will stop at address of int3 but set of newly discovered addresses will contains new address 52A748:
loc_52A748:  
 ...
 lea     r8, ExNPagedLookasideLock
 lea     rcx, ExNPagedLookasideListHead
 add     rsp, 28h
 jmp     ExInterlockedInsertTailList


exactly like in previous case. Lets see more complex case - w7 64bit:

ExInitializeNPagedLookasideList:
 sub     rsp, 48h
 and     [rsp+48h+var_10], 0
 movzx   eax, [rsp+48h+Depth]
 mov     [rsp+48h+var_18], ax
 mov     eax, [rsp+48h+Tag]
 mov     [rsp+48h+var_20], eax
 mov     rax, [rsp+48h+Size]
 mov     [rsp+48h+var_28], rax
 call    ExInitializeNPagedLookasideListInternal
 add     rsp, 48h
 retn


Ups-2, there was nothing like loading some addresses. But you can collect calls and also add them to edge storage. Function ExInitializeNPagedLookasideListInternal is very big and complex so lets see something interesting at address 140166277:
loc_140166277:
 add     [rbx+4B00h], edi
 lock bts cs:ExNPagedLookasideLock, 0
 jnb     short loc_1401662A6
 lea     rcx, ExNPagedLookasideLock
 call    KxWaitForSpinLockAndAcquire
 add     [rbx+4B04h], edi
 add     [rbx+4B08h], eax
 mov     r9d, eax
 jmp     short loc_1401662A9


Function KxWaitForSpinLockAndAcquire is not exported but you can recognize lock with instruction lock bts [address in .data section]. Now you know that you have lock address and in one of two blocks after this you will have safely access to list - so you need associate some State with all of your edges. Lets say for this simple case that state 0 means that we still search for lock and state 1 means that we got lock and now expect list address. So you can mark addresses 1401662A6 and 1401662A9 as having state 1 and then process them in some other way

There is one important question - how safe you can skip processing of nodes with state 0 when you already have some edges with state 1? Answer - it depends from your state machine. You can draw on paper all your states and if this graph has no branches - you can skip all processing in this cycle blocks with lower state:

But lets see other state graph (branches can occur when we try to find several addresses or there are several way to find the same address. And you clearly doing something wrong if your state graph has loops):

In this case you can safely skip only edges with state 0 when you in state 1. But you can`t skip edges with state 1 or 2 when you processing block with state 3

Комментариев нет:

Отправить комментарий