ASM. Secrets of PEB Structure Exploitation

Tr0jan_Horse

Moderator
Staff member
MODERATOR
ULTIMATE
PREMIUM
MEMBER
Joined
Oct 23, 2024
Messages
304
Reaction score
8,795
Deposit
0$
PebLogo.webp


The abbreviation PEB stands for "Process Environment Block" , or the environment block of the executable process. It has long attracted the attention of code researchers, since it stores the kernel part of our program, which is inaccessible from the user level. The size of this block on 64-bit systems is 0x380=896bytes that describe 90 fields. You can hide an elephant in such a space, which means that REV must certainly contain interesting values and references, which are discussed in this article.

Table of contents:

1. Introduction
2. Collecting information without calling Win32API
3. Potential attacks on the console
4. KUSER_SHARED_DATA structure format
5. Conclusion




1. Introduction

First, let's figure out where the "Process Environment Block" came from and why it is needed at all.
A process is a kernel object, and therefore when a program is launched from disk or the user function CreateProcess() is called , the request for creation goes to the Ntoskrnl.exe kernel. In this case, the service personnel creates the main EPROCESS structure , through which the OS further manages the process. Since the process is simply a box (and its Thread thread is directly involved in executing the code) structure for the main process thread is also immediately created in the kernel , the ETHREAD . Then the process can create additional threads for itself, and for each of them the kernel will create its own ETHREAD. Thus, each process has a single EPROCESS structure\passport, and necessarily one or more ETHREAD structures.

However, part of the operating system is in the user's address space, such as services, Win32API libraries, and so on. Due to their job responsibilities, they also need to periodically access the EPROCESS structure. Everything would be fine, but transitions from the user to the kernel and back are an expensive operation, and when this action is repeated several hundred times per second, the performance of the entire OS already sags. Therefore, engineers created (in a sense) a copy of the nuclear EPROCESS, placing it in the user space for their charges. This is the hero of this article, REV thread block rigidly tied to it , with the TEB .

This move not only allowed to speed up the work of user applications, but also created fertile ground for machinations on the part of malware, since confidential kernel information was in the public domain. Here, the Mayaks had to do the splits - either to repel potential threats, or to increase productivity, which is what they ultimately chose. I will say in advance that there are not many fields in REV, the modification of which can disrupt the operation of the process, but such fields still exist.

A pointer to the REB is written into each of the TEB structures of the process, and a pointer to the TEB itself is written into the hidden part of the segment register. fson x32 systems, and gson x64 systems. It follows from this that the OS allocates for it not just an address in the memory of the current process, but a special segment (although in protected mode the line between them is thin) . The command !addressthe WinDbg debugger returns information about any memory region (works only in yum) . Judging by the log, this is the private memory of the process (private) , 4096 bytes in size (one virtual page with Read\Write attributes) :


Code:

0:000> !address $peb <------- псевдорегистр
--------------------------------------------
Usage: PEB
Allocation Base: 000007ff`fffdf000
Base Address: 000007ff`fffdf000
End Address: 000007ff`fffe0000
Region Size: 00000000`00001000
Type: 00020000 MEM_PRIVATE
State: 00001000 MEM_COMMIT
Protect: 00000004 PAGE_READWRITE


2. Collecting process information without calling Win32 API

The REV structure is a gold mine where you can find many useful things. The main advantage of its direct reading is that the extracted data allows you to write position-independent shell code that will function properly "in snow and rain" on any version of Windows. The main thing here is to take the base address in memory from the register gs, from which we then shift to the required fields. At the same time, the fields contain not only Int values , but also Pointer pointers to other system structures far beyond the REB itself, which expands our capabilities exponentially.

The diagram below shows the relationship of the structures available through REV. I shaded the secondary fields in gray to focus only on those that we will dump directly to the console. You will find the entire REV structure in the article clip, and the diagram shows only a fragment of its first 30h members:


PebLinks.webp



2.1. Output of information from the main REV

As mentioned above, in 64-bit Win, the pointer to the REB structure is located at offset (60h) in its child structure TEB, which (in turn) is always looked at by the segment register gs. Thus, you can get to the REV with just one assembler instruction. mov rsi,[gs:60h], after which in the register RSIget a link to REV.



Code:
0: kd> dt _teb Process*
nt!_TEB
+0x060 ProcessEnvironmentBlock : Ptr64 _PEB    ;<--- адрес РЕВ в структуре ТЕВ
0: kd>



Well, then we just read the fields by their offsets, using the value in the register RSIas a base.
If there is a description of the structure, there are no problems at all, because instead of the Hex offset, you can specify the field name - this is how it looks in practice:

C-like:
 mov     rsi, [gs:0x60]    ;//<---- указатель на РЕВ
push    rsi
movzx eax, [rsi + PEB.BeingDebugged]
mov ebx, [rsi + PEB.NtGlobalFlag]
mov r10d,[rsi + PEB.NumberOfProcessors]

cinvoke printf,<10,' BeingDebugged.......: %d',\
10,' NtGlobalflag........: 0x%x',\
10,' ImageBase...........: 0x%016I64x',\
10,' ProcessHeap.........: 0x%016I64x',\
10,' ProcessorCores......: %d',10,0>,\
rax,rbx,\
[rsi + PEB.ImageBaseAddress],\
[rsi + PEB.ProcessHeap],r10




At the output we have a debug flag for our application yes/no, then a global system debug flag 0x400 (in this case no) , the base address of loading the image into memory 0x00400000, the address of the static memory of the Heap process (popularly called heap) , and in the last field the number of cores of our CPU. Note that this did not require calling a single Win32API function, although under normal circumstances the chain is needed: IsDebuggerPresent() , GetModuleHandle() , GetProcessHeap() and cpuid.


peb_1.webp



Now, following the well-trodden path, we read the remaining fields of the REV structure, taking in another dose of information:


C-like:
pop     rsi
push    rsi
cinvoke printf,<10,' LdrLoaderData.......: 0x%016I64x',\
10,' SharedMemoryBase....: 0x%016I64x',\
10,' StaticServerData....: 0x%016I64x',\
10,' ProcessParameters...: 0x%016I64x',10,0>,\
[rsi + PEB.Ldr],\
[rsi + PEB.ReadOnlySharedMemoryBase],\
[rsi + PEB.ReadOnlyStaticServerData],\
[rsi + PEB.ProcessParameters]
pop     rsi
push    rsi
mov eax, [rsi + PEB.OSMajorVersion]
mov ebx, [rsi + PEB.OSMinorVersion]
movzx ebp, [rsi + PEB.OSBuildNumber]
mov r10d,[rsi + PEB.OSPlatformId]
mov r11, [rsi + PEB.CSDVersion]
cinvoke printf,<10,' OSMajorVersion......: %d',\
10,' OSMinorVersion......: %d',\
10,' OSBuildNumber.......: %d',\
10,' OSPlatformId........: %d',\
10,' CSDVersion..........: %ls',10,0>,\
rax,rbx,rbp,r10,r11

Except for the operating system version: NT-6.1 = Win7, Build number, and service pack,
fields of interest LdrLoaderData + ProcessParameters, which store pointers to structures outside the REB perimeter - they are discussed further:


peb_2.webp




2.2. Purpose of the structure "PEB_LDR_DATA"

So the first field from the screenshot above looks at the structure "PEB_LDR_DATA" . The fields of this structure are filled by the system image loader so that you can find in them the base addresses of the DLL libraries loaded into our process, the name of these libraries, the entry point to each, the size occupied in memory, and much more. This is what the prototype of the structure looks like:



Code:


C-like:
struct PEB_LDR_DATA
Length                   dd  0
Initialized              dd  0
SsHandle                 dq  0
InLoadOrderModuleList    LIST_ENTRY
InMemoryOrderModuleList  LIST_ENTRY
InInitOrderModuleList    LIST_ENTRY
EntryInProgress          dq  0
ShutdownInProgress       dq  0
ShutdownThreadId         dq  0
ends



The first and last three fields are of no interest to us, and the linked lists LIST_ENTRY have occupied the middle positions . In fact, all three members carry the same information load, we just have the ability to list the libraries in the order they are loaded by the system loader, in the order of their location in memory, and in the order of initialization. Usually it is enough to take a link only from the field InLoaderOrderModuleList, which will be a pointer to the loader's child structure LDR_DATA_TABLE_ENTRY .

This way we will get very close to the descriptor of the first DLL library, and the link to the next one in the chain will be in the first field. InLoadOrderLinks. To find the last LDR_DATA_TABLE_ENTRY structure, we need to save the pointer to the first one in advance, because the linked list LIST_ENTRY is circular. That is, moving forward along the links, we will always return to where we came from.


C:
struct LDR_DATA_TABLE_ENTRY     ;<------------- LIST_ENTRY из “PEB_LDR_DATA” указывает сюда
InLoadOrderLinks        LIST_ENTRY   ------> адрес сл.структуры “LDR_DATA_TABLE_ENTRY” в цепочке
InMemoryOrderLinks      LIST_ENTRY
InInitOrderLinks        LIST_ENTRY
DllBase                 dq  0
EntryPoint              dq  0
SizeOfImage             dd  0
Padding1                dd  0
FullDllName             UNICODE_STRING
BaseDllName             UNICODE_STRING
Flags                   dd  0
LoadCount               dw  0
TlsIndex                dw  0
SectionPointer          dq  0
CheckSum                dd  0
Padding2                dd  0
TimeDateStamp           dq  0
EntryPointActContext    dq  0
PatchInformation        dq  0
ForwarderLinks          LIST_ENTRY
ServiceTagLinks         LIST_ENTRY
StaticLinks             LIST_ENTRY
ContextInformation      dq  0
OriginalBase            dq  0
LoadTime                dq  0
ends

As a rule, these loader structures are actively used by malware to search for the Ntdll database, so as not to call the API for dynamic library loading LoadLibrary() + GetProcAddress() , since they are like a red rag to aver. And so, with the database in hand, you can manually parse the DLL export section, and thus find the address of the desired function in the process memory.

In my example, I do not focus on one lib, but loop through them all. Functions with similar purposes are either NtQuerySystemInformation() from Ntdll, or CreateToolhelp32Snapshot() + EnumProcessModules() from Kernel32.dll, and for the sake of completeness, EnumerateLoadedModulesEx() from DbgHelp.dll. The key /iin the debugger you can request the number of instructions in the disasm listing - as we see, the consumption of at least 230 instructions (and this is without calling nested APIs) , while we spent only 5 (the rest is the loop and printf printing) . So we draw conclusions.
C-like:
;//  0: kd> uf /i NtQuerySystemInformation  ;<--- 253 instructions scanned
;//  0:000> uf /i CreateToolhelp32Snapshot  ;<--- 232 instructions scanned
;//-----------------------------------------------------------------------
pop     rsi
push    rsi
mov rsi,[rsi + PEB.Ldr]
mov rsi,[rsi + PEB_LDR_DATA.InLoadOrderModuleList]       ;//<--- линк на “LDR_DATA_TABLE_ENTRY”
mov rax,[rsi + LDR_DATA_TABLE_ENTRY.InLoadOrderLinks+8]

mov [firstEntry],rax     ;//<----- Запомнить линк на первую!!!

@@: push rsi
mov rbx,[rsi + LDR_DATA_TABLE_ENTRY.BaseDllName+8]
mov eax,[rsi + LDR_DATA_TABLE_ENTRY.SizeOfImage]
shr eax,10
cinvoke printf,<10,' %02d  %016I64x  %016I64x  %5d Kb  %ls',0>,[counter],\
[rsi + LDR_DATA_TABLE_ENTRY.DllBase],\
[rsi + LDR_DATA_TABLE_ENTRY.EntryPoint],rax,rbx
pop     rsi
inc [counter]
mov rsi,[rsi + LDR_DATA_TABLE_ENTRY.InLoadOrderLinks]
cmp rsi,[firstEntry]     ;//<----- Это последняя запись Entry?
jnz     @b



During the experiments it turned out that not all fields of the LDR_DATA_TABLE_ENTRY structure contain valid values, and therefore you shouldn't blindly trust them. For example, in the screenshot below, the address of the entry point in Ntdll is reset to zero for some reason, although for the other DLLs it is correct (I compared it with the WinDbg reading) . But the following will always be correct: the module's base in memory, its size, and creation time. TimeDataStamp, and also the name Full\BaseDllName (I was too lazy to check all the jerboas) .


peb_3.webp




2.3. Structure «RTL_USER_PROCESS_PARAMETERS»

Having superficially familiarized ourselves with the structure of the LDR_DATA_TABLE_ENTRY loader , let's return back to REV , and this time we'll look into RTL_USER_PROCESS_PARAMETERS , which is already under the control of the Executive kernel subsystem. Its prototype is buried in the spoiler, and the pointer is located in REV at offset 0х20. Note that the address looks into the heap area of our process, and is therefore fully writable:



Code:
0:000> dt _peb 000007fffffda000 Process*
ntdll!_PEB
+0x020 ProcessParameters    : 0x00000000`00272080  _RTL_USER_PROCESS_PARAMETERS
+0x030 ProcessHeap          : 0x00000000`00270000  Void
+0x0f0 ProcessHeaps         : 0x00000000`77961c40  -> 0x00000000`00270000 Void
+0x100 ProcessStarterHelper : (null)
+0x300 ProcessStorageMap    : (null)

0:000> !address 00272080

Usage:             <unclassified>
Allocation Base:   00000000`00270000
Base Address:      00000000`00270000
End Address:       00000000`00276000
Region Size:       00000000`00006000
Type:              00020000  MEM_PRIVATE
State:             00001000  MEM_COMMIT
Protect:           00000004  PAGE_READWRITE  <---- Атрибуты R\W

Code:
0: kd> dt _RTL_USER_PROCESS_PARAMETERS –v

ntdll!_RTL_USER_PROCESS_PARAMETERS, 30 elements, 0x400 bytes
   +0x000 MaximumLength    : Uint4B
   +0x004 Length           : Uint4B
   +0x008 Flags            : Uint4B
   +0x00c DebugFlags       : Uint4B
   +0x010 ConsoleHandle    : Ptr64 to Void
   +0x018 ConsoleFlags     : Uint4B
   +0x020 StandardInput    : Ptr64 to Void
   +0x028 StandardOutput   : Ptr64 to Void
   +0x030 StandardError    : Ptr64 to Void
   +0x038 CurrentDirectory : struct  _CURDIR,         2 elements, 0x18 bytes
   +0x050 DllPath          : struct  _UNICODE_STRING, 3 elements, 0x10 bytes
   +0x060 ImagePathName    : struct  _UNICODE_STRING, 3 elements, 0x10 bytes
   +0x070 CommandLine      : struct  _UNICODE_STRING, 3 elements, 0x10 bytes
   +0x080 Environment      : Ptr64 to Void
   +0x088 StartingX        : Uint4B
   +0x08c StartingY        : Uint4B
   +0x090 CountX           : Uint4B
   +0x094 CountY           : Uint4B
   +0x098 CountCharsX      : Uint4B
   +0x09c CountCharsY      : Uint4B
   +0x0a0 FillAttribute    : Uint4B
   +0x0a4 WindowFlags      : Uint4B
   +0x0a8 ShowWindowFlags  : Uint4B
   +0x0b0 WindowTitle      : struct  _UNICODE_STRING, 3 elements, 0x10 bytes
   +0x0c0 DesktopInfo      : struct  _UNICODE_STRING, 3 elements, 0x10 bytes
   +0x0d0 ShellInfo        : struct  _UNICODE_STRING, 3 elements, 0x10 bytes
   +0x0e0 RuntimeData      : struct  _UNICODE_STRING, 3 elements, 0x10 bytes
   +0x0f0 CurrentDir       : [32] struct _RTL_DRIVE_LETTER_CURDIR, 4 elements, 0x18 bytes
   +0x3f0 EnvironmentSize  : Uint8B
   +0x3f8 EnvironmentVer   : Uint8B
0: kd>

One of the interesting fields in this structure is Environment, through which you can smuggle your way to global system variables. So, practically without API calls, we will get, for example, the user\computer name, paths to profile folders, information about the processor, and much, much more. Here it is necessary to take into account that text strings in Environmentare not stored in the usual ansi format , but as 2 bytes per unicode character . In my example, I did not output the entire long string to the console, and set the counter only to the first 10 lines (if you wish, you can change the value in the RCX register) .



C-like:
mov     rsi,[gs:60h]
mov rsi,[rsi + PEB.ProcessParameters]
mov rsi,[rsi + RTL_USER_PROCESS_PARAMETERS.Environment]
add rsi,10h
mov rcx,10

@@:      push    rsi rcx     ;// RSI источник, RCX счётчик
mov rdi,buff    ;// RDI приёмник, для преобразования UnicodeToAnsi
@copy:   lodsw               ;// берём по 2 байта Word из источника,
stosb ;//    ...и сохраняем по одному Byte в приёмник.
or ax,ax       ;// это конец строки? (терминальный нуль)
jnz @copy ;// нет – продолжить..

cinvoke CharToOem,buff,buff        ;//<---- для печати возможной кириллицы на консоль
cinvoke printf,<10,' %s',0>,buff

pop     rcx rsi
shl eax,1       ;// коррекция указателя
sub rax,2       ;// ...^^^^^^
add rsi,rax     ;// следущая строка
loop    @b




peb_5.webp



In addition to variables, in the RTL_USER_PROCESS_PARAMETERS structure you can find the size of our window in pixels, its title bar, the full path to the executable file, and if you're lucky, the path to the PowerShell system snap-in. But in this case, we'll focus on such attributes of the console window as standard input/output handles StdHandle, as well as the Handle handle of the console itself. In normal mode, this information is returned by the GetStdHandle() + GetConsoleWindow() functions :



C-like:
pop     rsi
mov rsi,[rsi + PEB.ProcessParameters]
mov rax,[rsi + RTL_USER_PROCESS_PARAMETERS.ConsoleHandle]
mov rbx,[rsi + RTL_USER_PROCESS_PARAMETERS.StandardInput]
mov rdi,[rsi + RTL_USER_PROCESS_PARAMETERS.StandardOutput]
mov rbp,[rsi + RTL_USER_PROCESS_PARAMETERS.Environment]
mov r10,[rsi + RTL_USER_PROCESS_PARAMETERS.ImagePathName+8]
mov r11,[rsi + RTL_USER_PROCESS_PARAMETERS.ShellInfo+8]
cinvoke printf,<10,\
10,' Console   Handle....: 0x%04x <---- DANGER! Attacks possible!',\
10,' StdInput  Handle....: 0x%02x',\
10,' StdOutput Handle....: 0x%02x',\
10,' Environment.........: 0x%08I64x',\
10,' Window Title........: %ls',\
10,' PowerShell Path.....: %ls',10,0>,rax,rbx,rdi,rbp,r10,r11


peb_4.webp




3. Potential attacks on the console

Up until this point, we have simply collected information from the REB without subsequently applying it in practice.
In the second round, we will conduct small experiments, during which we will try to find a vulnerability in the Win console window. The error is associated with the descriptor, which is written in the field ConsoleHandlestructures RTL_USER_PROCESS_PARAMETERS . That's why I marked it as "Attention, possible attacks!" on the screenshot above .

The point is that any process can only have one console, and when trying to request a second one using the AllocConsole() function , the system returns the error “Access denied!” with the code STATUS_ACCESS_DENIED=0xC0000022 (WinError=5) . If you go to MSDN to look for the prototype of this function, you can find a warning of this nature:


AllocConsole.webp



Since this is a user mode API from the Kernel32 lib, it means that the check for the presence of a console window should be done somewhere nearby. Let's try to disassemble the function in the debugger.. Oops, that's right! First, the code enters the critical section (so that other threads do not interfere with it) , after which we immediately see a test for a zero field ConsoleHandle. If the application has a graphical interface GUI and does not have a console window, then a jump is made to create je kernel32!AllocConsole+0x54, otherwise exit from the critical section, with the error "Access denied!" Well, that's just too lame.


C-like:
0:000> uf /i AllocConsole  ; 49 instructions scanned

kernel32!AllocConsole:
00000000`76cf48a0 fff3             push    rbx
00000000`76cf48a2 4881ec80060000   sub     rsp,680h
00000000`76cf48a9 488b05a06a0800   mov     rax,qword ptr [kernel32!_security_cookie]
00000000`76cf48b0 4833c4           xor     rax,rsp
00000000`76cf48b3 4889842470060000 mov     qword ptr [rsp+670h],rax
00000000`76cf48bb 488d0d5e660800   lea     rcx,[kernel32!DllLock]
00000000`76cf48c2 ff1500830100     call    qword ptr [kernel32!RtlEnterCriticalSection]
;//------------------------------------------------------------------

00000000`76cf48c8 654c8b1c253000   mov     r11,qword ptr  gs:[30h]
00000000`76cf48d1 498b4360         mov     rax,qword ptr [r11+60h]     -----> получили линк на PEB
00000000`76cf48d5 488b5020         mov     rdx,qword ptr [rax+20h]     -----> PEB -> RTL_USER_PROCESS_PARAMETERS
00000000`76cf48d9 48837a1000       cmp         qword ptr [rdx+10h],0   -----> проверка поля “ConsoleHandle” на y/n
00000000`76cf48de 7414             je      kernel32!AllocConsole+0x54

;//------------------------------------------------------------------
kernel32!AllocConsole+0x40:
00000000`76cf48e0 488d0d39660800   lea     rcx,[kernel32!DllLock]
00000000`76cf48e7 ff15d3820100     call    qword ptr [kernel32!RtlLeaveCriticalSection]
00000000`76cf48ed b9220000c0       mov     ecx,0C0000022h              -----> STATUS_ACCESS_DENIED
00000000`76cf48f2 eb64             jmp     kernel32!AllocConsole+0xb8
...............



Thus, to allocate a forbidden second console, it is enough to reset the field to zero ConsoleHandle structures RTL_USER_PROCESS_PARAMETERS , and pull AllocConsole() again. What's interesting is that the previous console will turn into a zombie, since its window no longer has a handle, and accordingly the system will not be able to find the patient. Moreover, nothing prevents you from creating a third/fourth console window in this way, or even 1000 of them in a cycle. These ownerless windows cannot even be closed, and they will hang on the desktop until the next system reboot. That is, we will get nothing more than a "Denial of Service" DoS vulnerability. However, the bug was fixed on 64-bit Win7 SP1 systems, and therefore it only works in 32-bit WinXP.


4. Structure « KUSER_SHARED_DATA »

mapped into the user space In closing, I would like to say a few words about one more kernel structure – this is KUSER_SHARED_DATA .
Just like REV, it is designed to reduce the number of transitions from the user to the kernel, so that system services do not bother the boss for every little thing. Unlike REV, the address of this structure is rigidly fixed by the value 0x00000000`7ffe0000on both x32 and x64 systems (see x64dbg debugger) .

Of the most interesting fields, it is worth noting: system clock ticks, which are returned by the GetTickCount() function , the size of the installed physical RAM memory (specified in 4K-byte frames) , the number of physical CPU processors with their logical cores, and the bit depth of the application x32\64. The OS kernel stupidly displays\projects this structure from its body to the user space, and therefore there is no point in modifying it - at the next clock tick, the kernel will restore the contents anyway.


C-like:
mov     rsi,0x7ffe0000
movzx eax, [rsi + KUSER_SHARED_DATA.ImageNumberLow]
mov ebx, [rsi + KUSER_SHARED_DATA.ActiveConsoleId]
mov ebp, [rsi + KUSER_SHARED_DATA.NumberOfPhysicalPages]
imul rbp, 4096
shr rbp, 20
movzx edi, [rsi + KUSER_SHARED_DATA.ActiveGroupCount]
mov r10d,[rsi + KUSER_SHARED_DATA.ActiveProcessorCount]
mov r11, [rsi + KUSER_SHARED_DATA.TickCount]
imul r11d,[rsi + KUSER_SHARED_DATA.TickCountMultiplier]
cinvoke printf,<10,\
10,' ******* KUSER_SHARED_DATA *******',\
10,' ImageNumber.........: 0x%04x',\
10,' ConsoleId...........: 0x%02x',\
10,' PhysicalMemory......: %u Mb',\
10,' TotalProcessors.....: %d',\
10,' ProcessorCores......: %d',\
10,' TickCount...........: %I64u',10,0>,rax,rbx,rbp,rdi,r10,r11



sharedData.webp



5. Conclusion

The article was written with the hope that a novice reverse engineer will get at least something useful from it. In essence, you just need to make it a rule that if a code listing contains a construction like mov rax,[gs:0x60] (reading segment registers fs\gs ) , then expect trouble. For example, on x32 systems, the malware often installed its SEH exception handler in this way, so as not to call the fake SetUnhandledExceptionFilter() . In the clipboard you will find the full source code of the above blocks in FASM assembler, an include with a description of REB64 structures, as well as a ready-made x64 application for testing. Good luck to everyone, bye!
 
Top Bottom