
Table of contents:
1. Introduction
2. List of possible places
• Hardware devices
• System registry
• Files in the Windows folder
• Active processes
3. Practice
4. Conclusions
1. Introduction
In our age of rapidly developing IT industry, disguise plays a huge role. Malware "with low social responsibility" does not want to voluntarily lie down on the operating table to have its carcass trepanned by anyone, and therefore tries with all its might to hide its presence in the system. Static and dynamic analysis tools in the form of disassemblers and debuggers have to fight with a rather smart opponent - as practice shows, malicious code is always one step ahead. For this reason, a reverse engineer must recognize in long listings of someone else's code a sequence of actions where a potential bypass of virtual machines and sandboxes shines through.
Static analysis of malicious code in IDA is practically safe for the researcher, although in most cases the profit from this business is questionable and tends to zero. It is usually used at the preparatory stage to view the interrelations, cross-references, or the list of imports from system DLLs. And then you have to connect the heavy artillery of debuggers to track the course of events in real time. If you do not protect the perimeter of your host system at this stage, the malicious code will get out of control, which can lead to infection of the OS and loss of confidential data. This is where virtual machines such as VirtualBox , VMware , or (less productive in this context) emulators Qemu , Bochs and the like come to the rescue.
These reverse engineering tools are given special colors and sandboxes Sandbox is a kind of test environment in which all actions of suspicious code are monitored, all modified files and settings are saved, but nothing happens in the real host system. In a well-configured sandbox, you can run any files without a twinge of conscience, fully confident that this will not affect the performance of the OS. Tools of this class are used not only for security, but also to analyze all the actions of the malware that it performs after launching its carcass.
However, any self-respecting malware is well aware of the internal structure of these tools, since after all they are host software, which means that thorough excavations can easily find their artifacts. By and large, traces of debuggers, virtual machines and sandboxes cannot be hidden in principle, which is what the virus is in a hurry to take advantage of. If it detects even the slightest hint of unfavorable soil, then for the purpose of disguise it can completely refuse to "spread flora in especially large ..." , stupidly committing hara-kiri with a confusing death note like "The system does not have the DLL library I need" , or simply falling into a black hole. The list below contains only the most well-known ways for hackers to bypass analysis tools, since the rest are their derivatives.
2. Coordinates of excavation sites
Malicious applications use various methods to identify the execution environment, and its specific features make it easier to find. Both sandboxes and virtual machines usually cannot emulate a user's workstation 100% accurately - for example, VMs have limited system resources, their own drivers and DLLs, often use hard-coded user and computer names, etc. So why not take advantage of this?
2.1. Hardware devices
Nowadays, a typical workstation has at least: CPU=2 cores, RAM=2 GB, and HDD=80/120 GB. Malware's artificial intelligence can check whether the environment complies with these restrictions. Note that, unlike emulators, virtual machines and sandboxes use the original name of the CPU vendor, so there is no point in relying on it. The processor frequency is also displayed normally, but in most cases there is only one core on a VM, a ton of RAM, and a dynamically increasing hard drive of about 10..20 gigs. This is only true for virtual machines, since sandboxes in this regard are completely dependent on the workstation's specifications.
In the default VM settings, devices have predictable names - for example, you can check the HDD and DVDROM model, the manufacturer and the name of the graphics controller (including the display resolution in pixels) . We can also search for certain virtual devices that are obviously not present in a typical host system - these are Pipe channels and ALPC ports for guest-host communication. Particular attention should be paid to the MAC address of the network device, where the first three bytes are the manufacturer's identifier.
The speed of executing processor instructions is very different from that on a virtual platform, and therefore, in order to bring the situation closer to the real one, any VM intercepts the API of system timers. For example, the Sleep() function has only one argument in the form of a delay in milliseconds - it stops the active thread, and after the specified time has elapsed, it wakes it up again. However, having intercepted, the VM analyzes the value of this argument, and at the exit from the function returns in the register EAXthe time we have strictly transmitted, although due to the relatively low frequency of the hardware timer PIT=32.768 kHz, on a real system there will definitely be deviations of +/- 1%. For example, if we specify Sleep(1000) for a delay of 1 sec, then on a virtual machine we will get exactly EAX=1000, and in real life a little less EAX=990, or vice versa more EAX=1010. This fact, with a certain degree of probability, allows us to identify the operation of the code on the virtual machine.
It should be noted here that the malware is no pushover, and it has a lot of other ways to obtain system time ticks. It can find out whether the user's Sleep() or GetTickCount() has been intercepted or not - to do this, it is enough to read a few bytes in the function prologue in memory, then load the original Kernel32.dll library from disk, and, having found the required function in it, compare the two prologues. If they are different, then the VM has intercepted the API by splicing and the malware goes down one level to send the current thread to the kingdom of Morpheus using NtDelayExecution() from the native Ntdll.dll library.
C-like:
0: kd> u Sleep Пролог
kernel32!Sleep: ------------
00000000`7740efa8 ff2542e80700 jmp qword [kernel32!_imp_Sleep (00000000`7748d7f0)]
00000000`7740efae 90 nop
00000000`7740efaf 90 nop
00000000`7740efb0 90 nop
0: kd> u NtDelayExecution
ntdll!ZwDelayExecution:
00000000`77678e00 4c8bd1 mov r10,rcx
00000000`77678e03 b831000000 mov eax,31h
00000000`77678e08 0f05 syscall
00000000`77678e0a c3 ret
The new generation of malware uses another rather clever trick: it does not access the timer API at all, but reads ticks directly from the user-accessible nuclear structure KUSER_SHARED_DATA , which is always projected by Win at the same address. 0x7ffe0000on both x32 and x64. The qword-sized TickCountQuad field is located at offset 0x320.
C-like:
0: kd> dt _KUSER_SHARED_DATA 0x7ffe0000 Tick.
ntdll!_KUSER_SHARED_DATA
+0x000 TickCountLowDeprecated : 0
+0x004 TickCountMultiplier : 0xf99a027
+0x320 TickCountQuad : 0x1e2423
+0x32c TickCountPad : 0
0: kd>
As we can see, the multiplier is located at offset (4) - it solves the problem of compatibility of one structure on 2 systems of different bit depth x32/64. Thus, instead of calling the API GetTickCount() (returns the number of milli/sec since the system was started) , you can safely take ticks from KUSER_SHARED_DATA , since judging by the disassembly listings, this API under the cut does the same thing:
C-like:
0: kd> u GetTickCount
kernel32!GetTickCount:
00000000`7730ef40 ff25f2e90700 jmp qword [kernel32!_imp_GetTickCount (00000000`7738d938)]
00000000`7730ef46 90 nop
0: kd> dps 00000000`7738d938 L5
00000000`7738d938 000007fe`fd101120 KERNELBASE!GetTickCount
00000000`7738d940 000007fe`fd108350 KERNELBASE!GlobalMemoryStatusEx
00000000`7738d948 000007fe`fd113730 KERNELBASE!GetVersion
00000000`7738d950 000007fe`fd13b560 KERNELBASE!GetWindowsDirectoryA
00000000`7738d958 00000000`00000000
;//--------------------------------- x64 ----------------------------
0: kd> u KERNELBASE!GetTickCount
KERNELBASE!GetTickCount:
000007fe`fd101120 8b0c250400fe7f mov ecx,dword [SharedUserData+0x004 (00000000`7ffe0004)]
000007fe`fd101127 488b04252003fe7f mov rax,qword [SharedUserData+0x320 (00000000`7ffe0320)]
000007fe`fd10112f 480fafc1 imul rax,rcx <------ перемножить 2 поля
000007fe`fd101133 48c1e818 shr rax,18h <------ поправка к 100 нано/сек блокам (см. MSDN)
000007fe`fd101137 c3 ret
;//--------------------------------- x32 ----------------------------
kd> uf GetTickCount
kernel32!GetTickCount:
7c80934a ba0000fe7f mov edx,offset SharedUserData (7ffe0000)
7c80934f 8b02 mov eax,dword ptr [edx]
7c809351 f76204 mul eax,dword ptr [edx+4]
7c809354 0facd018 shrd eax,edx,18h
7c809358 c3 ret
kd>
2.2. System registry
The Win registry is a treasure trove of information. Here are the most interesting keys and their meanings:
• HKLM\HARDWARE\Description\System– you can read the BIOS version, which will be different on a real and virtual machine
• HKLM\SYSTEM\ControlSet001\Services- services, among which there are many artifacts pointing to the VM and sandbox
• HKLM\SYSTEM\CurrentControlSet\Enum\IDE – hard drive and CDROM names
2.3. Files in the Windows folder
When installing both the sandbox and the VM, they will definitely shit in the Win and ProgramFiles system folder. So you can check their contents, where we will find libraries and drivers for the virtual machine hypervisor. In the folder with host programs, VirtualBox creates an Oracle directory, but since we will try to install malware not on the host, but on the VM, then we need to check for DLLs in the System32 folder of the virtual environment. There is a decent zoo there, the names of all libraries begin with the prefix VBox***.dll . It is easy to organize their search using the FindFirst/NextFile() function , which is able to search for files using a mask with an asterisk (*).
2.4. Active processes and modules
The most common sandbox, Sandboxie, has a special feature: all its files have the prefix Sbie_xxx . The software core includes three files: the SbieDrv.sys driver , the SbieSvc.exe Inject library, which is injected into all processes launched from it service, and the SbieDll.dll . Thus, in order to catch the launch of code from the sandbox by the tail with a 99% probability, it is enough to list all the DLL modules loaded into our carcass and check for the presence of SbieDll.dll among them .
As a control shot, you can use another trick. The thing is that for complete isolation of the execution environment, sandboxes create a separate desktop "Desktop" inside the current station. You can find out the name of the Sandboxie desktop by running the software "ProcessHacker" or "ProcExplorer" by M. Russinovich from the Sysinternals package. The screenshot below shows that it is called "Default" , while on a real clean machine the system calls it by the name of the workstation "WinSta0" . This fact can be used as evidence.
3. Practical part
In the following example, I have collected all of the above under one hood.
To support cross-platform, the code was written specifically in 32-bit. Since it has hints of malware, it's clear that the avery on VirusTotal will curse us out, although there really are no malicious actions in it, except for the usual scan of the ground underfoot. But try to explain this to a dumb avery... oh well. In general, here are the main points:
1. We take the CPU vendor instructions CPUID. For this you need a cycle starting with EAX=0x80000002and to EAX=0x80000004. At every step CPUIDwill return a string in 4 general purpose registers EAX,EBX,ECX,EDX. We just dump them into the buffer and dump them to the console.
2. The easiest way to find out the number of processor cores is to read them from the “NumberOfProcessors” field at offset (64) of the REB structure .
The actual (not declared by the vendor) processor frequency at the current moment can be calculated by requesting CPU ticks using the instruction RDTSC, then use the Sleep(1000) function to pause for 1 sec, and re-read the ticks. The difference between the second and first requests will be the "Frequency" value.
3. The volume of installed physical RAM memory is returned by the API GlobalMemoryStatusEx() – everything is normal here.
4. As mentioned above, the virtual machine hooks the GetTickCount() function , so by sleeping for 1 second via Sleep(1000) , you can calculate it.
5. In some cases, you can read the machine name GetComputerName() - if we get something absurd in response, we draw conclusions.
6. The string with the type of graphics adapter is returned by EnumDisplayDevices() , and the display resolution by GetSystemMetrics() .
7. Using RegOpenKeyEx() + RegQueryValueEx() we will look into the Win registry to get the BIOS version.
8. Using functions from the Setuapi.dll library of the SetupDiEnumDeviceInfo() type , you can get strings with the name of the hard drive and CD-ROM drive – they hand over the VM with all its giblets.
9. Open the disk via CreateFile() and then IOCTL_DISK_GET_DRIVE_GEOMETRYLet's find out the disk capacity in gigabytes.
10. The MAC address of the network card is returned by GetAdaptersInfo() from the Iphlpapi.dll library.
11. FindFirst/NextFile() searches for files by the Vbox*.dll mask – all that remains is to dump them to the console.
12. In the Dbghelp.dll library there is a function EnumerateLoadedModules() , which the doctor prescribed for listing all the modules in our process.
13. And finally, EnumDesktops() returns to its callback the names of all desktops in the current station.
C-like:
format pe console
include 'win32ax.inc'
include 'equates\iphlpapi.inc'
include 'equates\setupapi.inc'
entry start
;//-----------
section '.data' data readable writeable
ai IP_ADAPTER_INFO
devInfo SP_DEVINFO_DATA
memStat MEMORYSTATUS_EX
dispDev DISPLAY_DEVICEA
align 16
struct DISK_GEOMETRY
totalCyl dd 0,0 ; - Цилиндров
mediaType dd 0 ; - Тип устройства
trackCyl dd 0 ; - Дорожек (треков) в цилиндре
secTrack dd 0 ; - Секторов в треке
byteSec dd 0 ; - Байт в секторе
ends
diskGeo DISK_GEOMETRY
IOCTL_DISK_GET_DRIVE_GEOMETRY = 0x00070000
sizePointer dd 0
hndl dd 0
type dd 0
mByte dd 1024*1024
gByte dd 1024*1024*1024
result dq 0
buff db 0
;//-----------
section '.text' code readable executable
start: invoke SetConsoleTitle,<'*** Virtual Machine & Sandbox Detect ***',0>
;//----- Инфа о процессоре -------------------------------
mov edi,buff
mov ecx,3
mov eax,0x80000002
@@: push eax ecx
cpuid
stosd
xchg eax,ebx
stosd
xchg eax,ecx
stosd
xchg eax,edx
stosd
pop ecx eax
inc eax
loop @b
cinvoke printf,<10,' CPU name string.....: %s',0>,buff
mov eax,[fs:30h]
mov eax,[eax+64h]
cinvoke printf,<10,' CPU cores...........: %d',0>,eax
rdtsc
push eax
invoke Sleep,1000
rdtsc
pop ebx
sub eax,ebx
shr eax,20
cinvoke printf,<10,' CPU frequency.......: %d.0 MHz',0>,eax
;//----- Размер памяти -----------------------------------
invoke GlobalMemoryStatusEx,memStat
push dword[memStat.dqTotalPhys+4]
push dword[memStat.dqTotalPhys]
fild qword[esp]
fidiv [mByte]
fstp [result]
add esp,8
cinvoke printf,<10,' Physical memory.....: %.1f Mb',0>,\
dword[result],dword[result+4]
;//----- Тест по таймеру ---------------------------------
cinvoke printf,<10,' Check 1 sec delay...: ',0>
invoke GetTickCount
push eax
invoke Sleep,1000
invoke GetTickCount
pop ebx
sub eax,ebx
cinvoke printf,<'%4u.0 msec',0>,eax
;//----- NetBios имя компьютера --------------------------
mov [sizePointer],128
invoke GetComputerName,buff,sizePointer
invoke CharToOem,buff,buff
cinvoke printf,<10,10,' Computer name.......: %s',0>,buff
;//----- Графический адаптер -----------------------------
invoke EnumDisplayDevices,0,0,dispDev,1
lea ebx,[dispDev.DeviceString]
cinvoke printf,<10,' Display device......: %s',0>,ebx
invoke GetSystemMetrics,SM_CXSCREEN
push eax
invoke GetSystemMetrics,SM_CYSCREEN
pop ebx
cinvoke printf,<'. %d x %d',0>,ebx,eax
;//----- Версия биос из реестра --------------------------
call ClearBuff
invoke RegOpenKeyEx,HKEY_LOCAL_MACHINE,<'Hardware\Description\System',0>,0,KEY_ALL_ACCESS,hndl
mov [sizePointer],64
invoke RegQueryValueEx,[hndl],<'SystemBiosVersion',0>,0,type,buff,sizePointer
cinvoke printf,<10,' Bios Version........: %s',10,0>,buff
;//----- Имя харда и CD через SetupApi.dll ---------------
call ClearBuff
invoke SetupDiGetClassDevs,GUID_DEVCLASS_CDROM,0,0,DIGCF_PROFILE
push eax
invoke SetupDiEnumDeviceInfo,eax,0,devInfo
pop eax
invoke SetupDiGetDeviceRegistryProperty,eax,devInfo,SPDRP_FRIENDLYNAME,0,buff,128,0
cinvoke printf,<10,' CD/DVDROM name.....: %s',0>,buff
call ClearBuff
invoke SetupDiGetClassDevs,GUID_DEVCLASS_DISKDRIVE,0,0,DIGCF_PRESENT
push eax
invoke SetupDiEnumDeviceInfo,eax,0,devInfo
pop eax
invoke SetupDiGetDeviceRegistryProperty,eax,devInfo,SPDRP_FRIENDLYNAME,0,buff,128,0
cinvoke printf,<10,' Hard drive name.....: %s',0>,buff
;//----- Размер диска ------------------------------------
call ClearBuff
invoke CreateFile,<'\\.\PhysicalDrive0',0>,0,FILE_SHARE_READ + FILE_SHARE_WRITE,\
0,OPEN_EXISTING,0,0
push eax
invoke DeviceIoControl,eax,IOCTL_DISK_GET_DRIVE_GEOMETRY,\
0,0,diskGeo,sizeof.DISK_GEOMETRY,type,0
pop eax
invoke CloseHandle,eax
mov eax,[diskGeo.byteSec]
imul eax,[diskGeo.secTrack]
imul eax,[diskGeo.trackCyl]
mov [type],eax
fild qword[diskGeo.totalCyl]
fimul [type]
fidiv [gByte]
fstp [result]
cinvoke printf, <10,' Hard drive size.....: %.1f Gb',10,0>,dword[result],dword[result+4]
;//----- МАС сетевой карты -------------------------------
invoke GetAdaptersInfo,ai,sizePointer
invoke GetAdaptersInfo,ai,sizePointer
movzx eax,[ai.Address+0]
movzx ebx,[ai.Address+1]
movzx ecx,[ai.Address+2]
movzx edx,[ai.Address+3]
movzx esi,[ai.Address+4]
movzx edi,[ai.Address+5]
cinvoke printf,<10,' NIC MAC address.....: %02x:%02x:%02x:%02x:%02x:%02x',\
10,' --------',\
10,' 08:00:27: <----- VBox v5.2',\
10,' 00:21:f6: v3.3',\
10,' 52:54:00: Vagrant or QEMU',\
10,' --------',\
10,' 00:50:56: <----- VMware Workstation',\
10,' 00:0c:29: ESXi Host ',\
10,' 00:05:69: ESXi,GSX',\
10,' 00:1c:14: VMware',10,0>,\
eax,ebx,ecx,edx,esi,edi
;//----- Библиотеки в Windows\system32 -------------------
call ClearBuff
invoke FindFirstFile,<'C:\Windows\System32\VBox*.dll',0>,buff
mov [hndl],eax
cinvoke printf,<10,' VBox \System32 dll..: %s',0>,buff+44
cmp [hndl],0
jz @f
@next: invoke FindNextFile,[hndl],buff
or eax,eax
jz @f
cinvoke printf,<10,23 dup(' '),'%s',0>,buff+44
jmp @next
@@: invoke CloseHandle,[hndl]
;//----- Sandbox - загруженные в процесс библиотеки ------
cinvoke printf,<10,10,' Sandboxie dll test..: ',0>
invoke GetCurrentProcess
invoke EnumerateLoadedModules,eax,EnumCallback,0
;//----- Перечисление рабочих столов Desktop -------------
cinvoke printf,<10,' Sandboxie Desktop...: ',0>
invoke EnumDesktops,0,DesktopCallback,0
@exit: cinvoke _getch
cinvoke exit, 0
;//********************************************************
proc ClearBuff
mov edi,buff
mov ecx,128/4
xor eax,eax
rep stosd
ret
endp
proc EnumCallback Name,Base,Size,Context
mov esi,[Name]
@@: lodsb
or al,al
jnz @b
sub esi,12
lodsd
cmp eax,'Sbie'
jnz @01
sub esi,4
cinvoke printf,<'%s <--- Found!',0>,esi
@01: mov eax,1
ret
endp
proc DesktopCallback Desktop,lParam
push esi
mov esi,[Desktop]
cmp dword[esi],'Defa'
jnz @02
cinvoke printf,<'%s <------- Found! ',0>,[Desktop]
@02: pop esi
mov eax,1
ret
endp
;//********************************************************
section '.idata' import data readable writeable
library msvcrt,'msvcrt.dll',kernel32,'kernel32.dll',\
user32,'user32.dll',iphlpapi,'iphlpapi.dll',\
advapi32,'advapi32.dll',setupapi,'setupapi.dll',dbghelp,'dbghelp.dll'
import dbghelp, EnumerateLoadedModules,'EnumerateLoadedModules'
include 'api\msvcrt.inc'
include 'api\kernel32.inc'
include 'api\user32.inc'
include 'api\iphlpapi.inc'
include 'api\setupapi.inc'
include 'api\advapi32.inc'
Now let's look at the herd of shot hares.
This is what the code returned on my real machine with Win7 on board.
Note the value of 1 sec delay, which in this case is equal to 998 m/sec (PIT timer error) , although I requested exactly 1000. Otherwise, nothing special, with the actual hardware on my experimental node:
And then the code log, which I launched in the Sandboxie sandbox .
injected into my process without asking In principle, everything is similar, if you do not count the left library SbieDll.dll , as well as the desktop "Default" . If it were not for this nuance, it would be very problematic to detect the sandbox:
But under VirtualBox, you don’t need to be a fortune teller to find artifacts like mushrooms after rain.
Firstly, no matter how many times I run this code, Sleep(1000) in the header always returns exactly the value I passed, 1000 m/sec, which certainly raises suspicion. And then everything goes up: graphics, BIOS, disks, physical network address, and either in the system folder. It seems that the developers did not set themselves the task of hiding anything at all, although it was not that difficult to implement. On the other hand, virtual machines have a completely different purpose, which they cope with perfectly.
4. Conclusions
In closing, I would like to say that if malware is so afraid of virtual machines and sandboxes, then it can be deceived by creating a fictitious test environment. For example, write some harmless lib, call it SbieDll.dll and place it in startup, or throw a bunch of text files into the system folder, renaming them to VBox*.dll from the list above, and so on in the same spirit. Then, after its tests, the malicious code can take the bait and voluntarily refuse to infect the system, believing that it has entered a virtual debugging environment. Although not 100%, this will further strengthen the system's defense fortress.
I put an executable file in a paper clip for testing. I ran the experiments on a VBox virtual machine, and I was too lazy to install VMware specifically for this purpose. Therefore, if anyone has a VAR with Win installed on it from XP to 10 of any bit depth, please show the machine's reaction to this code. In particular, I am interested in the names of hardware devices, including disks, MAC address, and everything else. I will be very grateful, good luck to everyone, bye!