Malware Analysis Series (MAS): Part 3
Code Injection Review
Code injection is a supported operation on Window systems and, of course, it a quite useful evasion method due to the fact that a malware is able to inject (write) a malicious code into a memory region (some people use the term “segment”) of the process itself (self-injection) or a remote one (remote injection), and this payload will be executed on the target context as whether it made part of it and without leaving many evidences. Furthermore, the source process (malware) can cleanly terminate itself while the malicious payload continue being running in a supposedly good process (for example, explorer.exe and svchost.exe). At end, it’s a stealth approach for evading security defenses.
It’s quite interesting to figure out that a long list of mitigations and protections such as Code Integrity Guard, Extension Point Disable Policy, Control Flow Guard, Code Integrity Guard, Dynamic Code Restriction and Arbitrary Code Guard (a kind of update of Dynamic Code Restriction) exist since Windows 8.1 (mainly Windows 10 and 11) and it isn’t so easy to perform code injection on these Windows versions without being detected and prevented. Further information about these mitigation and protection can be read on: https://docs.microsoft.com/en-us/windows/security/threat-protection/overview-of-threatmitigations-in-windows-10 , https://docs.microsoft.com/en-us/microsoft-365/security/defenderendpoint/customize-exploit-protection and https://techcommunity.microsoft.com/t5/core-infrastructureand-security/windows-10-memory-protection-features/ba-p/259046
There are excellent public documents explaining several code injection techniques, but at a summarized way, the main code injection techniques are the following ones:
- DLL Injection: this old technique is used to force a process to load a DLL. Main potentially involved APIs: OpenProcess( ), VirtualAllocEx( ), WriteProcessMemory and CreateRemoteThreat | NtCreateThread( ) | RtlCreateUserThread( ).
- PE Injection: in this technique a malicious code is written and, consequently, forced to be executed in a remote process or even in the own process (self-injection). Main related APIs: OpenThread( ), SuspendThread( ), VirtualAllocEx( ), WriteProcessMemory( ), SetThreatContext( ) and ResumeThreat( ) | NtResumeThread( )
- Reflective Injection: this technique is similar to PE Injection, but the malicious code avoid using LoadLibrary( ) and CreateRemoteThread( ), for example. There’re many interesting derivations of this method and, one of them (also used on Cobalt Strike) is accomplished by the following APIs: CreateFileMapping( ), Nt/MapViewOfFile( ), OpenProcess( ), memcpy( ) and Nt/MapViewOfSection( ). At end, code on remote process can be executed by calling OpenProcess( ), CreateThread( ), NtQueueApcThread( ), CreateRemoteThread( ) or RtlCreateUserThread( ). It’s interesting to note that a variant could use VirtualQueryEx( ) and ReadProcessMemory( ) too.
- APC Injection: this code injection technique allows a program to execute code in a specific thread by attaching to an APC queue. The injected code will be executed by the thread when it exits of alertable state (originated by calls such as SleepEx( ), SignalObjectAndWait( ), MsgWaitForMultipleObjectsEx( ), WaitForMultipleObjectsEx( ), or WaitForSingleObjectEx( )). Therefore, it’s common to also see APIs such as CreateToolhelp32Snapshot(), Process32First( ), Process32Next( ), Thread32First( ), Thread32Next( ), QueueUserAPC( ) and KeInitializeAPC( ) involved into this technique.
- Hollowing or Process Replacement: this technique, in a nutshell, is used by the malware to “drain out” the entire content of a process and insert into it a malicious content. Some involved APIs are CreateProcess( ), NtQueryProcessInformation( ), GetModuleHandle( ), Zw/NtUnmapViewOfSection( ), VirtualAllocEx( ), WriteProcessMemory( ), GetThreadContext( ), SetThreadContext ( ) and ResumeThread( )
- AtomBombing: this technique is an a variant of the previous technique (APC injection) and works by splitting the malicious payload into separated strings, creating an Atom to each given string, copying them into a RW segment (using GlobalGetAtomName( ) and NtQueueApcThread( )) and setting the context by using NtSetContextThread( ). Therefore, a list of further APIs are OpenThread( ), GlobalAddAtom( ), GlobalGetAtomName( ) and QueueUserAPC( ).
- Process Doppelgänging: this technique could be handled as a kind of evolution of Process Hollowing. The key difference between this both techniques is that while Process Hollowing replaces the process’s content (image) before it being resumed, Process Doppelgänging is able to replace the image before the process even being created by overwriting the target image with a malicious one before it being loaded. The key concept here is that NTFS operations are performed within transactions, so either all these operations inside a transactions are committed together or none of them are committed. In the meanwhile, the malicious image only exists and it’s visible inside the transaction and it isn’t visible to any other process. Therefore, the malicious image is loaded into memory and the malware drops the malicious payload from file system (by rollbacking the transaction) as the file never had existed previously. Some APIs are involved in this technique: CreateTransaction( ), CreateFileTransaction( ), NtCreateSection, NtCreateProcessEx( ), NtQueryInformationProcess( ), NtCreateThreadEx( ) and RollbackTransaction( ).
- Process Herpaderping: this technique is similar to Process Doppelgänging, but there’s a subtle difference in its procedure. Process Herpaderping is based on that fact that security defenses usually monitor process creation by registering a callback routine on the kernel side using PsSetCreateProcessNotifyRoutineEx( ) or during driver’s DispatchCleanup routines (IRP_MJ_CLEANUP), which it is invoked after a thread being created. That’s the key issue: if the an adversary create and map a process and, afterwards, this adversary is able to modify the file image and then create the thread, so security products are able to detect such a malicious payload. Nonetheless, this checking order can be comprised whether the adversary is able to create malicious binary on disk, open a handle to it, map it as an image section using NtCreateSection function (and including the SEC_IMAGE flag), create a process using the section handle (NtCreateProcesEx()), modify the file content to not sounds like malicious and create a thread (NtCreateThreadEx()) using this “good image”. That the point: when the thread is created, the process callback is triggered and the content of the file (good one) on disk is checked, so security defenses believes that everything is fine because image on disk is not harmful, but the true malicious is on memory. In other words, security defenses could not be effective to detect such image on disk that is different from image on memory. Few APIs used for this technique: CreateFile( ), NtCreateSection( ), NtCreateProcessEx( ) and NtCreateThreadEx( ).
- Hooking Injection: to use this technique, we will see that functions involved with hooking activities such as SetWindowsHookEx( ) and PostThreadMessage( ) are used to inject a malicious DLL
- Extra Windows Memory Injection: using this technique, malware threats injects code into the a process by using the Extra Windows Memory (as known as EWM), whose size is up to 40 bytes and it’s appended the instance of a class during the registration of windows classes. The trick is that the appended spaced is enough to store a pointer that might forward the execution to a malicious code. Some possible APIs involved to this technique are FindWindowsA( ), GetWindowThreadProcessId( ), OpenProcess( ), VirtualAllocEx( ), WriteProcessMemory( ), SetWindowLongPtrA( ) and SendNotify( ).
- Propagate Injection: this technique has been used by malware threats such as RIG Exploit Kit and Smoke Loader to inject malicious code into explorer.exe process (medium integrity level) and other persistent ones, and it’s based on the approach of enumerating (EnumWindows( ) → EnumWindowsProc → EnumChildWindows( ) → EnumChildWindowsProc → EnumProps( ) → EnumPropsProc → GetProp) windows implementing SetWindowsSubclass( ) (this further information on https://docs.microsoft.com/en-us/windows/win32/api/commctrl/nf-commctrlsetwindowsubclass). As you could remember, this function install a windows subclass callback and, as you know, callbacks are interpreted as hooking methods in the security world. How does it works? Once subclassed windows are found (checking UxSubclassInfo and/or CC32SubclassInfo, which provide the subclass header), it’s possible to preserve the old windows procedure, but we can also assign a new one to the window by updating CallArray field. When an event to the target process is sent then the new procedure is called and, afterwards, the old one is also called (keeping the previous and expected behavior). Therefore, a malware inserts a malicious payload (shellcode) into the memory and updates subclass procedure using SetPropA( ). When this new property is invoked (through a windows message) , the execution is forwarded to the payload. Some Windows APIs involved to this technique are FindWindow( ), FindWindowEx( ), GetProp( ), GetWindowThreadProcessId( ), OpenProcess( ), ReadProcessMemory( ), VirtualAllocEx( ), WriteProcessMemory( ), SetProp( ) and PostMessage( ).
This short and quick review about code injection techniques will be useful to understand how malware try to keep undetected and also indirectly will help you to understand unpacking techniques
A quite usual example of a code injection sequence from malware threats is shown below (a decompiled output from IDA Pro) and, certainly, you’ll be able to identify the technique used through information presented previously in this section:
Unpacking Methods
It’s quite complicated to classify and, mainly, describe unpacking techniques, but in a general way there’re few methods to unpack a malware sample such as using a debugger, an automated tool, a web service or even writing its own unpacking code to accomplish the task statically. The chosen methods depends on specific contexts and situations.
a. Debugger + breakpoint on specific functions
This is the most known method and consist on loading the malware into a debugger and setting up software breakpoints on well-known APIs, which most of them are related to memory management and manipulation, and looking for executables and/or shellcode to be extracted from the memory. Using x64dbg/x32dbg ([ctrl]+g or bp on its CLI) is really simple to insert software breakpoints on the following APIs:
- CreateProcessInternalW( )
- VirtualAlloc( )
- VirtualAllocEx( )
- VirtualProtect( ) | ZwProtectVirtualMemory( )
- WriteProcessMemory( ) | NtWriteProcessMemory( )
- ResumeThread( ) | NtResumeThread( )
- CryptDecrypt( ) | RtlDecompressBuffer( )
- NtCreateSection( ) + MapViewOfSection( ) | ZwMapViewOfSection( )
- UnmapViewOfSection( ) | ZwUnmapViewOfSection( )
- NtWriteVirtualMemory( )
- NtReadVirtualMemory( )
During the unpacking procedure we might face some issues (for example, anti-debugging techniques being used by the malware) and other side effects. Therefore, some notes before and after unpacking could be useful:
- Set up breakpoints after malware has reached its entry point (after the system breakpoint).
- As mentioned previously, it’s recommended to use an anti-debugging plugin and, in few cases, to ignore all exceptions from 0x00000000 to 0xFFFFFFFF range (on x64dbg, go to Options → Preferences → Exceptions to include this range).
- Sometimes ignoring exceptions could be a bad idea because malware could be them to call the unpacking procedure. Additionally (and out of the context in this article) there are threats that use interruptions and exceptions to call APIs.
- Learning about all listed APIs and their respective arguments by using MSDN is a key knowledge to unpack malware threats successfully.
- If you’re using VirtualAlloc( ), it’s recommended to setup the breakpoint on its exit point (ret 10). Additionally, sometimes it is easier to follow the allocated content on dump by setting a write memory breakpoint.
- In some cases, the malware extracts its payload onto memory, but it destroys the PE Header, so you’ll have reconstruct the entire header, though it’s simple procedure using a hex editor like HxD.
- The extracted payload might be in mapped or unmapped format. If it’s in mapped format, so probably the Import table is messed up and you need to fix them by realigning sections headers manually through PEBear (favorite method) or using a tool like pe_unmapper. You might need to fix the base address and the entry point whether it’s zeroed.
- To reconstruct a destroyed IAT it’s recommended to use Scylla (embedded on x64dbg). It will be necessary to enter the OEP and one of methods to find it is by looking for code transitions given by instructions such as jmp eax, call eax, call [eax], and so on.
- Few unpacked malware samples don’t have any function in the IAT, so there’re two possibilities: either sections are misaligned (mapped version) or the unpacked malware resolves all its functions dynamically.
- Another good alternative to find OEP is through code instrumentation like PIN (https://www.intel.com/content/www/us/en/developer/articles/tool/pin-a-dynamic-binaryinstrumentation-tool.html).
- Tools like tiny_tracer (https://github.com/hasherezade/tiny_tracer) use PIN to perform instrumentation easier and can be used to learn about functions being called by the malware (quite useful for unpacking and learning about anti-analysis techniques) and also to find possible OEP.
- In many opportunities, the unpacked code could be only the first stage of a malware, so it’s necessary to repeat steps to unpack the next stages.
- Few malware sample perform self-overwriting, so you could have to set a breakpoint on the .text section to detect the unpacked binary execution.
- Depending on the extracted binary (a shellcode, for example), it might not be able to run out of a specific process context, so it’d be necessary to inject it into a running process (for example, explorer.exe) to perform further analysis.
- How can you check whether the extracted malware might be the final one? There isn’t a definitive answer and few indications might be found by looking for network functions from DLLs such as WS2_32.dll (Winsock) and Wininet.dll, plain text strings, crypto functions (mainly whether malware is an ransomware), and many other evidences. It’s a good approach to open up the extracted code on IDA Pro mainly after having re-aligned sections and/or reconstructed the IAT.
b. Debugger + break on DLL loading
This an old and simple technique to unpack malware by stopping the debugger on each DLL loaded and examining the memory mapping for potentially extracted PE format files on the memory (pay attention: don’t focus only on RWX segments because many malwares extracts its payload in RW regions and soon before transferring the execution context to the extracted executable they change the region’s permissions to RWX by using VirtualProtect( )) . No doubts it can consume some time, but It continues being efficient in many cases. Common debuggers (x64dbg, OllyDbg and Immunity Debugger) have a configuration option to break on each DLL loading. On x64dbg this option is in Options → Preferences → Events and mark DLL load. On OllyDbg you can go to Options → Debugging Options → Events and mark “Break on new module (DLL)”.
c. Automated method
A malware analyst can use tools to automate the unpacking procedure. Aleksandra Doniec (Hasherezade) has provided excellent tools to attend this objective:
- hollows-hunter: https://github.com/hasherezade/hollows_hunter/releases
- pe-sieve: https://github.com/hasherezade/pe-sieve/releases
- mal_unpack: https://github.com/hasherezade/mal_unpack/releases
Her tools has a similar approach to each other, so you should run the malware in an isolated virtual machine and execute the appropriate command, which I show some syntax examples below that can be used for a quick approach, though all of these tools contain useful options and it’s worth to check them:
- hollow_hunter.exe /pname <filename> /loop /imp
- mal_unpack.exe /exe <filename> /timeout <timeout: ms>
- pe-sieve64.exe /pid <process ID>
- pe-sieve64.exe /pid <process ID> /dmode 3 /imp 3
The unpacked binaries with some additional information are saved into a directory created by the tool.
d. Process Hacker
Another trivial (and limited) way to extract binaries from memory is through Process Hacker by doubleclicking on the running process, going to “Memory” tab, looking for interesting regions/base addresses (RWX), double-clicking it and pressing “Save” button. Of course, it’s easier finding the malicious binary/payload in case of self-injection. In case of remote injection you’ll need to reverse the malware to understand the target process to be inject or make an “educated guess” and look for the injected code on well-known targets like explorer.exe or svchost.exe, for example. Once again, it’s a limited and simple approach, but sometimes can save time.
e. Using an public/paid Internet service
You can use an Internet service as the amazing Unpacme (https://www.unpac.me/#/), which offers an automated unpacking service. There’re a free and public plan (10 submissions per month) and other paid plans that are quite interesting for researchers and companies. Furthermore, it offers an API set to interface your customized application with the Unpacme service (https://api.unpac.me/).
f. Writing an unpacker code
Although this approach sounds being time consuming, it’s quite usual writing Python code to accomplish unpacking mainly in shellcode cases or while handling a case which a malware threads use several anti-vm and anti-debugging techniques. In addition, we have an advantage to automate the unpacking process while handling similar malware cases.
We will meet in the next article. 👋