Advanced Malware-Analysis workflows: De-obfuscation workflows on Windows

Tags: #<Tag:0x00007f8a1db4e778> #<Tag:0x00007f8a1db4e548> #<Tag:0x00007f8a1db4e458> #<Tag:0x00007f8a1db4e390> #<Tag:0x00007f8a1db4e2a0> #<Tag:0x00007f8a1db4e1d8> #<Tag:0x00007f8a1db4e0c0> #<Tag:0x00007f8a1db4de90> #<Tag:0x00007f8a1db4ddc8> #<Tag:0x00007f8a1db4dc10> #<Tag:0x00007f8a1db4da30>

This is a draft.

De-obfuscation and Anti Anti-analysis techniques - a necessary skill set on Windows

Why is this necessary knowledge?

I started reverse-engineering on Mac OS X. Apple uses a different binary format, which is called Mach-o. Back then Apple supported different architectures. I started PPC. Later I learned x86 and x86-64.
On Mac OS 10.4 (Tiger) we could do almost everything with GDB, otoole, otx, class-dump and so on. Reverse engineering software wasn’t really hard on Mac OS X.

On Windows, nowadays especially, reverse engineering is much harder. There are efficient obfuscation techniques. DRM uses this as well. So there are even legit use cases for obfuscation, and of course software vendors and innovators can protect their work. But for a Malware Analyst this means we cannot just classify anything, which is hard to reverse engineer due to obfuscation / protection techniques, as malicious.

Malware. - Uses obfuscation as well. Commercial grade. Sometimes Malware authors use pirated obfuscation tools. Like Themida.

Themida® uses the SecureEngine® protection technology that, when running in the highest priority level, implements never seen before protection techniques to protect applications against advanced software cracking.

But let’s not jump into that right now. If you are interested in Malware Analysis, check out my other blog posts and wiki articles on Malware Analysis, Forensics and Reverse Engineering. If you are new to the topic, consider this one an advanced follow-up article.

This topic is so vast, that this article can only scratch the surface and has to to remain focused on the workflows.


Workflow - basic steps to de-obfuscate a sample

  1. Test the acquired sample - is it sane? Does it work at all?
  2. I can attempt to detect an obfuscation technique. There are tools for that, which have signatures and methods for this.
  3. Check for smart ways, like memory forensics. Maybe we can dump a secret or even a cryptor password? Any contextual infos you are looking for? Hostnames, IPs, sockets… Sometimes you can just use this and skip all the rest. But better safe than sorry.
    a. keep in mind: Memory forensics is about what the Malware did. Code analysis is about what the Malware could do. Behavior analysis is about what the Malware does.
    b. remind this and use careful language in the report. You perform the rest of the analysis to gain certainty.
  4. Instrument the executable with a debugger. OllyDBG, IDA Pro, BinNavi, x64dbg… maybe even WinDBG
    a. Breakpoints, single-stepping… hard work
  5. dump the process after unpacking / decoding / decryption
  6. keep it running, rebuild imports table with LordPE or Scylla - result is an unprotected sample
  7. load the file into IDA Pro for example
  8. and if you are like me you start some Behavioral Analysis in the lab and remote-debug the sample in BinNavi. But you can also do it directly on the lab VM of course.

If everything works here, you can perform code level analysis and report what the Malware did or attempted to do, enumerate the capabilities and do your timeline analysis. Find IOCs, define some NIDS and Endpoint Protection rules, and do your job.
Maybe you can search through a network ringbuffer for the presence of that Malware in your company. Lots of options… but first things first.

Detect packers, cryptors, encoders in PE files

I am listing the tools here, because Woodman is dead. I am not aware of another comprehensive tool list.


There was a popular tool called PEiD to detect obfuscation techniques in Windows PEs. It can also help to detect the Original Entry Point (OEP). The tools is not actively supported any more. But many tools use the signatures.


AnalyzePE’s pescanner uses PEiD sigantures.


You can take a look at this one. It has worked for me, in the lab.

Detect It Easy or DIE

DIE is being developed by horsicq, It can identify more packers and goes one step deeper. It has got plugins for HIEW and CFF Explorer. Therefore it’s my first choice these days.

Here’s the GitHub project.

Many programs of the kind (PEID, PE tools) allow to use third-party signatures. Unfortunately, those signatures scan only bytes by the pre-set mask, and it is not possible to specify additional parameters. As the result, false triggering often occur.

Exeinfo (PE)

Exeinfo might be useful.

Exeinfo PE is a different tool. It also has hints on how to deal with certain obfuscations.


TitanEngine is a framework.

We have designed TitanEngine in such fashion that writing unpackers would mimic analyst’s manual unpacking process. Basic set of libraries, which will later become the framework, had the functionality of the four most common tools used in the unpacking process: debugger, dumper, importer and realigner.

According to the docs it works with MarioPack, CryptoCrackPE, ExeFog, MEW, PackMan, nPack, yC, MEW 5, PeX … UPX, tELock, AHPack, AlexProtector, FSG (simplified), DEB… and integrates into Python.
You can use it with PEiD and Olly. Worth a look, maybe. I haven’t come across most of these packers so far. I see Themida a lot though.

Anti Analysis

Some Malware tries to evade automated detection (which AV companies rely on). These labs have certain characteristics their targets don’t have. Like virtualization, enterprise hardware, the presence of analysis tools, specific system names, IP address ranges…

To detect virtualization Malware can enumerate the graphics adapters and other drivers, or probe for VM back-channel I/O to detect guest utilities.

Other means of Anti Analyis may include that the Malware times its execution. So when you set a breakpoint, the time between the routines changes. If the Malware runs too slow, that can indicate debugging or some other form of instrumentation.

Some Sandbox environments have default characteristics. Certain usernames (Sandbox-User), certain ways to rename an executable (sample-123.exe), clipboard is always empty (Dyre checked for that). only a few CPUs are available, mouse cursor never moves, disk is very small, foreground window doesn’t change the color (Tinba checked that). If the clipboard is empty and window focus never changes, that means no one is using the system. Or at least not a person.
Another way for Malware to evade automated sandbox analysis environment is causing Popups, which have to be clicked. The sandbox environment might not be able to do this automatically.

Detect a Debugger - Anti Debugging

Windows has an API call for this: IsDebuggerPresent. It will return 1 into EAX if a debugger is present.

CALL IsDebuggerPresent
JNZ foo.123456

Make it a JZ, flip the logic… Or let’s just NOP it. With Olly you can fill the rest with NOPs to keep size, to prevent the program from becoming misaligned.

There might be further integrity checks, which involve hashes. You’d handle these in a similar way

Afterwards in Olly: Edit → Copy all modifications to executable , Save file… And work on the modified sample.

There are 100s of ways… take a look at OllyExt. It’s very useful to overcome generic anti debugging techniques.

BinDiffing for the patches - check

There are a couple of quick ways to check for the modifications.

  • the Windows command fc
  • on Linux you can use vbindiff and radiff2
  • You can use BinDiff if you have a lot of patches, but since you need to create IDBs it takes a little longer. It’s free now.


Detect a sniffer - Anti Sniffing

Windows has an API call, which a program can use to enumerate the processes. It’s CreateToolhelp32SnapShot. Very intuitive…

Since this returns a list, you’d see a function which handles the data from this call. There will most likely be some strcmp calls, which compare with “wireshark.exe”, for example. I’d note these offsets, and load the Malware into Olly. You will most likely have to reverse engineer a loop in assembly.

Hint: if you want to work with a Debugger and IDA Pro on Windows 10, and see the same offsets, you need to opt the process out of ASLR:
You need to disable DynamicBase so that the addresses aren’t randomized via ASLR in Olly (on Windows 10). I use CFF Explorer for that. It’s at the Optional Header → DLL Characteristics → untick DLL can move and disable ASLR.

In Olly the “trick” is to press CTRL + g for goto, and to punch in the offset of the string comparisons, for example. You can set a breakpoint in Olly (F2) at the strcmp, and check the stack for the strings, which are going to be compared.

In IDA Pro strings can be x-ref’ed.


Hide code and data

In order to hide code and data obfuscation techniques can be used. These can be sophisticated. Like poly-morph code or non-standard instruction usage.

Disassemblers can use the Linear Sweep method, and Recursive Traversal. To identify code versus data they do some guessing and poke around. Heuristics. This can fail.

In order to make them fail a Malware can branch into multibyte instructions. If there is a jump into a multibyte instruction, the disassembler can get irritated. OllyDBG might even crash. But if you execute the program Olly will realign the code.
In order to spot this you need to pay some attention to the hex.

Or Malware Authors can just add a lot of junk code, possibly dead segments. They can add some junk between pusha and popa. This can also be used for AV evasion, because even today some AVs on the market are still file signature based.

RETN can irritate Olly and other Debuggers

You might see something like that in Olly:

CALL 0xNextOffset

What happens here is that the CALL inst pushes the EIP to the stack. With POP EIP is written into EBX. The RETN is also a jump. It looks at the top of the stack, and jumps there. If the program jumps there, the code will be treated as an instruction. But Olly will assume that after the RETN is only data.

String de-obfuscation - basics

  • We can use xorsearch or brutexor.

  • might be useful to identify common encoding patterns.

    /opt/balbuzard/ -l 1 foo.bin > /tmp/foo.txt
    vim /tmp/foo.txt

It’s part of a toolkit for this task.

Other tools to automatically de-obfuscate strings

  • Kahu converter utilities (very powerful). This is not a Malware site as far as I can tell. But at the moment Chrome redirects me to a red warning page, and I need to confirm that I accept the risks. Sometimes Malware authors turn the tables, and blacklist security tools this way.
  • NoMoreXOR - a tool to help guess a files 256 byte XOR key by using frequency analysis
  • unXOR, xortool, xorBruteForcer

Dynamic string building

  • (in REMnux). This can locate strange dynamically defined strings, which are concatenated and converted. This works based on a disassembler.

Memory Forensics

You may be able to acquire a memory image to use Memory Forensics.

Unpacking Malware

Trick jumps with SEH chain linking

This is a beautiful example why a Malware Analyst needs to have basic Exploit developer skills.

PEcompact is an example of a packer, that uses trick jumps. It makes use of the Windows SEH (Structured Exception Handler), in order to trick a program to jump without using traditional x86 jumps.

We know Stack frame based exception handling. Like /GS alias Ghost Stack.

SEH however is an example of a vector based protection. If you are a developer you know the try - catch exception handling keywords in C++ or Java.

However these SEH exceptions will be added automatically, by the compiler. Given that the Visual Studio compiler option /SAFESEH is enabled; for example. The Malware can trick the program to jump into the exception handler. With a trick jump.
This way the Malware is riding the SEH chain, into it’s own routines. Similar to certain exploits, which do this to exploit stack buffer overflows.

The SEH chain is a linked list data structure. The first field of each element in this data structure is the address of the next exception handling function. This second is a pointer, to the exception handler. A pointer to the catch block.
These elements get pushed onto the stack, and therefore appear in reverse order. That’s important to keep in mind, when you inspect SEH chains at runtime.

(image by Corelan Team)

You see a design pattern here. Chain of Responsibility. An exception is triggered.
Then there is a cascading chain reaction. It deepens into more generic exception handlers. The first might catch a buffer-length exception, the second an arithmetic cast exception and the final one might be some Windows generic one. Each of these elements in the linked list points to its next exception until the list ends at 0xFFFFFFFF.

If an exception happens, there is a lookup to FS:[0]. That’s where the first SEH list element (a struct) resides. FS:[0] is in the TEB (Thread Environment Block).

Long story short: if you see something like this, take a closer look.

MOV EAX,0x12345678

With the mov of FS:[0] into ESP a new element gets added to the SEH chain. It’s like a catch block injection, at 0x12345678. The two PUSH instructions have laid out the list element structure on the Stack.
Then the MOV DWORD PTR DS:[EAX],ECX triggers an error because EAX at this point has been zero’ed with the self-XOR. So this program first injects a SEH element into the chain, and then triggers an exception.

In Olly you can inspect something like this by inspecting the View → VEH / SEH chain. This will show the current state of the program’s SEH handlers (catch blocks).

Use OllyDump to incrementally narrow towards OEP

Usually the obfuscation routine inserts a stub, which gets executed before the original code. What we are looking for is getting the Original Entry Point (OEP). We’d ride the SEH chain on a debugger, set a breakpoint on a new SEH chain element and use OllyDump(Ex).
OllyDump for OllyDBG 1 can fix the IAT automatically. OllyDump however usually required me to rebuild the PE header with CFF Explorer.

Many packers store the OEP in EAX before unpacking. Usually if we step through a packer, a JMP EAX, or a CALL EAX marks the end of the stub.
Once this has executed the Malware is unpacked / decoded in memory. And we can dump it. I usually dump along the PUSH instructions, towards the CALL or JMP. I inspect the strings and if they are readable, If they are my chances are good that I am getting closer, or that I have OEP.

Next step is to is Scylla or Imports Fixer to rebuild the imports. There also is Universal Import Fixer (UIF). UIF gets used in conjunction with tools like Scylla. It massages the process for tools like Scylla.

Ransomware and VM lab snapshots

If you dynamically reverse engineer a Ransomware it’s important that you keep a snapshot. Because it will lock you of your lab VM. And I am saying this as a burned child…

Ransomware is a huge problem,. and illustrates the business model of Malware. How do you not pay?

Thread Local Storage (TLS) callback functions

TLS (not the Transport Layer Encryption) here is the Windows feature, that allows threads to maintain different values for a variable. These callbacks will be executed before the entry point. So it’s possible to execute code before the entry point with a TLS callback function.
This is highly annoying, if you drop the Ransomware onto Olly. And it executes the TLS callback. And the TLS callback function locks you out. :boom:

You do so static analysis on TLS callbacks. pescanner can list TLS callbacks. IDA Pro let’s you chose the Entry Point (CTRL + E).

You can also use Olly, and configure it to make the first pause at a TLS callback (if defined). Navigate to Options → Debugging → Stat and use the radio button. The default in Olly is WinMain (if location is known). But this is not good, if you deal with Ransomware, that implements the evil lock out routines as TLS callback functions.
Then don’t forget to disable ASLR (Dynamic Base) to instrument the binary with corresponding offsets between IDA Pro and Olly.

Hooks and Hollows - Process Hollowing

Malware can do the following:

  1. use CreateProcessA to create a process in a suspended state on Windows. Windows will create the process and pause it.
  2. then call NtUnmapViewOfSection. There is will hollow out contents of the memory of other processes. A suspended process won’t crash when this happens.
  3. In that hollowed out process memory gets allocated with VirtualAllocEx.
  4. WriteProcessMemory writes code into that process
  5. Process gets resumed with ResumeThread (because the Windows kernel works based on Threads)

The best debugging technique here might be to set hardware breakpoint on a direct or indirect call of such an API call. An indirect call is the CALL EAX style instruction, where the address of the API call has been placed into a register. A hardware breakpoint remains set even if you restart the process in Olly. Note that some weird hypervisors might not allow you setting these…

Malware Analysis of Process Hollowing samples

  • First I’d take a look at the parameters of CreateProcessA. These will be on the Stack.The CreationFlag is 4 I’d suspect that this is the beginning of the Process Hollowing approach.
  • I’d single-step over to the next API call (F8). Always focused on the calls.
  • And I’d find an NtUnmapViewOfSection. This is the hollowing approach, which carves out the process, which has been created in a suspended state.
  • WriteProcessMemory would write something into that process. There should be a Buffer address, which holds that… In Olly you’d right-click that Buffer argument on the Stack window and hit Follow in Dump
    • if the dump under the disassembly listing shows an MZ somewhere, as an indicator that I am seeing a PE header, I am on the right track. So that executable is written into the hollowed out process here. These dumps can be packed as well.
    • right-click the dump area, Backup → Save data to file …. This allows us to take care of this executable in a targeted fashion. From there you’d have to restart the analysis… from the beginning. You might need to cut out the section from MZ to the end with a hex editor.

So what we did is: we instrumented a Malware, which used an injection approach called Process Hollowing. We dumped out the target payload, which was supposed to be injected.

Use Olly tracing to determine the exit cause of a self-defending Malware

For tracing it’s usually better to have Olly using Hardware breakpoints. It’s at Options → Debugging → Use HW breakpoints for stepping. Tracing into programs crashes is another Exploit developer technique a Malware Analyst needs to know. Olly has a recording function. This will not only log every instruction after you Trace Into it. It will also log the contents of the registers so that you can see what the conditions were, for the jumps.

Another way to do this is using differential debugging, code coverage. You’d see where the program steps out, but not necessarily why. Of course you can perform differential debugging (with BinNavi) and trace into a function at the same time. Usually that’s overkill.

Here’s the workflow:

  1. Check the IAT with LordPE or something. You see something like kernel32:LoadLibraryA gets imported.
  2. When Olly holds at the OEP, set a HW breakpoint on that call, and step into
  3. crash :boom:
  4. Run until the breakpoint, and Trace Into instead of Step Into
  5. crash :boom:
  6. Check the trace log, and you may see that it aggressively crashes itself
  1. work upwards. Why did it do that? Did is detect the breakpoint? Any 0xCC (INT3) in a CMP?

Malware authors know how debuggers work. Olly temporarily patches the program when you set a breakpoint. Try different kinds of breakpoints. Or use a HW breakpoint instead of a soft-breakpoint. But some Malware can look into Debug registers via SEH chain riding, because Windows passes this to an exception handler.

Shellcode analysis

What are Malware authors trying to hide? Their Exploits?

I have seen lots of interesting Shellcode in Malware. After all the de-obfuscation that’s a huge win. But we also have to know how read Shellcode. Shellcode can do everything assembly code can do… in theory. Shellcode has traditionally been used to spawn shells, but nowadays it does much more.

Most Shellcode in Malware is boring. Droppers and Downloaders.

Shellcode consists of opcodes like 0x90 written as \x90, which is a NOP. In Olly you often see the hex values, and sometimes the disassembling didn’t work. So reading a little opcode is a good skill for a Malware analyst.

But since we all humans, we want do be able to disassemble Shellcode. Many people compile Shellcode anyways, with position independent code compilers.

Disassemble Shellcode with rasm2

Remove the \x prefix and line up the shellcode like 9090909090... Like this.

Save the Shellcode into a file, named foo.txt

~ » more foo.txt

Excellent. Now use GNU to remove the \x parts:

~ » cat foo.txt | tr --delete '\\x'    

Now let’s pass this to rasm2. The minus, -, indicates that we use stdin

~ » cat foo.txt | tr --delete '\\x'   | rasm2 -a x86 -D -     
0x00000000   1                       90  nop
0x00000001   1                       90  nop
0x00000002   1                       90  nop
0x00000003   1                       90  nop

Use IDA Pro and Olly to analyze Shellcode

In order to get these tools to work with the Shellcode, you need to wrap it into an executable.

REMnux has You’d run it: -s foo.txt
file foo.exe

This needs the \xs.