C++ Reverse Engineering - concepts and tools - Ghidra and WinDbg Preview

Tags: #<Tag:0x00007febef86f2a8> #<Tag:0x00007febef86f1e0> #<Tag:0x00007febef86f118> #<Tag:0x00007febef86f050> #<Tag:0x00007febef86eec0> #<Tag:0x00007febef86edf8> #<Tag:0x00007febef86ed30> #<Tag:0x00007febef86ec68> #<Tag:0x00007febef86eba0>

This is a growing Wiki article about C++ reverse engineering.

C++ Reverse Engineering - concepts and tools

Did you know that many endpoint protection agents / clients are written in C++? And that AntiVirus vendors aren’t great at defensive application development?
Or these industrial interfaces which interact with hardware often rely on C components? That many embedded systems are developed in C++? Or that Windows is all C++?

C++ is everywhere… in your OS. In your browser. In your kernel. In your life. C++ is life.

Motivation

C++ reverse engineering can be useful for vulnerability analysis in

  • operating systems (including Windows drivers / kernel extensions and modules)
  • native components in mobile apps (like parsers, e. g. PDF, XML, …)
  • runtime environments, such as the Java Runtime Environment
  • Embedded / Firmware, …
  • CTFs
  • ah and web browsers[1]

C++ Reverse Engineering - concepts

Basic concepts can be exemplified using Radare2 (r2). This tool can be used to pre-assess the effort and to improve the debugging strategy.

Why Radare2 is useful for initial target exploration

Radare2 is a free OpenSource Unix-fashioned command-line program that uses the Capstone Disassembler for those CPU architectures which are going to be relevant in the following context. It implements linear and recursive disassembly algorithms, just like IDA Pro, Hopper, Binary Ninja, etc…

Commercial disassemblers like Binary Ninja or IDA Pro will do more in order to semi-automatically (“interactively”) improve the disassembly results; like analyzing the calls, control- and data-flow analysis, static taint analysis, etc… This matters for complex targets, such as C++ based applications (for Microsoft Windows environments). It matters less for small (purposely crafted CTF) applications. In the security community, there is a misunderstanding with respect to how different the target applications can be.

Radare2 is only used to exemplify the concepts. Its auto-analysis (-AA, aaaa) is not necessarily the most robust option, speaking of the applied guesswork / heuristics.
– Yes, the analysis can be influenced in order to improve the results. The main drawback of Radare2 is not the lack of options or the quality of the disassembly. It’s the design, that is focused on a large accumulation of features instead of industry-grade robustness and correctness. Then again, what’s “industry-grade” in this market segment?!

Why not use them all?

Ghidra performs advanced disassembly and analysis. It can be used to reverse engineer complex real-world target apps, where “guesswork” is needed.

It also comes with a robust decompiler, that is useable from within Radare2. On the level I reverse-engineer C++ applications, I rarely need IDA Pro. I don’t do this for living.

Example target: ARM 32 based C++ application

[email protected]:~/Source/ghidraExampleSource$ for binary in example*; do file $binary && echo ""; done
example: ELF 32-bit LSB executable, ARM, EABI5 version 1 (SYSV), dynamically linked, interpreter /lib/ld-linux-armhf.so.3, for GNU/Linux 3.2.0, BuildID[sha1]=55598202b6ee85ab37e4da04a6d1f1bff8a98547, with debug_info, not stripped

example_stripped: ELF 32-bit LSB executable, ARM, EABI5 version 1 (SYSV), dynamically linked, interpreter /lib/ld-linux-armhf.so.3, for GNU/Linux 3.2.0, BuildID[sha1]=55598202b6ee85ab37e4da04a6d1f1bff8a98547, stripped

Radare2 build

On Linux Mint (DISTRIB_RELEASE=19.1 alias Tessa) Radare2’s version is:

[email protected]:~/$ apt-cache policy radare2
radare2:
  Installed: 2.3.0+dfsg-2

The current release version is 3.7. Due to this, we cannot use this repo version.

[email protected]:~/Source/radare2$ radare2 -v
radare2 3.7.1 22628 @ linux-x86-64 git.3.7.1-177-geeb62abe9
commit: eeb62abe93a3b531759a5ae6b3ceefc679ca183c build: 2019-09-02__09:41:05

The disassembly listing uses Capstone (version 5), the decompiler listing uses Ghidra 9.0.4. This is an ARM 32 Bit listing.

pdgo will show the offsets next to the decompiler suggestions.

Result:

  • ELF 32 bit ARM binary loaded into OpenSource disassembler / decompiler
  • used Ghidra successfully from within Radare2
  • performed auto-analysis on a small C++ target application
  • discovered a function (main), that also is the OEP (Original Entry Point)

Primer on Ghidra Linux ARM-32 and C++ reverse engineering

The reverse engineering workflow is the same: Binary-format -> Entry Point -> Callgraph -> Cross-References (X-refs) -> Imports, Functions, Syscalls, …

Original Entry Point

TBD

Structs

TBD

Classes

TBD

Primer on WinDbg Preview for C++ Reverse Engineering on x86-64 (Windows 10)

WinDbg (Preview) is a tool for software diagnostics in the Windows userland and kernel. It helps with the architecture and design of debugging. The main difference between the standard WinDbg from the Windows SDKs and the Preview-version is the usability and presentation of debugging.

With WinDbg we aim to find a common recurrent identifiable problem (a bug) using the debug engine for instrumentation. With WinDbg we can implement automated and interactive debugging strategies.

Basics on WinDBG (Preview)

In this Wiki article examples walkthroughs are given to share a hands-on understanding of WinDBG Preview.
Most of the debugger commands appear to be the same.

Versions, compatibility, environment

  • Windows 10 Version 1903
  • WinDbg Debugger client 1.0.2001.02001 / Debugger engine 10.0.19528.1000
  • Intel Core i7-7700 / ESXi guest VM (lab) - x86-64

Time-travel debugging

TBD

Ghidra and Pharos OOAnalyzer

Pharos[2] is a CMU project with practical relevance[3] for reverse engineering. It comes with plugins for Ghidra and IDA Pro. It makes reverse-engineering of object-oriented C++ code easier.

Build the OOAnalyzer Ghidra pluggin on Windows

Usually, I install Windows build tools with sdkman[4] or chocolatey[5]. Here, for demonstration purposes, it’s advisable to omit the unit tests. Otherwise, you have to adapt the build script in the subfolder, it misses some test libs. On Windows, I only build the Ghidra extension.
Manually building the entire Pharos tools-suite requires development skills aimed at producing research-grade reverse-engineering tools.

In the following % marks the prompt. ... marks sections that aren’t copy-pasted.

% C:\Users\wishi\Source\pharos\tools\ooanalyzer\ghidra\OOAnalyzerPlugin (master -> origin)
% choco install gradle
...
% set GHIDRA_INSTALL_DIR=%userprofile%\Source\ghidra_9.1.1_PUBLIC_20191218\ghidra_9.1.1_PUBLIC\
% gradle buildExtension -x test

> Task :buildExtension

Created ghidra_9.1.1_PUBLIC_20200210_OOAnalyzerPlugin.zip in C:\Users\wishi\Source\pharos\tools\ooanalyzer\ghidra\OOAnalyzerPlugin\dist
...

The zip-file is the extension. This needs to be loaded into Ghidra (I am using 9.1.1 in this section, as you saw in the listing above).

Test it, and load notepad.exe (should be x86_64 LE). This may take a while… but we need some time to setup Pharos in order to get the tools for the C++ class reverse-engineering.

Screenshot 2020-02-10 at 16.14.30

The decompile.exe process belongs to Ghidra. – In the mean time go to your favorite Docker host (a Linux system, see the different prompt). (base) is from Anaconda Python (to be ignored).

(base) [email protected]:~/Source$ docker pull seipharos/pharos
Using default tag: latest
latest: Pulling from seipharos/pharos
5667fdb72017: Pull complete
...
(base) [email protected]:~/Source/blog$ docker run --rm -it -v /home/marius/Source/blog/:/blog seipharos/pharos
[email protected]:/# cd blog/
[email protected]:/blog# ls
notepad.exe

In the Docker container, everything is pre-installed. we are just using it interactively.

Screenshot 2020-02-10 at 16.31.51

The results won’t be perfect, but we are on the cutting edge… it’s ok.

Screenshot 2020-02-10 at 16.34.28

In my test, it loaded 4 classes. This is before we load the results:

This is after we have the results loaded (pay attention to the Symbol Tree widget):

Result: workflow to semi-automatically reverse-engineer C++ classes with OpenSource tools. Take a better example and you’ll see more classes.

BinDiff and Ghidra - software similarity classification analysis on Call Graphs and Basic Blocks

Future versions (version 6) of BinDiff (freely available as a result og Googles aquisition of Zynamics[6]) may ship with Ghidra support. I was able to test this in its very early stage.

The plugin can be installed via the beta fork of BinExport.

Screenshot 2020-02-14 at 14.36.47

And once I differed two basic C programs the result was that I was to analyze Call Graphs and basic function blocks.

For me, this looks promising for C++ reverse engineering.

WinDBG (Preview) and Ghidra - combined integrated static and dynamic analysis

In reverse-engineering, there are two schools of thought: static analysis and dynamic analysis. C++ target applications may be too complex so that limiting the analysis to one kind of analysis is not sufficient.
For more information on this:

Ret-Sync is a tool that bridges Ghidra (and other static disassemblers) to different debuggers on different platforms. This can be helpful to investigate applications fast and efficiently; with behavioral information. execution traces and static information in mind.

Ret-Sync opens a listener from Ghidra, once you install the plugin.

Then you can attach the target application to WinDBG Preview.

I built the WinDBG extension from Visual Studio targeting x86_64. The build succeeds without errors for the current master branch (Feb 20):

C:\Users\wishi\windbg
λ ls -l
...
-rw-rw-rw-   1 user     group     1163264 Feb 16 14:52 sync.dll
...

In order to load the extension (the DLL contains all you need to load) into WinDBG Preview (which you probably loaded from the Microsoft AppStore, I used a fitting environment variable:

_NT_DEBUGGER_EXTENSION_PATH=%USERPROFILE%\windbg

I recommend to set this on a System level so that you may focus on invoking WinDBG with the proper flags, scripts and command-line arguments from any Windows terminal.

With respect to patch-diffing and vulnerability hunting, WInDBG Preview offers a robust remote debugging option.

This is especially useful if you have a debug system on a different patch status than the system that is used for the static analysis.

  1. You’d use BinDiff to statically compare the disassembly. Once you have narrowed down which functions you want to investigate you proceed with reachability analysis from the application entry points (file and network I/O, …).
  2. Next, you chose which version of the target application you instrument.
  3. Then you overlay the results.

  1. although most browsers > 90% FOSS (Free Open Source Software) the underlying Operating System and the memory manager code may not be available. ↩︎

  2. https://insights.sei.cmu.edu/sei_blog/2019/07/using-ooanalyzer-to-reverse-engineer-object-oriented-code-with-ghidra.html - Blog ↩︎

  3. http://cmu-sei-podcasts.seimedia.libsynpro.com/reverse-engineering-object-oriented-code-with-ghidra-and-new-pharos-tools - Podcast ↩︎

  4. https://sdkman.io ↩︎

  5. https://chocolatey.org ↩︎

  6. https://security.googleblog.com/2016/03/bindiff-now-available-for-free.html ↩︎