/advent

21 - Reversing with AI

Welcome to Day 8 of the Advent of Radare2!

Today, we will dig into the fascinating intersection of artificial intelligence and reverse engineering, specifically examining how advanced language models can enhance various reverse engineering tasks.

Our primary focus will be on decompilation techniques, where we’ll explore how these powerful AI tools can assist in transforming low-level machine code back into more readable and maintainable high-level source code.

Traditional decompilers often produce very verbose output for high level to interpret, necessitating manual cleanup and analysis. However, with the integration of artificial intelligence, tools like decaí and r2ai are improving this process, offering generative suggestions and more readable decompilation outputs.

Generative decompilation brings the ability to use natural language to manipulate and tweak the output of the decompilation as well as propagating changes across functions, describing function behaviours and permitting “zero-click” interactions to solve problems.

Decompiler Options

Decompilation, the process of reconstructing high-level source code from low-level machine code, presents significant challenges. The complexity of modern architectures, compiler optimizations, and the inherent loss of information during compilation make perfect reconstruction nearly impossible.

Several tools aim to address these difficulties, offering varying levels of accuracy and sophistication. Popular options include standalone commercial decompilers like Binary Ninja or IDA Pro, or NSA^Wfree ones like GHIDRA.

On the other side, radare2 itself boasts several decompilation plugins, delegating the heavy work of decompilation to external projects outside the core. Following its modular design it is possible to use Ghidra’s decompiler, as well as many other options like r2dec, JADX, revng, and many others available via r2pm.

$ r2pm -s decompiler
radeco              [syspkg] Radare Decompiler in Rust
pickledec           pickle decompiler for radare2
revng               rev.ng decompiler
r2retdec            retdec decompiler plugin for radare2 (adds pdz command)
r2apktool           [r2-r2pipe-python] APK decompiler alternative to apktool
r2ghidra            Ghidra Disassembler, Analysis and Decompiler Plugins
decai               r2ai/OpenAI/Anthropic/ollama based decompiler, formerly known as r2ai-decai
jadx                jadx decompiler (requires Java)
r2snow              [decompiler] snowman decompiler integration with radare2
r2jadx              jadx decompiler integration plugin for radare2
r2angr              angr decompiler for radare2
pdq                 QuickDecompiler based on r2js and esil
$

Natively, r2 comes with the pdc command which combines the pseudo disassembly generated with the asm.pseudo=true and the emu.str emulation option to find out to render a pretty-unoptimized pseudo-decompilation full of goto and label statements.

This output is not ideal for the users, but it’s aligned with the information in radare2, which is in many situations better than the alternatives.

Comparing decompilers

Let’s inspect the output of different decompilers for a versions of the following function:

C

int sum(int a, int b) {
    return a + b;
}

r2ghidra

[0x100003eec]> pdg
int32_t sym._sum(int32_t param_1, int32_t param_2)
{
    // [00] -r-x section size 152 named 0.__TEXT.__text
    return param_1 + param_2;
}

r2dec

[0x100003eec]> pdd
/* r2dec pseudo code output */
/* a.out @ 0x100003eec */
#include <stdint.h>

void sum (int64_t arg_10h, int64_t arg1, int64_t arg2) {
    int64_t var_8h;
    int64_t var_ch;
    x0 = arg1;
    x1 = arg2;
    /* [00] -r-x section size 152 named 0.__TEXT.__text */
    var_ch = w0;
    var_8h = w1;
    w8 = var_ch;
    w9 = var_8h;
    w0 = w8 + w9;
    return;
}
[0x100003eec]>

pdc

[0x100003eec]> pdc
 // callconv: x0 arm64 (x0, x1, x2, x3, x4, x5, x6, x7, stack);
void sym._sum (int64_t arg1, int64_t arg2, int64_t arg_10h) {
    loc_0x100003eec:
        sp = sp - 0x10 // [00] -r-x section size 152 named 0.__TEXT.__text
        [var_ch] = w0 // arg1
        [var_8h] = w1 // arg2
        w8 = [var_ch]
        w9 = [var_8h]
        w0 = w8 + w9
        sp = sp + 0x10 // a.c:0
        return x0;
}

decai (using pdc as backend)

[0x100003eec]> decai -d
int sum(int num1, int num2) {
    return num1 + num2;
}
[0x100003eec]>

Decai can be used as a frontend for any other decompiler, but it’s using pdc by default (see decai -e cmds) because it’s the more verbose and detailed, which provides better context for language models to optimize the output in a better way. Another benefit of pdc compared to r2ghidra/r2dec/.. is that it works for all the architectures, so it can be used even for brainfuck, pickle, lua or z80.

R2AI Components

R2AI is not just one thing, but instead, a set of different programs, scripts and plugins that bridge the gap between radare2 and local and remote language models.

Decai is a lightweight plugin for radare2 that focuses on decompilation. It leverages AI models to enhance the readability of decompiled code by performing various cleanups and optimizations. Internally, decai utilizes the pdc (pseudo decompiler) to generate initial decompilation output. This output is then processed through AI models to refine the code, making it more comprehensible.

r2ai serves as the bridge between radare2 and AI models. It facilitates communication with local or remote language models, enabling features like auto-solving tasks, function renaming, and type propagation. By integrating r2ai with decai, users can automate the decompilation process, resulting in cleaner and more informative code representations.

NOTE Actually there’s an ongoing work on rewriting r2ai, decai and r2ai-plugin into a single codebase in pure C, which may probably end up replacing those three packages.

r2ai-plugin that’s the same as r2ai but loaded from radare2 using the rlang api to comunicate internally with radare2 to execute commands and share context with the currently loaded session.

r2ai-server shellscript wrapper to start different language model, this can simplify the way to start the r2ai webserver, llamafile, llama.cpp, ollama, etc.

shai a shellscript to be placed in your PATH to query language models from bash or even with vim.

To install any of these use r2pm -ci decai

AI Based Decompilation

The decompilation process begins with decai invoking the pdc command to generate pseudo-C code from the target function. This initial output often contains artifacts and lacks meaningful variable names and types. To address this, decai employs AI models and a custom system prompt:

By utilizing the recursive call graph obtained via the afla command in radare2, decai can propagate types and rename functions throughout the binary, resulting in a more coherent decompilation output.

This is the default prompt used by decai. But we can redefine it at any time or save it in your radare2rc:

[0x00000000]> decai -r
Rewrite this function and respond ONLY with code, NO explanations, NO markdown, Change 'goto' into if/else/for/while, Simplify as much as possible, use better variable names, take function arguments and and strings from comments like 'string:'
[0x00000000]>

Remote Language Models

If you have an API key for any API endpoint r2ai will pick them from your home:

You can select which one to use by changing the decai -e api=? configuration variable.

NOTE For decompilation purposes Claude and OpenAI are the best options

Running it in Local

But hey, we don’t wanna drain rivers or leak our research! We want to use local models!

Turns out huggingface is probably the best repository to find and download models. We will focus on the GGUF ones (a standard file format), which makes them easy to use in most llama based api servers like llamacpp or the r2ai webserver.

To accomplish this, we’ll install r2ai and utilize the -m flag to specify our desired model (you can use -M to view available model suggestions first). After entering the initial command, you’ll be prompted to select the quantization level and size of the GGUF file for download. I typically recommend selecting Q5_K_M, and it’s important to note that larger file sizes don’t necessarily correlate with better performance.

These are some of the recommended models for decompilation in local:

$ r2pm -r r2ai
[r2ai:0x100003a58]> -M
-m Qwen/Qwen2.5-Coder-7B-Instruct-GGUF
-m unsloth/Llama-3.2-1B-Instruct-GGUF
-m QuantFactory/granite-8b-code-instruct-4k-GGUF
-m bartowski/Gemma-2-9B-It-SPPO-Iter3-GGUF
[r2ai:0x100003a58]> -m Qwen/Qwen2.5-Coder-7B-Instruct-GGUF
[r2ai:0x100003a58]> hello
Hello! How can I assist you today?
[r2ai:0x100003a58]>

We will now start the webserver in r2ai:

[r2ai:0x100003a58]> -w
[12/11/24 17:26:33] INFO     r2ai.server - INFO - [R2AI] Serving at port    web.py:336

As explained before we can also start the webserver using r2ai-server like this:

$ r2pm -r r2ai-server -l
r2ai
llamafile
llamacpp
koboldcpp
$ r2pm -r r2ai-server -m /path/to/Qwen2.5-Coder-32B

Models downloaded through r2ai are automatically stored in the ~/.r2ai.models directory. Those can be reused from decai, r2ai-server or r2ai-native by specifying the model name or the full path to the GGUF file.

Settings in Decai

We probably don’t need to change anything in decai to use the r2ai webserver as backend.

> decai -e api=r2ai
> decai -e host=localhost
> decai -e port=8080

If we want to make it use pdc, pdg and pdd all together to combine all the decompiled code into a single function we can just specify it like this:

> decai -e cmds=pdc,pdg,pdd

NOTE Other useful commands to be fed to decai are pdsf or CCf.

Language models understand the constructions and transformations required to transpile the code between different programming languages.

[0x100003eec]> decai -e lang=python
[0x100003eec]> decai -d
def sum_numbers(num1, num2):
    result = num1 + num2
    return result
[0x100003eec]> decai -e lang=bash
[0x100003eec]> decai -d
sum() {
    local num1=$1
    local num2=$2
    echo $((num1 + num2))
}
[0x100003eec]> decai -e lang=lisp
(defun sum (arg1 arg2)
  (let ((result (+ arg1 arg2)))
    result))
[0x100003eec]> decai -e lang=sql
[0x100003eec]> decai -d
CREATE OR REPLACE FUNCTION sum_numbers(num1 INT, num2 INT)
RETURNS INT AS $$
BEGIN
    RETURN num1 + num2;
END;
$$ LANGUAGE plpgsql;
[0x100003eec]>

Vulnerability Analysis

Language models are also trained with several commits, pull requests, CVE descriptions and many other sources of knowledge about vulnerabilities and bad practices. So we can use decai to conversate with the model to do any of the following actions:

$ r2 overflow
[0x100003ebc]> decai -d
#include <string.h>
#include <stdio.h>

int main (int argc, char **argv, char **envp) {
    char buffer[32];
    if (argc > 1) {
        strcpy(buffer, argv[1]);
    } else {
        strcpy(buffer, "world");
    }
    printf("Hello %s\n", buffer);
    return 0;
}

Suggest a new function name:

[0x100003ebc]> decai -n
afn HelloWorldPrinter @ 0x100003ebc
[0x100003ebc]>

Explain me the function:

[0x100003ebc]> decai -x
The function checks the number of command line arguments.
If there is only one argument, it copies the string
"world" into a buffer, otherwise it copies the second
argument into the buffer. Then it prints "Hello %s\n"
with the buffer as the string argument. It also includes
a stack protection mechanism to prevent buffer overflow
attacks.

Choose Japanese for “human language”:

[0x100003ebc]> decai -e hlang=japanese
[0x100003ebc]> decai -x
この関数は、コマンドライン引数をチェックし、引数が1つだ
け与えられた場合にはその引数を"Hello %s\n"という形式で
出力します。引数が与えられなかった場合には"world"という
文字列を出力します。また、スタックオーバーフローを防ぐ
ためのチェックも行っています。

Let’s ask about vulnerability analysis with -V:

[0x100003ebc]> decai -V
The vulnerability in this code is a buffer overflow
vulnerability. The `strcpy` function does not check the
size of the input string and the size of the destination
buffer. If the input string (in this case `argv[1]`) is
larger than the destination buffer (in this case `buffer`),
it will cause a buffer overflow. This can lead to
arbitrary code execution, denial of service, or
information disclosure vulnerabilities.

Here is a simple exploit that can cause a segmentation
fault by overflowing the buffer:

$ ./program $(python -c "print 'A'*40")

This will create a string of 40 'A' characters, which
is larger than the size of the buffer (32 characters),
and pass it as an argument to the program. This will
overflow the buffer and cause a segmentation fault.

Please note that this is a simple demonstration of how
the buffer overflow can be exploited and real-world
exploits can be much more complex and dangerous.

Playground

Today’s “challenge” consists in putting all the knowledge we learned today:

Experiment with different prompt structures to see how they affect the decompilation results. For instance, providing more context or specific instructions can lead to more accurate and informative outputs.

Conclusion

Integrating AI into the decompilation process through tools like Decai and r2ai marks a significant advancement in reverse engineering. By leveraging powerful language models, these tools automate and enhance code analysis, making it more accessible and efficient. As AI models continue to evolve, we can anticipate even greater improvements in decompilation accuracy and readability.

Hope you come back tomorrow!