24 - Tracing Functions

Greetings on Day 24 of the Advent of Radare!

Our focus today is on function call tracing within specific memory areas using Radare2 and r2frida. Today’s post will teach us how to set boundaries in code analysis, filter analysis results within a range, and understand the relationships between functions by recreating a backtrace call graph.

We are reaching the end of the advent calendar, so make sure to git pull and run sys/install.sh for your radare2 installation. Get ready to learn more about r2frida, code analysis, and graphs to enhance your reverse engineering skills!

Finding The Functions

To begin, we need to specify a range in memory. While we usually run aaaa without much thought, when working with large binaries or debugging, it’s often preferable to define limits. This helps avoid analyzing all system libraries and program codepaths when we’re only interested in specific ones.

We previously learned about the search.in variable, which can be set to different values to specify where we want to scan memory for values. The same concept applies to analysis when using the anal.in variable:

I will assume you have r2frida installed, so we’ll load a system session:

$ r2 frida://0
INFO: Mounted io on /r2f at 0x0
 -- Are you still watching?
[0x1042c7818]>

We want to analyze only the functions in the range of the executable map of the libr_flag.dylib. So we grep for them and this is the output:

[0x1042f0000]> :dm~flag
0x00000001042f0000 - 0x00000001042f8000 r-x .../libr_flag.dylib
0x00000001042f8000 - 0x00000001042fc000 r-- .../libr_flag.dylib
0x00000001042fc000 - 0x0000000104300000 r-- .../libr_flag.dylib

Our range of interes starts in 0x00000001042f0000 and ends at 0x00000001042f8000. Therefor we run:

[0x1042c7818]> -e anal.in=range
[0x1042c7818]> -e anal.from=0x00000001042f0000
[0x1042c7818]> -e anal.to=0x00000001042f8000

And now we are ready to run a command that uses these vars (hint: it’s not aaaaaa).

[0x1042f0000]> aac
[0x1042f0000]> aflc
106

If we run afl we will see that the functions have names like sub.fcn.10408e0d4_10408e0d4.. but this is because we haven’t loaded the symbols from this library yet! To address that we must run this command before aac:

[0x1042c7818]> s 0x00000001042f0000
[0x1042f0000]> .:is*
[0x1042f0000]> aac

The .:is* command will set flags from the symbols of the binary loaded in current address. If we weren’t lucky and analyzed the functions before loading the symbols we can still fix that with the following command:

[0x1042f0000]> .afna@@F

. - interpret the output of afna as commands to be executed
@@F - run the command on every function

Scanning for Calls

Now that we have found all the functions within the range of libr_flag, we can go a step further and analyze the code looking for CALL instructions. To do this, let’s disable the anal.in restrictions.

[0x1028f3818]> -e anal.in=bin.ormaps.x

We must now define the range of memory to scan for instructions, this is a search command, so we must configure the search.in the same way we did before:

[0x1042c7818]> -e search.in=range
[0x1042c7818]> -e search.from=0x00000001042f0000
[0x1042c7818]> -e search.to=0x00000001042f8000

Or well, we can also do alternative boundaries by just telling it to scan the instructions in the upcoming 1KB of memory.

e search.from=$$
e search.to=$$+1K
e search.in=raw

Once the memory range is defined, we can search for all function calls within this region using the /a command. This command performs a linear disassembly of all instructions, specifically looking for CALL instructions within the specified range.

/at call

This command outputs a list of addresses where function calls occur within the defined range, helping us locate and focus on function calls only.

Filtering Unique Calls

The output of /at command may contain repeated or indirect call destinations. To identify unique destinations, we can filter the results using radare2’s ~ operator and sort the addresses to eliminate duplicates.

[0x1028f3818]> /at call~$$[4]~?
     106
[0x1028f3818]> /at call~[4]~?
     293
[0x1028f3818]>

~[4]: Filters the output to extract only the fourth field (the call destination).
$$: Sorts the list and removes duplicates, outputting only unique call addresses.
~?: counts the lines in the output

Let’s save the list of unique calls in an internal file:

[0x1028f3818]> /at call~$$[4] > $calls

Note that we used $calls. this is an internal file, all file operations accessing names that start with a $ will be using in-memory files. So we don’t need to use the filesystem to save output of commands.

Tracing Calls

R2Frida is specifically designed to trace function calls, and it excels at recording logs of each call while tracking arguments passed, return values, and even backtraces.

In today’s post, we will focus on capturing backtraces and saving the results to a log file.

In our previous post, we obtained a list of unique call destinations. Now, we will set a trace on each of them using this line:

[0x1028f3818]> :dt $$ @@.$calls

:dt - add a debugger trace hook on r2frida
$$ - target function in the current offset
@@. - repeat on each offset found in the file
$calls - the file name to read offsets from.

There’s only one missing thing to be done: enable backtrace:

[0x1021a7818]> :e hook.backtrace=true

We can now resume the execution of the program and get our terminal full of anoying logs!

[0x1021a7818]> :dc

Static Instruction Tracing

Tracing is monitoring and recording program execution at runtime, including function calls, system calls, memory access, and other events for analysis and debugging.

In radare2’s debugging mode, a tracepoint is a breakpoint with the trace bit enabled, which logs and continues execution when hit (instead of stopping).

Common tracing targets:

Function Calls
Function Enter/Exit (Frida hooks)
Instructions (Frida Stalker)

Instruction-level tracing monitors program execution at the machine instruction level, recording instruction sequences and their effects (memory access, cache misses, CPU cycles, register values). Main applications include:

Reverse Engineering: Understanding function relationships, arguments, and execution patterns.
Fuzzing and Exploit Development: Analyzing code coverage and identifying execution paths.
Performance Optimization: Identifying inefficient code patterns and unnecessary operations.
Assembly Optimization: Measuring instruction costs to write more efficient low-level code.
Compiler Development: Understanding how high-level code translates to machine instructions for better optimization strategies.

While instruction-level tracing is computationally expensive, its detailed insights often justify the cost, particularly for security, optimization, and debugging purposes.

Tracing Facilities

Radare2’s dt commands and trace configuration variables can track instruction execution counts and code coverage during ESIL emulation or debugging.

Let’s take a look at them!

[0x00000000]> dt?
Usage: dt  Trace commands
| dt                                 list all traces
| dt [addr]                          show trace info at address
| dt*                                list all traced opcode offsets
| dtj                                list instruction trace logs in json
| dt+ [addr] [times]                 add trace for address N times
| dt-                                reset traces (instruction/calls)
| dt=                                show ascii-art color bars with the debug trace ranges
| dta 0x804020 ...                   only trace given addresses
| dtc[?][addr]|([from] [to] [addr])  trace call/ret
| dtd[qi] [nth-start]                list all traced disassembled (quiet, instructions)
| dte[?]                             show esil trace logs
| dtg                                graph call/ret trace
| dtg*                               graph in agn/age commands. use .dtg*;aggi for visual
| dtgi                               interactive debug trace
| dts[?]                             manage trace sessions, used for step back (EXPERIMENTAL)
| dtt [tag]                          select trace tag (no arg unsets)
| dtt.
[0x00000000]> e~trace
asm.trace = false
asm.trace.color = true
asm.trace.space = false
asm.trace.stats = true
dbg.trace = false
dbg.trace.continue = true
dbg.trace.eval = true
dbg.trace.inrange = false
dbg.trace.libs = true
dbg.trace.tag = 0
graph.trace = false
[0x00000000]>

Let’s enable the debug traces for esil and get a linear disassembly from the emulated instructions:

[0x100003a58]> e dbg.trace=1
[0x100003a58]> aeim
[0x100003a58]> 10ds
[0x100003a58]> dtd
0x100003a58 pacibsp
0x100003a5c stp x28, x27, [sp, -0x60]!
0x100003a60 stp x26, x25, [sp, 0x10]
0x100003a64 stp x24, x23, [sp, 0x20]
0x100003a68 stp x22, x21, [sp, 0x30]
0x100003a6c stp x20, x19, [sp, 0x40]
0x100003a70 stp x29, x30, [sp, 0x50]
0x100003a74 add x29, sp, 0x50
0x100003a78 sub sp, sp, 0x640
0x100003a7c mov x19, x1
[0x100003a58]>

Challenge

Today’s challenge involves reading the hook-urls.js script for r2frida, which can be loaded with :. hook-urls.js inside an r2frida session, and getting it to work to create a backtrace graph for a program other than Safari.

hook-urls.js

NOTE: A backtrace graph is a graph that merges all the backtrace linked lists into a graph containing all the control flow paths that lead to the functions we hooked. You can think of it as a kind of callgraph that focuses on one specific function, showing all the real code paths.

Commands like this are not available in any other tool, and the potential to script all of this brings tremendous opportunities for static and dynamic analysis.

Summary

Analyzing code and finding functions is highly configurable in radare2, understand the syntax of the commands is key to solve our daily problems. (Remember, you can even order pizzas with radare2)

Take your time to re-read all the #aor24 posts and practice all you want, join the chats and suggest improved workflows and new actions.

Stay tuned for tomorrow’s last advent post of Radare!