09 - ARM Analysis

Welcome to 9th of the Advent of Radare2!

Today, we’ll explore how to set up Radare2 for analyzing ARM binaries and how those instructions are constructed at bit level.

The ARM architecture has many implementations, SoCs and CPUs that can have multiple configurations based on its bit-width, endianness, CPU types, and instruction modes, such as Thumb and ARM. Understanding how to configure these options and understanding the role of base address and memory regions on RISC architectures like ARM will be key when analyzing binaries for those targets.

Setting Up the Arch

The first step is to set the architecture with asm.arch, this eval variable can be changed with the -a and -e commands/flags.

NOTE that radare2 flags can be also used in the repl, therefor r2 -a arm is the same as doing: r2 -c '-a arm'.

These are the 4 eval variables we need to tweak for our purposes:

asm.arch : arch plugin to use
asm.bits : 16 (thumb), 32 (ARM32) or 64 (aarch64)
asm.cpu : specify which cpu submodul to use
cfg.bigendian : select little or big

For example:

$ r2 -a arm -b 32 arm32firmware.bin

Selecting the CPU model

ARM64 processors support various CPU types, each with slight instruction variations. For example, Cortex-A and Cortex-R families are ARM64-compatible but have slight differences in supported instructions. Radare2’s asm.cpu option lets you specify the ARM64 CPU type, allowing for better accuracy when disassembling.

To list available CPU options for ARM.gnu plugin:

[0x100003a58]> -a arm.gnu
[0x100003a58]> -e asm.cpu=?
v2
v2a
v3M
v4
v5
v5t
v5te
v5j
XScale
ep9312
iWMMXt
iWMMXt2
wd
[0x100003a58]>

For ARM64, you’ll typically see options like cortex or v9. Selecting the correct CPU type ensures that Radare2 interprets architecture-specific instructions properly.

e asm.cpu=cortex

NOTE for shell/r2shell ortogonality reasons, the -e is an alias for e.

Endianness

ARM32 binaries can be either little-endian (common in most ARM systems nowadays, and this feature was removed after the first releases of Aarch64 architecture which also had arm32 compatibility modes) or big-endian (common in old embedded systems, for development simplicity distros started to compile everything in little endian if the CPU was supporting to let more software to work in case big endian was not supported properly).

The default endianness in radare2 is defined by the host system. So, unless we are in powerpc or a s390x mainframe we are probably in a little endian host, so if we want to use big endian we must change this variable:

[0x00000000]> -e cfg.bigendian=true

Let’s see how the endianness affects the disassembly with a simple example:

[0x0000815c]> e cfg.bigendian = 0
[0x0000815c]> pd 1
0x0000815c      0d20a0e1       mov r2, sp
[0x0000815c]> e cfg.bigendian = 1
[0x0000815c]> pd 1
0x0000815c      0d20a0e1       stceq p0, c10, [r0, -0x384]!
[0x0000815c]>

What we are seeing here is that these 4 bytes disassembles with valid instructions even in little or big endian, but probably the first one makes more sense, so we will assume it’s a little endian binary. But… what’s the endian and why old systems and network infrastructures use big endian instead?

The CPUs read the data in a specific order, this is:

from left to right (big endian)
from right to left (little endian)

Considering humans put the most valuable digits in the left side of the numbers, we think in “big endian”. We can examinate this by using the wv4 command which is also affected by cfg.bigendian:

[0x00000000]> e cfg.bigendian = 0
[0x00000000]> wv4 1
[0x00000000]> p8 4
01000000
[0x00000000]> e cfg.bigendian = 1
[0x00000000]> wv4 1
[0x00000000]> p8 4
00000001
[0x00000000]>

Network protocols put all their header values in “network order”, which is basically big endian, and this is probably the reason why most networking systems used big endian CPUs. Nowadays, modern CISC architectures provide instructions to load or store using a specific endianness, so the performance loss is negligible.

There are some historical and fun stories about that, but probably the reason of using little endian was because bytes are read in a stream and for backward compatibility reasons or supporting variable size integers packed in bytes, reading from left to right was a better decision.

But wait, is there any other endian? Of course! there are a couple of middle endians! And those appeared when CPUs needed to extend their data buses to address more memory, so they couldnt change the encoding of the instruction set to avoid rewiring the whole instruction decoder and just packed an extra byte or two at the end to read that extra 48 bit value, for example:

01 00 00 -> 1 in little endian
00 00 01 -> 1 in big endian
00 01 00 -> 1 in middle endian

Swapping to Thumb

ARM32 supports both the 32-bit ARM instruction set and the 16-bit Thumb instruction set (such as Thumb2), which is optimized for smaller code size. Radare2 provides ahb (analyze hint bits) to specify which sections use ARM or Thumb mode explicitly.

Forcing ARM or Thumb Mode with ahb You can use ahb to hint specific regions to be either ARM (32-bit) or Thumb (16-bit). For example:

Set Thumb Mode (16-bit):

ahb 16

Set ARM Mode (32-bit):

ahb 32

This is helpful when disassembling binaries that switch between modes, such as those that may use ARM for performance-critical functions and Thumb for space efficiency. To remove all hints and revert to the default configuration, use:

ahb-*

This command clears all hints, making Radare2 fall back on asm.arch and asm.bits for disassembly.

Runtime Mode Changes

ARM architecture supports runtime switches between endianness and instruction modes (ARM and Thumb). These behaviors are controlled using the Current Program Status Register (CPSR).

Endianness Switching

[0x00000000]> -a arm; -b 32
[0x00000000]> arp~flg
flg cpsr    .32 64  0
flg tf  .1  .517    0   thumb
flg ef  .1  .521    0   endian
flg itc .4  .522    0   if_then_count
flg gef .4  .528    0   great_or_equal
flg jf  .1  .536    0   java
flg qf  .1  .539    0   sticky_overflow
flg vf  .1  .540    0   overflow
flg cf  .1  .541    0   carry
flg zf  .1  .542    0   zero
flg nf  .1  .543    0   sign
[0x00000000]> dr 1
sp = 0x00000000
r15 = 0x00000000
tf = 0x00000000
ef = 0x00000000
jf = 0x00000000
qf = 0x00000000
vf = 0x00000000
cf = 0x00000000
zf = 0x00000000
nf = 0x00000000
[0x00000000]>

The CPSR contains the E bit:

0: Little-endian (default for most systems).
1: Big-endian.

Example Assembly Code:

mrs r0, cpsr        ; Read CPSR into r0
bic r0, r0, 0x200000 ; Clear the E bit
orr r0, r0, 0x200000 ; Set the E bit (toggle endianness)
msr cpsr_c, r0      ; Write back to CPSR

Switching Between ARM and Thumb Modes

The CPSR’s T bit determines the instruction set: - 0: ARM mode. - 1: Thumb mode.

Example:

mrs r0, cpsr        ; Read CPSR into r0
orr r0, r0, 0x20   ; Set T bit
msr cpsr_c, r0      ; Write back to CPSR, switching to Thumb mode

Alternatively, the ARM processor determines the instruction set mode from the least significant bit of the target address during a branch or jump. Jumping to an address with the lowest bit set (e.g., <addr> + 1) will switch the processor into Thumb mode without modifying the CPSR directly.

Instruction Decoding

Radare2 provides an abstraction to encapsulate multiple disassembler implementations to work seamlessly for the user. The potential of this permits us to use commandlines tools like GNU gas to assemble instructions directly from the .encode callback of the arch plugin. But the underlying implementation is opaque to radare2. This means that the internals of the instruction decoding are not available for radare2.

The maximum details we can have are available as an opex struct inside the JSON representation for all the Capstone plugins:

[0x100003a74]> aoj~{}
[
  {
    "opcode": "add x29, sp, 0x50",
    "disasm": "add x29, sp, 0x50",
    "pseudo": "x29 = sp + 0x50",
    "description": "add two values",
    "mnemonic": "add",
    "mask": "ffffffff",
    "esil": "0x50,sp,+,fp,=",
    "sign": false,
    "id": 6,
    "opex": {
      "operands": [
        {
          "type": "reg",
          "value": "fp"
        },
        {
          "type": "reg",
          "value": "sp"
        },
        {
          "type": "imm",
          "value": 80
        }
      ]
    },
    "addr": 4294982260,
    "bytes": "fd430191",
    "val": 80,
    "size": 4,
    "type": "add",
    "esilcost": 0,
    "cycles": 1,
    "failcycles": 0,
    "delay": 0,
    "stackptr": 0,
    "family": "cpu"
  }
]
[0x100003a74]>

THe opex object provides the information of type and value for the operands of each opcode. But sometimes we need to understand which specific bits are representing each of these operands. And without having access to the internals of the decoder we need to use aob.

Instruction Bits

The aob (Analyze Opcode Bits) command breaks down an instruction into its bit-level representation. This command works independently of the plugin used, so it means that we can decode at bit level any instruction from any architecture supported by radare2. This is.. 84 architectures:

$ rasm2 -L | wc -l
      84
$

This is particularly useful for understanding encoding schemes, debugging instruction decoding, or fuzzing with invalid instructions. Let’s see what’s the output of this command and how it works internally.

[0x100003a74]> aob
22211111 33333322 03333333 100x00x0  : add x29, sp, 0x50

Each group of bits corresponds to fields in the instruction. This visual breakdown aids in analyzing how operands and opcodes are represented.

[0x100003a74]> aobv
22211111 33333322 03333333 100x00x0       add, x29, sp, 0x50
\\\____________\\_____________________ 2r sp     = 01
   \\\\\___________________\__________ 1r x29    = 073
         \\\\\\____\\\\\\\____________ 3i 0x50   = 037
              \_________\\_\\_\___ 0o add    = 04001

This ascii art representation lets us see which bits and locations are taken to handle every operand and the instruction itself:

The 2r, 1r, 3i, 0o correspond to the instruction fields: registers, immediate values, and operation codes.

The octal number at right is the packed representation of all the sequantial bits of this operand, This brings us a much wider understanding on how this instruction or family are composed.

Use aobj for structured data:

[0x100003a74]> aobj
{
  "opstr": "add x29, sp, 0x50",
  "size": 4,
  "bytes": [[2,2,2,1,1,1,1,1],[3,3,3,3,3,3,2,2],[0,3,3,3,3,3,3,3],[1,0,0,-1,0,0,-1,0]],
  "flipstr": "22211111 33333322 03333333 100x00x0",
  "args": [[23,24,26,27,29,30],[1,2,3,4,0,31],[5,6,7,8,9],[10,11,12,13,14,15,16,17,18,19,20,21,22]],
  "vals": [2049,31,59,1]
}

JSON output is ideal for scripting, such as fuzzers, that require bit manipulation or decoding.

Instruction Fuzzing and Discovery

By swapping bits (e.g., toggling x bits), you can craft new instructions to test how they behave on real hardware or check for missing decodings in the disassembler. This is an effective way to explore reserved, invalid or undocumented instructions by writing a program and executes them on different CPUs to find out different behaviours.

Community Challenge

Automatic mode detection of ARM/Thumb modes remains a challenge in Radare2.

The challenge for today consists in picking this ticket #21294 and find out the reason and improve the code analysis or the bin parsers to provide the right analysis hints for the asm.bits to get them to work without human checks!

See you tomorrow at 10th day of the AoR!