/advent

04 - Searching strings

Welcome to the 4th day of the Advent of Radare2!

Today’s knowledge pill is about finding specific strings within a binary, which is also a common task not just for reverse engineering but also for forensics when performing data carving, remember? this was the original reason why radare was created.

Searching for a simple message like “Hello” may sound straightforward, but binary formats, encodings, and compiler optimizations can make it more complex.

In this post, we’ll learn about different Radare2 commands that scan memory for locating strings and explore external tools like rafind2 for scanning random files directly from the shell.

Plain Ascii Strings

Let’s begin with something simple, searching plain ascii strings in a binary. In radare2 we use the / command to search. To find a string such as “Hello,” simply use:

[0x00000000]> / Hello

This command searches for the ASCII “Hello” sequence and returns any hits. Radare2 will display the offset of each match, allowing you to jump to the location and examine the surrounding code or data.

Sounds simple, but things can become a little more complex if we don’t know how to answer the following questions:

NOTE that all the / subcommands make use of the -e search. eval variables, this means that if we are not spotting anything maybe it’s because we need to change the scope. By default when loading binaries, r2 will scan only the portions of that physical file that are mapped in memory.

That’s the default value:

[0x100003a58]> e search.in
io.maps
[0x100003a58]>

But we can specify any other range:

[0x00000000]> e search.in=?
raw
flag
block
bin.section
bin.sections
bin.sections.rwx
bin.sections.r
bin.sections.rw
bin.sections.rx
bin.sections.wx
bin.sections.x
io.map
io.maps
io.maps.rwx
io.maps.r
io.maps.rw
...
[0x00000000]>

And even specify our range via search.from and search.to. If we just want to scan the entire binary ignoring the binary headers and don’t load any portion as virtual address, we can launch r2 -n.

Wide Strings

Sometimes the string can be encoded in UTF-16 or wide-encoded strings, use the /w command. UTF-16 strings are common in binaries compiled for Windows, especially those written in languages that support Unicode by default. Here’s an example of how to search for the “Hello” string in UTF-16:

[0x00000000]> /w Hello

Wide strings, take 2 bytes per character and can be written with ww, note that text encodings can be also affected by the endianness and we can even find utf-32 strings, mainly used for single point or glyphs.

[0x00000000]> ww hello
[0x00000000]> x 16
- offset -   0 1  2 3  4 5  6 7  8 9  A B  C D  E F  0123456789ABCDEF
0x00000000  6800 6500 6c00 6c00 6f00 0000 0000 0000  h.e.l.l.o.......
[0x00000000]> e cfg.bigendian = true
[0x00000000]> ww hello
[0x00000000]> x 16
- offset -   0 1  2 3  4 5  6 7  8 9  A B  C D  E F  0123456789ABCDEF
0x00000000  0068 0065 006c 006c 006f 0000 0000 0000  .h.e.l.l.o......
[0x00000000]>

Having fixed size runes to store the characters in the strings helps indexing the characters and computing lengths in a faster way.

To search for wide strings we use the /w command.

[0x00000000]> /w hello
0x00000001 hit0_0 680065006c006c006f00
[0x00000000]>

Note that utf8 strings are also detected by default with normal / scans. But feel free to learn more subcommands from /? to spot further capabilities and usecases in case you need them.

Base64

Many programs use base64 encoding to “hide” their strings. radare2 can decode them all in just a single line, but we can also encode the string we like and scan the binary for that pattern like this:

[0x100003a58]> p6es hello
aGVsbG8=
[0x00000000]> p6es ello
ZWxsbw==
[0x00000000]> p6es llo
bGxv
[0x00000000]>
[0x100003a58]> / `p6es hello`
...

It’s important to note that base64 packs characters every 3 bytes usually, so the result from encoding multiple substrings may end up into different representations.

To decode the base64 string you can use a script that takes every string, running p6d on it, and writes a comment or sets the comment on the given flag with the decoded string. Then we can use CC~base64~... to filter for our keywords.

This is the help for the base64 facilities in radare2:

[0x100003a58]> p6?
Usage: p6[d|e][s|z]  [len]  base64 decoding/encoding
| p6d[s|z] [len]    decode current block as base64
| p6e[s|z][len]     encode current block in base64
| p6ez              encode base64 zero-terminated string
| p6es hello world  encode given string to base64
| p6ds AAA=         decode given base64 string
| p6dz              decode null-terminated base64 string in block
[0x100003a58]>

Strings in Programs

When loading known file formats like programs or libraries which contain executable code, the linker uses the pack all the strings used in a single section. That section is usually in mapped into a read-only and non-executable and contains all the null terminated strings concatenated one after the other.

RBin is smart enough to spot that region by checking the section and scan for strings and make them available as part of the “binary strings” list from the iz command.

[0x100003a58]> iS~str
3   0x00007a15   0x506 0x100007a15   0x506 -r-x CSTRINGS         3.__TEXT.__cstring
[0x100003a58]>

For a deeper scan, we can use izz which scans all the mapped regions instead of just the cstrings one.

Additionally, we can also use izzz which ignores the mapped areas and the virtual addressing setup io.va which helps finding strings inside binary headers or parts from the executable that never hit the memory when executed.

Those strings are usually more than the ones are referenced by the program, so we can use that area to find long unrefered strings to write our own ones for our patching needs. Just in case we need to replace a string with a longer one, this trick may work better than resizing strings.

Note that the algorithm implemented under iz, is also the one available in /z, which can be used to do manual scans on the regions defined by search.in/search.from/search.to variables.

[0x100003a58]> /z?
| /z min max  search for strings of given size
[0x100003a58]>

Note that these minimum and maximum boundaries used for scanning strings can be also configured by rbin with the following commandline options in rabin2:

$ rabin2 -h | grep min:max
 -N [min:max]    force min:max number of chars per string (see -z and -zz)

..as well as the bin.str.min and bin.str.max variables in radare2.

rafind2 from the Shell

Radare2 includes an external tool called rafind2, which provides the searching capabilities of radare2 directly from the command line. rafind2 is especially useful if you want to search for strings without loading the entire binary into Radare2 and you don’t care about interactivity.

Here’s how to use rafind2 to search for the “Hello” string:

$ rafind2 -s Hello /path/to/binary

With -s to specify the search string and taking the files as target files, rafind2 will scan the binaries for instances of “Hello” and output any matches, similar to Radare2’s / command.

Other interesting flags from rafind2 are:

For example:

$ rafind2 -s libS /bin/ls /bin/sleep
/bin/ls: 0x46a1
/bin/ls: 0x10689
/bin/sleep: 0x44e1
/bin/sleep: 0xc531
$

NOTE All the programs in radare2 use the IO library to access “files” and read data from there, so it’s important to note that this /bin/ls can be also replaced with ptrace://pid or tcp: or any other URI handler listed by r2 -L to (for example) scan memory from running processes.

Assembly Constructed Strings

In modern programming languages like Rust and Swift, strings may not appear as direct ASCII or Unicode sequences in the binary. These languages often use assembly-level string construction, which compiles down to instructions that build strings dynamically at runtime. Radare2’s /az command allows us to search for such assembly-constructed strings.

Here’s how you can use /az:

[0x100002db0]> /az
0x100001634 Enter Password
0x100001704 Password
0x100001b4c 123
0x100001b58 password
0x100001bc0 password123
0x100001c38 123
0x100001d40 Submit
0x100002004 Success
0x10000201c Error
0x100002114 OK
0x100002f10 forp

[0x100002db0]> 0x100001634
[0x100001634]> pd 10
 ;-- asm.str.0_Enter_Password_1:
 0x100001634      a0c88dd2       mov x0, 0x6e45             ; 'En' ; string "Enter_Password"
 0x100001638      80aeacf2       movk x0, 0x6574, lsl 16    ; 'te'
 0x10000163c      400ec4f2       movk x0, 0x2072, lsl 32    ; 'r '
 0x100001640      002aecf2       movk x0, 0x6150, lsl 48    ; 'Pa'
 0x100001644      616e8ed2       mov x1, 0x7373             ; 'ss'
 0x100001648      e1eeadf2       movk x1, 0x6f77, lsl 16    ; 'wo'
 0x10000164c      418eccf2       movk x1, 0x6472, lsl 32    ; 'rd'
 0x100001650      01c0fdf2       movk x1, 0xee00, lsl 48
 0x100001654      23140094       bl sym SwiftUI.LocalizedStringKey.stringLiteral...cfC
 0x100001658      ff4300d1       sub sp, sp, 0x10
[0x100001634]>

The /az command performs a linear disassembly of all the program instructions, identifying code patterns that construct list of characters with immediates and shifts. This search is more CPU-intensive than ASCII or UTF-16 searches, but it’s invaluable for languages and frameworks that construct strings dynamically. Mainly because all the release builds of programs written with modern versions of LLVM end up inlining all the short strings by packing the immediates in registers instead of touching the memory for performance and size reasons.

Note that /az creates flags named prefixed with asm.str. as well as create comments with the “string:” prefix, which is enough for making the pseudo decompilation readable.

It’s important to note that in this case, the Swift API can take a single string packed in two 64 bit registers, the x0 and x1, holding up to 16 characters long strings.

Challenge

To practice what we learned here take this base64 string, which contains a program that asks for a password and using ONLY rax2, rafind2 and radare2 you may be able to spot the password!

Post the solution in your favorite social network and learn from other responses to improve your skills!

Use this oneliner to extract the binary:

$ cat file.txt | base64-D | xz -d > file.bin
/Td6WFoAAATm1rRGBMDPEfiTAyEBHAAAAAAAANaLJFHgyfcIx10AZ76Zr95iOx0/aCMide+MdjEVIJPn
bwxkGuaPAv06L03zaTHI1hbGPBGlsNs8tO99SHN9xdc7yyszQnogDKwj1t1sWXar5/Y+AbQWG4PfUDVC
tGiHty76xEwgo2+3Q9ER9q4lNNE12drZ3+PRLa97z9uGD4X0B5ze0jC0NmCW4XyPHVTk3u/wcQ8ngJNJ
k3FoQtfepbaTR7nDG9Ze+4G8bC0Sf7Jp/GVJ/g5/a3EbcZwYYY8VBUMSh9u8nor9XMAkgNgXSrWWi/Mm
x1GVotePgwrO+ClVMycDdjUXIRc9DX78gVRiVUtjCaC1nRIdzDTPDBjQGh27O2nKmJ13TcV7KagOhLKI
xIUnbVHius7esB6/SUCVHAcQOmWJrOQrPV8WqoJlr2xisHBbbKJVjTsp3RyQza1evv9fPKZKz5rQXFDd
h5tLVpIMb4VQ92XFPPwAh2hSBlBjtzTXg2+csJPZH/AW+c9QCzvuCfml04OYiDJk2kZsl8mPc5fha5X+
lmXTcDLZS4++385iRbzthCYWopNyK4Qh4LoMuRZ3InP5cO1KqU9UWmeZ96c4iwNdGhBf6/GuLMySTTK0
YPY0tB0RyPfu4hH5bdF18a7UnUzWWlI8z25Y5DXIP96aA0BjXH/oq3djh7I26pnN1zerddfhHRdHKJgI
0RwZ4Ghe+5yZ/8LhErhM9u3RS+pke12KadEueeScckDdbR4eabURoE2qZhEz4WuDPuFKfekWwVFhbHqu
0rFN8FBKYz9K1y7sZ8iMCnuEFVWQtImVZm22HS69mkcqoWf7qp2ShggDE5YSyd4WqkijWroQbQBVXt+g
2tSGNkRzWHxEx37M7WGCEHr3JWTqfU4iUNSHnpEX66T4pqMSOEb8Xzi1sCBs9uruHu5lXvXdHArUXIXJ
ML9p18lATx1FRzjCuaX9y1zGVBSln6H+l/ZwMYa3OvAqwUP4PG5L5UwqRytI1NSEaDypk2N5vKgPTt3Y
suY16NL93TBXLJsNAG/iXj0QRaea5Ctq70go546jKw8dM5pULLbZHX9swWzzR7ZMz5HQmJkuUxDSi2jO
jZOZ6NaVua1Zb/R/c8CxLBrW+9B2F3fL+4FYj5LbNxcWfI3Vt3Irm+ku6Fn2aeDQH33LY6FM3BlIsow9
rHYqAQ4gHymCfng4fPEpVz6inabuNqpoVaFlCAaaV7FkoRJAzx5XP6CZQNZ9FVZsDbUwtmRJLGsNYggW
2WClPrZpYgJAbf2BRJFENWVt1lCw0RFh9kuVYl1/2TvU2E+ZKn9sADoBn8IIr6vsVtqDkqDhgRAwIjNO
eggnHJOYWsL90JYY3usfB2lhD3sdFauOOHCeD7jjIFBrNLehZ/vCzGdpyTpq6GmbsxPnif3C1on6istP
9Q7cFzxL+wcvWv0tMRtsd7A03u36jyN3NLBZvuCUidt5EVMm+XG4CkDFhgKLo9mGaEzVfRBkLR4b3+NG
a1U4h+v1dAQiBu1UMeAteQwaX894jkPGH5RDc7jeqbGlsKEAlWpzxGbQTefU7+P4b8HygRhLRgMTdsIF
/Wi+DgYJABbkGAF5N7XhQtwgkLsCnwyCs3Hnt1w3kKNYzR4+54ANRov2uW+cQWK3Vef9pgXEI7N+kCy6
gcrfL7mqLetdTLqu7Teb4BdrKaXWMDEZ+pyfTfFYpw6kkrmlw01TONBKfzoC8dXNYDxgJew+LHgk7si7
n7DR2zBqZ6xkBNjgCuPYUFL4E+nJz7s3x3jTCnFh0nkDIaRJkx/QMlL7vbwFL9lSa/RhltqsWXTdeEAH
IrPNRXFFSZxgh7T2ig+AGeGZ505KU/GbwMsIABhzxZa2+BJ1ZJnNoK4CsXXsuUXV9przhHS9A/P81YGU
q2MmEZost0LnXo6oPzFx/XrVOEc+TeiYutzlBmqnpSE0cWR3bqy+NNxmTLs9wK4wyJQo38JbSY2eFwV3
ckTxtaW62aPRD6iZnZ2XWG7Ho71Y0Fq+/tH3Aw3pn4DChfX/OqDrDYdMLnh5GP5FixNPQQ4/zY98Mrkp
r39kE7HBOQGnd23zwtiE5Tco6LIFoyE824zhw8YEoZmMgJuypuKbNrup675pggX2o3blZ3NiCJqmXOBD
krE4S6TIpEKdgwoUpdLXLi4vFjC3NHNgEyGafAOoK3glv5n/WYFAcDQdmT5r8Nf55yZab/VPMvQHfkL2
a9qJKUxvesQWyZgbiyXSYJOKzJYq5mrHkiI8ZG/SLzzjyBbvXeFVj98eRvVFpFciCyN93nr890p31RHL
8a/gEeQG4myOoy3eySwhf6FcjMkkcaOP851PG71trq4OkR6bfKqDZ0aTz7TQRWtx9M5OIuEd5ZgWPBY7
zxc4ui6X+UMsERHSxuTLRr4caiV6UuYbURou/17JTq0U6ao0/EjXuIdgVVAgbXzFWsATRqD5Q/VY4atf
WjYp9F0uM9V3OzU9OVJoJZRqjMGc2lseCl3dqAUQP81D1cSPkfdvcppYBKhLiMdLRYzLyn2EOXRBE0Ki
IRzxLFftVvBAEhsUq5uDAA8sOZhBvgZ6p0mtDVW0SEa7YJKe34PqqKqIchVzjoCD7vZwKMdy78+M+/OO
Nguh180nwwQOChO2n/4IJxLhvD2nKU+xr7bE83jVygzfOhUPa6YdRafPL5m95g1a5Fltil3H2rfNu4rR
LHgj7B99yYvC3JhADtqmYhcHBfjHV81JvOFi9vHtfxAdW4VHUEFTX3m6iMtN3cLres7AaOd1YRcUa+bc
oX6wN99Jvau5kmfSUxlYG1gmL2Rn1LZzkzu2hv/koDOqlSlWuagmo5XUmz3hffZvJjYc9QAV2KLeNDWB
V7If+dxvptpxyPn/OXa4esIx9tndwQcOiB8R56mOczkm7NKR4PswFuQ9O+3k3RTtiBlSExHcb6jVniLX
BQGto1wP6T3g867npIVCtMeFPCq7acJjobXCSWQ8xSd2CGOyiW9TXOYmtew3Vo2f8vEQHrliFaFg2VfJ
DetnDpISAABs+oUmSA1prwAB6xH4kwMAjOr8dbHEZ/sCAAAAAARZWg==

Summary

Finding strings in binaries can reveal important clues, whether they’re ASCII, Unicode, or dynamically constructed. In this post we learned the basics and got some practical tips and exercises to get fluent in the process.

See you tomorrow in another advent post of radare2!

–pancake