# Flare-On 2018 - Challenge 12 - Subleq'n'RSSB

After explaining my solution for the last challenge of Flare-On 2018, I want to move one and analyze the details of the second stage. I will entirely focus on two OISC machines: Subleq1 and RSSB2. Take it as a personal training to tackle these processors and a way to get rid of the curiosity I wasn't able to eliminate when I firstly scored this challenge.
The goal is to create an IDA Pro processor module to reverse the code fully statically. This is possible—for humans at least—only if the code can be translated to an higher level representation. We first analyze the Subleq and then switch to the RSSB.

## Subleq

A Subleq instruction has 3 operands `A B C`. The memory pointed by `A` is subtracted from the memory pointed by `B`. The result is stored into the memory address `B`. If the subtraction result is less than or equal to 0, then the instruction pointer takes the value of `C`. If the subtraction result is greater than 0—or `C` is 0—the instruction pointer is incremented by 3 to point to the next instruction.
These conditions are not enough to have readable code. We can divide them in four new instructions:

• `CLEAR`: if `A == B` and `C == 0`;
• `SUB`: if `A != B` and `C == 0`;
• `JMP`: if `A == B` and `C != 0`;
• `JLE`: if `A != B` and `C != 0`.

With four different instructions it already makes more sense to write an IDA processor. Look at the IDA graph view result. Not bad at all. We have unconditional and conditional jumps, we recognize an abvious loop and an exit branch. Ah, I manually fixed the addresses of some indirect jumps to get this graph. Graph view would show unlinked branches otherwise.

So far so good. A nice looking picture but still quite unreadable. Reversing with four instruction for sure is less painful than using only a `subleq` instruction. However, we can try to do better and introduce more high level mnemonics. A great help comes from the official solution of challenge 11 of Flare-On 2017. The author, Nick Harbour, published some macros he wrote and that we can re-implement in IDA.

We start with `MOV` that he wrote in this way:

``````.macro MOV(%a, %b) // Move %a to %b
{
MOVJMP(%a, %b, Z)
}
{
subleq %b, %b
subleq %a, Z
subleq Z, %b
subleq Z, Z, %c
}
``````

The `MOV` reveals a new pattern. As you can see, there are three `MOV`s modifying a fourth one. Check what is modified in the macro above. The three `MOV`s overwrite the `%b` operand of the `MOVJMP` macro. We call it `MOV_DEREF_DST`: move `A` to the memory pointed by deferencing `B`. The macro is:

``````.macro MOV_DEREF_DST(%a, %b)
{
MOV(%b, _DEREF_DST)
MOV(%b, _DEREF_DST + 1)
MOV(%b, _DEREF_DST + 7)
_DEREF_DST:
MOV(%a, Z)
}
``````

In a similar way we can write `MOV_DEREF_SRC` to overwrite the `%a` operand of the `MOVJMP` macro at run-time:

``````.macro MOV_DEREF_SRC(%a, %b)
{
MOV(%a, _DEREF_SRC + 3)
_DEREF_SRC:
MOV(Z, %b)
}
``````

We can also add some arithmetic instructions like `ADD`, `INC` and `DEC`. The `ADD` can be written like this:

``````.macro ADD(%a, %b)
{
subleq %a, Z
subleq Z, %b
subleq Z, Z
}
``````

`%a` is negated and stored in register `Z` and then subtracted from `B`, thus giving an addition. `INC` is an `ADD` with one:

``````.macro INC(%a)
{
_ONE:
dw 0x1
}
``````

Last one is `DEC`:

``````.macro DEC(%a)
{
subleq Z, Z, _SUB
_ONE:
dw 0x1
_SUB:
subleq _ONE, %a
}
``````

Now our IDA processor has 10 instructions but there is one more sequence we can convert to a macro. In the picture you can see that `A` is subtracted from `B` and the result is checked to be less than or equal to 0. Immediately after, the result is checked to be greater than 0. This is a `JNE` instruction that we can translate to: ``````.macro JNE(%a, %b, %c) // Jump to %c if %a != %b
{
MOV(%b, _VAR)
SUB(%a, _VAR)
subleq _VAR, Z, _LESS_EQUAL
subleq Z, Z, _GREATER
_LESS_EQUAL:
subleq Z, Z
subleq Z, _VAR, _EQUAL
_GREATER:
subleq Z, Z, %c
_VAR:
dw 0x0
_EQUAL:
}
``````

The instruction set is quite complete and the new IDA graph view is much much lighter, beside the fact that the code can be reversed completely statically at this point. Compare the graph on the right with the one at the beginning of the post.
The really cool thing is that we can also see this Subleq code implements concepts like stack pointer, base pointer for the function frame or function calls. Look at the picture for example. Once the code is lifted to something more meaningful, the RE pain vanishes. This Subleq works as interpreter for another inner VM, a Reverse Subtract and Skip if Borrow (RSSB) machine. This also explains why the Subleq traces were incredibly verbose. I release the IDA processor on GitHub implementing all the macros explained so far, so that you can have a closer look at the lifted code. Be aware that I didn't create a macro for indirect jumps—like the one in the `POP&RET` picture. Either you create it as an exercise, or you extract the value of the `POP` right before a function return and manually patch the code in IDA. Last thing: this IDA module is an experimental toy highly depending on the macros used in this challenge.
Finally the code of the Subleq captured from IDA and translated to Python.

``````def dispatch(mem):
ip = mem
if mem[ip] == -1:
return False

mem = (mem[mem[ip]] - mem) & 0xffff
mem = mem - 65536 if mem > 32767 else mem

if mem[ip] != 2:
mem[mem[ip]] = mem

if mem < 0:
return True
return False

mem = [] # RSSB code
size = len(mem)

while mem <= size:
if mem[mem] == -2:
break

if dispatch(mem):
mem += 1
mem += 1

if mem == 1:
mem = 0
chr_out = chr(code)
mem = 0
if mem == 1:
mem = 0
mem = ord(chr_in)``````

RSSB instructions have one operand `A` which points to memory. Both Subleq and RSSB have the IP register mapped to memory location 0x0, but the latter has a second register mapped at location 0x1: the accumulator `ACC`. Every `rssb` instruction subtracts `ACC` from `A` and stores the result in both `A` and `ACC`. If the result is less than 0, the next instruction is skipped. Jumps can be taken by manipulating the program counter.
RSSB is more extreme than Subleq as you can see. Reversing pure RSSB instructions would be mind-blowing. We must absolutely recognize some macros, write an IDA processor module for RSSB and make the code reversable with ease—as we just did for the Subleq.

The most basic instructions we can implement out of a single `rssb` are:

• `CLEAR_ACC`: if `A == 1`, because `A` will point to `ACC` thus giving `ACC-ACC`;
• `NOP`: if `A == -1`, because the RSSB in question jumps to the next instruction without any consideration.

At this point we should stare at our code with three instructions and spot some patterns which can be converted to an higher-level representation.
The first two are `SUB` and `ADD`, with these macros:

``````.macro SUB(%a, %b)   // Subtract %a from %b and store in %b
{
rssb ACC          // ACC is at mem 1
rssb Z            // Z is at mem 2
}

.macro SUB_V2(%a, %b)   // Subtract %a from %b and store in %b
{
rssb ACC          // ACC is at mem 1
}

.macro ADD(%a, %b)   // Add %a to %b and store in %b
{
}
``````

Let's explain them a bit. In the `SUB` we reset the accumulator and put `%a` in it. Two `rssb Z` are used to preserve the sign of `%a` and two `NOP` to do not skip instructions depending on the sign. The last instruction will be `%b-%a` with `%a` being in the accumulator. `SUB` can be written also with three `rssb Z` without interleaving `NOP`s, `SUB_V2`. The effect will be the same. The `ADD` is mostly equal except for the missing `NOP`s. We don't need them because we want to invert the sign of `%a` to have `%b-(-%a)`.

Subtraction can also be used to set memory to zero:

``````.macro CLEAR_MEM(%a)  // Set memory at %a to 0
{
SUB(%a, %a)
}
``````

Now let's add unconditional branches with a `JMP` macro:

``````.macro JMP(%a)    // Jump to %a
{
ADD(_VAL, IP)        // IP is at mem 0
_VAL:
dw (_VAL - %a) // compute jump length
}
``````

You notice how we directly modify the program counter by adding the length of the jump to it.
What about `INC` and `DEC`:

``````.macro INC(%a)    // Increment %a by 1
{
JMP(_CONTINUE)
_ONE:
dw 0x1
_CONTINUE:
}

.macro DEC(%a)    // Decrement %a by 1
{
SUB(_ONE, %a)
JMP(_CONTINUE)
_ONE:
dw 0x1
_CONTINUE:
}
``````

We are still missing `MOV` and conditional branches, the most important instructions to understand the control flow of a program. The challenge author implemented the `MOV` by using `CLEAR_MEM` in conjunction with `ADD`. The respective macro is going to be:

``````.macro MOV(%a, %b)   // mov %a to %b
{
CLEAR_MEM(%b)
}
``````

After pushing our new macros inside an IDA processor for RSSB, it gets easier to recognize new—more complicated—macros. You can see above a `MOV_DEREF_DST`. Move `A` to the memory pointed by dereferencing `B`. The pretty looking macro will be:

``````.macro MOV_DEREF_DST(%a, %b)   // Move %a to ptr %b
{
MOV(%b, _CLEAR)
MOV(%b, _CLEAR + 4)
_CLEAR:
SUB_V2(Z, Z)
MOV(%b, _DEREF_DST + 4)
_DEREF_DST:
}
``````

`MOV_DEREF_SRC` is quite similar but a bit shorter. The macro for `MOV_DEREF_SRC` is:

``````.macro MOV_DEREF_SRC(%a, %b)   // Move ptr %a to %b
{
CLEAR_MEM(%b)
MOV(%a, _DEREF_SRC + 1)
_DEREF_SRC:
}
``````

Conditional branches in our RSSB are more fascinating than all other instructions. I recognized a `JGE` / `JLE` macro by analyzing the last low-level snippet of code remaining in my IDA disassembly. The RSSB code in IDA is quite long and would clutter the blog post. Go here to have a look at the big picture.
Try to reason some minutes on the code and maybe you can agree it makes sense to translate it to a conditional branch. The pattern is not unique tho. Our RSSB code shows different instructions combinations achieve a `JGE` behavior. For example the pattern at offset `0xfd6` of the RSSB memory is slightly different from the pattern at offset `0x17a4`.

Below is what could be a `JGE` macro. Be aware it doesn't match 1:1 with our RSSB.

``````.macro JGE(%a, %b, %c)   // Jump to %b if %a >= 0 else %c
{
CLEAR_MEM(_VAR1)
CLEAR_MEM(_VAR2)
rssb ACC              // Skip this if %a < 0
rssb _VAR1            // _VAR 1 is 0 if %a >= 0 else -%a
MOV(_VAR1, _VAR2)}
MOV(_JL, _VAR_JL)
MOV(_JGE, _VAR_JGE)
DEC(_VAR2)
JMP(_L1)

_VAR1:
dw 0x0
_VAR2:
dw 0x0
_VAR_JL:
dw 0x0
_VAR_JGE:
dw 0x0
_VAR3:
dw 0x0
_JL:
dw %c
_JGE:
dw %b

_L1:
rssb _VAR2          // Here _VAR2 is -1 if %a >= 0 else is 0
INC(_VAR2)
ADD(_VAR2, _VAR3)   // Create a jump length of 14
rssb _VAR3         // _VAR3 is 0 if %a >= 0 else 14

CLEAR_MEM(_VAR3)

rssb ACC          // We get here if %a >= 0

CLEAR_MEM(_VAR3)

rssb ACC          // We get here if %a < 0
``````

The RSSB IDA processor now counts 14 instructions that allow us to see memory movements, arithmetic operations and control flow decisions. IDA graph view should make a lot of sense by now. And in fact… In the first part of this blog post I said my final goal is to reverse the RSSB statically with IDA Pro. Well, it's possible now. The macros made their job and reversing the machine doesn't cause a crazy headache anymore. There is also here the concept of stack, push, pop, base pointer and function call like in the Subleq. There are indirect jumps as well for which I didn't provide any macro, but target addresses can be solved by embedding in the IDA module a tiny stack emulator. The alternative is to patch the code manually as I did.
Actually there is a branch missing in the graph: it goes from the bottom to the small loop you can see in the top part of the image. I didn't include it on purpose, or IDA would align the basic blocks in a weird way. Head here for a closer look to the full graph after lifting the code.

I publish on GitHub also the toy RSSB processor for IDA. What I said for the Subleq module holds for the RSSB one. You can use the IDA processors to experiment, play with it and modify it, but do not expect they will work with other VM code. These processors are NickHarbour-dependent :) and have been coded with no special effort.

Ah, in the cat'n'grep writeup I was wondering what was the big number added to the sum of all input chars up to `@` excluded. Mistery solved—it's `(i << 10) + (i << 11)`, signed and over 16 bits of course. `i` is equal to the index of `@` in the password.

The python code doing the same encryption as of the RSSB is:

``````def wrap(val):
return (ord(val) - 32) << 7 | (ord(val) - 32)

def twos_ca(val):
if val >> 15:
return -((1 - val) & 0xffff) + 1
return val

def encrypt(text):
res = []
tot_sum = 0

for i, c in enumerate(text):
c = ord(c)
if c - 0x40 == 0:
tot_sum += (twos_ca(i<<10) + twos_ca(i<<11))
break
else:
tot_sum += c

for i in range(15):
v = wrap(text[i*2:i*2+2])
v = v ^ ((i << 5) | i)
res.append(tot_sum + v)
return res``````

If you never tried to create an IDA processor module, you can take the following resources as a starting point:

Unfortunately most of the resources you find online are for IDA 6 API. The Subleq and RSSB processors I wrote are for IDA >= 7.