Emanuele Cozzi @invano

CSAW 2018 quals - 1337

The NOPS team couldn’t miss CSAW CTF 2018 qualification round. We scored 6th in Europe and 17th worldwide. I mostly focused on reversing with kvm, rev 500, and 1337 initially rev 300, later upgraded to rev 500.

Today an Eurecom former student sent me an email asking how I solved 1337. The promise was to write a quick high-level email and then stay available for more in-depth questions from him. Well, this thing went out of control and I ended up writing an email-writeup.

I thought of sharing this publicly, since there are still no writeups around for 1337. Maybe someone else will find it useful to understand what was a possible way to approach this challenge. I modified the email-writeup a bit and added some pictures to make it more readable.

The first thing to notice in 1337 were the sections named as UPX0 and UPX1. This is typical naming of UPX-packed PEs. However 1) entropy is low, 2) if you open the binary in IDA you can see something wrong is going on by just looking at the control-flow graph with some zoom-out. The CFG is not the typical UPX one. The author goal here was to fool you to think the binary is packed…

Once you go past this, the second thing you notice is that IDA cannot disassemble it properly. Some instructions stay unexplored, lots of functions cannot be defined, asm tricks and so on. This is the point where I switched to dynamic analysis. Windows VM plus x64dbg, my favourite debugger.

The execution under debugger revealed interesting features:

  1. the binary triggers a continuous exception 0xC0000005 (access violation exception)
  2. the binary installs an exception handler (mmmmhhhh)
  3. if you write a wrong password the binary prints Nope! on GUI (weird, doesn’t happen from console)
  4. if you put a breakpoint, 1337.exe exits and debugger stops

At this point it was pretty clear 1337 is doing anti-dbg the hard way. You cannot use breakpoints. Either you patch the installed exception handler or you use hardware breakpoints (no checks for hwbpt but you only have 4 slots).

Let’s switch to strings to see if we get lucky. There are 2 format strings: one is %12s, the other %24s. A bit of IDA and a bit of debugger and you see %12s is used to get your password at address 0x5104C3.

1337 fake input

Here you can see that your input is checksummed using CRC32 and compared with the constant 0x2477c. If you pass the check, 1337 prints flag{your_input}. As soon as I noticed this, I immediately said myself the author decided to create a trap. Should we really bruteforce 12 chars? Also, you reach this point only if you use normal breakpoints.. definitely a trap!

So I switched to the %24s format string referenced at address 0x529055 and I started using hardware breakpoints. The key point here was to focus on the real algorithm without wasting time on useless asm instructions inserted everywhere just to make a lot, but really a lot, of noise. Because of this noise and the asm tricks, you cannot even use a decompiler (unless you clean the code). Pure asm.

After IDA + x64dbg + feelings/guessing I was able to explore the code in the correct way and finally reconstruct the full algorithm. See the feelings/guessing part. I only want the flag, I don’t need to understand how the entire binary works.

Overall behavior:

  • get 24B from stdin
  • divide the input in 3 chunks of 8B each
  • for each chunk, do crazy math and store the result
  • get a “special” string from the PE resource directory / string table
  • compare the previous results with this “special” buffer byte by byte
  • if you pass the check, then you get the flag.

The crazy math looks like this when decompiled:

1337 crazy math

Also note the __PAIR__ functions. That’s an IDA macro to compute an unsigned long value from 2 arguments. In our case the binary is 32bits but does 64bits math.

The code above can be translated to a way more readable python one liner (thanks @reyammer here :D):

   def mul(val):
      # val is a 8B input chunk
      res = (val >> 62) + pow(5, val, 2**64) - 1
      return res & 0xffffffffffffffff

How is the final buffer loaded from resources? There is a jump table right before looping on the next 8B chars of the input. This jump table makes it possible to load the final buffer. However, this buffer it’s overwritten with a different resource on every iteration. Case 4 of the jump table loads the bytes we want.

1337 jumptable

Also remember that every byte equal to 0x2d is swapped with 0x0.

1337 2d to zero

You have the math, you have the final buffer, you have to go back to the correct input that let you pass the check. Because of mul properties, you can see that the lowest byte of val only affects the lowest byte of res. See the picture below. Cool, you can bruteforce the input from the lowest byte to the highest. I wrote a recursive bruteforcer, since you can potentially have multiple solutions.

1337 pow bytes

Troubles begin exactly at this point. No way I can get a printable input even if I was totally sure about the 1337’s algorithm. I pinged the author to check if the hard-coded buffer was correct and yes, it was.

The organizers released an hint during the second day: “Use Hebrew as system language.” That means they exploit Windows code pages for chars encoding. If your system language set is not Hebrew, LoadStringA (used to load strings from resources) gives you a “corrupted” buffer.

1337 hebrew

I noticed the resources declared as LANG_HEBREW before the hint… but this encoding trick was totally new to me. I thought language is Hebrew because, well, the challenge author must be Hebrew at this point :).

Look at here: https://en.wikipedia.org/wiki/Windows-1255. If the code point is 0x0045 (‘E’), LoadStringA translates it to 0x45. What if the code point is 0x05DE? If your system is not using Hebrew code page, that value will be translated to 0x3F.. Instead, with the right language set it should be 0xEE.

Consider the first 8B buffer used in the final check:

B0 75 85 3F 08 3F B9 BE

1337 is wrongly converting unicode code points 05D5 and 05D4 to 0x3F instead of 0xE5 and 0xE4. We must convert every char correctly or we will not get any flag.

The final check consists of a 24*1B memcmp between the result computed from the input and the expected value. memcmp is called through call eax from address 0x577098.

I fixed the magic buffers with Windows cp-1255 in mind and ffiiiiinaalllyy I got the flag: flag{4.\/3ry.1337.|>45$vVp@9/}

You can find the full script HERE.

P.S. 22nd char of the magic buffer is F893. It’s not belonging to cp1255 and a quick google search suggested it’s an invalid character. I preferred to bruteforce it (result is 0xDF) and save time for other challenges.