Reverse Engineering

Challenge

Link

dive in the lake (50 pts)

Here

FUNTRAN (115 pts) UPSOLVE

Here

TheGoodGod (264 pts)

Here

smolene's latest obsession (338 pts) UPSOLVE

Here

dive in the lake (50 pts)

Description

Can you make it to 320m?

Solution

Given ELF 64 bit, open it using IDA

Program receive input using argument and in this case there are 3 arguments needed. All arguments passed to function f.

function f just add those 3 arguments and return the result. Looking at another function i dont see any suspicious function available, so there should be something in another place (outside .text section). The first approach i do is executing strings command to the executable

So there are some suspicious section such as .debug_info, searching that section i found this reference https://stackoverflow.com/questions/77169835/issue-in-parsing-dwarf-debug-info-section-in-an-elf-file. .debug_info section has relation with dwarf, searching about how to dump dwarf information from ELF file i found dwarfdump command. Execute dwarfdump command i got this result

We can see that the start address is 0x00001147 which is the end of function f.

Searching about dumped instruction i found that those instruction called dwarf expression https://llvm.org/docs/AMDGPUDwarfExtensionAllowLocationDescriptionOnTheDwarfExpressionStack/AMDGPUDwarfExtensionAllowLocationDescriptionOnTheDwarfExpressionStack.html. Next step i do is searching any useful command in GDB that related to dwarf expression and found that we can dump the instruction using info scope command https://yourlabs.org/posts/2014-04-13-gdb-debugging-basics/ .

My approach is trying to reconstruct the whole instruction in python with reference https://dwarfstd.org/doc/DWARF5.pdf. Here is my reimplemented code

def push(arr, val):
  arr.insert(0, val)

def and_op(arr):
  tmp1 = arr.pop(0)
  tmp2 = arr.pop(0)
  tmp3 = tmp1 & tmp2
  push(arr, tmp3)

def mul_op(arr):
  tmp1 = arr.pop(0)
  tmp2 = arr.pop(0)
  tmp3 = tmp1 * tmp2
  push(arr, tmp3)

def mul_op(arr):
  tmp1 = arr.pop(0)
  tmp2 = arr.pop(0)
  tmp3 = tmp1 + tmp2
  push(arr, tmp3)

def ne_op(arr):
  tmp1 = arr.pop(0)
  tmp2 = arr.pop(0)
  return tmp1 != tmp2


stack = []

rdi = 123
rsi = 123
rdx = 123

push(stack, 1)
push(stack, rdi)
push(stack, rsi)
and_op(stack)
push(stack, rdx)
and_op(stack)
push(stack, 7219272754963824708)

if(ne_op(stack) == 0):
  push(stack, 3)
  mul_op(stack)

push(stack, 9259542123273814144)
push(stack, 9259542123273814144)
push(stack, 9259542123273814144)
push(stack, rdi)
and_op(stack)
push(stack, 0)

if(ne_op(stack) == 0):
  push(stack, rsi)
  and_op(stack)
  tmp = stack.pop(0)
  
  if(tmp == 0):
    push(stack, rdx)
    and_op(stack)
    tmp = stack.pop(0)
    if(tmp == 0):
      push(stack, 3)
      mul_op(stack)
  
  push(stack, rdi)
  push(stack, rdi)
  mul_op(stack)

  push(stack, rsi)
  push(stack, rsi)
  mul_op(stack)
  
  add_op(stack)
  push(stack, 18116903027530606121)
  
  if(ne_op(stack) == 0):
    push(stack, 3)
    mul_op(stack)
  
  push(stack, rdx)
  push(stack, rdx)
  mul_op(stack)

  push(stack, rsi)
  push(stack, rsi)
  mul_op(stack)

  add_op(stack)
  push(stack, 16612709672999228116)
  
  if(ne_op(stack) == 0):
    push(stack, 7)
    mul_op(stack)
  
  print(stack.pop(0)) # should be 189

Lets analyze each part

So we need to get value 189 which is the result of all correct input
- 1*3*3*3*7
rdi & rsi & rdx == 7219272754963824708
rdi, rsi, rdx are printable (<=0x7f)
- rdi, rsi, rdx & 0x8080808080808080 == 0
rdi*rdi + rsi*rsi == 18116903027530606121
rdx*rdx + rsi*rsi == 16612709672999228116

We can see that the constraint looks like solveable using SMT solver like z3. So the last step is creating script to find the valid value for rdi, rsi, and rdx using z3. Here is my script

from z3 import *
from Crypto.Util.number import *

rdi = BitVec('x', 64)
rsi = BitVec('y', 64)
rdx = BitVec('z', 64)

s = Solver()

s.add(rdi & rsi & rdx == 7219272754963824708)
s.add(rdi*rdi + rsi*rsi == 18116903027530606121)
s.add(rdx*rdx + rsi*rsi == 16612709672999228116)
s.add(rdi & 0x8080808080808080 == 0)
s.add(rsi & 0x8080808080808080 == 0)
s.add(rdx & 0x8080808080808080 == 0)

print(s.check())
model = s.model()

flag = long_to_bytes(model[rdi].as_long())[::-1]
flag += long_to_bytes(model[rsi].as_long())[::-1]
flag += long_to_bytes(model[rdx].as_long())[::-1]
print(flag)

Flag : EPFL{_1tTookS3venDwarf5}

FUNTRAN (50 pts)

Description

Solution

TheGoodGod (264 pts)

Description

History teaches us there are good gods and bad ones. Good ones write their own deobfuscator. Bad ones just guess the flag directly.

Solution

Given ELF 64 bit, open it using IDA

There is obfuscation on main function, through debugging we can know which instruction executed and which are not. In this case i try to dump executed instruction by automating the next instruction process.

#!/usr/bin/python3
import string
import os
import string

class SolverEquation(gdb.Command):
    def __init__ (self):
        super (SolverEquation, self).__init__ ("solve-equation",gdb.COMMAND_OBSCURE)
    
    def invoke (self, arg, from_tty):
        
        gdb.execute("pie b 0x15ce")
        gdb.execute("pie run < x.txt")
        arch = gdb.selected_frame().architecture()
        dict = {}
        main_iter = 1564 # trial and error
        for _ in range(main_iter):
            print(_)
            current_pc = addr2num(gdb.selected_frame().read_register("pc"))
            disa = arch.disassemble(current_pc)[0]
            addr = current_pc - 0x555555554000
            if(addr not in dict):
                dict[addr] = [disa["asm"], disa["length"]]
            gdb.execute("ni")
        for i in dict:
            print(f"{hex(i)} : {dict[i]}")
        
def addr2num(addr):
    try:
        return int(addr)  # Python 3
    except:
        return long(addr) # Python 2
SolverEquation()

0x15ce : ['and    r11,0x7fff', 7]
0x15d5 : ['mov    rax,r11', 3]
0x15d8 : ['imul   rax,r11', 4]
0x15dc : ['add    r11,rax', 3]
0x15df : ['and    r11,0x1', 4]
0x15e3 : ['ja     0x5555555555e7', 2]
0x15e5 : ['push   rbp', 1]
0x15e6 : ['mov    rbp,rsp', 3]
0x15e9 : ['sub    rsp,0x20', 4]
0x15ed : ['and    rax,0x7fff', 6]
0x15f3 : ['mov    r11,rax', 3]
0x15f6 : ['imul   r11,rax', 4]
0x15fa : ['add    rax,r11', 3]
0x15fd : ['and    rax,0x1', 4]
0x1601 : ['je     0x555555555605', 2]
0x1605 : ['mov    DWORD PTR [rbp-0x14],edi', 3]
0x1608 : ['mov    QWORD PTR [rbp-0x20],rsi', 4]
0x160c : ['lea    rax,[rip+0xa1a]        # 0x55555555602d', 7]
0x1613 : ['mov    rdi,rax', 3]
0x1616 : ['mov    eax,0x0', 5]
0x161b : ['call   0x555555555050 <printf@plt>', 5]
0x1620 : ['and    r11,0x7fff', 7]
0x1627 : ['mov    rdx,r11', 3]
0x162a : ['imul   rdx,r11', 4]
0x162e : ['add    r11,rdx', 3]
0x1631 : ['and    r11,0x1', 4]
0x1635 : ['ja     0x55555555563a', 2]
0x1637 : ['mov    rax,QWORD PTR [rip+0x2a22]        # 0x555555558060 <stdin>', 7]
0x163e : ['mov    rdx,rax', 3]
0x1641 : ['mov    esi,0x39', 5]
0x1646 : ['and    r11,0x7fff', 7]
0x164d : ['mov    rax,r11', 3]
0x1650 : ['imul   rax,r11', 4]
0x1654 : ['add    r11,rax', 3]
0x1657 : ['and    r11,0x1', 4]
0x165b : ['je     0x55555555565f', 2]
0x165f : ['lea    rax,[rip+0x2a1a]        # 0x555555558080', 7]
0x1666 : ['mov    rdi,rax', 3]
0x1669 : ['call   0x555555555080 <fgets@plt>', 5]
0x166e : ['test   rax,rax', 3]
0x1671 : ['jne    0x55555555567d', 2]
0x167d : ['and    rcx,0x7fff', 7]
0x1684 : ['mov    rax,rcx', 3]
0x1687 : ['imul   rax,rcx', 4]
0x168b : ['add    rcx,rax', 3]
0x168e : ['and    rcx,0x1', 4]
0x1692 : ['je     0x555555555696', 2]
0x1696 : ['mov    BYTE PTR [rip+0x2a1b],0x0        # 0x5555555580b8', 7]
0x169d : ['mov    QWORD PTR [rbp-0x8],0x0', 8]
0x16a5 : ['jmp    0x555555555712', 2]
0x1712 : ['cmp    QWORD PTR [rbp-0x8],0x38', 5]
0x1717 : ['jbe    0x5555555556a7', 2]
0x16a7 : ['and    rax,0x7fff', 6]
0x16ad : ['mov    rdx,rax', 3]
0x16b0 : ['imul   rdx,rax', 4]
0x16b4 : ['add    rax,rdx', 3]
0x16b7 : ['and    rax,0x1', 4]
0x16bb : ['je     0x5555555556c2', 2]
0x16c2 : ['mov    rax,QWORD PTR [rbp-0x8]', 4]
0x16c6 : ['lea    rdx,[rax*4+0x0]', 8]
0x16ce : ['lea    rcx,[rip+0x29ab]        # 0x555555558080', 7]
0x16d5 : ['mov    rax,QWORD PTR [rbp-0x8]', 4]
0x16d9 : ['add    rax,rcx', 3]
0x16dc : ['movzx  eax,BYTE PTR [rax]', 3]
0x16df : ['movsx  eax,al', 3]
0x16e2 : ['mov    esi,eax', 2]
0x16e4 : ['lea    rax,[rip+0x29d5]        # 0x5555555580c0', 7]
0x16eb : ['mov    rdi,rax', 3]
0x16ee : ['call   0x555555555189', 5]
0x16f3 : ['and    rax,0x7fff', 6]
0x16f9 : ['mov    rcx,rax', 3]
0x16fc : ['imul   rcx,rax', 4]
0x1700 : ['add    rax,rcx', 3]
0x1703 : ['and    rax,0x1', 4]
0x1707 : ['je     0x55555555570d', 2]
0x170d : ['add    QWORD PTR [rbp-0x8],0x1', 5]
0x1719 : ['and    r10,0x7fff', 7]
0x1720 : ['mov    rcx,r10', 3]
0x1723 : ['imul   rcx,r10', 4]
0x1727 : ['add    r10,rcx', 3]
0x172a : ['and    r10,0x1', 4]
0x172e : ['je     0x555555555734', 2]
0x1734 : ['mov    edx,0x39', 5]
0x1739 : ['lea    rax,[rip+0x28e0]        # 0x555555558020', 7]
0x1740 : ['mov    rsi,rax', 3]
0x1743 : ['and    r8,0x7fff', 7]
0x174a : ['mov    r11,r8', 3]
0x174d : ['imul   r11,r8', 4]
0x1751 : ['add    r8,r11', 3]
0x1754 : ['and    r8,0x1', 4]
0x1758 : ['ja     0x55555555575b', 2]
0x175a : ['lea    rax,[rip+0x291f]        # 0x555555558080', 7]
0x1761 : ['mov    rdi,rax', 3]
0x1764 : ['call   0x55555555528c', 5]
0x1769 : ['test   eax,eax', 2]
0x176b : ['jne    0x5555555557e8', 2]
0x17e8 : ['and    rax,0x7fff', 6]
0x17ee : ['mov    r11,rax', 3]
0x17f1 : ['imul   r11,rax', 4]
0x17f5 : ['add    rax,r11', 3]
0x17f8 : ['and    rax,0x1', 4]
0x17fc : ['ja     0x5555555557ff', 2]
0x17fe : ['lea    rax,[rip+0x83d]        # 0x555555556042', 7]
0x1805 : ['mov    rdi,rax', 3]
0x1808 : ['call   0x555555555030 <puts@plt>', 5]
0x180d : ['mov    eax,0x0', 5]

From the trace we can see that there are two type of obfuscation.

Dont execute jump in ja
Execute jump in je

So there is a pattern although it use different register on each obfuscation part. With those knowledge we can create script to automatically deobfuscate the program.

from idaapi import *
import idc
import ida_bytes

def deobfuscate(address):
	start_address = address
	insn = insn_t()
	last_inst = ""
	list_inst = []
	
	for i in range(500):
		create_insn(start_address)
		size = decode_insn(insn, start_address)
		inst = idc.print_insn_mnem(start_address)
		operand = []
		
		for j in range(3):
			tmp = idc.print_operand(start_address, j)
			if(tmp == ''):
				break
			operand.append(tmp)
		
		joined_operand = ','.join(operand)
		list_inst.append([start_address, size, inst, joined_operand])
	
		if(inst == "jz" and last_inst == "and"):
			tmp = joined_operand.split("_")[-1].split("+")
			new_addr = 0
			for j in range(len(tmp)):
				new_addr += int(tmp[j], 16)
			ida_bytes.del_items(start_address + size)
			ida_bytes.del_items(new_addr + 1)
			
			for j in range(1,7):
				tmp_inst = list_inst[-1*j]
				for k in range(tmp_inst[1]):
					patch_byte(tmp_inst[0]+k, 0x90)
			for j in range(start_address, new_addr):
				patch_byte(j, 0x90)
			
			start_address = new_addr
		elif(inst == "ja" and last_inst == "and"):
			for j in range(1,7):
				tmp_inst = list_inst[-1*j]
				for k in range(tmp_inst[1]):
					patch_byte(tmp_inst[0]+k, 0x90)
		elif(inst == "retn"):
			break
		else:
			start_address += size
		last_inst = inst

list_address = [0x1189, 0x128C, 0x15ce]
for start_address in list_address:
	deobfuscate(start_address)

The script will automatically nop obfuscation part then we can decompile the function. To create a function we can follow step below

Undefine instruction (press u)
Make instruction (press c)
Create function on push rbp (press p)

In some part we will facing issue regarding the old xreference, to patch it we can do undefine instruction then make instruction on address that has xreference. Here is the final code after deobfuscation

Here is the analysis for each function

main
- receive input
- call sub_1189 0x39 times. sub_1189 process our input and store it to unk_40c0
- call sub_12a3. sub_12a3 process our input.
- compare unk_40c0 with unk_2010 (static values)
sub_1189
- Do some arithmetic operation
- Each 4 bytes on input will be stored on 3 bytes result
  - i = [0,1,2,3]
    (4*i) >> 3 = [0, 0, 1, 1] -> index on a1
    ((4*i) >> 3) + 1 = [None, 1, None, 2] -> index on a1
sub_12a3
- Compare the count of character on a1 and a2, e.g:
  - a1 = "aabc", a2 = "caba"
  - sub_12a3(a1, a2) -> True
    Since count of "a" in a1 and a2 is 2
    Since count of "b" in a1 and a2 is 1
    Since count of "c" in a1 and a2 is 1

Basically we can bruteforce 4 bytes input then compare 2 bytes result since those 2 bytes will not processed in further iteration. There will be many possibility but we can reduce the possibility and accelerate the process by modifying charset for each product iteration. In this case i implement that algorithm as recursive function. Here is my solver

import string
from itertools import product

def sub_1189(a1, a2, a3):
	a1[a3 >> 3] ^= (a2 >> (a3 & 7))
	if((a3 & 7) != 0):
		a1[(a3 >> 3) + 1] ^= (a2 << (8 - (a3&7)))&0xff

def recur(arr, flag, charset):
	if(len(flag) >= 56 ):
		print(f"Possible Flag : {flag}")
	for i in product(charset, repeat=4):
		arr2 = arr[:]
		tmp = flag + ''.join(i).encode()
		for j in range(len(flag), len(tmp)):
			sub_1189(arr2, tmp[j], 4*j)
		if(arr2[:len(tmp)//2] == cmp_val[:len(tmp)//2]):
			charset_tmp = charset[:]
			for k in charset_tmp:
				tmp2 = tmp.count(k.encode())
				tmp3 = charset_ori.count(k)
				if(tmp2 == tmp3):
					charset_tmp.remove(k)
			recur(arr2, tmp, charset_tmp)

charset_ori = [0x33, 0x33, 0x33, 0x34, 0x34, 0x34, 0x34, 0x35, 0x35, 0x36, 0x38, 0x45, 0x46, 0x4c, 0x50, 0x5f, 0x5f, 0x5f, 0x5f, 0x5f, 0x5f, 0x5f, 0x5f, 0x5f, 0x5f, 0x5f, 0x61, 0x61, 0x61, 0x64, 0x65, 0x65, 0x65, 0x66, 0x68, 0x68, 0x69, 0x69, 0x6c, 0x6c, 0x6d, 0x6e, 0x6e, 0x72, 0x73, 0x74, 0x74, 0x74, 0x74, 0x76, 0x77, 0x7a, 0x7a, 0x7a, 0x7b, 0x7d]
charset_ori = [chr(i) for i in charset_ori]

flag = b"EPFL"

for i in flag:
	charset_ori.remove(chr(i))

uniq = []
for i in charset_ori:
	if(i not in uniq):
		uniq.append(i)

cmp_val = [0x40, 0x42, 0xb8, 0x4f, 0x92, 0xe5, 0x26, 0x33, 0xee, 0xa0, 0xc1, 0x97, 0xbc, 0x4f, 0x81, 0x43, 0x81, 0xe2, 0xdc, 0x2b, 0x92, 0xf9, 0xf, 0x73, 0x96, 0x18, 0x2b, 0x33, 0xd0]

arr = [0 for _ in range(57)]

for i in range(len(flag)):
	sub_1189(arr, flag[i], 4*i)
recur(arr, flag, uniq)

Flag : EPFL{3z_disa55em8le_4_an_3z_r3v_with_4n_ez_fl46_at_th4t}

smolene's latest obsession (338 pts)

Description

Solution

PreviousLakeCTF Quals NextCryptography

Last updated 1 year ago