Program receive input using argument and in this case there are 3 arguments needed. All arguments passed to function f.
function f just add those 3 arguments and return the result. Looking at another function i dont see any suspicious function available, so there should be something in another place (outside .text section). The first approach i do is executing strings command to the executable
So we need to get value 189 which is the result of all correct input
1*3*3*3*7
rdi & rsi & rdx == 7219272754963824708
rdi, rsi, rdx are printable (<=0x7f)
rdi, rsi, rdx & 0x8080808080808080 == 0
rdi*rdi + rsi*rsi == 18116903027530606121
rdx*rdx + rsi*rsi == 16612709672999228116
We can see that the constraint looks like solveable using SMT solver like z3. So the last step is creating script to find the valid value for rdi, rsi, and rdx using z3. Here is my script
History teaches us there are good gods and bad ones. Good ones write their own deobfuscator. Bad ones just guess the flag directly.
Solution
Given ELF 64 bit, open it using IDA
There is obfuscation on main function, through debugging we can know which instruction executed and which are not. In this case i try to dump executed instruction by automating the next instruction process.
#!/usr/bin/python3import stringimport osimport stringclassSolverEquation(gdb.Command):def__init__ (self):super(SolverEquation, self).__init__ ("solve-equation",gdb.COMMAND_OBSCURE)definvoke (self,arg,from_tty): gdb.execute("pie b 0x15ce") gdb.execute("pie run < x.txt") arch = gdb.selected_frame().architecture()dict={} main_iter =1564# trial and errorfor _ inrange(main_iter):print(_) current_pc =addr2num(gdb.selected_frame().read_register("pc")) disa = arch.disassemble(current_pc)[0] addr = current_pc -0x555555554000if(addr notindict): dict[addr]= [disa["asm"], disa["length"]] gdb.execute("ni")for i indict:print(f"{hex(i)} : {dict[i]}")defaddr2num(addr):try:returnint(addr)# Python 3except:returnlong(addr)# Python 2SolverEquation()
From the trace we can see that there are two type of obfuscation.
Dont execute jump in ja
Execute jump in je
So there is a pattern although it use different register on each obfuscation part. With those knowledge we can create script to automatically deobfuscate the program.
The script will automatically nop obfuscation part then we can decompile the function. To create a function we can follow step below
Undefine instruction (press u)
Make instruction (press c)
Create function on push rbp (press p)
In some part we will facing issue regarding the old xreference, to patch it we can do undefine instruction then make instruction on address that has xreference. Here is the final code after deobfuscation
Here is the analysis for each function
main
receive input
call sub_1189 0x39 times. sub_1189 process our input and store it to unk_40c0
call sub_12a3. sub_12a3 process our input.
compare unk_40c0 with unk_2010 (static values)
sub_1189
Do some arithmetic operation
Each 4 bytes on input will be stored on 3 bytes result
i = [0,1,2,3]
(4*i) >> 3 = [0, 0, 1, 1] -> index on a1
((4*i) >> 3) + 1 = [None, 1, None, 2] -> index on a1
sub_12a3
Compare the count of character on a1 and a2, e.g:
a1 = "aabc", a2 = "caba"
sub_12a3(a1, a2) -> True
Since count of "a" in a1 and a2 is 2
Since count of "b" in a1 and a2 is 1
Since count of "c" in a1 and a2 is 1
Basically we can bruteforce 4 bytes input then compare 2 bytes result since those 2 bytes will not processed in further iteration. There will be many possibility but we can reduce the possibility and accelerate the process by modifying charset for each product iteration. In this case i implement that algorithm as recursive function. Here is my solver
import stringfrom itertools import productdefsub_1189(a1,a2,a3): a1[a3 >>3]^= (a2 >> (a3 &7))if((a3 &7) !=0): a1[(a3 >>3) +1]^= (a2 << (8- (a3&7)))&0xffdefrecur(arr,flag,charset):if(len(flag)>=56 ):print(f"Possible Flag : {flag}")for i inproduct(charset, repeat=4): arr2 = arr[:] tmp = flag +''.join(i).encode()for j inrange(len(flag), len(tmp)):sub_1189(arr2, tmp[j], 4*j)if(arr2[:len(tmp)//2]== cmp_val[:len(tmp)//2]): charset_tmp = charset[:]for k in charset_tmp: tmp2 = tmp.count(k.encode()) tmp3 = charset_ori.count(k)if(tmp2 == tmp3): charset_tmp.remove(k)recur(arr2, tmp, charset_tmp)charset_ori = [0x33,0x33,0x33,0x34,0x34,0x34,0x34,0x35,0x35,0x36,0x38,0x45,0x46,0x4c,0x50,0x5f,0x5f,0x5f,0x5f,0x5f,0x5f,0x5f,0x5f,0x5f,0x5f,0x5f,0x61,0x61,0x61,0x64,0x65,0x65,0x65,0x66,0x68,0x68,0x69,0x69,0x6c,0x6c,0x6d,0x6e,0x6e,0x72,0x73,0x74,0x74,0x74,0x74,0x76,0x77,0x7a,0x7a,0x7a,0x7b,0x7d]charset_ori = [chr(i)for i in charset_ori]flag =b"EPFL"for i in flag: charset_ori.remove(chr(i))uniq = []for i in charset_ori:if(i notin uniq): uniq.append(i)cmp_val = [0x40,0x42,0xb8,0x4f,0x92,0xe5,0x26,0x33,0xee,0xa0,0xc1,0x97,0xbc,0x4f,0x81,0x43,0x81,0xe2,0xdc,0x2b,0x92,0xf9,0xf,0x73,0x96,0x18,0x2b,0x33,0xd0]arr = [0for _ inrange(57)]for i inrange(len(flag)):sub_1189(arr, flag[i], 4*i)recur(arr, flag, uniq)
Flag : EPFL{3z_disa55em8le_4_an_3z_r3v_with_4n_ez_fl46_at_th4t}