西湖论剑gghdl题解

西湖论剑里一道比较有意思的题,该题用GHDL库模拟了硬件描述语言(Hardware Description Language, HDL)的执行,推测是实现了某种字符编码。比赛的最后一小时我发现这题每个字符的编码都是固定的,于是叫上任佬一起花了大概40分钟动调打表,好在最后有惊无险地解出。为了避免下次再发生这种“生死时速”的情况,赛后我用idapython实现了自动调试脚本,详细记录一下实现的过程和遇到的坑。

题目链接

0x00. 程序流程

首先来梳理一下程序的大致流程。

一个while+switch结构:

image-20211124222707805

case 0是读取输入,并且输入长度要大于等于44:

image-20211124222741905

中间应该有一段加密过程,这里不管。直接看到case 5,可以发现这里通过判断*(a1 + 272) == 44来判断是否比较成功。所以使*(a1 + 272)增加的地方就是真正进行了比较的地方:

image-20211124223047937

可以发现case 7中有多处修改了*(a1 + 272),并且貌似是分段进行比较的,每段8个字节,最后一段4个字节,逐个字节比较:

image-20211124223235617

并且每一段用的都是同一个比较函数,重命名为check:

image-20211124223432909

动态调试可以发现,密文和加密后的输入都是这种03, 02的组合,暂且称之为“32编码”。一个字符编码成8个字节,每个字节为03或者02。

image-20211124224041018

稍微调试一下就可以发现每个字符的编码都是固定的。

0x01. 手动调试思路

check函数的两个参数分别是 rdi 和 rsi 寄存器,第一个参数a1指向编码后的输入字符的地址,第二个参数a2指向密文的地址:

image-20211124224726705

在check函数下断点,构造常用字符的集合”0123456789qwertyuiopasdfghjklzxcvbnmQWERTYUIOPASDFGHJKLZXCVBNM{}_-“分成段输入进行两次动调,dump每一个字符的编码结果,打表。顺便把密文也dump下来,对密文中的每个字符进行查表解码,得到flag。这也是我比赛时的解法:

image-20211124225012855

然而手动动调打表的方法非常耗时间,并且容易出错。

0x02. idapython自动调试实现

可以使用idapython代替人工完成调试,先贴出完整代码,接下来再一一解释:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
from idaapi import *
from idc_bc695 import LocByName
import time

set_remote_debugger('192.168.253.148', '-1')
start_process()
wait_for_next_event(WFNE_SUSP, -1)

decode_map = {}
cipher = []
charset = b'1234567890qwertyuiopasdfghjklzxcvbnm{}_-####'
#charset = b'QWERTYUIOPASDFGHJKLZXCVBNM##################'

for i in range(44):
run_to(LocByName('check'))
wait_for_next_event(WFNE_SUSP, -1)
decode_map[get_64bit(get_64bit(get_reg_val('rdi')))] = charset[i]
cipher.append(get_64bit(get_64bit(get_reg_val('rsi'))))
print('Dump map: ' + str(['%x: %c' % (k, v) for k, v in decode_map.items()]))
print('Dump ciphertext: ' + str([hex(i) for i in cipher]))

print('decode_map = ' + str(decode_map))
print('cipher = ' + str(cipher))

首先是设置远程调试服务器的ip和端口,-1表示使用默认端口23946。start_process函数开启动调调试:

1
2
set_remote_debugger('192.168.253.148', '-1')
start_process()

注意idapython脚本的执行是与调试进程异步的,也就是说idapython脚本不会等到调试进程启动后再执行start_proccess()后面的内容,而是会立刻执行。所以我们要设置启动时挂起进程:

image-20211124225615727

等待挂起事件,使脚本与调试进程同步:

1
wait_for_next_event(WFNE_SUSP, -1)

等调试进程启动并挂起后脚本会收到事件,并继续执行。decode_map是我们要打的表,表示每个编码对应的字符,cipher是要dump下来的密文,两个charset是常用字符的集合,因为长度大于44所以需要拆分成两段进行两次调试:

1
2
3
4
decode_map = {}
cipher = []
charset = b'1234567890qwertyuiopasdfghjklzxcvbnm{}_-####'
#charset = b'QWERTYUIOPASDFGHJKLZXCVBNM##################'

run_to(LocByName('check'))使调试进程执行到check函数处停下,因为脚本是实际的调试进程是异步的,所以需要用wait_for_next_event(WFNE_SUSP, -1)等待挂起事件:

1
2
3
4
5
6
7
for i in range(44):
run_to(LocByName('check'))
wait_for_next_event(WFNE_SUSP, -1)
decode_map[get_64bit(get_64bit(get_reg_val('rdi')))] = charset[i]
cipher.append(get_64bit(get_64bit(get_reg_val('rsi'))))
print('Dump map: ' + str(['%x: %c' % (k, v) for k, v in decode_map.items()]))
print('Dump ciphertext: ' + str([hex(i) for i in cipher]))

这段代码的作用是dump映射表和密文:

1
2
decode_map[get_64bit(get_64bit(get_reg_val('rdi')))] = charset[i]
cipher.append(get_64bit(get_64bit(get_reg_val('rsi'))))

脚本运行中:

image-20211124230554421

输出结果:
image-20211124230607029

将两次dump下来的映射表合并,对密文进行解码即可得到flag:

1
2
3
4
5
decode_map = {216739043520741891: 49, 144962924459524611: 50, 217020518497452547: 51, 144680349971186179: 52, 216737944009114115: 53, 144961824947896835: 54, 217019418985824771: 55, 144681445187846659: 56, 216739039225774595: 57, 144681449482813955: 48, 216739043520742147: 113, 217019418985825027: 119, 216737944025891587: 101, 144962924459524867: 114, 144680349971186435: 116, 216739039225774851: 121, 216737944009114371: 117, 216739039242552067: 105, 217019414707634947: 111, 144681449482814211: 112, 216739043537519363: 97, 217020518497452803: 115, 144680349987963651: 100, 144961824964674307: 102, 217019419002602243: 103, 144681445204624131: 104, 144962920181334787: 106, 217020514219262723: 107, 144680345692996355: 108, 144962920164557571: 122, 144681445187846915: 120, 217020518514230019: 99, 144961824947897091: 118, 144962924476302083: 98, 144961820669707011: 110, 216737939730924291: 109, 217020514202485507: 123, 216737939714147075: 125, 217019414690792195: 95, 216737939730924035: 45, 217020518514229763: 35}
cipher = [144680349987898115, 216739043537453827, 217020518497387267, 217020518514164483, 144680349971120899, 144961824964608771, 217020514202485507, 144681449482813955, 144961824947896835, 144962924459524611, 216739039225774595, 216739043520741891, 216739039225774595, 144961824947896835, 144962924459524611, 216737939730924035, 216739043537519363, 144962924476302083, 216739043537519363, 144680349987963651, 216737939730924035, 144680349971186179, 144681449482813955, 217020518514230019, 144681445187846659, 216737939730924035, 144681445187846659, 217020518497452547, 216739043520741891, 144681445187846659, 216737939730924035, 144961824964674307, 144681449482813955, 216739043537519363, 144961824947896835, 144962924476302083, 216739043520741891, 144681445187846659, 144961824947896835, 144680349971186179, 144681445187846659, 144962924459524611, 217020518514230019, 216737939714147075]
decode_map2 = {216739043520676611: 81, 217019418985759491: 87, 216737944025826051: 69, 144962924459459331: 82, 144680349971120899: 84, 216739039225709315: 89, 216737944009048835: 85, 216739039242486531: 73, 217019414707569411: 79, 144681449482748675: 80, 216739043537453827: 65, 217020518497387267: 83, 144680349987898115: 68, 144961824964608771: 70, 217019419002536707: 71, 144681445204558595: 72, 144962920181269251: 74, 217020514219197187: 75, 144680345692930819: 76, 144962920164492035: 90, 144681445187781379: 88, 217020518514164483: 67, 144961824947831555: 86, 144962924476236547: 66, 144961820669641475: 78, 216737939730858755: 77, 217020518514229763: 35}
decode_map.update(decode_map2)
print(''.join(map(lambda x : chr(decode_map[x]), cipher)))

输出:

1
DASCTF{06291962-abad-40c8-8318-f0a6b186482c}

理论上来说还可以使用DBG_Hooks技术,例子可以参考倚天屠龙(一):妙用IDA Pro–利用IDAPython编写调试插件,但是我在实际操作时会出现莫名其妙的问题,代码如下:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
from idaapi import *
from idc_bc695 import LocByName

decode_map = {}
cipher = []
charset = b'1234567890qwertyuiopasdfghjklzxcvbnm{}_-####'
#charset = b'QWERTYUIOPASDFGHJKLZXCVBNM##################'

class DebugHook(DBG_Hooks):

def dbg_bpt(self, tid, ea):
if ea == LocByName('check'):
decode_map[get_64bit(get_64bit(get_reg_val('rdi')))] = charset[len(cipher)]
cipher.append(get_64bit(get_64bit(get_reg_val('rsi'))))
print('Dump map: ' + str(['%x: %c' % (k, v) for k, v in decode_map.items()]))
print('Dump ciphertext: ' + str([hex(i) for i in cipher]))
continue_process()
return 0

def dbg_process_exit(self, pid, tid, ea, code):
print('decode_map = ' + str(decode_map))
print('cipher = ' + str(cipher))

set_remote_debugger('192.168.253.148', '-1')
debughook = DebugHook()
debughook.hook()
start_process()
wait_for_next_event(WFNE_SUSP, -1)

在check函数下断点,运行结果如下,可以看到大多数时候读取的数据都是错误的,这是一个坑。估计是idapython设计的缺陷:

image-20211124231930132

延时一秒再读取也没什么用:

image-20211124232151785

如果换一种思路呢?比如在调试进程挂起的时候dump数据:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
class DebugHook(DBG_Hooks):

def dbg_suspend_process(self):
ea = get_reg_val('rip')
if ea == LocByName('check'):
decode_map[get_64bit(get_64bit(get_reg_val('rdi')))] = charset[len(cipher)]
cipher.append(get_64bit(get_64bit(get_reg_val('rsi'))))
print('Dump map: ' + str(['%x: %c' % (k, v) for k, v in decode_map.items()]))
print('Dump ciphertext: ' + str([hex(i) for i in cipher]))
continue_process()
return 0

def dbg_process_exit(self, pid, tid, ea, code):
print('decode_map = ' + str(decode_map))
print('cipher = ' + str(cipher))

这样是可行的,但是开始调试时会卡一会,不知道为啥。运行结果如下:

image-20211124232554171

所以非迫不得已还是不要使用DBG_Hooks为好。