[Studying] Analyzing DarkComet

First Post:

Last Update:

Word Count:
1.5k

Read Time:
9 min

Introduction

DarkComet is one of the most well-known remote access trojans (RATs) widely used during the late 2000s and early 2010s. Despite being originally developed as a remote administration tool, it was later abused by threat actors to gain unauthorized access to victim systems.

In this article, I analyze the DarkComet payload through static analysis and reverse engineering. The goal is to understand how the malware establishes command-and-control (C2) communication and how its core functionalities are implemented internally.

This article is part of my series Inside Different Generations of RATs, where I study the evolution of RAT malware by reversing different families from various time periods. If you are interested in the full series, please refer to the article linked above.

DarkComet

DarkComet is a remote access trojan (RAT) developed by Jean-Pierre Lesueur (known as DarkCoderSc), an independent programmer and computer security coder from France. Although the RAT was originally developed back in 2008, it began to proliferate in early 2012. DarkComet was later discontinued, partly due to its use during the Syrian civil war to monitor activists and also due because the author feared potential legal consequences for undisclosed reasons 1.

Different versions of DarkComet can still be found on the internet.

This article mainly discusses version 5.3.1.

Experimental Environment Setup

To safely conduct malware analysis, the environment should be isolated using virtual machines. Under no circumstances should malware be executed on a personal or production system.

Tool Description
ExeInfo PE Executable analysis tool used to detect packers, compilers, and basic file properties
Detect It Easy (DIE) File identification tool used to detect packers, protectors, and compiler signatures
UPX Open-source executable packer used to compress and decompress binaries
Wireshark Network protocol analyzer used to capture and inspect live or recorded traffic
Ghidra Open-source software reverse engineering framework developed by the NSA

Device IP Address Description
Windows XP x86 (VM) 192.168.85.2 Victim machine used for executing both the controller and payload
Windows 10 x64 (VM) 192.168.85.3 Analysis machine used for reverse engineering

The two virtual machines were configured within an isolated internal network to prevent unintended external communication. If you want to know how to set up your experimental environment, please view:

Usage

Launch DarkComet.exe.

Version 1.3

Version 5.3.1

DarkComet provides two panels for configuring the payload (server stub):

Minimal version

Installer version

Here, the term “server” might be confusing from a modern perspective. Compared to the previous articles, this GUI application acts as the server in the network architecture, while the payload is the client. However, in that era, the term server often referred to the service provider—the payload—which provides the remote control capabilities.

To avoid confusion, this article uses the following terms:

  • Controller: attacker-side application
  • Payload: executable deployed on the victim machines

The minimal version builder allows the users create a “small” payload or “normal” payload. I will discuss the difference in the reverse engineering section. In this article, I refer to them as small.exe and normal.exe, respectively.

DarkComet is capable of binding multiple ports to establish TCP communication channels:

Once the target machine is compromised, the controller can control the remote machine using multiple functionalities:

Protocol Analysis

The communication channel is protected using RC4 algorithm. I will demonstrate it in reverse engineering section:

Reverse Engineering

Open the payload executables with ExeInfo and DIE (Detect It Easy):

small.exe

small.exe

normal.exe

normal.exe

normal.exe appears to be unpacked. However, it has relatively high entropy:

Nevertheless, Ghidra can still correctly display the imported modules. It seems like Delphi compiler leds high entropy. This is expected because compilers such as Visual Studio C++ or Delphi often add stubs or perform optimizations when generating executables.

Therefore, in this case, only small.exe is packed. We can unpack it manually or by using upx.exe. If you are interested in manual unpacking UPX, please refer to this article.

Using upx.exe:

1
upx.exe -d small.exe -o unpacked_small.exe

Next, we can perform reverse engineering on both normal.exe, unpacked.exe. After further investigation, I realized that these two executables are actually identical.

Same sections

Same defined datas

Same message handler

Initially, I thought small.exe might be a stage-1 payload that downloads or loads a stage-2 payload from the C2 server. However, it turns out that it is simply called “small” because it is packed with UPX. Therefore, the reverse engineering analysis in this article focuses on normal.exe.

Notice that DarkComet uses different integer and string values to identify C2 commands:

The communication is protected using the following function:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
void FUN_004616b4(int param_1,int param_2,int *param_3)

{
longlong *plVar1;
uint *puVar2;
uint uVar3;
uint uVar4;
int iVar5;
int iVar6;
uint uVar7;
undefined4 *in_FS_OFFSET;
undefined4 uStack_43c;
undefined1 *puStack_438;
undefined1 *puStack_434;
int local_424;
uint local_420 [256];
int local_20;
byte local_19;
int *local_18;
longlong *local_14;
longlong *local_10;
int local_c;
int local_8;

local_424 = 0;
local_10 = (longlong *)0x0;
local_14 = (longlong *)0x0;
puStack_434 = (undefined1 *)0x4616df;
local_18 = param_3;
local_c = param_2;
local_8 = param_1;
FUN_004059cc(param_1);
puStack_434 = (undefined1 *)0x4616e7;
FUN_004059cc(local_c);
puStack_438 = &LAB_00461980;
uStack_43c = *in_FS_OFFSET;
*in_FS_OFFSET = &uStack_43c;
iVar5 = 0;
if (local_c != 0) {
iVar5 = *(int *)(local_c + -4);
}
if (iVar5 != 0) {
iVar5 = 0;
if (local_8 != 0) {
iVar5 = *(int *)(local_8 + -4);
}
if (iVar5 != 0) {
puStack_434 = &stack0xfffffffc;
FUN_0046124c(local_8,&local_424);
FUN_004055c8(&local_8,local_424);
iVar5 = 0;
if (local_c != 0) {
iVar5 = *(int *)(local_c + -4);
}
if (iVar5 < 0x101) {
FUN_00406928((int *)&local_10,(int)&DAT_004611a0,1);
uVar4 = 0;
if (local_c != 0) {
uVar4 = *(uint *)(local_c + -4);
}
plVar1 = (longlong *)thunk_FUN_004059f0(&local_c);
FUN_004611c4(local_10,plVar1,uVar4);
}
else {
FUN_00406928((int *)&local_10,(int)&DAT_004611a0,1);
plVar1 = (longlong *)thunk_FUN_004059f0(&local_c);
FUN_004611c4(local_10,plVar1,0x100);
}
uVar4 = 0;
puVar2 = local_420;
do {
*puVar2 = uVar4;
uVar4 = uVar4 + 1;
puVar2 = puVar2 + 1;
} while (uVar4 != 0x100);
uVar4 = 0;
iVar5 = 0;
puVar2 = local_420;
do {
iVar6 = 0;
if (local_c != 0) {
iVar6 = *(int *)(local_c + -4);
}
uVar4 = (uint)*(byte *)((int)local_10 + iVar5 % iVar6) + uVar4 + *puVar2 & 0x800000ff;
if ((int)uVar4 < 0) {
uVar4 = (uVar4 - 1 | 0xffffff00) + 1;
}
local_19 = (byte)*puVar2;
*puVar2 = local_420[uVar4];
local_420[uVar4] = (uint)local_19;
iVar5 = iVar5 + 1;
puVar2 = puVar2 + 1;
} while (iVar5 != 0x100);
uVar3 = 0;
uVar7 = 0;
FUN_00406928((int *)&local_14,(int)&DAT_004611a0,1);
uVar4 = 0;
if (local_8 != 0) {
uVar4 = *(uint *)(local_8 + -4);
}
plVar1 = (longlong *)thunk_FUN_004059f0(&local_8);
FUN_004611c4(local_14,plVar1,uVar4);
iVar5 = 0;
if (local_8 != 0) {
iVar5 = *(int *)(local_8 + -4);
}
if (-1 < iVar5 + -1) {
local_20 = iVar5;
iVar5 = 0;
do {
uVar3 = uVar3 + 1 & 0x800000ff;
if ((int)uVar3 < 0) {
uVar3 = (uVar3 - 1 | 0xffffff00) + 1;
}
uVar7 = uVar7 + local_420[uVar3] & 0x800000ff;
if ((int)uVar7 < 0) {
uVar7 = (uVar7 - 1 | 0xffffff00) + 1;
}
local_19 = (byte)local_420[uVar3];
local_420[uVar3] = local_420[uVar7];
local_420[uVar7] = (uint)local_19;
uVar4 = local_420[uVar3] + local_420[uVar7] & 0x800000ff;
if ((int)uVar4 < 0) {
uVar4 = (uVar4 - 1 | 0xffffff00) + 1;
}
*(byte *)((int)local_14 + iVar5) =
*(byte *)((int)local_14 + iVar5) ^ (byte)local_420[uVar4];
iVar5 = iVar5 + 1;
local_20 = local_20 + -1;
} while (local_20 != 0);
}
uVar4 = 0;
if (local_8 != 0) {
uVar4 = *(uint *)(local_8 + -4);
}
FUN_00405c6c(local_18,uVar4);
uVar4 = 0;
if (local_8 != 0) {
uVar4 = *(uint *)(local_8 + -4);
}
plVar1 = (longlong *)thunk_FUN_004059f0(local_18);
FUN_004611c4(plVar1,local_14,uVar4);
}
}
*in_FS_OFFSET = uStack_43c;
puStack_434 = &LAB_00461987;
puStack_438 = (undefined1 *)0x46195f;
FUN_00405530(&local_424);
puStack_438 = (undefined1 *)0x461972;
FUN_004060f8((int *)&local_14,"\x11\nTByteArray\x01",2);
puStack_438 = (undefined1 *)0x46197f;
FUN_00405554(&local_c,2);
return;
}

256 bytes S-box:

1
uint local_420[256]

Key Scheduling Algorithm (KSA):

1
2
3
4
5
6
7
uVar4 = 0;
puVar2 = local_420;
do {
*puVar2 = uVar4;
uVar4 = uVar4 + 1;
puVar2 = puVar2 + 1;
} while (uVar4 != 0x100);

Pseudo Random Generation Algorithm (PRGA):

1
2
3
4
5
6
7
8
9
10
uVar3 = uVar3 + 1 & 0xff;
uVar7 = uVar7 + local_420[uVar3] & 0xff;

swap(S[i], S[j]);

uVar4 = S[i] + S[j] & 0xff;

byte = S[uVar4];

data[i] ^= byte;

Therefore, we can rewrite the decompiled code with the following pseudo code:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
function RC4_EncryptDecrypt(data, key):

key_len = length(key)
data_len = length(data)

S = array[256]

//S-box
for i from 0 to 255:
S[i] = i

//KSA
j = 0
for i from 0 to 255:
j = (j + S[i] + key[i mod key_len]) mod 256
swap(S[i], S[j])

//PRGA
i = 0
j = 0
for k from 0 to data_len-1:

i = (i + 1) mod 256
j = (j + S[i]) mod 256

swap(S[i], S[j])

t = (S[i] + S[j]) mod 256
keystream = S[t]

data[k] = data[k] XOR keystream

return data

Conclusion

This article presented a reverse engineering analysis of the infamous RAT DarkComet. It provides multiple advanced features that allow attackers to gain full control over compromised Windows XP systems.

From the 2000s to the early 2010s, many RATs adopted the RC4 algorithm to protect their communication channels.

References

1. https://en.wikipedia.org/wiki/DarkComet

THANKS FOR READING