Content:
Overview
DRIDEX is a well-known and long-lasting type of banking Trojan that first emerged in late 2014. It’s notorious for targeting banks and financial institutions to steal sensitive information like banking details and user credentials. Usually, Dridex spreads through phishing emails, often using Word or Excel files that have harmful VBA macros.
The APT group called TA505 is notorious for its involvement in various cyber threats, including Dridex, as well as other well-known malware like TrickBot and the destructive Locky ransomware.
Dridex tries to avoid being detected and analyzed. It does this by hiding its actions when communicating with Windows, and it uses a technique called “string hashing” to make it more challenging for security researchers to understand what it’s doing. It also uses a method called “Vectored Exception Handling” to further complicate its analysis.
Tools used:
- DiE
- Virustotal
- PEbear
- PEiD
- capa
- IDA Pro
- hashdb
Static Analysis
IOCS
name | malware.bin |
arch | i386 |
md5 | df1b0f2d8e1c9ff27a9b0eb50d0967ef |
sha1 | fdd07c89c8ed656964dfa1a6cff271e170eda0c2 |
Sample | MalwareBazaar |
Detect-it-Easy
To obtain general information about the executable, load the file into DiE, where you can find the compilation date of the binary, which is September 19, 2020. It is a PE32 binary and includes only two imports: OutputDebugStringA and Sleep. Properties such as a low number of strings and high entropy (e.g., .rdata: 7.76, .text: 6.5) suggest that this binary may be packed.
Virustotal
Searching the MD5 hash on VirusTotal yielded a report where 53 out of 70 antivirus vendors flagged it as malicious. The hash was associated with the Dridex, Graftor, and Cridex family tags. You can review the detailed report here.
Yara detection
- Detects aPLib decompression code often used in malware
- Dridex v4 dropper C2 parsing function
- Heaven’s Gate: Switch from 32-bit to 64-mode
Sigma detection
- Detects suspicious calls of DLLs in rundll32.dll exports by ordinal
- Adversaries may abuse the Windows Task Scheduler to perform task scheduling for initial or recurring execution of malicious code
- Code integrity failures may indicate tampered executables.
Capa
capa detects capabilities in executable files. You run it against a PE, ELF, .NET module, or shellcode file and it tells you what it thinks the program can do.
The report reveals several key insights, including the use of string encryption, anti-disassembly techniques, and PE parsing.
PEiD
To determine the packer and encryption algorithm used for string encryption, you can employ PEiDs plugin KANAL. PEiD is an intuitive application that relies on its user-friendly interface to detect packers, cryptors and compilers found in PE executable files – its detection rate is higher than that of other similar tools since the app packs more than 600 different signatures in PE files.
It detects the use of CRC32, Base64, and aPLib algorithms.
Technical Analysis
To conduct further analysis, load the binary into IDA Pro. Inside IDA Pro, we observe that the binary passes certain hashes as arguments to a function and subsequently invokes the returned values as if they were functions. This suggests that the binary is likely resolving API functions dynamically.
The first argument in both calls to the function sub_6015C0 is identical, indicating that it likely represents the hash of a DLL module.
To identify the implementation of the algorithm being used, let’s analyze the function sub_6015C0.
API hashing
The function sub_607564 takes the dllHash as input and parses through the PEB (Process Environment Block) structure to retrieve the base name of the module associated with the hash.
It iterates through all the modules, computes the RC4 hash of their base names, performs an XOR operation with the value 0x38BA5C7B, and subsequently compares this result with the dllHash.
With knowledge of the algorithm name and XOR key, we can utilize “HashDB,” which is an API hash lookup plugin for IDA Pro.
To perform a HashDB lookup, simply right-click on a hash and select the “HashDB Lookup” option.
Appgate has created a script which which resolves API calls automatically and inserts a comment where the function is called, so we can easily search where in the code certain DLLs or APIs are being used.
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
import idautils
import pefile
import zlib
import os
def get_path_dirs():
return [x for x in os.path.expandvars("%PATH%").split(";") if x]
def generate_hashes_table(key):
hashes = []
for path_dir in get_path_dirs():
for dll_name in os.listdir(path_dir):
dll_hash = get_dridex_hash(dll_name, dll=True, key=key)
dll_dict = {'name': dll_name, 'hash': dll_hash, 'imports': []}
hashes.append(dll_dict)
return hashes
def get_dridex_hash(s, dll, key):
return hex(((zlib.crc32(s.upper().encode() if dll else s.encode()) & 0xffffffff) ^ key))
def api_resolver(dll_hash, api_hash, hashes, key):
for item in hashes:
if dll_hash != item['hash']:
continue
if not item['imports']:
for path_dir in get_path_dirs():
if not os.path.exists(os.path.join(path_dir, item["name"])):
continue
pe = pefile.PE(os.path.join(path_dir, item["name"]))
exp_list = []
for exp in pe.DIRECTORY_ENTRY_EXPORT.symbols:
try:
exp_list.append(exp.name.decode())
except:
continue
for import_name in exp_list:
_hash = get_dridex_hash(import_name, dll=False, key=key)
item['imports'].append({'name': import_name, 'hash': _hash})
for api in item['imports']:
if api_hash == api['hash']:
return "{}!{}".format(item['name'], api['name'])
return "{}!unknown".format(item['name'])
def resolve_apis(resolver_offset, hashes_table, xor_key):
for xref in idautils.XrefsTo(resolver_offset):
off = idc.prev_head(xref.frm)
# This loop will search for the hash that is being passed by the function
# It's limited to 100 searches to avoid possible infinite loops.
dll, api = None, None
for i in range(1, 101):
if i == 100:
print("[-] Cannot find hash for address: %s" % hex(xref.frm))
break
# If it's not a "push" operation, keep looking
if idc.print_insn_mnem(off) != "push":
off = idc.prev_head(off)
continue
# If a "push" is identified, checks if it's the DLL or the API hash
if not dll:
dll = hex(idc.get_operand_value(off, 0))
off = idc.prev_head(off)
continue
# If the DLL was already found, then the second push is the API hash
api_name = api_resolver(dll, hex(idc.get_operand_value(off, 0)), hashes_table, xor_key)
comment = "Unknown" if not api_name else api_name
idc.set_cmt(xref.frm, comment, True)
break
# ---------------------- Main ---------------------- #
def main(xor_key, resolver_function):
hashes = generate_hashes_table(xor_key)
resolve_apis(resolver_function, hashes, xor_key)
main(0x38BA5C7B, 0x6015C0)
I have adjusted the function address and XOR key to match our specific sample.
Vectored Exception Handler(INT 3)
Now that the APIs are resolved, we have noticed that in many instances, after obtaining the address of the resolved API, the code has been modified to include “int3” instructions, designed to trap a debugger. The presence of “int3” instructions can complicate dynamic analysis since debuggers often use “int3” for setting breakpoints.
Instead of using a “call” instruction to invoke the resolved APIs, the program registers sub_687D40 as an exception handler through the RtlAddVectoredExceptionHandler API. This implies that when the program encounters an “int3” instruction, the kernel invokes sub_687D40 to manage the interrupt and transfer control to the API stored in the EAX register.
Once the handler returns the EXCEPTION_CONTINUE_EXECUTION code, the user thread’s context is reinstated, and the malware resumes execution from the “retn” instruction.
To improve the flow graph and eliminate these debugger traps, I discovered this script from LEXFO’s blog.
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
from idaapi import get_segm_by_name
from idc import patch_byte, add_bpt, set_bpt_cond, BPT_EXEC, load_and_run_plugin
import ida_search
load_and_run_plugin("idapython", 3)
def find_all_occurences(start, end, bin_str, flags):
occurences = list()
ea = start
while ea <= end:
occurence = ida_search.find_binary(ea, end, bin_str, 0, flags)
ea = occurence + 1
occurences.append(occurence)
return occurences[0:-1]
def patch_binary():
segment = get_segm_by_name('.text')
occurences = find_all_occurences(segment.start_ea, segment.end_ea, "CC C3", ida_search.SEARCH_DOWN)
datas = [0xFF, 0xD0]
for occurence in occurences:
for i, byte in enumerate(datas):
patch_byte(occurence + i, byte)
return True
Although the original script is intended for IDAPython 6.x, I have made modifications to make it compatible with IDAPython 7.x, referencing the changes from this article : Porting from IDAPython 6.x-7.3, to 7.4.
String Decryption
The binary contains encrypted strings in the .rdata section, and Capa suggests that these strings are encrypted using RC4. By examining their cross-references (xref), we can understand how they are decoded.
In the image provided, it is evident that the first 40 bytes serve as the key to decrypt the subsequent bytes within the function sub_61E5D0.
We can confirm this by attempting to decrypt the hexadecimal bytes using a tool like CyberChef.
IOC Extraction
Appgate has released a Tool to Extract the Botnet ID, C2 IP Adresses, Decrypted string, Decrypt Network Communication from the dridex sample.
Stealer(CyberDefender Challenge)
Challenge Author: Nidal Fikri
Very Difficult(4.7)
Category : Malware Analysis
Tags : Malware Analysis, Reverse Engineering, Threat Hunting
Instructions:
- Uncompress the challenge (pass: cyberdefenders.org)
Challenge Description
Scenario:
Your enterprise network is experiencing a malware infection, and your SOC L1 colleague escalated the case for you to investigate. As an experienced L2/L3 SOC analyst, analyze the malware sample, figure out what it does and extract C2 server and other important IOCs.
P.S.: Make sure to analyze files in an isolated/virtualized environment as some artifacts may be malicious.
Question and Answers
Q1. The provided sample is fully unpacked. How many sections does the sample contain?
Ans: In Detect-it-Easy, it is apparent that the executable contains four sections, namely:
- .text
- .rdata
- .data
- .reloc
Q2. How many imported windows APIs are being used by the sample?
Ans: Within the sample, two Windows APIs are imported, namely:
- OutputDebugStringA
- Sleep
Q3. The sample is resolving the needed win APIs at run-time using API hashing. Looking at the DllEntryPoint, which function is responsible for resolving the wanted APIs?
Ans. sub_6015C0 accepts two hashes as its arguments: the first one being the DLL module hash, and the other being the API hash.
Q4. Looking inside the function described in question 3, which function is responsible for locating & retrieving the targetted module (DLL)?
Ans. The function sub_607564 parses the PE file and structures such as _PEB, _PEB_LDR_DATA, and _LDR_DATA_TABLE_ENTRY in order to retrieve the DLL base name.
Q5. What type of hashing is being used for the API hashing technique?
Ans. CRC32 is utilized for API hashing, as confirmed by Capa’s reporting and validated through testing with HashDB.
Q6. What is the address of the function which performs the hashing?
Ans. The function sub_61D620 performs hashing, XORs the output with the XOR key, and then compares the result to the hash in the sample.
Q7. What key is being used for XORing the hashed names?
Ans. 0x38BA5C7B
Q8. What information is being accessed at the address 0X60769A?
Ans. BaseAddress
Q9. Looking inside the function described in question 3, which function is responsible for locating & retrieving the targetted API from the module export table?
Ans. The function sub_6067C8 iterates through all the API imports in the PE header. It subsequently retrieves the address of LdrGetProcedureAddress and obtains the desired API.
Q10. Diving inside the function described in question 8, what is being accessed at offset 0X3C within the first passed parameter?
Ans. e_lfanew
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
//0x40 bytes (sizeof)
struct _IMAGE_DOS_HEADER
{
USHORT e_magic; //0x0
USHORT e_cblp; //0x2
USHORT e_cp; //0x4
USHORT e_crlc; //0x6
USHORT e_cparhdr; //0x8
USHORT e_minalloc; //0xa
USHORT e_maxalloc; //0xc
USHORT e_ss; //0xe
USHORT e_sp; //0x10
USHORT e_csum; //0x12
USHORT e_ip; //0x14
USHORT e_cs; //0x16
USHORT e_lfarlc; //0x18
USHORT e_ovno; //0x1a
USHORT e_res[4]; //0x1c
USHORT e_oemid; //0x24
USHORT e_oeminfo; //0x26
USHORT e_res2[10]; //0x28
LONG e_lfanew; //0x3c
};
Q11. Which windows API is being resolved at the address 0X5F9E47 ?
Ans. By using HashDb and the script, it becomes evident that it resolves “CreateThread.”
Q12. Looking inside sub_607980, which DLL is being resolved?
Ans. NTDLL.dll
Q13. Also Looking inside sub_607980, which API is being resolved?
Ans. RtlAddVectoredExceptionHandler
Q14. What is the appropriate data type of the only argument at function sub_607D40?
Ans. When referring to the MSDN documentation, it becomes apparent that the argument type should be “_EXCEPTION_POINTERS.”
1
2
3
4
5
6
PVECTORED_EXCEPTION_HANDLER PvectoredExceptionHandler;
LONG PvectoredExceptionHandler(
[in] _EXCEPTION_POINTERS *ExceptionInfo
)
{...}
Q15. After reverse-engineering sub_607980 and knowing its purpose, Which assembly instruction is being abused for further anti-analysis complication? (especially when running the sample)
Ans. int3, for more details refer here
Q16. After reverse-engineering sub_607980 and knowing its purpose, Which assembly instruction is being used for altering the process execution flow? (Also adds anti-disassembly complication)
Ans. ret
Q17. There are important encrypted strings in the .data section. Which encryption algorithm is being used for decryption?
Ans. RC4, for more details, refer here
Q18. What is the address of the function that is responsible for strings decryption?
Ans. 0X61E5D0
Q19. What are the two first decrypted words (strings) at 0X629BE8?
Ans. Program Manager
Q20. What is the key used for decrypting the strings in question 16?
Ans. D5 BB C5 3E 12 94 70 92 5A 59 E6 EA 6A A9 E6 C4 8B C4 8D 50 93 D5 1C D4 33 88 41 26 BA E4 A8 15 60 E7 B1 91 48 93 3C DB
Q21. What is the length (in bytes) of the used key in question 16?
Ans. 40
Q22. What is the address of the function that is responsible for connecting to the C&C?
Ans. The function sub_623370 utilizes APIs that are employed to establish a connection with the C&C (Command and Control) server and retrieve commands.
Q23. What is the first C&C IP address in the embedded configuration?
Ans. The Appgate config extractor identified the first IP as 192.46.210.220.
Q23. What is the port associated with the first C&C IP address?
Ans. 443
Q24. How many C&C IP addresses are in the sample configuration?
Ans. 4
Q25. What is the address of the function which downloads additional modules to extend the malware functionality?
Ans. The function sub_623820 employs HttpSendRequestW and InternetReadFile to interact with the network and retrieve files.
With that, the challenge is successfully completed.