These Labs are from Chapter 14(Malware-Focused Network Signatures) for practice from the book “Practical Malware Analysis” written by Michael Sikorski and Andrew Honig.

Tools used :

Detect-it-Easy
PEiD
IDA Pro
Procmon
Wireshark
Inetsim
ApateDNS
x64dbg

Lab14-01

Starting with opening the executable in Detect-it-Easy(DiE) for static analysis. Where we can see that it imports functions from KERNEL32.dll, ADVAPI32.dll, urlmon.dll. Interesting imports are URLDownloadToCacheFileA, GetCurrentHwProfileA, GetUserNameA. It also contains some interesting strings.

PEiD plugin KANAL detects Base64 encoding used in the executable.

After setting up Inetsim, ApateDNS and Procmon, we run the malware and observe that it make a normal looking GET request and downloads png file at the path “C:\Users\vboxuser\AppData\Local\Microsoft\Windows\INetCache\IE\JYK9PB68\a.png”.

Let’s open the executable in IDA Pro. In main function, we can see that it first calls GetCurrentHwProfileA and retrieves the GUID from it. Then it stores the GUID’s last 12 numbers in the format of “%c%c:%c%c:%c%c:%c%c:%c%c:%c%c”. Then it calls GetUserNameA to retrieve the name of user associated with the current process. Then it concatenates the GUID and username with a “-“ between them. Then it passes the string into a function sub_4010BB.

Lokking into the function sub_4010BB, it calls another function sub_401000 which contains the reference to the string detected by KANAL.

So, ig sub_4010BB is the Base64 encoding function.

Back in main function, it calls another function sub_4011A3 in while loop. In which, it frames a URL in the format of :

  
sprintf(buffer, "http[:]//www[.]practicalmalwareanalysis[.]com/%s/%c.png", base64Encoded, base64Encoded[sizeof(base64Encoded)-1]);

It create a process for the downloaded file to run by calling CreateProcessA.

Moving to Advanced dynamic analysis, i loaded the file in x32dbg and set breakpoints at call to functions sub_4010BB and URLDownloadToCacheFileA.

In the following image we can see the format of GUID and username before passing it to Base64-encoding function.

We can see the returned statment by the function sub_4010BB.

To confirm, i decoded it using python :

There seem to be an extra “a” at the end of the encoded text, but after going through the function, i observed that “a” is used for padding instead of “=”.

We can see the URL used, by setting a breakpoint at URLDownloadToCacheFileA.

Finally, to check the path where the file is downloaded, we can set a breakpoint at the call instruction to CreateProcessA.

Question and Answers

Question 1: Which networking libraries does the malware use, and what are their advantages?
Answer : urlmon.dll is the only networking imported library. It is used for Object Linking and Embedding(OLE) based API calls. URLDownloadToCacheFile is imported from this library.
Advantage of this library is that the http packets being sent looks like a typical packet from the victim’s browser.

QUestion 2: What source elements are used to construct the networking beacon, and what conditions would cause the beacon to change?
Answer : GUID and username are the source elements. These elements will be different for different machines.

Question 3: Why might the information embedded in the networking beacon be of interest to the attacker?
Answer : This information can be used to keep track of the infected machines.

Question 4: Does the malware use standard Base64 encoding? If not, how is the encoding unusual?
Answer : No, it uses “a” as padding character instead of “=”.

Question 5: What is the overall purpose of this malware?
Answer : It downloads a file and then executes it.

Question 6: What elements of the malware’s communication may be effectively detected using a netowrk signature?
Answer : THe domain name, colons, dash found after decoding. File name is the last character Base64 encoded text.

Question 7: What mistakes might analysts make in trying to develop a signature for this malware?
Answer : GET request and name of file requested will be same with every system. The GUID and username will also be same for every target.

Question 8: What set of signatures would detect this malware(and future variants)?
Answer : alert tcp $HOME_NET any -> $EXTERNAL_NET $HTTP_PORTS (msg:”Lab14-01 colons and dash”; urilen:>32; content:”GET|20|/”; depth:5; pcre:”/GET\x20\/[A-Z0-9a-z+\/]{3}6[A-Z0-9a-z+\/]{3}6[A-Z0-9a-z+\/]{3}6[A-Z0-9a-z+\/]{3}6[A-Z0-9a-z+\/]{3}6[A-Z0-9a-z+\/]{3}t([A-Z0-9a-z+\/]{4}){1,}\//”; sid :37284429; rev:1;)
The logic for the above rule is simple. As the format for strings before base64 encoding is : “HH:HH:HH:HH:HH:HH-username”. Let’s take 3 characters, (3*4)/3 = 4. So, these 3 will convert in 4 characters after encoding by creating a number from 6 bits. The binary for “:” is 111010. and the 4th character will be from these bits only. So 58th character in the string “ABCDEFGHIJKLMNOPQRSTUVWXYZabcdefghijklmnopqrstuvwxyz0123456789+/” will be 6. similarly “t” will be for colon.

Lab14-02

Analyse the malware found in file Lab14-02.exe. This malware has been configured to beacon to hard-coded loopback address in order to prevent it from harming your system, but imagine that it is a hard-coded external address.

Question and Answers

Question 1: What are the advantages or disadvantages of coding malware to use direct IP addresses?

Answer : Coding malware to use direct IP addresses has both advantages and disadvantages. Let’s explore them:

Advantages:

By using direct IP addresses, malware can bypass the DNS resolution process.
Malware that uses direct IP addresses does not rely on specific domain names or URLs. This can be advantageous for attackers who want to maintain control over their infrastructure without worrying about domain takedowns or changes in DNS configurations.

Disadvantages:

Direct IP addressing can make malware communication more traceable.
Direct IP addressing limits the flexibility and scalability of malware operations. If the IP address of a command-and-control (C&C) server changes, the malware needs to be updated with the new IP address, requiring manual intervention by the attacker

nQuestion 2: Which networking libraries does this malware use? What are the advantages or disadvantages of using these libraries?

Answer : In Detect-it-Easy, we can see that it is using WININET.dll for networking.

Advantages :

It makes easier for the malware to interact with network resources, such as making HTTP requests, managing cookies, or handling proxy configurations.
Cache and cookies are typically handled automatically by the operating system. In the case of malware, if the cache is not cleared prior to downloading files, there is a potential for the malware to unintentionally retrieve a cached file instead of obtaining the latest code that requires downloading.

Disadvantages :

It requires User-Agent and other required optional header to be hard coded in the malware.

Question 3: What is the source of the URL that the malware uses for beaconing? What advantages does this malware offer?

Answer : We can see the URL being used in Fakenet during the dynamic analysis.

The string resource section in the executable contains the URL. It is being loaded using LoadStringA function.

Advantage of using string from the resource section is that Modifying embedded resources in malware provides an advantage by enabling the Command and Control (C2) infrastructure to be changed dynamically, or by allowing the malware to function as a backdoor to multiple C2 servers without requiring modifications and recompilation of the binary.

Question 4: Which aspect of the HTTP protocol does the malware leverage to achieve its objectives?

Answer : There seems to some random data in the User-Agent field of the first GET request. We can see it in the above fakenet screenshot.

Upon further investigation, it seems that malware creates two threads in which one encodes data using customised Base64-encoding and other uses hard coded User-Agent, i.e. “Internet Sirf”.

Question 5: What kind of information is communicated in the malware’s initial beacon?

Answer : After decoding the initial beacon, we can clearly see that its the standard output of cmd.exe being run.

From this we know that the initial beacon is an encoded command-shell prompt.

Question 6: What are some disadvantages in the design of this malware’s communication channels?

Answer : The User-Agent used in second GET request is hard-coded which makes it easy to form a signature to detect it.

alert tcp $HOME_NET any -> any $HTTP_PORTS (msg:"PMA Lab14-02 Suspicious User-Agent - Internet Surf"; flow:to_server; content:"User-Agent|3A 20|Internet|20|Surf"; http_header; classtype:trojan-activity; sid:6348756; rev:1;)

The encoded User-Agent always starts with “!<”, which can also be used in signature.

alert tcp $HOME_NET any -> any $HTTP_PORTS (msg:"PMA Lab14-02 Suspicious User-Agent - Starts with !<"; flow:to_server; content:"User-Agent|3A 20|!<"; http_header; classtype:trojan-activity; sid:1337351; rev:1;)

Question 7: Is the malware’s encoding scheme standard?

Answer : Malware is using Base64 with customised index string set.

Question 8: How is communication terminated?

Answer : It is terminated using “exit” string and while exiting, it deletes itself.

Question 9: What is the purpose of this malware, and what role might it play in the attacker’s arsenal?

Answer : After conducting our examination, it has been determined that the objective of this malicious software is to create a reverse TCP command shell. This shell allows for data transmission using a user-agent in an attempt to evade network analysis methods. It is evident that the malware is designed to eliminate itself, indicating that it is likely utilized solely for the purpose of initial system access before additional malware or persistence mechanisms are established. Consequently, it can be inferred that this component serves as a temporary and disposable tool to achieve the intended goal.

Lab14-03

This Lab builds on Lab14-01. Imagine that this malware is an attempt by the attacker to improve his technique. Analyze the malware found in file Lab14-03.exe.

Question and Answers

Question 1: What hard-coded elements are used in the initial beacon? What elements, if any, would make a good signature?

Answer: In strings section we can see some hard-coded headers used in the beacon. The elements are :

User-Agent
UA-CPU
Accept
Accept-Language
Accept-Encoding

THe malware author mistakenly coded the User-Agent “User-Agent:..”. This results in User-Agent field being set to “User-Agent:User-Agent:..”, which may be used as a signature.

Question 2: What elements of the initial beacon may not be conducive to a long-lasting signature?

Answer: The domain and URL found in the malware are hard-coded but sub_401457 checks whether the configuration file is already present or not, if it is it reads its content. But if it is not present then only it creates the file and writes the hardcoded URL to it.

The URL should not be used to make a long lasting signature as the malware has built-in functionality ti update this URL.

Question 3: How does the malware obtain commands? What example from the chapter used a similar methodology? What are the advantages of this technique?

Answer: After reading the file from the internet, it compares the content with “<no” and calls a function sub_401000.

In sub_401000, it tries to check whether the content starts with “noscript” tag.

Using this technique, the malware can hide its command within any legitimate content. Another advantage of this technique is that the author can change the html on the server side, to update the command, URL on the client side.

Question 4: When the malware receives input, what checks are performed on the input to determine whether it is a valid command? How does the attacker hide the list of commands the malware is searching for?

Answer: It checks for an initial noscript tag being followed by a URL. URL should end with “96`”. Between URL and 96, there must be two sections composed of the command and argument(similar to /command/1213141516). THe first letter of the command must be one the allowed commands(d, n, r, s).

The content should be of the form : noscript.URL/command/argument/96`

The attacker hides the commands by using the first character to switch between predefined commands.

Question 5: What type of encoding is used for command arguments? How is it different from Base64, and what advantages or disadvantages does it offer?

Answer: The malware just get the index of the charcters from the string “/abcdefghijklmnopqrstuvwxyz0123456789:.” to encode them.

For example the character “a” will be encoded to 01.

The disadvantage of this encoding is that it can be reversed easily.

Question 6: What commands are available to this malware?

Answer: The available commands are:

d: Download and run the executable
r: modifies the configuration file
s: Sleep
n: Quit

Question 7: What is the purpose of this malware?

Answer: This malware is a Downloader and Launcher. It uses web-based control techniques and the ability to easily adjust as malicious domains are identified.

Question 8: This chapter introduced the idea of targeting different areas of code with independent signatures(where possible) in order to add resiliency to network indicators. What are some distinct areas of code or configuration data that can be targeted by netowrk signatures?

Answer: Areas that can be included in the signature:

statically defined domain and path
hard-coded HTTP headers such as UA-CPU, User-Agent
HTTP response contains “noscript.URL[69`]”

Question 9: What set of signatures should be used for this malware?

Answer: One signature can be made from the User-Agent string with the additional User-Agent header:

alert tcp $HOME_NET any -> any $HTTP_PORTS (msg:"PMA Lab14-03 Duplicate User-Agent and known hardcoded headers"; flow:to_server; content:"Accept|3A 20|*/*|0D 0A|Accept-Language|3A 20|en-US|0D 0A|UA-CPU|3A 20|x86|0D 0A|Accept-Encoding|3A 20|gzip|2C 20|deflate|0D 0A|User-Agent|3A 20|User-Agent|3A 20|Mozilla/4.0|20|(compatible|3B 20|MSIE|20|7.0|3B 20|Windows|20|NT|20|5.1|3B 20|.NET|20|CLR|20|3.0.4506.2152|3B 20|.NET|20|CLR|20|3.5.30729)"; http_header; classtype:trojan-activity; sid:1337352; rev:1;)

Malware-Focused Network Signatures(Chapter 14)

Lab14-01

Question and Answers

Lab14-02

Question and Answers

Lab14-03

Question and Answers

Further Reading

Backdoor(Lab 01-01)

GINA Interceptor(Lab 11-01)

Inline Hook (Lab11-02)