Summary
Over the past year, a major change in tactics employed by ransomware adversaries is to exfiltrate data from the victim's environment. The data then serves as the material for an extortion threat on top of the ransom for encrypted data. This additional tactic became a trend followed by most major ransomware families early last year, 2020 1.To support this tactic, some ransomware operators have added a specific type of malware to perform this exfiltration to their intrusion set 2. The five samples analyzed here perform this type of data exfiltration. They upload a set of files from the victim's computer to command and control servers hosted on IP addresses 51.81.153[.]212, 51.161.82[.]135, and 51.77.110[.]6. All of these IP addresses are owned by OVH SAS, a French hosting company. The malware follows the exfiltration with a single line PowerShell command that stops the malware's running process and then deletes the malware file that was executed. The malware has a type of anti-analysis behavior called "Relocate API Code" according to the Malware Behavior Catalog's 3 categorization 4. The malware reads a copy of system DLLs into memory and resolves imports from there. This causes a problem for debuggers such as x64dbg 5.
Interestingly, these files share code with an earlier malware sample with completely different capabilities. This earlier file has been observed alongside TrickBot, CobaltStrike, and ransomware 6. This earlier malware additionally uses the same anti-analysis technique, but does not exfiltrate data. It has the capability to download a CobaltStrike beacon and execute it 7. In addition to this overlap in code and behavior, the command and control (C2) infrastructure domains are registered via the same registrar. Also, the C2 IP addresses are owned by the same hosting company, OVH.
Anti-analysis Trick
Relocate API Code
The first behavior one observes when running these samples in a sandbox or in a debugger is that at the point where the imports are resolved an exception is raised. Debugging past this point is not possible without circumvention of an anti-analysis trick. This circumvention starts by examining the first encoded string the samples decode. Encoded strings are found in two general forms in these samples. First is with all the rest of the strings in the file in the .data section. Some of these can be seen in Figure 1 with one example highlighted.
Figure 1: Encoded Strings
Notice the "mOrxxxx" characters that trail each of the encoded strings. These trailing characters will be examined below.
The other location where encoded strings are found is split up character-by-character as stack strings. Each byte is moved one-by-one to a location on the stack before the decoding operation occurs. The first string of this type is in the function that resolves imports from kernel32.dll. This encoded string is shown in Figure 2.
Figure 2: First Encoded String
This first string decodes to "C:\Windows\System32\kernel32.dll". This path is then used to read the DLL from the filesystem into memory. Imports are then resolved against this copy of the DLL rather than the system DLL. The function calls to copy the DLL are shown in Figure 3.
Figure 3: Copy DLL Function
In the debugger's environment, this read fails with an exception which then prevents the imports from being properly resolved 8. A detailed explanation of what's happening here can be found on OALabs YouTube channel 9.To circumvent this trick, one can create a copy of the DLLs on the filesystem and change their names along with the decoded path strings. This way the DLLs can be read correctly and the imports properly resolved. This change can be done on the fly in the debugger after the strings are decoded. An example of changing this on the fly using the filename kernel33.dll is shown in Figure 4.
Figure 4: Change DLL Name
Alternatively, the single encoded byte needed for this change can be modified in the sample with a hex editor to make this change permanent. This also makes restarting the analysis in the debugger less annoying. This byte difference is highlighted using HexFiend's 10 comparison function and can be seen in Figure 5.
Figure 5: Original Compared to Patch
After the DLL path has been decoded, the various imports from that DLL are then decoded from similar stack strings. One example with LoadLibraryW is shown in Figure 6. Again, please note the trailing "mOrxxxx" string immediately after the encoded bytes of the "LoadLibraryW" string.
Figure 6: Encoded LoadLibraryW with Trailing Additional String
Once the import strings are decoded, a custom implementation of GetProcAddress is used on the copy of kernel32.dll to resolve the imports. The results of this process can be seen in Figure 7.
Figure 7: After Imports Resolved
Examining the exports for these samples shows a DLL name "Input.exe" as well as one exported function "bsearch". These exports are shown in Figure 8.
Figure 8: Exports
This bsearch function is a version of the binary search algorithm 11. It appears once in the samples as part of the custom GetProcAddress implementation. This function call is highlighted in Figure 9.
Figure 9: Function Call to bsearch in Custom GetProcAddress
After the imports from kernel32.dll have been resolved, the next DLL is ntdll.dll. The process shown above is repeated for this DLL. The alternative name used here was npdll.dll. The path after this change is shown in Figure 10.
Figure 10: DLL Name Change
Some of the resolved imports give an idea of what's to come and what the capabilities are for these samples. Examples of this are the imports of "HttpAddRequestHeadersW" and "EnumProcesses" as shown in Figures 11 and 12.
Figure 11: Import HttpAddRequestHeadersW
Figure 12: Import EnumProcesses
The rest of the DLLs after ntdll.dll that are loaded are not loaded using this same antianalysis trick. These other DLLs are user32.dll, wininet.dll, and psapi.dll. The steps to decode and resolve imports from each DLL are divided into separate functions. Each of these functions is shown in figure 13.
Figure 13: Resolve Import Functions
In the code of these samples, another interesting library function calling pattern is to call NtAllocateVirtualMemory using syscall to allocate memory. This pattern of function call is shown in Figure 14.
Figure 14: Syscall Used on NtAllocateVirtualMemory
Collect Environment Information
The first set of capabilities in these samples is to collect information about the victim's environment. The first bit of information collected is the name of the computer. The call to GetComputerNameExA is shown in Figure 15.
Figure 15: Collect Computer Name
The next bit of environmental information collected is the physical and virtual memory status. This is done via a call to GlobalMemoryStatusEx 12 which is shown in Figure 16.
Figure 16: Measure Physical and Virtual Memory Status
The memory status is not sent back to the command and control (C2) infrastructure. It is probably used in the file processing algorithm because the primary purpose of these samples is to exfiltrate files from the victim's computer. These files must be copied from the filesystem to memory for processing before being sent to the C2.
The next data point collected is the username that ran the malware file. This data point does not appear to be sent back to the C2 according to the fields in the network traffic. The API call to GetUserNameA is shown in Figure 17.
Figure 17: Collect Username
Next the samples check the free disk space via GetDiskFreeSpaceExA as shown in Figure 18.
Figure 18: Check Disk Free Space
Next the OS version and product information is collected via calls to GetVersionExA and GetProductInfo. These calls are shown in Figure 19.
Figure 19: Gather OS Version and Product Information
Malware Configuration
After the environment information has been collected, the malware configuration strings are then decoded. As opposed to the stack strings used in the import resolution process, the configuration strings are normal strings in the .data section as shown above. All of these strings have a trailing "mOrxxxx" string. Interestingly this additional data does not cause problems for the decoding process. The reason for this is the decoding function works on a null terminated string. Examining the encoded strings closely, this null termination can be seen before the additional characters. An example of this null termination in a configuration string is shown in Figure 20.
Figure 20: Null Termination
An example of this null termination in a stack string is shown in Figure 21.
Figure 21: Null Termination in Stack String
The configuration strings for one particular sample 13 are shown in Figure 22.
Figure 22: Decoded Configuration Strings
The second configuration string from the top in Figure 22 above is used in a field called "key" in the C2 traffic along with the exfiltrated data. Each of the samples analyzed here have different key strings. The following table shows each of these strings along with the first five characters of the SHA256 hash of the file the string was collected from.
SHA256 Prefix | Key |
dcc4a | 46rnyegq235etnerhgf43trrthgbfRYdfnhg |
68af2 | 8953n7b8ewurdfb3njnyuridrwdb |
934c5 | huve3fn298vmfu293jKVFDSfvjjfe893 |
a7cf0 | huve3fn298vmfu293jKVFDSfvjjfe893 |
8cfd5 | 3f9n8uv0n43809vn3d092v09290 |
Exfiltration
The data exfiltration process starts by enumerating the logical drives that are available on the victim's computer. This is determined using a call to GetLogicalDriveStringsW and is shown in Figure 23.
Figure 23: Determine Available Logical Drives
For each of these logical drives, a function is called that walks the file system searching for targeted files and exfiltrating them. This walk function interestingly is recursive. This recursion is shown in Figure 24.
Figure 24: Walk Function Recursion
The first steps taken in the walk drives function is to add an asterisk to the path that is the input of the function. This is numbered "1" below. Then memory is allocated twice in a row. This is numbered "2" below. The string with the trailing asterisk is then written to one of the two allocated memory locations. This is numbered "3" below. Finally, this string is used to call "FindFirstFileW" with the second allocated memory location as the output location that receives the structure resulting from the API call. These are all shown in Figure 25.
Figure 25: Finding Files
As the malware walks the file system, any files that contain one of the following strings in the filename are exfiltrated.
.doc | .xls | |
.docx | .xlsx |
Interestingly, the algorithm used to find these files is probably not what the adversary expected. Rather than checking for a file extension as a suffix it actually matches any infix of the above strings. Because of this, any file with .xlsx will already match .xls for example. These target file extensions are shown in Figure 26.
Figure 26: Target File Extensions
Next the file size is determined using a call to GetFileSize. This information is included in the exfiltrated data. The API call is shown in Figure 27.
Figure 27: Determine File Size
As the file system walk proceeds, the path strings are emitted as debug strings via a call to OutputDebugStringW. This is followed immediately by bytes for a Windows carriage return line feed. These two are shown in Figure 28.
Figure 28: Emit Debug Strings
This malware can exfiltrate large files. It does this by dividing the file into chunks according to a hard coded "frame size". This hard coded size is 32535 bytes and is highlighted in Figure 29.
Figure 29: Frame Size
The above also shows all the other fields that are used in the C2 traffic. The frame number starts at "-0" then "1", "2" etc. The "filecrc" field is the cyclic redundancy check (CRC) which is used as a checksum for error detection 14. The last two fields are the filename and the computer name.
Using a constructed test file named "testfile.doc" the TLS encrypted C2 traffic is intercepted and analyzed using Burp 15 and Inetsim 16. The body parameters and the request headers from this test file are shown in Figure 30.
Figure 30: Test Document Shown Being Exfiltrated in C2 Network Traffic
Interestingly, the configuration strings include "GET" in addition to "POST". However, this capability does not appear to be used in these samples. A check is performed which determines which of the two request methods are used. This check is shown at the top of Figure 31. The POST and GET options are shown in the center, and the call to HttpOpenRequestW is shown at the bottom.
Figure 31: Request Method Options
The algorithm used to determine which files are exfiltrated is flawed in that it will match files that are not Word, Excel, or PDF documents. It will exfiltrate any file that contains the target strings anywhere in the filename. A test of this is shown in Figure 32 which uses a fake file with the extension ".txt" and ".pdf" in the middle of the filename.
Figure 32: Unexpected Exfiltration
After all the filesystem walking has completed, the next function called sends a dummy "end of transmission" file out to the C2. The filename of the file is ".lock" and the content is "locked". This file only exists in memory and network traffic. It is not written to the filesystem. The strings for this dummy file are shown in Figure 33.
Figure 33: Dummy File Strings
This dummy file as seen in the network traffic is shown in Figure 34.
Figure 34: End of Transmission Dummy File
After all of the above is finished, the last action taken by the samples is to run a PowerShell command from a string. This command gets the process ID of the parent process of the command itself. It uses that process ID to kill the malware's process. It then deletes the malware file from the filesystem. This is to clean up after the data exfiltration is complete. The command is executed by a call to CreateProcessA as shown in Figure 35.
Figure 35: PowerShell Cleanup Command
The full text of the PowerShell command is shown in Figure 36.
PowerShell Command
Evolving Variants
The earliest observed variant of this malware family was compiled on April 24th, 2021 according to the PE header TimeDateStamp field. It was then first seen in the Titanium Platform on April 25th, 2021. This earliest variant did not include the PowerShell cleanup that was used in the later two variants. The comparison of the older sample 18 and the newer sample 19 analyzed using Relyze 20 is shown in Figure 36.
Figure 36: Addition of PowerShell Cleanup Capability
Another difference between this oldest sample and most of the newer ones is the inclusion of a program database (PDB) path. This path is shown in Figure 37.
Figure 37: PDB Path String
The two newer samples have very few differences. Code-wise, there is one single function that is ~55% different between the two. The rest of the code is identical or nearly identical. This small difference is shown in Figure 38.
Figure 38: Difference in Newer Samples
Another interesting difference between the two newer samples is that the newest sample does not call back to a hostname for C2 communication. It is configured to call back to the C2 URL on a bare IP address.
Related Malware
One function in particular that is found in each of the three samples analyzed above is the string decoding function. Encoding and decoding functions are a good area of code to examine closely and to hunt for in malware repositories. This type of function can be stable across samples and tends to be reused by an adversary in multiple campaigns even if the capabilities of the malware are radically different. Building a YARA rule based on this function reveals an older malware sample 21 which was blogged about by researchers at Walmart 22. The results of a retrohunt in the Titanium Platform using this YARA rule is shown in Figure 39. The result shown in orange is a false positive. This file is a copy of one of the other actual malware files but appears to have been modified by a researcher.
Figure 39: Retrohunt Results
This sample is definitely related to the three samples analyzed here. It uses the same anti-analysis trick as well as the same string obfuscation including the "mOrxxxx" trailing characters. Both the standard string form as well as the stack strings are found in this sample. However, this older sample is a CobaltStrike beacon loader rather than a data exfiltrator. The YARA rule for detecting this code overlap is provided at the end of the blog. As opposed to earlier YARA rules that I have written, I read the advice from Marc Ochsenmeier on Twitter about adding comments with the meaning of opcodes in YARA rules meant for sharing 23. This rule and future rules will include the assembly as well as the bytecode strings.
In addition to this older malware sample that is definitely related, a hunt for other malware that contains variations of the string "mOrxxxx" reveals a multitude of potentially related malware samples. These are nearly uniformly detected as malicious. A future blog will address these additional files. The results of a retrohunt for the full string "mOrxxxx" is shown in Figure 40.
Figure 40: Full String mOrxxxx
The results of retrohunts for this string with two Xs and one X are shown in Figure 41 and 42.
Figure 41: Retrohunt for String with Two Xs
Figure 42: Retrohunt for String with One X
Analysis of the results of these retrohunts will be the topic of a future blog post.
IOCs
Sample 1
File
Filename | Input.exe |
Filename | v2c.exe |
MD5 | 1010bec081572dc3bd16e26a1e37d815 |
SHA1 | bfc0219efb60fb270cee0b7b102afc0d4b0a121a |
SHA256 | dcc4ac1302ac5693875c4a4b193242cbb441b77cd918569c43fe318bcf64fe3d |
Import Hash | 85ce0801668e488873e72eeb306503da |
SSDEEP | 768:ycscKP14scGOqEMQcanOPBbEbeFpUGC/YDR5Ws:yV3cGOqEMQcanOJFpUGC/Y9 |
Timestamp | 2021-04-24T17:34:20Z |
PDB | E:\work\proj\file_sender\x64\file_sender.pdb |
Magic | PE32+ executable (GUI) x86-64, for MS Windows |
File Type | Win64 EXE |
File Size | 35328 |
First Seen | 2021-04-25 |
User Agent
Mozilla/5.0 (Windows NT 10.0; WOW64) AppleWebKit/537.36 (KHTML, like Gecko) discord/0.0.309 Chrome/83.0.4103.122 Electron/9.3.5 Safari/537.36 |
URL
hxxps[://]files[.]pablotech[.]info/uploadFile.php |
Hostname
files[.]pablotech[.]info |
Domain
Name | Registrar | IANA ID |
pablotech[.]info | Hosting Concepts B.V. d/b/a Registrar.eu | 1647 |
Sample 2
File
Filename | Input.exe |
Filename | sender.exe |
MD5 | e3300ec2f31f5730970c5bb066d2f0ed |
SHA1 | c768882e102a5dd3d1c17d306698c5cfc3d9d8d5 |
SHA256 | 68af250429833d0b15d44052637caec2afbe18169fee084ee0ef4330661cce9c |
Import Hash | 6473877da5764bbd5a9b16892ef13b69 |
SSDEEP | 768:zp2FXczP/cpWyB/3RtUcGOqEMIcqfz/YghUx:zp2FsTcB/UcGOqEMIcqfz/Yg4 |
Timestamp | 2021-04-28T03:00:08Z |
Magic | PE32+ executable (GUI) x86-64, for MS Windows |
File Type | Win64 EXE |
File Size | 36352 |
First Seen | 2021-05-03 |
User Agent
Mozilla/5.0 (Windows NT 10.0; WOW64) AppleWebKit/537.36 (KHTML, like Gecko) discord/0.0.309 Chrome/83.0.4103.122 Electron/9.3.5 Safari/537.36 |
URL
hxxps[://]figures[.]pablotech[.]info/uploadFile.php |
Hostname
figures[.]pablotech[.]info |
Domain
Name | Registrar | IANA ID |
pablotech[.]info | Hosting Concepts B.V. d/b/a Registrar.eu | 1647 |
Sample 3
File
Filename | Input.exe |
Filename | v2c.exe |
MD5 | 4af8b45c9b0f73d47a499d92064b6c2e |
SHA1 | 424f3c281f46e4cf2350c78cfa89a87873e0b994 |
SHA256 | 934c557e52bd47fa312ea4098e05781145d0b81c9dc543ef42b266813bdb05d4 |
Import Hash | 6473877da5764bbd5a9b16892ef13b69 |
SSDEEP | 768:9vutX7Qp6CPRp8Yh/ZYWcGOqEMUcgk9/Y7hCeUpU:K7QpJp8YFrcGOqEMUcg0/Y7lk |
Timestamp | 2021-05-17T20:33:40Z |
Magic | PE32+ executable (GUI) x86-64, for MS Windows |
File Type | Win64 EXE |
File Size | 36352 |
First Seen | 2021-05-18 |
User Agent
Mozilla/5.0 (Windows NT 10.0; Win64; x64; rv:89.0) Gecko/20100101 Firefox/89.0 |
URL
hxxps[://]51.161.82[.]135/uploadFile.php |
IP Address
IP | Owner | ASN |
51.161.82[.]135 | OVH SAS | 16276 |
Sample 4
File
Filename | Input.exe |
Filename | v2.exe |
MD5 | 7c801e3c256d2e9e1f4462fe84e44c68 |
SHA1 | 4cd9cecd1d093f290e6f8f0ad6d5e76dbedbf3d9 |
SHA256 | a7cf0f72bb6f1e0a61fbf39e3a3a36db6540250caeef35b47fb51a8959f40984 |
Import Hash | 9f86f12427bca134faaa21bcc0d849d3 |
SSDEEP | 768:vkcGOqEMccVhPO4TrASVqipOHMd6m/YFh50:ccGOqEMccV7rAZipOHA/YFT |
Timestamp | 2021-05-24T23:06:16Z |
PDB | E:\work\proj\file_sender\x64\file_sender.pdb |
Magic | PE32+ executable (GUI) x86-64, for MS Windows |
File Type | Win64 EXE |
File Size | 37376 |
First Seen | 2021-06-01 |
User Agent
Mozilla/5.0 (Windows NT 10.0; Win64; x64; rv:89.0) Gecko/20100101 Firefox/89.0 |
URL
hxxps[://]51.161.82[.]135/uploadFile.php |
IP Address
IP | Owner | ASN |
51.161.82[.]135 | OVH SAS | 16276 |
Sample 5
File
Filename | file_sender.exe |
Filename | sender.exe |
MD5 | 12a7595d94e142847a04f11659ed183d |
SHA1 | f80a2f102ca0297d053c75e0676049dc87cb3f35 |
SHA256 | 8cfd554a936bd156c4ea29dfd54640d8f870b1ae7738c95ee258408eef0ab9e6 |
Import Hash | 9f86f12427bca134faaa21bcc0d849d3 |
SSDEEP | 768:sPcGOqEMccNNPayYDcfHyIY2QUy2h08wx:2cGOqEMccNEDuhY2FS84 |
Timestamp | 2021-06-15T10:57:36Z |
PDB | E:\work\proj\file_sender\x64\file_sender.pdb |
Magic | PE32+ executable (GUI) x86-64, for MS Windows |
File Type | Win64 EXE |
File Size | 36352 |
First Seen | 2021-06-16 |
User Agent
Mozilla/5.0 (Windows NT 10.0; Win64; x64; rv:89.0) Gecko/20100101 Firefox/89.0 |
URL
hxxp[://]51.77.110[.]6/uploadFile.php |
IP Address
IP | Owner | ASN |
51.77.110[.]6 | OVH SAS | 16276 |
YARA Rule
private rule WindowsPE
{
condition:
uint16(0) == 0x5A4D and uint32(uint32(0x3C)) == 0x00004550
}
rule DataExfiltrator_Decoder
{
meta:
author = "Malware Utkonos"
date = "2021-05-07"
description = "String decoding function found in data exfiltration malware."
exemplar = "dcc4ac1302ac5693875c4a4b193242cbb441b77cd918569c43fe318bcf64fe3d"
strings:
$a = { 4489442418 // mov dword [rsp+0x18 {arg_18}], r8d
88542410 // mov byte [rsp+0x10 {arg_10}], dl
48894c2408 // mov qword [rsp+0x8 {arg_8}], rcx
4883ec28 // sub rsp, 0x28
8b442440 // mov eax, dword [rsp+0x40 {arg_18}]
33d2 // xor edx, edx {0x0}
b904000000 // mov ecx, 0x4
48f7f1 // div rcx
}
condition:
WindowsPE and $a
}
References:
1 https://research.checkpoint.com/2020/ransomware-evolved-double-extortion/
2 https://docs.oasis-open.org/cti/stix/v2.1/cs01/stix-v2.1-cs01.html#_5ol9xlbbnrdn
3 https://github.com/MBCProject/mbc-markdown/tree/master/yfaq
4 https://github.com/MBCProject/mbc-markdown/blob/master/anti-behavioral-analysis/evade-debugger.md
5 https://x64dbg.com/
6 https://medium.com/walmartglobaltech/trickbot-crews-new-cobaltstrike-loader-32c72b78e81c
7 Ibid.
8 Thanks to Sandor Nemes for assistance in understanding this behavior.
9 https://www.youtube.com/watch?v=242Tn0IL2jE&t=1053s
10 https://hexfiend.com/
11 https://en.cppreference.com/w/c/algorithm/bsearch
12 https://docs.microsoft.com/en-us/windows/win32/api/sysinfoapi/nf-sysinfoapi-globalmemorystatusex
13 934c557e52bd47fa312ea4098e05781145d0b81c9dc543ef42b266813bdb05d4
14 https://en.wikipedia.org/wiki/Cyclic_redundancy_check
15 https://portswigger.net/burp
16 https://www.inetsim.org/
17 dcc4ac1302ac5693875c4a4b193242cbb441b77cd918569c43fe318bcf64fe3d
18 Ibid.
19 68af250429833d0b15d44052637caec2afbe18169fee084ee0ef4330661cce9c
20 https://www.relyze.com/
21 0234f80c6fd3768f9619d6fcd50d775ec686719fcc665007bfd1606bbe787744
22 https://medium.com/walmartglobaltech/trickbot-crews-new-cobaltstrike-loader-32c72b78e81c
23 https://twitter.com/ochsenmeier/status/1379546812437118980
Keep learning
- Find the best building blocks for your next app with RL's Spectra Assure Community, where you can quickly search the latest safe packages on npm, PyPI and RubyGems.
- Get up to speed on securing AI/ML systems and software with our Special Report. Plus, see the Webinar: The MLephant in the Room.
- Learn about complex binary analysis and why it is critical to software supply chain security in our Special Report. Plus: Take a deep dive with RL's white paper.
Explore RL's Spectra suite: Spectra Assure for software supply chain security, Spectra Detect for scalable file analysis, Spectra Analyze for malware analysis and threat hunting, and Spectra Intelligence for reputation data and intelligence.