Executive Summary
ReversingLabs’ machine learning-based threat hunting system detected malicious code in a legitimate looking package, aiocpa, last week that was engineered to compromise crypto currency wallets. RL then reported the malicious package to the Python Package Index (PyPI) to be taken down, and the PyPI team then published their own blog about the package. Shortly after, researchers at Phylum reported on RL’s discovery of the package. RL’s research team dug deeper because of the apparent uniqueness of the malicious campaign.
Unlike the majority of the attacks targeting open source repositories like npm and PyPI, the malicious actors behind aiocpa were not impersonating or typosquatting legitimate looking packages. Instead, they published their own crypto client tool in order to steadily attract a user base that would later be compromised through a malicious version update. Using RL Spectra Assure to carry out differential analysis of two package versions, our team was able to determine how exactly these attackers carried out this distinctive campaign.
Here’s how the RL research team discovered aiocpa, how attackers took a stealthier route to carry out the campaign — and how machine learning-based threat hunting can make or break a development team’s ability to spot a software supply chain attack before it arises.
Discussion
On November 21, RL’s research team used the company’s Spectra platform and its machine-learning based threat hunting to detect a PyPI package, aiocpa, that contained files with behaviors similar to previously-seen malicious packages published on the PyPI package repository.
Figure 1: ML Threat Hunting policy violation
Diving into the package content and looking at the utils/sync.py file that triggered the detection immediately revealed the presence of a suspicious, obfuscated code pattern.
Figure 2: Obfuscated code in sync.py file
During the regular analysis of threats from open source packages, RL researchers often encounter malware using this type of code obfuscation, which includes several recursive layers of Base64 encoding and zlib compression. Deobfuscating it gives insight into the package’s malicious functionality. In this case, deobfuscation yielded a simple wrapper around the CryptoPay initialization function designed to exfiltrate all arguments to a remote Telegram bot. These arguments included sensitive information like tokens related to crypto trading, which can then be used to steal crypto assets.
Figure 3: Deobfuscated infostealer code
RL reported the malicious package to the PyPI security team immediately upon detection, which is visible from their coverage of the incident. PyPI quarantined then removed the package.
It is also worth noting that the malicious actor was observed trying to take over an existing PyPI project named pay, probably to gain access to an established user base, or the attacker estimated that such a package name would attract more victims. This is visible from a request on the PyPI support GitHub pages submitted on September 3rd – just two days after publishing the first version of the aiocpa package.
Figure 4: Package takeover request
Package name takeover represents another supply chain infection vector that PyPI users should be aware of – considering it’s a pretty serious one. Picture this: You include project pay as a transitive dependency and it got taken over by a threat actor, and a new malicious version got published to PyPI. Would you be able to detect such a change? Would it get automatically propagated to your software solution? Heed this advice from the PyPI security team: “The possibility for project name transfers is a reminder to pin your dependencies and versions - and level up by using hashes to prevent unwanted updates to existing package/version constraints.” In addition, it’s essential to perform a security assessment of third-party packages, code, tools, and extensions used in your software development, since all these areas serve as potential vectors attackers can exploit to compromise your organization.
Typical security assessment
A first glance at the package’s project page didn’t show any reason for suspicion. It looked like a well maintained crypto pay API client package, with several versions published since September 2024. It also had a well organized documentation page.
Figure 5: PyPI home page of the package
The profile information of the maintainer didn’t look like one of those throwaway accounts with generic blue-and-white avatars. Also, the same maintainer published another package that had been maintained since March 2024. Looking at the linked GitHub page lent us even more confidence that this was a legitimate account with lots of code contributions dating all the way back to January 2024.
Figure 6: Malicious GitHub account details
A regular developer trying to make a security assessment using the standard protocol of checking to see if the project is well maintained and if the maintainer account looks legitimate would not be able to judge this package as suspicious. The project also had a respectable download count – more than 10k – another signal that this is a trustworthy package.
A close look at the source code hosted in the referenced GitHub repository didn’t raise red flags either, as the malicious code was never added to the GitHub project. Instead, it was surreptitiously implanted in the package that was published to PyPI.
Advanced security assessment was required
While traditional application security testing (AST) tools would not help in this case, advanced software supply chain security tools like RL Spectra Assure provide deeper visibility into such behaviors. Spectra Assure allows organizations to assess and manage third-party software security risks, and stands out among other tools in that it’s based on behavior indicator extraction during static analysis. In this case, RL researchers used the platform’s behavioral differential analysis, in which two versions of the same package are compared to see the extracted behaviors of each at the file level. Differences between versions for each file can then be reviewed, which uncovers the presence of unexpected or unusual behavior patterns.
When looking at this package, malicious code was detected in versions 0.1.13 and 0.1.14, published on November 20. Comparing these versions with any of the previous versions or with the content of the referenced GitHub repository quickly surfaced problematic behavior patterns visible in Figure 7.
Figure 7: Malicious GitHub account details
Behavior like expression execution and Base64 decoding, combined with the presence of unusually long strings is commonly observed within malware that is published to the PyPI package repository.
When RL researchers encounter such patterns in malware that gets published to open source package repositories like PyPI and npm, we combine them into Spectra Assure’s Threat Hunting (TH) policies. TH policies simplify threat detection, and enable your team to find threats in large binaries and packages.
Looking at the content of large projects with hundreds of files is an overwhelming task. Extracting behaviors can simplify that task, but reviewing all of the extracted behaviors and judging if something is malicious or not is challenging. Spectra Assure’s TH policies provide a way to pinpoint threats in large and comprehensive projects, while behavior indicators give developer and security teams a detailed explanation of what the detected threat is capable of.
Spectra Assure’s TH policies, which incorporates research from the RL team, can be combined with RL’s automated threat detection system to provide a comprehensive software bill of materials (SBOM) and risk assessment to successfully prevent software supply chain threats.
Conclusion
This incident is a clear reminder that open-source software security threats are growing and becoming harder to detect. The efforts taken by the threat actors to disguise their malicious creation meant that even reasonable efforts to assess the quality and integrity of the package in question would not be enough to identify the supply chain threat. With the ever growing sophistication of threat actors and the complexity of modern software supply chains, dedicated tools need to be incorporated into your development process to help prevent these threats and mitigate related risks.
Indicators of Compromise (IOCs)
Indicators of Compromise (IoCs) refer to forensic artifacts or evidence related to a security breach or unauthorized activity on a computer network or system. IOCs play a crucial role in cybersecurity investigations and cyber incident response efforts, helping analysts and cybersecurity professionals identify and detect potential security incidents.
The following IOCs were collected as part of ReversingLabs investigation of this software supply chain campaign.
PyPI packages:
package_name | version | SHA1 |
aiocpa | 0.1.13 | a1187d2a4acfe8ddaee3c7be79a9bb838142903a |
aiocpa | 0.1.13 | 7007be259829d72e73ff63ad409770ca56cfc418 |
aiocpa | 0.1.14 | fc36c157075dd4302f71ed2660e19a61016b085c |
aiocpa | 0.1.14 | 01f7db47368bffa279fb15c688518774454650cf |
Keep learning
- Find the best building blocks for your next app with RL's Spectra Assure Community, where you can quickly search the latest safe packages on npm, PyPI and RubyGems.
- Get up to speed on securing AI/ML systems and software with our Special Report. Plus, see the Webinar: The MLephant in the Room.
- Learn about complex binary analysis and why it is critical to software supply chain security in our Special Report. Plus: Take a deep dive with RL's white paper.
Explore RL's Spectra suite: Spectra Assure for software supply chain security, Spectra Detect for scalable file analysis, Spectra Analyze for malware analysis and threat hunting, and Spectra Intelligence for reputation data and intelligence.