Executive Summary
On December 4, a malicious version 8.3.41 of the popular AI library ultralytics — which has almost 60 million downloads — was published to the Python Package Index (PyPI) package repository. The package contained downloader code that was downloading the XMRig coinminer. The compromise of the project's build environment was achieved by exploiting a known and previously reported GitHub Actions script injection.
Discussion
A new version 8.3.41 of a popular AI library ultralytics was released on GitHub on Dec. 4, and also published to PyPI package repository. Similar to a recent RL research team post regarding the aiocpa incident, the content in the GitHub repository didn’t match the content of the matching PyPI package. Malicious actors managed to compromise the build environment related to the mentioned project and injected the malicious code after the code review part of the process was finished.
What made the ultralytics incident worse than aiocpa: Project maintainers didn’t properly locate the compromise and as a result, on December 5, version 8.3.42, which was supposed to address the incident and serve as a safe version to upgrade from the malicious one, ended up containing the same malicious code. Finally a clean version, 8.3.43, was published on the same day, resolving this supply chain attack. On December 7, however, two additional versions, 8.3.45 and 8.3.46, were published to PyPI containing the malicious downloader code in __init__.py file. This was likely performed using a compromised PyPI API token, stolen during the initial compromise of the build environment.
This supply chain compromise had the potential to impact a huge number of users since ultralytics (see RL Spectra Assure Community listing) is a GitHub project with more than 30,000 stars — and the PyPI package shows about 60 million downloads.
Infection vector
Unlike the recent compromise of a trusted npm package @solana/web3.js (See Spectra Assure Community listing), which also had a similar impact radius but was caused by a compromise of one of the maintainer accounts, in this case intrusion into the build environment was achieved by a more sophisticated vector, by exploiting a known GitHub Actions Script Injection that was previously reported by the security researcher Adnan Khan.
Figure 1: Comment on the GitHub issue pages explaining the infection vector
With this GitHub Actions Script injection, a malicious actor can create a fork of any repository that uses ultralytics/actions, and by crafting a pull request from a branch that has injection payload code in its title, he can achieve arbitrary code execution.
Two maliciously crafted pull requests, #18018 and #18020, were designed to enable backdoor access to the compromised environment. Figure 2 shows the malicious code embedded in the name of the forked branch.
Figure 2: Malicious pull requests designed to trigger execution of the payload code
The user account behind this pull request, openimbot, and the remote connection that was established after the execution of the malicious payload was initiated from Hong Kong, based on information provided by ultralytics maintainers.
Figure 3: User data behind the malicious pull request
This GitHub user, openimbot, has an interesting contribution history, with a long period of inactivity, between the end of August and beginning of December, when the reported attack was executed. While researchers could conclude that the account was overtaken sometime in this period, that may not be a correct conclusion.
Figure 4: Contributions history of the openimbot account, which initiated the compromise
The RL Spectra Assure platform can help prevent this type of attack and quickly pinpoint the malicious content inserted into the package. Figure 5 shows the file-based behavior differences between non-malicious version 8.3.40 and malicious version 8.3.41.
Figure 5: Behavior differences between non-malicious version 8.3.40 and malicious version 8.3.41
From the behavior diff, security teams can conclude that the malicious code was inserted into files downloads.py and model.py. Figure 6 shows that the code inserted into the model.py file is designed to check the machine type of the system on which the code is executed. This type of behavior is often used by malware to deliver payload specific to the infected machine.
Figure 6: New behavior introduced in model.py file
Inspection of the source code confirms the above assumption. The code designed to download platform-specific payload is visible in Figure 7.
Figure 7: Code responsible for downloading and executing payload
File system-related behavior changes visible in Figure 8 suggest that the code that performs the actual downloading of the file and writing it to disk is located in the downloads.py file. The code difference between downloads.py file from versions 8.3.40 and 8.3.41 confirm that conclusion.
Figure 8: New behavior introduced in downloads.py file
While in this case, based on the present information the RL research team has, it seems that the malicious payload served was simply an XMRig miner, and that the malicious functionality was aimed at cryptocurrency mining. But it is not hard to imagine what the potential impact and the damage could be if threat actors decided to plant more aggressive malware like backdoors or remote-access trojans (RATs).
With the additional compromised packages published on December 7, users of the ultralytics project should use increased caution given the risk that additional secrets or access credentials were leaked and could be abused to compromise some of the release artifacts.
Indicators of Compromise (IoCs)
Indicators of Compromise (IoCs) refer to forensic artifacts or evidence related to a security breach, or unauthorized activity on a computer network or system. IOCs play a crucial role in cybersecurity investigations and cyber incident response efforts, helping analysts and cybersecurity professionals identify and detect potential security incidents.
The following IOCs were collected as part of the RL research team’s investigation of this software supply chain incident.
PyPI packages
package_name | version | SHA1 |
ultralytics | 8.3.42 | ee304a92a9e68e7923d7a37a370c7556ac596250 |
ultralytics | 8.3.42 | 7c6136cf4e857582c2f086673359be94e7e4b702 |
ultralytics | 8.3.41 | dd0577b10e73792f2b2315af63b872fe4123ec9c |
ultralytics | 8.3.41 | bea3060707e6f3fec47aa2af64ea2e774b56e9f5 |
ultralytics | 8.3.45 | 059beed5bcdfea16c05b4d45560c97abfd4af3de |
ultralytics | 8.3.45 | a1f1e3ede7c7e6ae650a294630214ce7fa596255 |
ultralytics | 8.3.46 | 62b6532384bdd9b96af5ac684d87f52efb48f7de |
ultralytics | 8.3.46 | 96f496ac5c64f3c884676dd99d6edae2d079b1e6 |
Keep learning
- Find the best building blocks for your next app with RL's Spectra Assure Community, where you can quickly search the latest safe packages on npm, PyPI and RubyGems.
- Get up to speed on securing AI/ML systems and software with our Special Report. Plus, see the Webinar: The MLephant in the Room.
- Learn about complex binary analysis and why it is critical to software supply chain security in our Special Report. Plus: Take a deep dive with RL's white paper.
Explore RL's Spectra suite: Spectra Assure for software supply chain security, Spectra Detect for scalable file analysis, Spectra Analyze for malware analysis and threat hunting, and Spectra Intelligence for reputation data and intelligence.