Dependency Confusion As A Tool For Targeted NPM Hacks
We chatted with ReversingLabs Reverse Engineer Karlo Zanki about how NPM packages have been caught serving malware via compromised software updates.
EPISODE TRANSCRIPT
PAUL ROBERTS
Welcome, everyone. This is the latest edition of ConversingLabs, which is ReversingLabs' biweekly podcast where we talk about threats, threat hunting, threats to software supply chain, and all manner of interesting topics like that. I'm your host, Paul Roberts. I'm the cyber content lead here at ReversingLabs. I'm very pleased this week to have one of my colleagues, Karlo Zanki, who is a reverse engineer at ReversingLabs with us. Hey, welcome, Karlo.
KARLO ZANKI
Thank you, Paul. I'm glad to be here also.
PAUL ROBERTS
It's great to have you. You have been the author of a number of really interesting research reports for ReversingLabs in your time here, and you do a lot of really fascinating work, and we're going to talk about some of it this week regarding dependency confusion attacks, which is a very interesting area of investigation for just a little bit of housekeeping. Before I get going. For our attendees, we are going to take questions at the end of our conversation with Karlo. So if you've got them, just use the we're on Zoom Webinars. Use the Q&A feature to pose your questions. There is a chat feature. I'm going to share some links to some of the writing that we've done on NPM related threats and attacks in the chat. And feel free to communicate to me or the other hosts or other attendees. And following the conversation today, we're going to just have a couple of questions for you about potentially interested in some follow up webinars or discussions around dependency confusion attacks. You'll also have the option to get a ConversingLabs t-shirt. The rare ConversingLabs t-shirt. So hang around until the end if you want to score a T-shirt. Karlo, welcome. How are you?
KARLO ZANKI
I'm great, thanks, Paul. How are you?
PAUL ROBERTS
I'm doing well. Hey, could you tell us just a little bit about the work that you do here at ReversingLabs?
KARLO ZANKI
Well, here at ReversingLabs, I'm mainly responsible for research and finding interesting stories for presentation of our... My everyday job is looking at potential malware and trying to find something dangerous in software.
PAUL ROBERTS
So today we're talking about some really interesting discoveries that you had just in the last couple of weeks regarding dependency confusion attacks or threats on the NPM platform, which is kind of open source platform. Could you talk just a little bit about what dependency confusion is and how dependency confusion attacks work before we delve into what you specifically discovered?
KARLO ZANKI
Well, dependency confusion is a type of attack where the attacker doesn't directly try to penetrate your infrastructure. Instead, they try to circumvent your defenses by making your build environment or one of your developers install the malicious payload on your system without direct interactions. So we aren't talking about phishing emails or something like that where they are trying to work you to install something by clicking on it. Instead, this is a bit more complex attack where the attacker tries to find one of your weak spots and make your misconfiguration do the job instead of him. So basically he relies on you having private dependencies depending that your source code depends on private packages and that such packages don't exist in public NPM repository, anything, whatever.
PAUL ROBERTS
It doesn't have to be NPM, right?
KARLO ZANKI
No it could be public package repository as long as most of them have let's say are vulnerable to dependency confusion attacks because that's just like a feature not vulnerability as they would say. They provide you with an option to configure your system so that it isn't vulnerable but if by default it is if you miss the proper configuration steps you will end up with a vulnerable development environment.
PAUL ROBERTS
Right.
KARLO ZANKI
So they rely on you having private dependency package inside your build environment infrastructure and that package isn't listed in public NPM record. What they do then they publish their own malicious package to the public repository name it the same as your packages named your private repository and give it a higher version number in hope that procedures in your environment will fetch the package from the public repository which has higher version number suggesting that it's in your and that it's your goal to install a newer version of your package.
PAUL ROBERTS
Right. So applications these days might use a variety of both public, open source modules and internal private nonpublic privately developed modules and they all kind of come together and these attacks seem to play on where that package where a module is sourced from and exploit some configuration weaknesses in these platforms NPM or PyPI or GitHub or whatever where they might just preference the higher version number over whether something is a public package or a private package and we've seen there are some really interesting. Pretty extensive research done about a year ago that revealed this as an avenue to attack some really sophisticated companies: Apple, Tesla and others who were vulnerable to this who found that you could push malicious code into these environments using these dependency confusion strategies. How common is it Karlo that folks who might be using a platform like NPM or GitHub or what have you would have it misconfigured or set up in such a way that again public packages the higher version number but the same name as a private internally developed package might get loaded?
KARLO ZANKI
Well, there's general vulnerabilities in any software system. It puts misconfiguration very high in top ten vulnerability reasons in their vulnerability list so it's common in any kind of development so also in development based on public source repositories so again it is a lot of people. A lot of developers of any kind forget to do proper configuration. Either they don't know how to do it, either they configure it well on two or three systems and just one place they forget. So as soon as you have to do some configuration and by default, by default the configuration is not protecting you there is a high probability that some percent of people will remain vulnerable.
PAUL ROBERTS
Let's move the story up to about to the end of April, I think April 29th. The company, Snyk, a cybersecurity company, published some research saying that they had discovered some targeted dependency confusion attacks. Didn't name the firm that it was associated with, but these were sort of in the wild, what appeared to be dependency confusion attack on the NPM platform. Talk about what Snyk uncovered and then kind of how you engage with that.
KARLO ZANKI
Well, the Snyk detected a dependency confusion attack with a detailed analysis, and let me first say today, every day you can see a lot of NPM packages containing, trying to create a proof of concept for dependence confusion. But most of them don't have real malware, fully functional malware inside them. They just perform beacon, or something like that. What was interesting to Snyk was that they found packages that contain a fully functional malware so it would get installed. It performed all kinds of malicious operations. It includes PowerShell, so it's a fully functional malware. And what was more interesting was that it was well obfuscated and also included encryption of the final payload. So that triggered some interest in them. They did a good write up about it, found public de-obfuscater for it, and pretty much did the whole analysis. What was interesting that these packages were published 12th of April. They reported them around the end of April. In the meantime, field today there were several other packages of the same format with the same content published to the NPM repositories targeting other companies, which we can detect based on username of the NPM publisher of packages. So all the names were targeting German companies, which was interesting.
PAUL ROBERTS
So you said that this is going on a lot. Like if you were to just monitor or scan all the activity on NPM in a given week or whatever, you would see a lot of these types of precursors to what looked like dependency confusion attacks. But that a lot of them might be the work of people like you, they might be the work of security researchers or security companies kind of testing things out, right? Or who knows what. Non-malicious kind of attack. Hard to know what's behind them. Could be malicious, could not be malicious, but it's not malware per se.
KARLO ZANKI
In this case we had a full functional malware. That's what was interesting.
PAUL ROBERTS
Right.
KARLO ZANKI
And all these packages were from the same actor obviously. The actor did a lot of work to remain anonymous. All the accounts were created with mail provider, which means that they didn't want to know who was behind this stuff.
PAUL ROBERTS
Right.
KARLO ZANKI
So it all looks like a genuine malware.
PAUL ROBERTS
And in terms of like what was in. So when you're looking at a dependency confusion attack, one thing you noted was there's a module there's one of these NPM packages that has a legitimate name, appears to be the name of an internally developed module, non public module, we don't know where it's from, we don't know who developed it, we don't know where it's used, but looks like a legitimate name. And then these two other packages that were linked to that, that were clearly sort of Gibberish nonsense names, packages. When you go to analyze those, Karlo, what are you looking for when you find stuff like that?
KARLO ZANKI
Well, generally the most important thing is that you finally want always what malware is capable of and what is the command server to which it phones back. So that's generally what you look. But what's always interesting is new techniques in malware obfuscation I know anything bypassing. So something that differs it from other malware you have previously seen, and of course, you want to find out who is behind it. Is it somehow connected to some well known actor from before or anything? In this case, none of the research, Snyk did the research, JFrog did the research, we did our research. None of us could have pinpointed the actor behind this. That was quite interesting.
PAUL ROBERTS
Right, because all you have to go for on is really these maintainer accounts at NPM or GitHub. They're associated with an email address, but it's a kind of straw man email address, a proton mail address, and that's kind of the end of the line in terms of what you have as a trail leading to the attacker. Just looking at the packages themselves, like, what are some of the if you're on the target side, the defender side, are there red flags? Are there things that you should be looking for if you're concerned about an attack like this on your own organization?
KARLO ZANKI
First thing you wish to look at are our version numbers. Let's say you have a private package number. You usually put number versions in sequential order, 1.01.5.2. And if you suddenly hit large discrepancy in your version order, you can find that pretty unexpected. But again, that would require you to look at the dependency you have installed in your software, which if you have a lot of dependencies in your software, it's not an easy task. If you have two or three dependencies, it's not hard to track. But if you have a bit more complex software, which has tens of dependencies, we don't need to talk about hundreds. It's enough. If you have 10, 20 dependencies, it's hard to follow track about those version numbers. Usually human brain can put seven, eight things in the context and more of that something gets out of the brain. So you want to look at high version numbers, and also you want, again, if you get infected, you want to monitor network activities and typical malware detection stuff.
PAUL ROBERTS
Right. So Snyk discovered these three packages. They wrote it up in a blog post end of April. What was the process you used to go from the work that they did to discovering additional packages, related packages on NPM with some different contents and functionality in them?
KARLO ZANKI
Well, actually I did my everyday job and looked for malicious packages and when I found the package, I tried to find more about the encryption, I knew it was about some constant seen in the code. And then of course, the first Google search result was Snyk's blog.
PAUL ROBERTS
Right. You said, "okay..."
KARLO ZANKI
When you find something interesting, you usually try to get more information about that sample to see if it was already found or anything like that.
PAUL ROBERTS
Right. And we know from the - oh so we're talking about these in the context of targeting German firms. And we're saying that because first of all, in the Snyk blog post, one of the maintainer accounts I think was Bosh...
KARLO ZANKI
Bosh, yes.
PAUL ROBERTS
Bosh NPM modules or something like that. So there was this Bosh, obviously big German electronics firm.
KARLO ZANKI
Yeah, but when you have just one maintenance, it's not easy to understand.
PAUL ROBERTS
It's a big leap to sort of say this must be targeting Bosh. Right.
KARLO ZANKI
But in case we had three more packages which also contain different names from German companies and from that we could conclude that ties to our German companies of all that.
PAUL ROBERTS
Right. One of them was Bertelsman, NPM was one of the accounts. There was the Bosch node modules, there was another one, DB schenker, which is a rail and logistics company in Germany as well. But putting that all together to you and to most of us looks like, okay, this seems to be a targeted attack against German firms and your mind kind of goes from there. Okay, well, who might want to target German firms and so on. But the basis for that again, is the maintainer names, the module names and typosquatting or this kind of confusion is a part of dependency confusion attacks. Right. So when we talk about typosquatting that's often used in like phishing attacks or watering hole attacks, you set up a domain to sort of look like another domain and then lure people to it and hope that they don't notice that it's different. How does typosquatting work in the context of dependency confusion attacks? Is it a really important part component of it or not?
KARLO ZANKI
Well, I will name it typosquatting since in most cases the name of the package needs to be the same as the prior package. But typosquatting is also quite often same in public repositories because there are more vectors for performing typosquatting. Traditional one letter change is one of type. But in NPM, for example, you have scopes monkey summar name and what you can do is name take basic package name and extend it or remove scope from it to perform typosquatting. So you have more vectors, more possibilities to perform typosquatting in public repositories than in some other traditional typosquatting landscapes.
PAUL ROBERTS
Right, but you make a really good point, which is for dependency confusion to work, the attack module, so to speak, the public attack module has to actually have the exact same name as the private module. So that would seem to be like if you were worried about this as a development organization, one thing it seems like you should be doing is scanning like NPM or GitHub for modules that have the same name as the private modules that you maintain, right?
KARLO ZANKI
Yes, that's one way to find them. But also what you can do to protect yourself is to use NPM scopes and reserve your scope of NPM so someone else couldn't use that against you.
PAUL ROBERTS
Right. How is it that private modules that you develop internally become known publicly? Right? Like, how does that happen? And what mistakes do development organizations make?
KARLO ZANKI
For example, accidentally you have some bill script which publish online to a public repo and mistakenly it puts something that wasn't defined in GitHub or file or something like that. When you perform publish in NPM, you automatically publish something to some repo that you configured. And if you didn't configure it to be private, we can get again somewhere. There are a lot of ways.
PAUL ROBERTS
Is there an easy way to search for that? Is there an easy way to scan for that? We got a list of private modules, we know their names. Let's go ahead and just scan all these different repos and look for any reference to them.
KARLO ZANKI
Yeah, the research from 2021 did just that. Search for public, for private names in public repos so there are several ways to do this.
PAUL ROBERTS
Bad guys could do it if they know the name, but of course, as a developer, as a good guy right. Or a potential target, something, you could do that.
KARLO ZANKI
Yes.
PAUL ROBERTS
Interesting. Talk about from the attack standpoint, we talk about dependency confusion, but this is really just a way to place malware in target environments at the end of the day. Right. So could you just talk about kind of connect the dots between these malicious NPM modules like the ones that we saw? So those get loaded maybe by mistake. How do you move from that to actually getting the malicious binaries onto the target environment? How does that process work?
KARLO ZANKI
Well, once you have it in public campaign, but I'm not sure if I understood the question totally, but once you publish malicious code, upload to the repository, you don't need to do anything personally directly. It's all now on the target and the build environment. If the build environment is configured to, let's say, daily updates, dependency, they will do it inside of you. You don't have to perform any further action, you don't have to send phishing emails, anything like that. Your biggest work is in preparation, finding good target, preparing good package with a good version. And you don't even have to try too hard to keep it hidden. Because since this stuff is done automatically in background from the no damage to your environment, the developer often doesn't see that the dependent got updated.
PAUL ROBERTS
Right. And at that point really it's just a malware infection, right? At that point you're really talking about...
KARLO ZANKI
You probably have when that JavaScript is run, your module note provides post installed scripts, which enable that after installation is finished, you launch some node script which you have in the package. So once you specified it to launch, after the installation, it's up. If you configure it properly to be come back to you probably through DMS or anything that can go out, several layers of protection. You know that you have a version. Now it's up to you what you want to do next. You can put anything other on it.
PAUL ROBERTS
And your analysis found that these modules had kind of a full range of capabilities that a malicious downloader would have.
KARLO ZANKI
Yes, download packages, upload information, executive shell, several commands. The Snyk blog did also good technical description of the malware itself. So it's definitely a good read.
PAUL ROBERTS
Okay, so we published and JFrog published and Snyk published these blog posting like NPM dependency confusion attacks are being used to attack German organizations. And then within the last 12 hours, we got a tweet from this German company that I wasn't familiar with, but maybe other people were, Code White Security, saying, thanks so much for your great research. Basically, yeah, this is a Red Team attack that we have been doing on behalf of our clients and you discovered it and good job. Let's see, what should we think about this? So lo and behold, yes, there was an attack on German firms. We were right about that. No, it was not malicious per se. It was part of a planned kind of Red Team assessment of their security. Good. I guess that Bertelsman and Bosch and these other companies are looking at this as an avenue of attack. What do we make of this, Karlo, and what should we conclude?
KARLO ZANKI
Well, personally, my opinion is that, okay, you're a Red Teamer, you want to do something to prove something, that something is vulnerable. But usually it's not so common to put a fully functional malware as a payload for your threat intel. Because in this case, anyone can copy that repository that were public. Some of them still haven't been removed from NPM, so any malicious actor can copy the text. Full functional malware modified on its own account and use it again on something else. What's specific about public repositories, you can always create another account, publish another software. Most of that software doesn't go through some detailed code review which prevents you from publishing. Security works in a way that someone reports malware and then they remove it. So in this case, when somebody puts a Red Tool up, which is fully functional malware, anybody else can also download it, rework it, and publish it for his own purposes.
PAUL ROBERTS
This is also happening in the open, basically in public.
KARLO ZANKI
One thing is when you put rating tools in a closed environment of the company, you are pentesting. But when you publish it, fully functional to public repository you need to be aware that someone else can detect it and use it for his really bad purposes.
PAUL ROBERTS
Right.
KARLO ZANKI
So that's why most of the Red Team testing tools on NPM are usually just beacons.
PAUL ROBERTS
Right. One thing I thought is listen, companies that are hired to do red teaming like their job is to get into your environment, right? That's what you're paying them to do. And so they're going to stay on top of whatever the latest and best means for compromising organizations are. The fact that this company used NPM dependency confusion is an indication that the word is out basically that this is a very reliable means to penetrate even sophisticated organizations. White hat companies doing red team assessments are using it but by extension we should also assume that the bad guys are doing this as well.
KARLO ZANKI
It's also positive from this case is that it definitely has reason awareness of the problem. It will probably spread some news among companies and that's a good thing about it. But there are well, sometimes it's good to make some publicity and sometimes it also there's always good and bad.
PAUL ROBERTS
As the saying goes, there's no such thing as bad publicity, right?
KARLO ZANKI
Yeah, that's true.
PAUL ROBERTS
Some people may take issue with that. So I guess the question is if you're an organization that an enterprise, you're doing software development, you're obviously consuming open source and third party code from different platforms, GitHub, NPM, what have you. You made a lot of investments in kind of traditional security detection and monitoring tools, ideas, IPS endpoint maybe you've got some cloud based stuff as well. So this is a new kind of attack vector that you may not be aware of or have been aware of, but clearly it's one that's being used. What can you really do from a monitoring and detection standpoint to stay on top of this and to be feel confident that somebody tries a dependency confusion attack against my organization. We're either insulated from it, we're not vulnerable or we're going to detect it when it happens. What would you recommend? Because there aren't really like security tools that do just this, that look just for this type of thing.
KARLO ZANKI
Yes, definitely what you should do is like always use the best recommendation for security because that are the places you can always catch it when it tries to get outside if it's not doing just wiping. So you need to keep alert. You can't put back the traditional methods but what you need to do is if you use any kind of public source code repository and you have a private network. You need to let's say do monitoring of the communication with the outside public repositories either through some middle protected, controlled by yourself. Mirror of such public repositories which you then do some analysis of anything that gets into it. And you also need to monitor, let's say create a process of validating configurations whenever anybody. You need to define good procedure about proper configuration of your develop environment because as stated by us, this configuration is still one of top problems and top vulnerabilities in any kind of software solution. So you need to do configuration properly. There are a lot of public documents explaining how it is properly done. You need to do that properly. There are also tools which can help you track dependencies and such stuff. You can also use them.
PAUL ROBERTS
We have, getting ready to wrap up. I think we are going to be doing questions and answers. So if you have a question for Karlo about anything we've talked about, use a Q&A feature to ask your question and Karlo and I will get to it. Karlo final question is if you're concerned that an attack like this may have taken place. What can you do to sort of retrace the steps here and piece together something like a dependency confusion attack after the fact to figure out whether you've been targeted or victimized like are there what would be the steps that you would use to try and reconstruct an attack and figure out what happened? Is it pretty straightforward, just kind of looking at...
KARLO ZANKI
First thing you need to know is your dependencies. You need to see what do you use from outside repositories. You can do that by analyzing your code. You don't need from any further action but code analysis. You need to look at the places where your dependencies have been declared and see what you have installed. That's the most important thing. And then you can see if that is really what came from your product report. If you see some, let's say anomalies like bigger numbers, unexpected names, then that's a sign that you could potentially be compromised.
PAUL ROBERTS
And then if there is a compromise to take place, obviously you can see from the Snyk write up or from our write up or JFrog. Often there are indicators of compromise, command and control servers that are used and what have you, that you would use.
KARLO ZANKI
The biggest thing which will catch your name is probably someone can modify this code, put another package name, someone can change command and control server, encryption keys, anything like that that can be changed. But what is if he's targeting you with dependency confusion attack then they are targeting your dependencies from your private repository. So that's the place that we need to check and see if public package exists on NPM.
PAUL ROBERTS
Okay, final final question and again if people have questions, ask them because we're about to wrap it up. Final, final question. Are the platform vendors themselves, NPM, GitHub, et cetera, are they engaging with this issue or this problem at all in terms of features or kind of tightening up their business logic to address these attacks or are they sort of like this is the way it's supposed to work?
KARLO ZANKI
Well, my opinion is that they handle it like this is what we wish it to work. You make sure you configure it well and you will be safe.
PAUL ROBERTS
Don't be dumb.
KARLO ZANKI
Yes, read the manual.
PAUL ROBERTS
Got it. Okay, we have one question from the audience, and Carolynn is here. Hi, Carolynn, how are you doing?
CAROLYNN VAN ARSDALE
Hi, everybody. Hi, nice to see you all. My name is Carolynn van Arsdale. I'm a cyber content creator here at ReversingLabs. Happy to be here. We have one question for you, Karlo. Are you ready?
KARLO ZANKI
Yes.
CAROLYNN VAN ARSDALE
So this one question we've got for you is what data source could help defenders detect such attacks? Question for...
KARLO ZANKI
Yes. There is no central source of data. A lot of companies are doing a lot of research on public repositories in the recent time. That's where software supply chain attacks are becoming really popular. In the last year and a half, a lot of companies are tackling this problem, doing their own research. There is no centralized like NVD for vulnerabilities. There's no central place where you can find such information. What I personally do, I follow a few Twitter accounts which have proven to give most information. You can always watch for some cyber news because usually in cyber news somebody publishes, even though the company isn't behind research, this information is usually shared. But the fastest you will know on Twitter, security professionals like Twitter, a lot. You will find is on Twitter, probably before some bigger article is written. So follow some interesting people on Twitter, and that's probably the best source of information.
PAUL ROBERTS
We can share some of those accounts too in a follow up session. Okay, I'm going to launch the poll, and then I got one more question for you, Karlo. So we got a few questions for the audience. First of all, are you interested in learning more? So if you want to go into a deeper dive on dependency confusion attacks, it doesn't have to be NPM, let us know. We've got experts like Karlo who are available to tell you more. Karlo, one of the things we've seen, in fact, there was just a tweet about it recently also is kind of folks kind of taking over domains associated with different code repositories and sort of that there's this kind of ecosystem that is vulnerable to abuse or misuse around open source code and code maintainers and so on. Is there anything really that organizations or companies can do about that risk, of just, again, somebody takes over a domain associated with some big commonly used software module or component. Somebody got a single maintainer module that somebody takes over and starts putting back doors in or stuff like that. We've seen that happen frequently as well. Any easy way to guard against that?
KARLO ZANKI
Well, I'm not sure if there is an easy way. Things were designed to work in that way. You should be responsible for your domain problem. You should take care that it stays yours. You should protect your secrets. In some other cases, if you want to protect your account, nobody will do that instead of you probably. There could possibly be some two factor authentication or anything like that that appears to seem on protecting your personal email accounts like that, that would probably solve some of the problems. But until that's implemented, it's probably your responsibility to do that on your own.
PAUL ROBERTS
It is something that we're really starting to address right now, which is a lot of this open source ecosystem is really based on trust, right. And trust between the different participants, and that is obviously vulnerable to misuse and abuse. Okay, final question, folks, is the big one, which is are you interested in the ConversingLabs t-shirt? So if you are, just let us know and we will follow up with an email, get your address, and send you one of the exclusive ConversingLabs t-shirts. And I wanted to, Karlo, anything that I didn't ask you that you wanted to say or point you wanted to make, I didn't give you a chance to make.
KARLO ZANKI
No. Thank you. It was really nice conversation.
PAUL ROBERTS
Yeah, I agree. Karlo Zanki, thank you so much for coming on and speaking to us on ConversingLabs. And we'll have you on again. I'm sure you're working on interesting stuff all the time, so it would be really great to have you back to talk about it. And thanks to Carolynn. Thanks once again for your help. And thanks to all of our attendees once again. We'll have another one of these coming up in a couple of weeks. We're going to do some live stuff from the RSA conference beginning of June, so stay tuned. Karlo, have a great day and thanks so much for joining us.
KARLO ZANKI
Thanks, Paul. Thanks to everybody who followed up in the end. Nice for me. Hope it wasn't boring.
PAUL ROBERTS
I don't think it was. All right, man, thank you so much. Thanks everyone.