Until recently, malicious actors who wanted to get their hands on sensitive corporate data or IT systems followed a few, well-worn paths. Those included phishing attacks on privileged employees, attacking the organization’s public-facing applications and infrastructure, and leveraging weak configurations or remote code execution (RCE) flaws.
Security monitoring and detection tools have evolved to address these risks. But now bad actors are following an even more effective, less scrutinized path into vulnerable IT environments — one that exploits secrets buried in both public and private code repositories.
Here's how the bad guys are gaining access to secrets in your software — and what your software and security teams can do to keep them safe.
[ Learn more in our special report, Secrets Exposed: A Modern Guide for Securing Secrets in Software ]
Secrets are left exposed in code repositories
The trend of hackers turning to open source repositories as a platform to launch attacks and facilitate compromises is being driven by changes in software development practices. For example, agile development and continuous integration/continuous deployment (CI/CD) methodologies require the use of many production-critical secrets such as environment variables, tokens, and keys to set up and maintain development processes and pipelines.
Malicious actors are pursuing those development secrets because they can use them to gain access to protected IT environments — and to facilitate lateral movement from low-value IT assets, such as developer laptops, to high-value IT assets, such as code repositories or Amazon S3 storage buckets.
Attackers can also use credentials and other development secrets to elevate privileges and plant malicious code such as ransomware and wiper malware that can wreak havoc in targeted organizations. Unfortunately, it's easier for attackers to uncover secrets left exposed in your organization’s code than you might think.
Why such secrets are so easy to find is clear: In the rush to develop and test new features, many developers place secrets such as API keys somewhere in the codebase during development to facilitate feature development and testing. That’s fine — as long as developers take steps to limit use of that token, or scrub it from the code prior to release.
But often that doesn’t happen, said Karlo Zanki, a researcher at ReversingLabs. Those stored secrets give those who obtain them unfettered use of the access token, he said.
“For example, (developers) might put an access token in the code, but not configure it in the correct way. They might not set a use limit or apply two-factor authentication.”
—Karlo Zanki
Recent examples of secrets leaks include:
- Secrets published to PyPI repositories. An analysis of secrets disclosures on the Python Package Index (PyPI) turned up many packages published by someone going by the name “18550288233” that contained API access tokens, including AWS and Google Cloud API keys, database access credentials, along with the IP addresses of the endpoints where those databases were hosted. Secrets left exposed by the developer spread across dozens of different packages, suggesting that the developer either was unaware of the exposed secrets or indifferent to them.
- Leaky functions. The security researcher Eaton Zveare manipulated the Javascript of a poorly designed application to bypass user authentication to Toyota’s Global Supplier Preparation Information Management System (GSPIMS). While scanning the Javascript code for “API keys, secret API endpoints, or anything else that might be interesting,” he discovered a secret function designed to let Toyota administrators adopt any GSPIMS user’s “view” of the application. The feature returned a valid Java Web token for any Toyota user when provided with only an email address as input. Properly leveraged, the flaw could give an attacker “total, global control over the entire system,” Zveare wrote.
Weaknesses in development tools and platforms expose secrets
Development tools and platforms can also be a target. Zanki’s analysis of the PyPI package dynamo_lock, for example, revealed that it contained web service access credentials for Amazon’s DynamoDB database; Amazon's Simple Queue Service (SQS); and Gigya’s identity management platform in a file, config.py.
Zanki subsequently determined that the credentials were inadvertently exposed. The tool the developer used to create the PyPI package simply added all the files from the project directory structure to the package, which was then published to the PyPI repository — secrets and all. The developer failed to notice the inclusion of the config.py file in the release artifacts that were produced before the package was published.
A similar platform failure was behind the leak at Travis CI in 2021, when a vulnerability in its application exposed secrets for any developer forking a public repository and printing files during a build process, Ars Technica reported. An estimated 700 million clear-text logs were viewable to anyone on the Internet for about eight-days before Travis CI noticed and patched the flaw, potentially exposing API tokens and other secrets for thousands of development projects.
Automating secrets discovery is escalating
When mistakes like that happen, whether driven by tool- or human failures, the risk of losing those secrets to attackers is increasing. Why? Because, with so many secrets potentially lurking in public repositories, malicious actors have turned to automation to facilitate secrets discovery, said Ashlee Benge, Director of Threat Intelligence Advocacy at ReversingLabs.
“There are specific types of places where secrets tend to be exposed, and they all have a sort of ‘fingerprint’ or set of characteristics that allow a bad actor to scan the Internet, watching and waiting for new secrets to be exposed."
—Ashlee Benge
Malicious actors can even automate discovery, writing scripts that leverage search engines to look for telltale URL patterns such as those containing the term ‘api’ and ‘log.txt.’ Malicious actors are also on the lookout for open or poorly secured Amazon S3 storage buckets that may contain credentials or other sensitive data. By applying automation to the search for these red flags, bad actors can receive alerts almost immediately when new credentials become exposed, she said.
Targeted hacks put developers in the crosshairs
In addition to exploiting secrets leaks from inadvertent leaks or tool failures, attackers get some developers to cough up secrets to hacks and targeted cyber campaigns. Hacks of employees or corporate IT assets, as well as malware outbreaks that extend to developers and development environments, have been behind several high-profile security incidents in recent years.
For example, CircleCI suffered a hack of its continuous integration platform in January, 2023, and password management application provider LastPass was hacked in August, 2022. Both breaches were the result of attacks on individual developers in which the engineers’ corporate laptops were compromised. In the latter case, a threat actor was then able to gain access to the company's cloud-based development environment.
With CircleCI, the breach exposed private code repositories of CircleCI customers, including secrets stored in those non-public repositories. In the case of LastPass, those attacks resulted in the theft of cloud backups, including system configuration data, API secrets, third-party integration secrets, and encrypted and unencrypted LastPass customer data.
Of course, not every secret discovered in source code constitutes a security breach. For example, many code repository scans turn up “credentials” that are simply place-holder values tucked away in the developer’s code or comments. These appear legitimate, but don’t provide access to any protected system or data.
Also, development and security teams that are aware of the risk that exposed credentials pose commonly place so-called “canary tokens” in their code bases. These appear to be legitimate access tokens, but do not provide access to protected resources. These teams closely monitor those false tokens in order to flag efforts by hackers to exploit exposed secrets in their code bases.
However, while some secrets are innocuous or are fakes planted by security teams, ReversingLabs data suggests that plenty of dangerous secrets still exist in codebases, and it remains easy for attackers to discover tokens for resources like AWS and Google cloud infrastructure that have been left exposed in developers’ code.
What’s needed: Secrets visibility
Due to the diversity of causes for the leaking or exposure of development secrets, it’s unlikely that a “silver bullet” solution to the problem will emerge. There is no easy fix for problems as complex as developer awareness, malware, hacks of developers and development environments, and vulnerabilities in third-party development tools and platforms.
But there are changes you can make that will improve the resilience of your development organization to secrets leaks. Among them is gaining better visibility into the presence of stored secrets within code repositories and the management of tokens. Over the last year, a growing number of organizations such as CircleCI and GitHub have introduced free and automated scanning tools to locate secrets hidden in code.
The use of such tools and practices are voluntary today, but that may not be the case for long. The Federal Government wants to operationalize software supply chain security practices among firms that sell software and services to its agencies. That means that techniques for addressing secrets exposure within your development organization may soon move from “nice-to-have” to “must-have.”
Keep learning
- Get up to speed on securing AI/ML systems and software with our Special Report. Plus: See the Webinar: The MLephant in the Room.
- Learn how you can go beyond the SBOM with deep visibility and new controls for the software you build or buy. Learn more in our Special Report — and take a deep dive with our white paper.
- Upgrade your software security posture with RL's new guide, Software Supply Chain Security for Dummies.
- Commercial software risk is under-addressed. Get key insights with our Special Report, download the related white paper — and see our related Webinar for more insights.
Explore RL's Spectra suite: Spectra Assure for software supply chain security, Spectra Detect for scalable file analysis, Spectra Analyze for malware analysis and threat hunting, and Spectra Intelligence for reputation data and intelligence.