Backward incompatibilities, the lack of standard schemas for components, and projects staffed by too few developers are just some of the risks threatening the security of free and open-source software (FOSS), a study released by the Linux Foundation, the Open Source Security Foundation (OpenSSF), and Harvard University has found.
The Census III study is based on 12 million observations of FOSS libraries used in production apps at more than 10,000 companies. It follows two earlier efforts to identify risky practices in the FOSS universe, Census I in 2015 and Census II in 2022.
The new study aims to shed additional light on the most commonly used FOSS packages at the application library level, explained the authors of the report, Frank Nagle and Richie Zitomer, of Harvard Business School; Kate Powell, of the Laboratory for Innovation Science at Harvard; and David A. Wheeler, of OpenSSF and the Linux Foundation.
"This effort builds on the Census I report that focused on the lower level critical operating system libraries and utilities, and the Census II report that sought to improve our understanding of the language-level FOSS packages that software applications rely on," the researchers wrote. "Such insights will help identify critical FOSS packages for resource prioritization to address security issues in this widely used software."
Here are the key challenges facing open-source software in 2025.
[ Get RL's Essential Guide: Software Supply Chain Security for Dummies ]
1. Backward compatibility
In one of the high-level findings in the report, the researchers pointed to risks created when significant differences exist between old and new software versions. A case in point identified in the report is the transition from Python 2 to Python 3, introduced in 2008. It noted that Python 2 is still used by 7% of Python developers — and by an even higher percentage in certain segments, such as data analysis (29%), computer graphics (24%), and DevOps (23%).
“This illustrates that transitioning to a new version of software, when significant incompatibilities are introduced, can take more than a decade if it occurs at all,” the researchers wrote. “It also suggests the potential security risks from backward incompatibility. Python 2 is no longer supported by the Python developers, including for security vulnerabilities, yet in some market segments it still has 23-29% of the users.”
Ngoc Bui, a cybersecurity expert at Menlo Security, said backward incompatibility between Python versions can introduce security risks by hindering the adoption of newer, more secure versions
“Organizations that rely on older versions due to compatibility issues may remain vulnerable to known exploits. Maintaining code for both versions increases complexity, raising the chance of errors.”
—Ngoc Bui
Reliance on outdated libraries and frameworks further expands the potential attack surface, Bui said.
Tim Mackey, head of software supply chain risk strategy at Black Duck Software, said backward compatibility is something that should be supported only "for a very short period of time, not something that you run for a long time."
"If you can avoid it in production, you should,” said one of the study’s partners, which included Fossa, Snyk, and Sonatype.
2. Standardized schema efforts make progress
The researchers also found that there are promising efforts to implement a standardized naming schema for software components, which would improve supply chain security and future Census efforts:
“[T]here is a critical need for a standardized software component naming schema that is in widespread use. Until one is widely used, strategies for software security, transparency, and more will have limited effect.”
Organizations will remain unable to communicate with one another on a global level, the study's authors continued. “Given the increasing frequency and sophistication of cybersecurity incidents in which the software supply chain plays a part, there is precious little time to waste,” they said.
Michael J. Mehlberg, CEO of Dark Sky Technology, said that an ideal standardized naming scheme would be a descriptive naming convention for identifying a software component or package that clearly specifies that software’s origin, type, version, and other relevant details necessary to distinguish it from other components or packages, while also providing enough information to allow vulnerability and risk assessment tools to investigate the trustworthiness of the software.
A package URL (PURL) is one such naming scheme for a software package or component that meets all of these criteria, he said. A PURL is like a universal address for a software package, making it easier to reference and share them consistently, he said.
Without a standardized naming schema like PURL, software composition, vulnerability, and risk analysis tools can’t do their job adequately, essentially preventing anyone from understanding potential problems and pitfalls that could crop up from using an open-source component or package, Mehlberg said.
“Real problems like dependency management, which enables precise tracking of what software is used and what vulnerabilities might exist in that software, become an issue. Risks like typosquatting, where maliciously named packages mimic legitimate packages, become possible. And things like automation, which are absolutely essential for software applications that may use hundreds or thousands of open-source components, become extremely difficult, if not impossible.”
—Michael J. Mehlberg
The need for standardized schema has been brought into sharp relief by the increased use of software bills of materials (SBOMs), said Mackey.
“SBOM transparency has highlighted the fact that there could be the same name for a thing that does different work. And that starts to present a security issue as much as a stability issue.”
—Tim Mackey
The component-naming problem is something that is more acute now that we have SBOMs, Mackey said. “But more importantly, it becomes something that the teams who are trying to patch stuff need to worry about now that people are thinking about their software supply chain more than just what they receive from a specific vendor,” he said.
“Without standardization, a person who's looking at the supply chain side of the equation might not have all of the information to go and say that components with the same name but different functions are, in fact, distinct things. They could end up overwriting a perfectly legitimate implementation of software with something that's just going to break everything.”
—Tim Mackey
“So it's a little bit less of a security problem in terms of an attacker, Mackey said. “It's more a case [that] from a patch management or vulnerability response perspective, if you have the wrong thing, then you're going to make incorrect decisions that could lead to some amount of failure,” he said.
3. The myth of “many eyes”
The Census III report also warned of the risks of low-staffed open-source projects. It noted that in 47 of the top 50 non-npm projects in 2023, 17% had one developer accounting for more than 80% of commits authored, 40% had only one or two developers accounting for that much, 64% had four or fewer developers doing the same, and 81% of projects had 10 or fewer developers authoring 80% of commits.
The researchers wrote:
“These findings are counter to the typically held belief that thousands or millions of developers are responsible for developing and maintaining FOSS projects. Many projects do receive contributions from many people, but often a few people do most of the work on any given project.”
Low staffing has been the dirty little secret of open-source software, Mackey said. “You've heard about the many-eyes thing: 'With open source, everybody's going to take a look at it. It's going to be awesome and wonderful.'” But that's not the reality, Mackey said.
“It's awesome and wonderful if you're the Linux kernel, where you've got lots of developers. Or it's awesome if you're Kubernetes. Lots of developers there. But once you get beyond the top 25, you really start to get into code that only a handful of people know. And that becomes a real problem.”
—Tim Mackey
In practice, software development quality and security are produced through comprehensive tests and peer reviews, said Shane Miller, a senior fellow with the Atlantic Council's Cyber Statecraft Initiative.
“When resources are strained, test coverage and code reviews are often compromised, and software will have more bugs, security vulnerabilities, and operational failures. FOSS projects with a small number of developers may be resource-strained and more susceptible to those quality and security challenges as a result.”
—Shane Miller
Adrien Guinet, head of engineering for the cybersecurity group at SandboxAQ, said the security incident with the XZ Utils project earlier this year is an example of the risks posed by low-staffed open-source projects. XZ Utils is an open-source data compression utility available on almost all installations of Linux and other Unix-like operating systems. Fortunately, the flaw was exposed before being distributed through updates to the Linux supply chain.
“XZ Utils was maintained by only one person, who was having personal issues. He was manipulated into adding another maintainer that effectively backdoored the Debian version of this library.”
—Adrien Guinet
Many OSS libraries that became critical to infrastructure over time are in the same state, Guinet said. “It's related to the very hard problem of how to fund open-source developers and projects.”
4. Memory safety is key: The rise of Rust
The Census III report noted that significant advances have been made using memory-safe languages, particularly Rust. “Use of components from Rust package repositories have increased considerably since Census II, signaling an industry response to memory-safety vulnerabilities,” the researchers wrote.
The report explained that memory-safety vulnerabilities are one of the most prevalent types of disclosed software vulnerability. The root cause is that the C and C++ programming languages, unlike almost all other programming languages, do not prevent memory safety errors by default.
Based on data from software composition analysis (SCA) scans, the researchers found that the number of Rust components being used has increased since 2020. In Census II, the average percentage of direct components from a Rust package repository was 0.02%. In Census III, that percentage rose to 0.12%. This represents a 500% increase in direct Rust application components reported by SCAs.
“However, the drastic increase in direct Rust application components since Census II points to increased adoption of the Rust language. This is significant progress towards using more memory-safe languages in OS.”
Black Duck's Mackey said that memory safety initiatives from the U.S. Cybersecurity and Infrastructure Security Agency (CISA) over the last year have started to see significant uptake.
“Rust is effectively the poster child for memory-safe programming languages, even though what CISA is focusing on is: 'Don't use C and C++ in assembly language anymore. Use a modern language.' That would encompass other languages, but Rust gets a lot of the mindshare as a memory-safe programming language. So we're seeing adoption of Rust in open source, which is a fantastic move forward."
—Tim Mackey
Mackey cautioned, however, that while an application may be written in Rust, it could still depend on an operating system written in another language. “The application may depend on libraries that come from the operating system that may not be memory-safe,” he explained. “So my code might be memory-safe, but my dependencies may not be."
The significant rise in the use of Rust package repositories enhances security through Rust’s strong memory-safety guarantees, said Jason Soroko, a senior fellow at Sectigo. However, this growth may also introduce risks if the rapid adoption outpaces thorough auditing and maturity of Rust libraries, he said.
“New or less-tested packages might harbor undiscovered vulnerabilities, and the dependency on a growing ecosystem requires robust security practices to ensure that the benefits of Rust are not undermined by potential weaknesses."
—Jason Soroko
Are we getting closer to a consensus?
Application security practitioners do not have a common vocabulary, risk analysis framework, or security process for FOSS projects, Cyber Statecraft's Miller said. “The Census III takes us a little closer to an important goal line, and I hope there will be more work like this from a variety of sources in the future,” she said.
Census I and II shed a lot of light on the governance requirements that people should be thinking about, and Census III continues with that trend, Mackey said. “That's one of the big reasons why we support the effort. I see a lot of positive that has come out of the last two censuses — and I expect that's going to continue with Census III."
Roger Grimes, a defense evangelist at KnowBe4, said there is nothing inherent in FOSS projects that makes them more or less secure than non-FOSS projects, but he acknowledged that the lack of dedicated, paid resources can definitely be challenging to upholding ongoing strong security. It really has more to do with the individual leaders, their visions, and continued enforcement of their ideals that made it a secure FOSS project, he said.
“With that said, every single previous broad initiative to strengthen FOSS has failed. I don't expect these findings and recommendations to be any different. It is difficult to get agreed-upon consensus in individual projects, loved by independent people."
—Roger Grimes
Keep learning
- Get up to speed on securing AI/ML systems and software with our Special Report. Plus: See the Webinar: The MLephant in the Room.
- Learn how you can go beyond the SBOM with deep visibility and new controls for the software you build or buy. Learn more in our Special Report — and take a deep dive with our white paper.
- Upgrade your software security posture with RL's new guide, Software Supply Chain Security for Dummies.
- Commercial software risk is under-addressed. Get key insights with our Special Report, download the related white paper — and see our related Webinar for more insights.
Explore RL's Spectra suite: Spectra Assure for software supply chain security, Spectra Detect for scalable file analysis, Spectra Analyze for malware analysis and threat hunting, and Spectra Intelligence for reputation data and intelligence.