Back2Source next
Better matching of binaries with source code
Sometimes, the released binaries of an open source package do not match its source code. Or the source code does not match the code in a version control repository. There are many reasons for this discrepancy, but in all cases, this is a potential serious issue as the binary cannot be trusted. Additional (or different) code in the binary could be malware or a vector for unknown software vulnerabilities, or create FOSS license compliance issues.
"Back to source" creates analysis pipelines in ScanCode.io to systematically map and cross-reference the binaries of a FOSS package to its source code and source repository and report discrepancies. We call this the deployment to development analysis (d2d) to map deployed code (binaries) to the development code (the sources) and we enable applying this "trust but verify" approach to all the binaries.
- The project's own website: https://aboutcode.org
Why does this actually matter to end users?
FOSS packages are under constant attack from malicious actors that are trying to inject malware payloads and backdoors in the code of these projects to compromise their users' systems. In early 2024, the popular XZ Utils open source file compression utility was subject to such an attack. See https://en.wikipedia.org/wiki/XZ_Utils_backdoor and https://tukaani.org/xz-backdoor for details on a complex and mostly successful attack and compromise of the open source so-called "Software Supply Chain".
"Back to source" was able to detect and flag the backdoor-related code as modified, possibly tainted and requiring a review, and this using code published and available before the XZ Backdoor became known. And with the upcoming Cyber Resilience Act (CRA), the systematic analysis of FOSS packages source and binary integrity will become an imperative.
Based on this early success, this project furthers and enhances "Back to source" to analyze and map more package binaries to their corresponding sources and refine the existing analysis. It will support analysis for more package ecosystems such as WinPE/COFF binaries on Windows, Mach-O binaries on macOS and iOS, Android binaries, and Java, Ruby and Python packages mixing scripts with native binary libraries.
Run by AboutCode Europe ASBL
This project was funded through the NGI0 Core Fund, a fund established by NLnet with financial support from the European Commission's Next Generation Internet programme, under the aegis of DG Communications Networks, Content and Technology under grant agreement No 101092990.