Send in your ideas. Deadline October 1, 2025
Grant
Theme fund: NGI0 Commons Fund
Start: 2025-08

Code Genetics

Scanning tool for identifying code origins

It is inherent to the nature of FOSS to be reused and remixed. But it is difficult to find which project is the actual correct, upstream original project where the code was created first. And this is critical for both security and license compliance. For example, there are several known cases of people forking a FOSS project and changing its main license to suit their needs.

Reviewing a codebase for its origin cannot be fully automated (yet) and requires extensive human review to disambiguate and establish correct provenance of code detected through scanning, matching, package manifests and other clues.

AboutCode's Code Genetics features will be integrated in DejaCode, ScanCode.io and PulrDB to aggregate scan results from complementary FOSS tools including ScanCode, MatchCode, and will also work to integrate other tools such as BANG, OWASP depscan or BIDS, and helps to automatically identify the true, correct code origin.

The purpose of the Code Genetics project is to significantly reduce the amount of human scan result reviews required to only a small ambiguous subset of complex cases where we cannot automatically identify the correct code origin.

The outcome for this project will be to aggregate origin scans in AboutCode, design a policies and rules system to automate scan reviews and integrate these features in PurlDB, ScanCode.io, MatchCode and DejaCode as needed to efficiently review and curate scan results, and finally shared curated data as open digital commons using FederatedCode .

Run by AboutCode Europe ASBL

Logo NLnet: abstract logo of four people seen from above Logo NGI Zero: letterlogo shaped like a tag

This project was funded through the NGI0 Commons Fund, a fund established by NLnet with financial support from the European Commission's Next Generation Internet programme, under the aegis of DG Communications Networks, Content and Technology under grant agreement No 101135429. Additional funding is made available by the Swiss State Secretariat for Education, Research and Innovation (SERI).