Calls: Send in your ideas. Deadline October 1st, 2023.
More projects like this
Verticals + Search

Great OCR for SANE

Integrate OCR capabilities into open source scanning tools

We have become dependent on search engines, allowing us to locate a document using some specific words across billions of webpages. However, not every document is born digital - or may reach the web via an indirect way. And users with for instance visual disabilities cannot read documents that are 'just' pixels.

The SANE project is a collection of open-source scanner drivers and related software. SANE tools allow the users to convert their documents, photos and any other similar material from a completely unsearchable and non-discoverable analog form into a digital representation, which can be easily shared and distributed.

The SANE-OCR project enables users to close the gap right at the stage when physical documents are converted from their incoming "analog" form to a searchable digital form - using a completely open-source stack. While the traditional result of scanning is just the visual image (essentially a photo), but in addition contains the recognized text using optical character recognition (OCR). This outputs documents which are searchable and discoverable.

Run by Kodo Baitas, MB

Logo NLnet: abstract logo of four people seen from above Logo NGI Zero: letterlogo shaped like a tag

This project was funded through the NGI0 Discovery Fund, a fund established by NLnet with financial support from the European Commission's Next Generation Internet programme, under the aegis of DG Communications Networks, Content and Technology under grant agreement No 825322.

Navigate projects

Job openings : NLnet is looking for (junior or senior) technology assessors .

Please check out NLnet's theme funds, such as NGI Assure and NGI Zero Entrust.

Want to help but no money to spend? Help us by protecting open source and its users.