Verticals + Search


Federating self-hosted search hubs

Searx is a popular meta-search engine, with the aim of protecting the privacy of its users. In the typical use case, few users trust one instance. However, a third-party services can easily fingerprint the users using the IP address of the searx instance and the user's queries. The project aims to create a searx federation to solve this issue. First, a protocol needs to be defined to allow the instances to discover themselves. Then, each instance will be able to proxy the HTTPS requests through other instances, so the user only has to trust one instance. Also, each instance will spread the requests to other instance according to their response time, and make that IP addresses are evenly used, or at least in the best possible way. To ensure the latter, the statistics page will be enhanced and available through an API that other instances will use. The federation will make sure that bots can't abuse this pool of IP address.

Why does this actually matter to end users?

Search and discovery is one of the most important and essential use cases of the internet. When you are in school and need to give a presentation, when you are looking for a job, trying to promote your business or finding relevant commercial or public services you need, most of the time you will turn to the internet and more importantly the search bar in your browser to find answers. Searching information and making sure your name, company or idea can be discovered is crucial for users, but they actually have little control over this. Search engines set the terms for what results you see, how your website can be discovered and what information is logged about your searches. What terms are set remains obscure for users and they can only follow the rules laid out for them, instead of deciding on their own what, where and how to find the information they are looking for.

More transparent, customizable and privacy-friendly search puts the user in the driver seat and can provide them with meaningful results. Searx does this by aggregating results from more than 70 search services while avoiding any user tracking or profiling. With every search users can decide what engines they want to use and which they don't, what search language must be used and other options that are saved on the device and can therefore not be tracked. Users are also free to run their own instance of searx, giving them complete control over the source code that makes that version of searx tick (and alter it however they like) and ensure additional privacy protection.

This project builds on the open and customizable setup of searx to provide users with extra privacy protective measures. Interested third parties can potentially use the IP address of that instance and the specific queries of its users to uniquely identify and follow them. This project will solve this issue while also making it easier for users to switch and maintain searx instances. Ultimately this new tooling can help to make searx more user-centric, stable and privacy-friendly.

Run by -

Logo NLnet: abstract logo of four people seen from above Logo NGI Zero: letterlogo shaped like a tag

This project was funded through the NGI0 Discovery Fund, a fund established by NLnet with financial support from the European Commission's Next Generation Internet programme, under the aegis of DG Communications Networks, Content and Technology under grant agreement No 825322.