Verticals + Search

Vertical use cases, Search, Community

This page contains a concise overview of projects funded by NLnet foundation that belong to Verticals + Search (see the thematic index). There is more information available on each of the projects listed on this page - all you need to do is click on the title or the link at the bottom of the section on each project to read more.

Agorakit — Groupware which is a friendly online home to communities

Agorakit is a web-based, open source organization tool for collectives. By creating collaborative groups, people can discuss topics, organize events, store files and keep everyone updated as needed. The tool is very easy to use, participants only need to register with an email, the very low barrier of entry and easy to use user interface make it an ideal tool for heterogeneous groups with people of broadly different backgrounds and skills. Those seem like simple features, but to have access to all those in the same product without friction is in our very humble opinion unique to Agorakit. The scope of this project is to enhance documentation, ease use and installation, and allow external communication (including federation).

>> Read more about Agorakit

Mifos X (Apache Fineract) — Type safety for/refactoring of Apache Fineract banking software

Apache Fineract is a sophisticated core banking system that provides comprehensive financial technology solutions. It offers features for client data management, loan and savings portfolio management, integrated real-time accounting, as well as extensive reporting capabilities. By commoditising core banking infrastructure, Fineract empowers communities and organisations of any size to integrate financial services everywhere.

Mifos X includes a payment orchestration engine and mobile banking apps, lowering the threshold to participate in the digital economy. In the scope of this project, type-safety is added to the software, QueryDSL is introduced to generate code and a significant amount of technical debt is resolved.

>> Read more about Mifos X (Apache Fineract)

Bonfire Search & Discovery — Improving search and discoverability in the Fediverse

Bonfire is a modular ecosystem for federated networks. The project creates interoperable toolkits that people can use to easily build their own apps to meet their specific needs. Users are then free to interact with multiple people and groups using these apps hosted on their own device, regardless of what federated software these other people use. Federated topics within the Bonfire ecosystem can consist of a hashtag, a category in a taxonomy, a location, etc. This enables users to find a topic they are interested in, see everything that was tagged with that (publicly or in their network), and follow it to receive any new tagged content. This will be interoperable with existing fediverse apps like Mastodon without requiring extra development on their end, and will create a decentralised graph of topics that can help relevant information flow from instance to instance.

All content on a Bonfire instance (including remote content coming in via follows or federated topics) will also be aggregated in a local search index with which the user can search their own data, information from people or groups they follow, as well as content from topics or locations they are interested in from around the fediverse. This search will happen locally on their device (which is a plus for privacy), with results appearing instantly while typing a query, and being able to filter the results (e.g., by object or activity type, hashtags, topics, or language). Every line of Bonfire’s code is available to be used or forked, in a collection of libraries that can be assembled and re-assembled to create all kinds of full-featured apps. One example is Bonfire's mutual aid extension where users can post and search for requests and offers across different instances according to topic and/or geographical location.

>> Read more about Bonfire Search & Discovery

Castopod — Podcasting in the fediverse

Castopod is an open-source podcast hosting solution for everyone, that can connect to the Fediverse through the W3C ActivityPub standard (Pixelfed, Mastodon, Pleroma…). Castopod is user friendly, and allows for easy discovery everywhere. Whether you are a beginner, an amateur or a professional, you will get everything you need: you can create, upload, publish, manage server subscriptions (WebSub embedded server). You can allow users to listen to your podcast directly, but just as easily connect to commercial directories (Apple, Google, Spotify…).

Take back control: interact with your audience on your platform (like, share, comment), the social network IS the podcast. In addition to supporting W3C ActivityPub, you can also export to proprietary social networks (Twitter, Instagram, Youtube, Facebook). Castopod is easily hosted on any PHP/MySQL server: unzip it and you and other podcasters are ready to broadcast professionally.

>> Read more about Castopod

COCOLIGHT — Lightweight version of Communecter

COmmunecter is an open source social and societal platform. COCOLIGHT is an low tech light weight client able to connects to any COmmunecter server, allowing both read and contribution modes. Easy to Install, fully Activity Pub compliant, federating organizations, events, projects and open badges. It allows to create networks of many COPI instances interconnected together and exchanging information and data.

>> Read more about COCOLIGHT

Conzept encyclopedia — An alternative encyclopedia

The Conzept encyclopedia is an attempt to create an encyclopedia for the 21st century. A modern topic-exploration tool based on: Wikipedia, Wikidata, the Open Library, Archive.org, YouTube, the Global Biodiversity Information Facility and many other information sources. A semantic web app build for fun, education and research. Conzept allows you to explore any of the millions of topics on Wikipedia from many different angles - such as science, art, digital books and education - both as a defined semantic entity ("thing") as well as a string. Client-side topic-classification in addition allows for a fast, higher-level logic throughout the whole user experience. Conzept also has an uniquely integrated user-interface, which gives you a single well-designed view of all this information (in any of the 300+ Wikipedia languages), without cognitive overload.

>> Read more about Conzept encyclopedia

ArtistHub — Allow creative artists to gain visibility and build reputation on the web

The Artist Hub is a progressive web app developed by The Creative Passport MTU, that allows users - Music makers - to connect different data sources and display their feeds all in the same global wall arranged in chronological order. Music makers will be able to create a custom fan page on a self-hostable server where all their music and related content can be placed and shared with their fans.

The underlying architecture for subscribing to and receiving posts/updates from connected services will be built using ActivityPub. The idea behind this architecture is a free and open-source way for music makers to share their content without needing to post to a number of different websites and social media and for fans to have the freedom to choose their platform of choice for engaging with that content.

We will use ActivityPub to aggregate data from a number of platforms. This will enable us to offer support for video (using PeerTube), audio (using Funkwhale), images (using PixelFed) and text (using Mastodon).

>> Read more about ArtistHub

Decidim revamp — Tools for participatory democracy

Decidim is a free and open, digital infrastructure for participatory democracy. Decidim allows to create and configure a web platform to be used as a political network for democratic participation. The platform is freely available for organisations and institutions seeking to initiate participatory processes such as deliberation, decision-making, collaboration, direct democracy and co-design.

In order for the project to reach a new stage of technical maturity, the project will overhaul the user experience through a complete redesign of its interface. It is necessary to review, order and, if necessary, remove features. This project is focused on doing the less visible, but necessary work, to make the code clean and sustainable in the long term.

>> Read more about Decidim revamp

DeltaBot — Social discovery over mail-based chat

Why make humans be the only ones to search new content that is relevant to you, if bots can be made to do the same on your behalf? The DeltaBot project will research and develop decentralized, e2e-encrypting and socially trustworthy bots for Delta Chat (https://delta.chat). Bots will bridge with messaging platforms like IRC and Matrix, offer media archiving for its users and provide ActivityPub and RSS/Atom integration to allow users to discover new content. Our project is not only to provide well tested and documented Chat Bots in Python but also help others to write and deploy their own custom bots. Bots will perform e2e-encryption by default and we'll explore seamless ways to resist active MITM attacks.

>> Read more about DeltaBot

Discourse ActivityPub — Connecting internet discussions with ActivityPub

Discourse is a modern open source discussion platform. In some ways it can work similar to email but it much better suited to large scale group discussions that in turn become searchable (i.e. indexable) items of knowledge on the world wide web (given that the forum is publicly viewable). We are building a two-way mirror for Discourse topics, compatible with the ActivityPub standard. The first iteration of this will be "Live Topic Links": When a topic is created on Forum A by pasting a URL to a topic on another Discourse instance (Forum B), the user is prompted "would you like to sync replies between this forum and the forum you're linking to?" If the user clicks "yes," replies to the mirror topic on Forum B would be synced back to the topic on Forum A, and vice versa (if Forum B has "whitelisted"rum A).

>> Read more about Discourse ActivityPub

EGIL SCIM client — System for Cross-domain Identity Management

Managing student information in an effective, secure and GDPR compliant way is crucial for the digitalized school. EGIL is an open source client that facilitates the exchange of student information to external providers of study material or administrative services in a standardized way. It supports attributes based on SCIM (RFC 7642-7644) and extensions, it provides an interface to common directory services and supports federated solutions between a large number of school principals and service providers. This project will improve EGIL's federative capabilities, submit an Internet-Draft on the subject federated accounts provisioning, as well as providing a proof of concept for using SCIM as the standard for exchange of student information. This will eliminate the problems caused by using several different exchange protocols and formats between school principals and service providers.

>> Read more about EGIL SCIM client

EduLuanti — Education platform centered around 3D/cube world Luanti

EduLuanti (previously known as the MinetestEdu project) is an open-source initiative designed to provide French teachers with tools for using the Minetest video game in the classroom. The aim is to encourage the adoption of open-source tools among educators and students in France and abroad, while contributing to the Luanti community with the development of educational features and customisable graphical elements with a focus on improved filtering of educational mods and enhanced manipulation of 3D data. This initiative follows on from the UNEJ (Urbanités Numériques En Jeux) project, which was developed in the north of Paris and is one of several projects using Luanti for education.

>> Read more about EduLuanti

EDeA — A forge suitable for open hardware development

The short version: EDeA is a novel approach to allow exploration of and improve discovery within the open hardware ecosystem - in order to help make open hardware designs and components discoverable and reusable.

At this moment in time, pretty much everything surrounding open hardware development is manual. Beyond just typing something into a generic search engine there isn't really suitable tooling available to search across what already exists. Accessible and usable distributions, collaboration tools and version control are what drove the free and open source software revolution, now open hardware needs to take the same leap forward.

Open hardware electronics projects are growing in numbers, thanks to crowdfunding, a strong developer community, and sophisticated open source electronic design automation (EDA) tools like KiCad. Between circuit schematic and printed circuit board (PCB) layout there is a logical association, but are being handled by separate programs, and therefore one can’t simply copy-paste design blocks. In 2020 it is still next to impossible to reuse proven parts of different designs without needless reimplementation. By leveraging KiCad’s pcbnew and eeschema scripting, a new way of building modular, reusable electronics opens. We are creating a catalog and community portal for discovery and development of proven circuit modules: power management, signal conditioning, data conversion, micro-controllers, etc.

>> Read more about EDeA

Email for expert news — Keep up to date with a flow of publication

Full text search can help locate text within a certain corpus, but it doesn't help much with staying up to date with the continuous development of a certain field. Ingesting the daily flood of potentially relevant publications is time-consuming, and so sharing and delegating effort makes a lot of sense. Bims (Biomed News) and NEP (New Economics Papers) are long standing projects in this vein, based on PubMed and RePEc, respectively. They are early examples of expertise sharing systems that deliver digests - human curated sets of the most relevant new publications. Dedicated experts filter the flow of incoming publications in different domains, allowing everyone to stay up to date with the latest developments through publicly available periodic reports on a variety of topics.

This project aims to build a new software tool to allow users to subscribe to these report across different fields of interest. Subscribers get a fully personalised report meaning they will not have to deal with distractions such as duplicate items. The software aims to be generic, so it may be applied to any serial data of records formatted in a structured way.

>> Read more about Email for expert news

Empowering Mobilizon — Find, create, organise and curate events

Mobilizon empowers users to create collaborative platforms for promoting local events, activities, and groups. Utilizing the ActivityPub protocol, these platforms facilitate information sharing, allowing users to publish their events on one Mobilizon instance and broadcast them across others when appropriate. Designed with user-friendliness in mind, Mobilizon aims to reduce local advertisers' reliance on major tech companies. Currently, dozens of Mobilizon instances are operational, collectively attracting thousands of users. However, this is not enough to harness the full potential of the network effect and drive meaningful societal change. Numerous enhancement requests and areas for improvement have been identified, and it is crucial to refine and prioritize these initiatives. Should we enhance federation with ActivityPub? Develop solutions to combat spam? Allow users to join a waiting list for fully booked events? Improve categorization and search functionalities? Address persistent bugs? Optimize response times? To tackle these challenges, we aim to establish a governance structure involving other instance administrators. Together, we can prioritize the most impactful changes and integrate them into our roadmap, ultimately making it easier for the community to discover and engage with local activities.

>> Read more about Empowering Mobilizon

Explain — Deep search on open educational resources

The Explain project aims to bring open educational resources to the masses. Many disparate locations of learning material exist, but as of yet there isn’t a single place which combines these resources to make them easily discoverable for learners. Using a broad array of deep content metadata extraction techniques developed in conjunction with the Delft University of Technology, the Explain search engine indexes content from a wide variety of sources. With this search engine, learners can then discover the learning material they need through a fine-grained topic search or through uploading their own content (eg. exams, rubrics, excerpts) for which learners require additional educational resources. The project focuses on usability and discoverability of resources.

>> Read more about Explain

FOSS Warn — Aggregate source of emergency alerts

The FOSS Public Alert Server lets clients receive Push Notification (via UnifiedPush) about official emergency alerts worldwide. Besides infrastructure like sirens, radio, and Cell-Broadcast, CAP (Common Alerting Protocol) alerts are another way of alerting the public. CAP alerts are used for a wide variety of emergencies. From alerts about extreme weather to alerts about contaminated drinking water to pandemics. Our server bundles over 280 official CAP alert publishers worldwide and can easily extend to more sources. This project aims to bundle the underlying alerting infrastructure into a single trustworthy source of information, not to replace it.

Having a shared global public source of information reduces the user's dependency on local emergency apps - which are often only available for the two largest mobile platforms. Furthermore such a converged effort makes it much simpler to develop clients for devices other than cell phones (like desktop PCs or smart speakers). Thirdly it can make traveling safer. Finding and installing the right local emergency apps to receive emergency alerts when traveling is quite the hurdle. With our solution, it would suffice to install one app for the world. One such app is FOSS Warn, an Android app that for now receives alerts for Germany and Switzerland. Within this project, FOSS Warn will be extended to work worldwide with the new server infrastructure.

>> Read more about FOSS Warn

FairSync — Simplify aggregation and discovery of places and events

How can we make it possible to search across different maps and lists of events maintained by different organisations? By connecting them, of course! FairSync develops and collects best practices to synchronize maps and events and to federate messengers and identities active in the global movement for sustainability. System integrators are faced with fast evolving APIs and protocols when they try to discover and connect systems and make search more easy.

We will work on master-master replication frameworks of metadata enriched data sets and test with platform providers for sustainability affairs. One approach is the "lazy master scheme": a common update propagation strategy where changes on a primary copy are first committed at the master node, afterwards the secondary copy is updated in a separate transaction at slave nodes.

We will try to advance such immediate update propagation in this project using protocols such as ActivityPub or the InCommon API. Federation of identities will be managed with SAML or oAuth2 protocols with fairlogin as a common identity provider.

>> Read more about FairSync

Faircamp 1.0 — Self-hostable, maintenance-free websites for audio producers

Faircamp is a static site generator for audio producers, empowering artists, labels and everyone else working with sound to distribute their work on their own, with low resource requirements and little to no maintenance effort. The aims within this project are to address usability, accessibility and cultural concerns, to improve documentation, to implement missing core architecture components and complete the embedding functionality, as well as complementary bugfixing and smaller feature additions.

>> Read more about Faircamp 1.0

searx — Federating self-hosted search hubs

Searx is a popular meta-search engine, with the aim of protecting the privacy of its users. In the typical use case, few users trust one instance. However, a third-party services can easily fingerprint the users using the IP address of the searx instance and the user's queries. The project aims to create a searx federation to solve this issue. First, a protocol needs to be defined to allow the instances to discover themselves. Then, each instance will be able to proxy the HTTPS requests through other instances, so the user only has to trust one instance. Also, each instance will spread the requests to other instance according to their response time, and make that IP addresses are evenly used, or at least in the best possible way. To ensure the latter, the statistics page will be enhanced and available through an API that other instances will use. The federation will make sure that bots can't abuse this pool of IP address.

>> Read more about searx

First Classify Documents — Categorise different types of official documents

With governments all over the world turning to digital filing systems, millions of paper files still wait to be digitized. One major challenge in this process is a structured approach to classifying and ordering documents. It is an unfortunate fact that many public documents are bitmap images of texts. For instance, tenders are published digitally but the actual resulting contracts are not published in a way that allows them to be indexed and queried - which hinders civil society in their ability to access these documents. Open source OCR software needs to become better to get good results with this. This project developed a system for models to distinguish between different types of official documents. able to classify state documents according to structure, keywords, document name, word and page count, metadata and context.

>> Read more about First Classify Documents

Folksonomy engine for the food ecosystem — Data modelling by the community

Everybody is interested in the food they eat, by many different aspects, ranging from taste, cost, ingredients and nutrition to its impact on health, the environment and society. We also happen to have many different names for the same food, the way we prepare it and other properties - sometimes only used very locally. That means it is not always easy for everyone to effectively search open data sets like OpenFoodFacts. Open Food Facts - sometimes referred to as the "wikipedia for food products" - is the biggest open food-database in the world.

The Folksonomy engine for the food ecosystem created within this project will unleash an ocean of new data and uses regarding food. Citizens, researchers, journalists, professionals, artists, communities, and innovators will be able to define and add new properties of their choice to food products on Open Food Facts for their own use or to enrich the shared knowledge. Open Food Facts already feeds hundreds of data reuses. Thousands more will become possible thanks to the new user defined properties.

>> Read more about Folksonomy engine for the food ecosystem

Funkwhale — ActivityPub-driven audio streaming and sharing

Funkwhale is a federated platform that provides tools for managing, publishing, and sharing audio content using the ActivityPub protocol. In this project, the team will expand the use of ActivityPub and extend the integrations with other ActivityPub-powered platforms. The flagship web app will be redesigned, adding support for more content types in its API, creating new features that integrate with MusicBrainz, and making the mobile Android offering feature-complete as well as adding a (Tauri based cross-desktop app.

>> Read more about Funkwhale

Funkwhale — ActivityPub-driven audio streaming and sharing

Funkwhale is a free, decentralized and open-source audio streaming and sharing platform, built on top of the ActivityPub protocol. It enables users to create communities of interest around music and audio content in general, listen to their private music library or distribute their own productions on the network. Each Funkwhale pod, or server, can communicate with other pods to exchange audio content, metadata or for user interactions. In this project, Funkwhale will improve the publication experience for creators, release its first stable version, improve content discovery inside the platform through better sharing and search mechanisms. We will also continue research and development for Retribute, a community wealth sharing platform meant to support creators on Funkwhale or any other platform.

>> Read more about Funkwhale

GNU social — Modernizing the original FOSS Social Network

GNU social is a free social networking platform, easily self-hostable and highly accessible, that enables both private and public decentralized communications. With NLnet NGI Zero's support, the project is undergoing a change of main focus from microblogging to groups and tags. With this, GNU social will be a space for communities where users can express their passions and explore new ones. Users will be able to immerse themselves in easily filterable content relevant to their interests, and to create and join communities. It's hard to pinpoint an existing alternative service that promotes the same level of functionality in terms of tagging, filtering and connecting with people that share common interests. Especially considering the available degree of accessibility, customization and expansion via plugins.

>> Read more about GNU social

Taler for local currencies. — Free software banking backend for local currencies

This project is about extending GNU Taler’s LibEuFin software to make it suitable as a core banking system for local or regional currencies, in combination with the Taler payment system. The innovation comes from employing FLOSS technology, and having a centrally managed and yet privacy-preserving payment system.

Our focus will be on creating interfaces to allow regional currency administrators to control the platform, including account creation, controlling money supply, analyzing transactions, and setting of relevant policies. Additionally, we will support onboarding of customers, including offering them a way to trade fiat currency (e.g. EUR) for the local currency or vice versa (if permitted by the currency conversion policies of the platform).

We will work with cities and regions that have deployed regional currencies (or are planning to do so) to better understand their needs and adapt our plans according to their use-cases.

>> Read more about Taler for local currencies.

Geolexica reverse — Reverse Semantic Search and Ontology Discovery via Machine Learning

Ever forgotten a specific word but could describe its meaning? Internet search engines more than often return unrelated entries. The solution is reverse semantic search: given an input of the meaning of the word (search phrase), provide an output with dictionary words that match the meaning. The key to accurate reverse search lies in the machine’s ability to understand semantics. We employ deep learning approaches in natural language processing (NLP) to enable better comparison of meanings between the search phrases with word definitions. Accuracy will be significantly increased. The project outcome will be employed on Geolexica as a pilot application and testbed for evaluation. The ability to identify entities with similar semantics facilitates ontology discovery in the Semantic Web and in Technical Language Processing (TLP).

>> Read more about Geolexica reverse

Real time graph database search engine — Live filtering on graph database streams

Based is the world's first open source pub/sub real time graph database. It allows for millions of concurrent connections to changes in data or relationships, and offers built-in features such as authentication, internationalisation, server-side scripts for automation, time-series data, and user management. This saves money, complexity, and maintenance. In this project we will work on a full text indexing engine, that will give developers and end users the ability to query text in real time – and get back any updates in text instantly. The search engine is geared toward working with our database, but is applicable to any database in which users are interested in text search that updates in real time and indexes dynamically.

>> Read more about Real time graph database search engine

The Open Green Web — Ethical meta-search filter on green hosted websites

The world wide web has become a mainstay of our modern society, but it is also responsible for a significant use of natural resources. Over the last ten years, The Green Web Foundation (TGWF) has developed a global database of around 1000 hosters in 62 countries that deliver green hosting to their customers, to help speed a transition away from a fossil fuel powered web. This has resulted in roughly 1.5 billion lookups since 2011 - through its browser based plugins, manual checks on the TGWF website and its API, provided by an open source platform. But what if you want to take things one step further? This project will create the world's first search engine with ethical filtering, that will exclusively show green hosted results. In addition to giving a new choice of search engine to environmentally conscious web users, all the code and data will be open sourced. This creates a reference implementation for wider adoption across industry of search providers, increasing demand and visibility around how we power the web. The project build upon the open source search engine Searx, and will collaborate with the developers of that search tool to make "green" search an optional feature for all installs of Searx.

>> Read more about The Open Green Web

Great scanning and OCR for mobile devices —

The aim of this project is to improve the scanning and optical character recognition on mobile devices. Currently the cameras of many mobile devices have relatively noisy output whenever lighting conditions are less than optimal. Additionally, it's almost impossible to achieve scans that are distortion free as mobile devices don't have a surface to which the document under scan could be pressed to reliably. These two problems lead to difficulties in performing optical character recognition over acquired images as most recognition algorithms require an input that is noise and distortion free. The solution that will be developed by this project will solve both of these problems by acquiring multiple scan images from different angles. Same objects can then be matched across the source images providing two benefits: the noise can be cancelled out and 3D shape of the document under scan can be derived. Such information can then be used to unfold the document to 2D space and provide a noise and distortion-free image to optical character recognition algorithms. The solution will be implemented taking into account the performance limitations of mobile devices and a major optimization effort will be spent to achieve an acceptable latency of the complex image processing algorithms.

>> Read more about Great scanning and OCR for mobile devices

Hypermachines: Realtime and Collaborative P2P Search — Realtime and Collaborative P2P Search

Modern search systems don't work offline, rely on proprietary indexes, and give users limited interfaces for content discovery. Our earlier work on the Hypercore Protocol produced a collection of data structures and networking modules for building low-latency, secure P2P applications. With this project, we will extend the Hypercore Protocol with a novel mechanism for distributing sandboxed computation, called Hypermachines, that can be combined with the existing data structures in our stack to power a next-generation search system. Hypermachines are deterministic Javascript programs, akin to lightweight smart contracts, that introduce algorithmic transparency and compositionality into our ecosystem. Users can create powerful indexing pipelines that merge their Hypermachine datasets together, yielding a highly-composable, collaborative search engine. By storing indexing logic directly alongside data structures, users can see exactly how indexes are produced, verify that they were produced correctly, and modify them according to their needs. We imagine a future in which Hypermachines power a decentralized marketplace for collaborative, transparent, and fast search engines.

ipfs-search.com — Search engine for the Interplanetary File System

ipfs-search.com is a Free and Open Source (FOSS) search engine for directories, documents, videos, music on the Interplanetary Filesystem (IPFS), supporting the creation of a decentralized web where privacy is possible, censorship is difficult, and the internet can remain open to all.

>> Read more about ipfs-search.com

IN COMMON — Public platform to map and act together for the Commons

IN COMMON emerged as a transnational European collective from a network of non-profit actors to identify, promote, and defend the Commons. We decided to start a common pool for Information Technologies with the aim to create, maintain, and share with the public geo-localized data that belong to our constituents and to articulate citizen movements around a free, public and common platform to map and act together for the Commons. IN COMMON forms a cooperative data library that provides collective maintenance to ensure data is always accurate.

>> Read more about IN COMMON

In-document search — Interoperable Rich Text Changes for Search

There is a relatively unexplored layer of metadata inside the document formats we use, such as Office documents. This allows to answer queries like: show me all the reports with edits made within a timespan, by a certain user or by a group of users. Or: Show me all the hyperlinks inside documents pointing to a web resource that is about to be moved. Or: list all presentations that contain this copyrighted image. Such embedded information could be better exposed to and used by search engines than is now the case. The project expands the ODF toolkit library to dissect file formats, and will potentially have a very useful side effect of maturing the understanding of document metadata at large and for collaborative editing of documents in particular.

>> Read more about In-document search

Practical Tools to Build the Context Web — Declarative setup of P2P collaboration

In a nutshell, the Perspectives project makes collaboration behaviour reusable, and workflows searchable. It provides the conceptual building blocks for co-operation, laying the groundwork for a federated, fully distributed infrastructure that supports endless varieties of co-operation and reuse. The declarative Perspectives Language allows a model to translate instantly in an application that supports multiple users to contribute to a shared process, each with her own unique perspective.

The project will extend the existing Alpha version of the reference implementation into a solid Beta, with useful models/apps, aspiring to community adoption to further the growth of applications for citizen end users. Furthermore, necessary services such as a model repository will be provided. This will bring Perspectives out of the lab, and into the field. For users, it will provide support in well-known IDE's for the modelling language, providing syntax colouring, go-to definition and autocomplete.

Real life is an endless affair of interlocking activities. Likewise, Perspectives models of services can overlap and build on common concepts, thus forming a federated conceptual space that allows users to move from one service to another as the need arises in a most natural way. Such an infrastructure functions as a map, promoting discovery, decreasing dependency on explicit search. However, rather than being an on-line information source to be searched, such the traditional Yellow Pages, Perspectives models allow their users (individuals and organisations alike) to interact and deal with each other on-line. Supply-demand matching in specific domains (e.g. local transport) integrates readily with such an infrastructure. Other patterns of integrating search with co-operation support form a promising area for further research.

>> Read more about Practical Tools to Build the Context Web

Indigenous — Indieweb mobile clients

Indigenous is a collection of native, web and desktop applications which allows you to engage with the Internet as you do on social media sites, but posts it all on your website. Use the built-in reader to read and respond to posts across the internet. Indigenous doesn't track or store any of your information, instead you choose a service you trust or host it yourself. Posts are collected on your website or service which supports W3C Microsub, writing posts uses the W3C Micropub specification. Popular services that support both are Wordpress, Micro.blog and Drupal, with more coming soon.

>> Read more about Indigenous

Inochi2D — Open source 2D animation/puppeteering framework

Inochi2D is an open source, BSD 2-clause licensed toolkit and ecosystem for real-time 2D puppet animation, for use in game development, virtual avatars and other multimedia applications. Our ecosystem features a SDK and two tools: Inochi Creator, which allows the user to create a puppet by rigging layered 2D art via warping meshes, physics, dynamic masking and real-time lighting, in order to create the illusion of depth and liveliness. And Inochi Session, which allows the use of Inochi2D puppets for livestreaming, teleconferencing and more, by mapping external tracking data to a puppet's rigging. The SDK and tools together allows anyone to express themselves without restrictive licensing terms.

With this grant our goal is to improve the user experience and portability of our tooling via the creation of a new UI toolkit which is purpose-built just for Inochi2D, called libsoba. We also plan to finish and release a major update to Inochi2D, version 0.9, which aims to make Inochi2D more future proof and portable, making it viable to use in game engines such as Godot and Unity, and on the web via WebASM, WebGL and WebGPU.

>> Read more about Inochi2D

Inventaire — Wikidata-based social sharing of reading experiences

The Inventaire Project is an effort to move forward on the front of accessing information on resources using libre software powered by open knowledge. This ideal is being materialized in the form of inventaire.io, a libre book sharing webapp, inviting everyone to make the inventory of their physical books, declare what they want to do with it (giving, sharing, selling), as well as who should be able to see it (shared publicly through e.g. ActivityPub, or only visible by your friends and groups).

To power those inventories with structured bibliographic data, inventaire.io is also playing the role of a Wikidata-federated open and contributive bibliographic database, extending wikidata.org data with Wikidata-compatible entities (CC0, shared data schema) tailored to our needs, but ready to be pushed to Wikidata when the data contributor deems it appropriate. This linked open data architecture allows users to build their inventories on a huge open knowledge graph, that we believe will, in time, offer exceptional discovery capabilities. This project addresses many features, such as improved privacy settings, accessibility, creating publisher collections and data federation.

>> Read more about Inventaire

Inventaire recommender — Book recommendations in Inventaire

To power those inventories with structured bibliographic data, inventaire.io is also playing the role of a Wikidata-federated open and contributive bibliographic database, extending wikidata.org data with Wikidata-compatible entities (CC0, shared data schema) tailored to our needs, but ready to be pushed to Wikidata when the data contributor deems it appropriate. This linked open data architecture allows users to build their inventories on a huge open knowledge graph, that we believe will, in time, offer exceptional discovery capabilities. Now that this first base of inventories and contributive bibliographic data has reached a certain level of maturity, we want to start moving forward on the next challenges: introduce curation and recommendation mechanisms, improve search tools, offer finer privacy settings, and move forward on decentralization.

>> Read more about Inventaire recommender

Karrot — Save and share food waste

Karrot started as a free and open-source tool to support grassroots initiatives that save and share food waste, but it has been gradually re-designed to become a more general purpose tool to support various groups of people in their face-to-face activities on a local, autonomous, solidarity-driven and voluntary basis. Some of its defining features are the self-assignment of tasks, full transparency of members' actions and no admin roles, using a trust-based system instead. In order to better support the diverse ways in which people self-organize and practice commoning, this project will further develop features focused in the needs of end users through a participatory design process. We will work with the themes of collective agreements, role assignment and going beyond group boundaries for organising, which includes exploring options for federating. The same way we envision the software to be used, we will continue to work for the governance and organisation of Karrot project itself to be community-driven, transparent and democratic.

>> Read more about Karrot

LO/CODE Book project — Professional typography inside LibreOffice

The project enhances readability of text documents by adding highly customizable paragraph-level line breaking and microtypography to the LibreOffice/Collabora Online Writer word processors. It creates a new type of software, with the print quality of proprietary DTP programs and with productivity of word processors. It saves paper and screen area with a compact paragraph layout and readable multi-column pagination. It should result in proposals to enhance the OpenDocument format standard (ISO/IEC 26300) which will be submitted for standardization, encouraging future standards to support enhanced readability, especially for people with reading difficulties.

>> Read more about LO/CODE Book project

Lemmy — ActivityPub for link aggregation

Lemmy is an open-source, easily self-hostable link aggregator that you can use to share and discover interesting new ideas - and discuss them with the world. Its designed to work in the Fediverse, and communicate natively with other ActivityPub services, such as Mastodon, Funkwhale and Peertube.

Lemmy aim to create a decentralized alternative to widely used proprietary services like Reddit. For a link aggregator, this means a user registered on one server can subscribe to communities on any other server, and have discussions with users registered elsewhere. The front page of popular link aggregators is where many people get their daily news, so Lemmy has the potential to help alter the social media landscape.

>> Read more about Lemmy

Lemmy private communities — Add private communities to Lemmy federated link aggregator

Lemmy is an open-source, easily self-hostable link aggregator that you can use to share, discover and discuss interesting new ideas - and discuss them with the world. Lemmy is a good decentralized alternative to widely used proprietary services like Reddit. It is designed to work in the Fediverse by virtue of its implementation of the W3C ActivityPub standard, and communicate natively with other ActivityPub services such as Mastodon, Funkwhale and Peertube. User registered on one server from one of these services should be able to effortlessly subscribe to communities on any other server, where they can have discussions with users registered elsewhere.

In this project, the team will deliver many noteworthy upgrades ranging from a more stable API, to group federation, two-factor authentication and improved moderation. In addition the project will work on the new native client Jerboa (for the Android OS). Also for the nostalgically inclined, the project is working on a new frontend inspired by traditional web forums like phpBB.

>> Read more about Lemmy private communities

librarian — Custom meta-search

Search engines are the default way of finding information on the internet. Although there is a host of search engines for users to choose - from library catalogs to cooking portals - there is currently only a small number of dominant search engines that practically decide who finds what on the internet. This situation has the following disadvantages: 1) by designing their algorithms these dominant search engines influence our world view, 2) the huge amounts of user data they record, creates sever risks of data leaks and misuse, finally 3) search engines can misuse their market power to gain advantages in other lines of business (e.g. the mobile phone market).

Federated web search is a technology where users connect to a so-called broker which forwards their search request to suitable search engines and combines the results. Using federated search lessens the risks of few dominant search engines: it shows a blend of search results created by different algorithms, it prevents the search engine to record data of individual users, and its search results are usually more divers. Still, for federated web search to become widely used, it faces the following challenges: 1) while exploiting user behavior is known to improve search effectiveness, brokers exploiting this data also risk leaks and misuse, 2) as brokers typically serve many users, they are not able to include search engines for personal content, such as email, social media or cloud storage because the public broker cannot know the user’s credentials to access these services, finally 3) brokers consider for every user the same base set of search engines, while considering a more focused set of engines could improve search results, given the diversity of users.

To improve upon these challenges, while avoiding the disadvantages of dominant search engines, this project will investigate a radical change to the federated search architecture: users run a broker on their own computer using a browser plugin. In this architecture the broker can safely analyze the user's behavior to improve search results as the data is accumulated on a per-user basis on disconnected computers. Furthermore, the search requests forwarded to search engines use the user's credentials and thus can access search engines for personal data, such as email etc. Finally, starting from sensible defaults, each user can configure its broker with his or her individual needs.

>> Read more about librarian

Libre Car Control — Automotive development platform, protocol analyzer and hacking multi-tool

The Engine Control Unit (ECU) is a microprocessor-based system that receives input from various sensors, analyzes the data, and controls various driving functions based on the input. LibreCar is a small and affordable device which can emulate an actual ECU as an electronic control module that manages control of an automotive vehicle. Acting as an all-in-one device for building, testing, monitoring, and experimenting with Automotive ECUs, LibreCar is built around a unique FPGA-based architecture making its digital hardware fully customized to suit the application at hand. As a result, it can act as a no-compromise Automotive protocol analyzer, an Automotive-hacking multi-tool, or an Automotive development platform. It is a fully reconfigurable test instrument that provides all the hardware, gateware, firmware, and software you will need to work with—and, indeed, to master Automotive domain such as rapid prototyping of compliant and non-compliant Automotive devices, Protocol analysis for Automotive protocols like Diagnostics, XCP and DLT for security research etc.

>> Read more about Libre Car Control

MaDada — Using LinkedData to improve FOI processes

MaDada is a free open source platform that simplifies and opens up the process of access by the general public to data and information held by the French government. Making use of the Freedom Of Information (FOI) law, the platform guides citizens to file requests, but also acts as an open data archive and platform for right-to-know or transparency campaigns, by publishing the whole process : the requests history, the resulting correspondence, and the data obtained through it. Launched in October 2019 by Open Knowledge Foundation France members, MaDada has helped 250+ users make over 1200 FOI requests to French public bodies, and is beginning to play an important role in the right-to-know, need for transparency and open government problems.

MaDada is based on the open source software Alaveteli (https://alaveteli.org), which has been adapted and deployed to more than 25 countries in 20 different languages and jurisdictions. Alaveteli offers efficient functions for users to request and manage FOI requests. The NLnet funding will help the project develop and improve discovery and search features of public bodies on madada.fr and Alaveteli software - for instance, in France alone there are more than 60,000 public authorities. This will take advantage of existing digital commons such as Wikidata, and open standards such as schema.org and DCAT.

>> Read more about MaDada

Mangaki — Advanced group recommendations

Within a set of search results, what should you do to find the optimal solution for not just a single user but a group? Mangaki is building an open source library for privacy-preserving group recommendations of items. While many content providers suggest recommendations at a personal level, these are often directed to a single user, or are restricted to a generic “family” category. Whenever say a group of friends want to watch a movie, it is often hard to decide what to watch, because people can have really different tastes.

Recommendations are also very privacy-sensitive. A straightforward way might be to share our complete viewing history, but that certainly can lead to embarrassing and awkward situations. So how can we collectively compute a list of relevant items without disclose all of our data unencrypted. The Mangaki project is making an open source library for group recommendations that works in a scalable and distributed way.

>> Read more about Mangaki

Manyfold — ActivityPub-powered tool for storing and sharing 3d models

Manyfold is a web application for managing collections of 3d models, with a focus on the needs of the 3d printing community. It is designed to be self-hosted, and lets users browse, organise, and analyse their downloaded models. With NLNet’s support, the project has recently launched federation features using ActivityPub, progressive transmission of 3d models, and a wide range of core feature enhancements. The next phase of the project will build on this base to create richer social features, better ways to get models into and out of the system, features to help financially support creators, and improvements to search and discovery features, all of which will help build an open, decentralised ecosystem for 3d model hosting.

>> Read more about Manyfold

Marginalia Search — A fresh take on search

Marginalia Search is an experimental Internet search engine for the independent web designed and optimized to run on cheap consumer hardware. The overarching goal of the development effort is to bring the project into a more mature state; to improve search quality and range, reduce the amount of manual operations, and to produce and offer portable data in order to bolster adjacent efforts in the search and discovery space.

>> Read more about Marginalia Search

Mautic Portability — Portable marketing campaigns for Mautic

Mautic is an open source marketing automation platform. It helps organizations to better understand their customers throughout their lifecycle, and, combined with what they already know about the customer and how they interact with marketing campaigns, enables full personalization of the digital experience across multiple channels. This project lays the foundation for an important feature which is much-requested and much-needed, to create a library of example campaign workflows and associated resources which marketers can install with a single click, saving time and improving best practice adoption. This project sees the establishment of an export and import functionality for campaigns and all associated resources. This project will also enable the export and import of this data between Mautic instances, further improving data portability.

>> Read more about Mautic Portability

MeiliSearch — Modern and responsive search

Advanced content search for apps and websites has become an increasingly protected craft. When owners of big content repositories need search at scale, they have to choose between hiring expensive search specialists or outsourcing search in its entirety. Search doesn’t need to be this complicated. It should be simple enough to be self-hosted with the developers you already have, and it should be understandable & open enough that you can resort to a managed cloud without fear of lock-in.

MeiliSearch is blazing fast and very light on resources. It packs advanced search capabilities like search-as-you-type, relevancy , typo-tolerance, synonyms and filters, all set up and configured in minutes. Our primary path to widespread adoption is integration with other developer ecosystems. Every new language, framework, platform or application that’s supported brings in a new audience of developers that wouldn’t otherwise know we even exist.

>> Read more about MeiliSearch

Mepo — Lightweight mobile map search

Mepo is a fast, simple, and hackable OSM map viewer for desktop linux & mobile linux devices (like the Pinephone, Librem 5, and postmarketOS devices) and both environments' various user interfaces (Wayland & X inclusive). Mepo works both offline and online, features a minimalist both touch/mouse and keyboard compatible interface, and offers a UNIX-philosophy inspired underlying design, exposing a powerful command language called mepolang capable of being scripted to provide and customize functionality such as bounding-box search scripts, bookmarks, routing, and more.

>> Read more about Mepo

Modular Meta-Press.es — Reusable decentralised meta-search engine

Meta-Press.es is a search engine dedicated to online press. It can work from your computer being shaped as browser WebExtension and gives you back the control of your information sources allowing to choose (and pin-point) the newspapers to search in. Sources can be contributed by users, covering any domain where it's the chronological order that matters : press (TV, radios…), scientific press, online agendas…

Using Meta-Press.es is free, avoid ads and does not trigger the tracking mechanisms of online newspapers when discovering the results. With the new developments within this project, Meta-Press.es will break out of web browsers to become available server-side and for mobile users. Also, contributions for your favorite sources will finally be possible "all by mouse" and without computer science specific knowledge (traditional method via CSS selectors still being available).

>> Read more about Modular Meta-Press.es

Meta-Press.es — A press search engine in your browser

Meta-Press.es is a press search engine, in the shape of a browser add-on. When using it, everything happens between the user's computer and the queried newspapers. Using Meta-Press.es, there is no data sent to third party (including our servers). We're not asking the users to believe that we respect their privacy, it's a matter of verifiable fact that we do. That means there is no single point of failure, of surveillance or of censorship.

>> Read more about Meta-Press.es

Meta-Press.es — Retrieve news feeds and search locally

Meta-Press.es is a addon (in the standard WebExtension format) which gives super powers to your web browser. Meta-press.es equips your browser with the capacity to query hundreds of online presss sources in a few seconds and get you the relevant results. It is a drop-in replacement for centralised services like Google News, and in addition helps you to create press reviews (via selection and export of results from automatized searches).

Using Meta-Press.es, it's your web browser that does the work, without any middleman between information sources and you. Your privacy is respected even against the ad or social trackers of the newspapers (as those mechanisms aren't triggered by Meta-Press.es searches). Unlike its news portal competitors, Meta-Press.es transparently shows what was queried and what was not - and you can choose your own information sources (via source selection filters and even source selection pick-up). Everything happens directly on the user device and under control of the user, avoiding single points of censorship and in support of Freedom of the Press and media diversity.

>> Read more about Meta-Press.es

Mobilizon — Find, create and organize events

Mobilizon is a free, libre and federated groups and events management platform. Most proprietary social medias collect behavioral data and social graphs by hosting groups and events management tools (such as Facebook events, MeetUp, etc.). This can become a problem, even more when your group works on topics like activism, raising awareness and empowering citizens. Mobilizon allows for a federation of interconnected hosts, that decentralize by design data concentration while permitting interactions between users across the federation. This group and event management tool has been designed by asking and considering the needs of mobilized citizens. It includes features that has been since implemented as well by mainstream social medias (multiple profiles for each account), and does not reproduces mechanisms driven by the attention economy. As such, Mobilizon is not a social media, it does not pander to egos, but focuseson being a toolkit tomanagecommunities. On top of the eventpublishingtool, it features a group discussion tool (akin to a minimalist forum), a group page management tool (that can be used as a one-page website), a group public and private posts tool (similar to a blog), and a group link directory (to organize links to online documents, resources, etc.). With this grant, Framasoft aims to improve Mobilizon's search results (within an instance as well as throughout the federation) and recommendations. We also want to help people find groups and events close to their interests or their location, as well as allow them to import their events from other platforms when possible (Facebook, MeetUp, etc.).

>> Read more about Mobilizon

MoboSearch — Providing an alternative view on the Android App ecosystem

Mobile phones play a major role in our society, yet they still suffer from severe limitations in how they handle apps. As a result, most people are unaware of the dangers of privacy leaks and are typically offered very constrained search capabilities within one single source of information, the app store. MoboSearch is a new search engine and information portal for apps, empowering users beyond the existing app stores. The system exposes privacy and security information, like app permissions, and gives users new easy and flexible search capabilities that allow to make an informed choice and to increase people's awareness. Openness and interoperability ensure that the system can offer and receive data, so to cooperatively enable a better and healthier app ecosystem.

>> Read more about MoboSearch

OSF Crawler Cooperation — Support Infrastructure for Open Search initiatives

The Open Search Foundation (OSF) attempts to build a European main stream search engine alternative, under European regulations like privacy and fair participation. Our project builds on the foundations of that OSF search engine to be, in an attempt to combine existing crawling efforts of OSF participants. This is implemented on the real internet scale: petabytes of data, billions of webpages, a hundred million websites with terabytes of communication between the components per day. The scale and regulations call for a concept which has not been implemented before. Existing web-search related projects are invited to contribute their ideas into our larger concept, which could become not just an alternative for Google Search but also has many other uses - even in early stages.

>> Read more about OSF Crawler Cooperation

OpenCarLink — Security tooling for vehicle ODB2 ports

OpenCarLink is an initiative aimed at revolutionizing vehicle diagnostics and security through the development of an open hardware device for vehicle OBD2 ports. By supporting communication protocols such as DOIP, CAN, Kline, and Single-Wire CAN, OpenCarLink enables users to perform remote diagnostics, real-time emissions tracking, enhanced vehicle security through penetration testing, and increased driver safety via behavioral data tracking. This project promotes an open and innovative future for the European mobility sector by help circumventing manufacturer limitations. By releasing the hardware design under an open-source license, OpenCarLink fosters a environment where enthusiasts, researchers, and professionals can contribute to and benefit from the advancements in vehicle diagnostics and control. With a focus on democratizing access to the DOIP protocol, OpenCarLink challenges the restrictive policies and secrecy that currently dominate the automotive industry, help paving the way for a more open and informed society.

>> Read more about OpenCarLink

Personal Food Facts — Privacy protecting personalized information about food

Open Food Facts is a collaborative database containing data on 1 million food products from around the world, in open data. This project will allow users of our website, mobile app and our 100+ mobile apps ecosystem, to get personalized search results (food products that match their personal preferences and diet restrictions based on ingredients, allergens, nutritional quality, vegan and vegetarian products, kosher and halal foods etc.) without sacrificing their privacy and having to send those preferences to us.

>> Read more about Personal Food Facts

Open Hospitality Network — Federated hospitality with ActivityPub

Hospitality is part of human tradition, practiced long before any software infrastructure existed. People share with others their homes, and exchange life’s stories and adventures - often without even mention of money. The internet age allowed hosts and travelers from all around the world to find each other more easily, and spontaneous communities emerged online. Nowadays, many hospitality exchange platforms exist which help travelers and hosts find each other.

Open Hospitality Network wants to unify hospitality exchange communities into one federated system conveniently serving travelers and hosts. We envision a variety of platforms to exist, united in diversity, where each of them is built around their own unique culture, yet they all communicate with each other in federation. We'd like them together to create a resilient ecosystem outlasting any particular founders and exchange platforms. Following a collaborative process, we are building software from the community for the community, software that on the one hand helps connect existing communities and on the other enables new federated communities to spring up and flourish.

>> Read more about Open Hospitality Network

Openki.net — Make local events and meetups discoverable

How do you discover what you can learn from the people around you? How do you search what other people in the same region have to offer, like a training course or a debating event?

Openki is an interface between technology and culture. It provides an interactive web platform developed with the goal to remove barriers for universal education for all. The platform makes it simple to organise and manage "peer-to-peer" courses. The platform can be self-hosted, and integrates with OpenStreetMap. At the moment Openki is focused on facilitating learning groups and workshops. The project will improve the tool, so it can be used not only to organise courses (with the collaboration of many different actors, in a more participatory way) but much broader,for bottom-up project initiation, for grassroot organizations and facilitating societal dialogue.

>> Read more about Openki.net

OpenStreetMap-NG — Alternative implementation of OpenStreetMap

OpenStreetMap-NG is an innovative rethinking of how open mapping platforms can be built and maintained, as an alternative to the current openstreetmap.org setup. Leveraging Python and other widely used technologies and guided by user-centric design principles, this project creates a more accessible, privacy-respecting, and developer-friendly mapping platform. By prioritizing both solid technical foundations and ease of use, OpenStreetMap-NG wants to make open-source mapping more approachable while pushing the boundaries of what's possible.

>> Read more about OpenStreetMap-NG

Openki Roles — Restructuring role management in libre tool for crowd-sourced education

How do you discover what you can learn from the people around you? How do you search what other people in the same region have to offer, like a training course or a debating event?

Openki is an interface between technology and culture. It provides an interactive web platform developed with the goal to remove barriers for universal education for all. The platform makes it simple to organise and manage "peer-to-peer" courses. The platform can be self-hosted, and integrates with OpenStreetMap. At the moment Openki is focused on facilitating learning groups and workshops. The project will add course templates, streamline roles when organising courses and redesign parts of the interface in order to improve the overall user experience.

>> Read more about Openki Roles

Organic Maps сonvergent UI with Qt Quick/Kirigami — Declarative cross-platform UI for navigation

Maps navigation software is a crucial part of computer systems today, be it on Mobile, Desktop, Automotive and so on. For quite a lot time already, we have a brilliant open-source maps application, now named Organic Maps. It's features make it strong competitor to commercial-grade software, among them are: privacy, fully offline maps, low battery consumption, navigation, points of interest (POI) and much more. Currently, the application shows it's strength on mainstream mobile operating systems only. On other systems, it's ability is quite limited, mainly because of lack of proper User Interface for them.

This project aims to create an Organic Maps convergent touch-friendly User Interface for Linux, backed by featured Qt Quick/QML application framework, perfectly suitable for this task. This would allow feature-parity for Mobile and Desktop Linux systems, and also creates solid ground for further unification of the User Interface among other platforms.

Organic Maps bookmarks, hike and bike — Improved bookmarks, address search, map styles and driving

Organic Maps is a free, open-source offline map application available for Android and iOS. It provides a privacy-focused alternative to Google and Apple Maps, empowering individuals who value their privacy and freedom from the surveillance ecosystems created by these companies. The app offers downloadable outdoor maps of the entire world, offline multi-point navigation, offline search on the map, saved bookmarks and trails, KML/KMZ/GPX interoperability, elevation contours, track recording, and more. This project focuses on enhancing core functionality: optimizing offline search, expanding bookmark management, and introducing new features for hikers and bikers.

>> Read more about Organic Maps bookmarks, hike and bike

PRESC Classifier Copies Package — Implementing Machine Learning Copies as a Means for Black Box Model Evaluation and Remediation

The ubiquitous use over the Internet, and in particular in search engines, of often proprietary black-box machine learning models and APIs in the form of Machine Learning as a Service, makes it very difficult to control and mitigate their potential harmful effects (such as lack of transparency, privacy safeguards, robustness, reusability or fairness). Machine Learning Classifier Copying allows us to build a new model that replicates the decision behaviour of an existing one without the need of knowing its architecture nor having access to the original training data. A suitable copy allows to audit the already deployed model, mitigate its shortcomings, and even introduce improvements, without the need to build a new model from scratch, which requires access to the original data.

This project aims to implement a practical solution of this innovative technique into PRESC, an existing free software tool for the evaluation of machine learning classifiers, so that classifier copies are automated and can be easily created by developers using machine learning, in order to reuse, evaluate, mitigate and improve black-box models, ensure a personal data privacy safeguard into their machine learning models, or for any other application.

>> Read more about PRESC Classifier Copies Package

The PeARS app — Building low-resource Web search applications from cognitive models

It is widely believed that Web search engines require immense resources to operate, making it impossible for individuals to explore alternatives to the dominant information retrieval paradigms. The PeARS project aims at changing this view by providing search tools that can be used by anyone to index and share Web content on specific topics. The focus is specifically on designing algorithms that will run on entry-level hardware, producing compact but semantically rich representations of Web documents. In this project, we will use a cognitively-inspired algorithm to produce queryable representations of Web pages in a highly efficient and transparent manner. The proposed algorithm is a hashing function inspired by the olfactory system of the fruit fly, which has already been used in other computer science applications and is recognised for its simplicity and high efficiency. We will implement and evaluate the algorithm on the task of document retrieval. It will then be integrated into a Web application aimed at supporting the growing practice of 'digital gardening', allowing users to research and categorise Web content related to their interests, without requiring access to centralised search engines.

>> Read more about The PeARS app

Peertube-Desktop — Enjoy and share federated videos

Cuttlefish is a client for PeerTube that will allow for searching and discovering new and interesting video's online with more privacy. PeerTube is a federated video hosting service based on the W3C ActivityPub standard. By using WebTorrent - a version of BitTorrent that runs in the browser - users help serve videos to other users. Cuttlefish is a desktop client for PeerTube, but will work on GNU/Linux-based phones (like the Librem 5 or Pinephone) as well.

We want the experience of watching PeerTube videos and using PeerTube in general to be better, by making a native application that will become the best and most efficient way to hook into the federation of interconnected video hosting services. It will have improved search, and will allow people to continue sharing watched videos with other PeerTube users for longer periods of time, instead of discarding the video when done watching. It will also help bridge PeerTube's gap between the - now separated - BitTorrent and WebTorrent networks by speaking both of those protocols.

>> Read more about Peertube-Desktop

peermaps — Peer to peer cartography

Peermaps is a p2p, offline-friendly way to distribute, view, and embed map data. Instead of fetching data from a centralized tile provider, you fetch data from other peers on the network. Right now we have all of OpenStreetMap processed into a 100GB archive in our p2p spatial database and rendering formats and seeded to hyperdrive and ipfs. This data is hooked up to a proof-of-concept web map viewer.

For this grant, we will build on our proof-of-concept to release a user-oriented map viewer as a web application with search functionality on peermaps.org along with a developer-oriented tool to embed web maps in an iframe. In addition to (p2p) web development, this project will involve research on peer queries for offline and online location-based search, optimizations to the spatial database and p2p layer, webgl graphics improvements in addition to web development in order to produce a usable p2p mapping alternative.

>> Read more about peermaps

PeerTube - Remote Transcoding — Remote Transcoding for distributed video sharing network

PeerTube is a free-libre and federated alternative to centralized video platforms such as YouTube, Twitch or Vimeo. It empowers content creators (institutions, video-makers and live streamers, communities, etc.) to self host their own collective video-platform without being isolated in the wide web. The technical choices behind PeerTube (ActivityPub Federation, peer-to-peer broadcasting) keep the source of this sugestion (the technical and financial bar to self & collective hosting: you no longer need Google's server farm and Amazon's money to host your own PeerTube servers (an instance) and synchronize it with other servers to share video catalogs!

There is still one technical bottleneck: video transcoding. This step is essential for a smooth video broadcasting experience. Transcoding happens at every video upload or during live-streams, and consumes a lot of CPU power. Instances hosting lots of content creators or live streamers tend to rapidly need to upgrade the CPU power of their server, to avoid a bottleneck that only happens episodically. Allowing transcoding work to happen remotely could solve a number of important logistical problems in a more efficient, resilient, affordable and eco-friendly manner.

>> Read more about PeerTube - Remote Transcoding

A Distributed Software Stack For Co-operation — Facilitating easy ad hoc cooperation

Perspectives aims to be to co-operation, what ActivityPub is to social networks. It provides the conceptual building blocks for co-operation, laying the groundwork for a federated, fully distributed infrastructure that supports endless varieties of co-operation. The declarative Perspectives Language allows a model to translate instantly in an application that supports multiple users to contribute to a shared process, each with her own unique perspective. The project builds a reference implementation of the distributed stack that executes these models of co-operation, and makes the information concerned searchable.

PixelDroid — Share and browse photos in the fediverse with a mobile app

PixelDroid is an Android client for Pixelfed, the federated image sharing platform based on W3C ActivityPub. Our goal is to bring the Pixelfed platform to Android and provide a mobile user experience that excites. We aim to provide feature-parity with the Pixelfed web client as well as add additional features - like image and video editing, capturing and uploading directly from the app. During the project we will also make it easy to use multiple accounts, even across different instances. Additionally, we want to contribute to the Pixelfed API with testing and additional documentation.

>> Read more about PixelDroid

Pixelfed — ActivityPub driven decentralised photo sharing platform

Pixelfed is an open source and decentralised photo sharing platform, in the same vein as services like Instagram. The twist is that you can yourself run the service, or pick a reliable party to run it for you. Who better to trust with your privacy and the privacy of the people that follow you? The magic behind this is the ActivityPub protocol - which means you can comment, follow, like and share from other Pixelfed servers around the world as if you were all on the same website. Timelines are in chronological order, and there is no need to track users or sell their data. The project has many features including Discover, Hashtags, Geotagging, Photo Albums, Photo Filters and a few still in development like Ephemeral Stories. The goal of the project is among others to solidify the technical base, add new features and design and build a mobile app that is compatible with Mastodon apps like Fedilab and Tusky.

>> Read more about Pixelfed

Plaudit — Make good science discoverable through endorsements

Plaudit is open source software that collects endorsements of scholarly content from the academic community, and leverages those to aid the discovery and rapid dissemination of scientific knowledge. Endorsements are made available as open data. The NGI Search & Discovery Grant will be used to simplify the re-use of endorsement data by third parties by exposing them through web standards.

>> Read more about Plaudit

Poliscoops — Make political news and online debate accessible

PoliFLW is an interactive online platform that allows journalists and citizens to stay informed, and keep up to date with the growing group of political parties and politicians relevant to them - even those whose opinions they don't directly share. The prize-winning polical crowdsourcing platform makes finding hyperlocal, national and European political news relevant to the individual far easier. By aggregating the news political parties share on their websites and social media accounts, PoliFLW is a time-saving and citizen-engagement enhancing tool that brings the internet one step closer to being human-centric. In this project the platform will add the news shared by parties in the European Parliament and national parties in all EU member states. , showcasing what it can mean for access to information in Europe. There will be a built-in translation function, making it easier to read news across country borders. PoliFLW is a collaborative environment that helps to create more societal dialogue and better informed citizens, breaking down political barriers.

>> Read more about Poliscoops

Pomme d’API — Improvements around the Open Food Facts API

Open Food Facts is an open and collaborative database of 3.5M food products from around the world. This project will improve the Open Food Facts API to make it easier for the 250+ apps and services that use it daily to access and contribute food products data. In particular, it will focus on providing easier means to contribute photos and data, better structured data, OpenAPI specifications, and extensive documentation.

>> Read more about Pomme d’API

pretalx — Open source tooling for events and conferences

When attending events like conferences, visitors are often subjected to privacy-invading proprietary apps by organisers. With printed programmes typically no longer made available, visitors are put on the spot: either they install some unknown app and allow themselves to be tracked, or they don't know which sessions to attend. Pretalx is an open source project for events and conferences. It provides a Call for Proposals interface, tools for review (including fully double-blinded ones), scheduling, speaker communication, and attendee feedback. pretalx has a variety of plugins and can be self-hosted. This gives conference organisers, speakers and attendees complete control over the data they share. This project will completely redo the writable API of pretalx, making it a strong privacy-friendly option for any event being organised.

Pretalx is one of the leading open source tools capable of handling the full organisation of events from Call for Proposals to user feedback, and is used by many large open source events already (MozFest, FOSDEM, Pycon, NSEC, etc).

>> Read more about pretalx

Private Searx — Add private resources to the open source Searx metasearch engine

Searx is a popular meta-search engine letting people query third party services to retrieve results without giving away personal data. However, there are other sources of information stored privately, either on the computers of users themselves or on other machines in the network that are not publically accessible. To share it with others, one could upload the data to a third party hosting service. However, there are many cases in which it is unacceptable to do so, because of privacy reasons (including GPPR) or in case of sensitive or classified information. This issue can be avoided by storing and indexing data on a local server. By adding offline and private engines to searx, users can search not only on the internet, but on their local network from the same user interface. Data can be conveniently available to anyone without giving it away to untrusted services. The new offline engines would let users search in local file system, open source indexers and data bases all from the UI of searx.

>> Read more about Private Searx

Protomaps — Self-hostable maps based on OpenStreetMap data

Protomaps is a free and open source map of the world, deployed as a single file you can host yourself. It enables interactive, zoomable mapping applications with only static storage and HTTP Range Requests. It uses the OpenStreetMap dataset as a primary source; its configurable toolchain can create maps with specific areas, custom data, and different cartographic styles. It’s used in earth science, journalism and the public sector. Protomaps has no vendor lock-in, permits end-to- end data sovereignty, and can ensure end-user privacy.

>> Read more about Protomaps

Re-isearch Schmate — Extending re-Isearch with a flat vector datatype for embeddings

Schmate is the development name for the evolving next iteration of re-Isearch adding vector datatypes for embeddings and applications like retrieval augmented generation (RAG). Schmate (pronounced "SHMAH-teh") is Yiddish for rag (שמאטע).

In contrast to typical vector stores the proposed re-Isearch+ shall offer a full passage information retrieval system (index and retrieval) using a combination of dense and sparse vectors as well as structure. It is dense passage retrieval (DPR) and a whole lot more. It addresses the stumbling blocks of chunking, has a tight integration of ingest, tokenisation, a number of alternative vector stores and similarity algorithms and, above all, uses a novel combination of understanding document structure (implicit and explicit) to provide a better contextual passage retrieval to solve the problem of misaligned context. This builds on the observation that meaning is also communicated through structure so needs to be viewed in the context of structure. Since structure like the words are meant by the sender (writer) to be received and understood (reader) our approach is to exploit the original author's organization of content to determine appropriate passages rather than relying solely on the chunks.

>> Read more about Re-isearch Schmate

Re-isearch — Vectorise text with a flexible unit of retrieval

*Project re-isearch: a novel multimodal search and retrieval engine using mathematical models and algorithms different from the all-too-common inverted index (popularized by Salton in the 1960s). The design allows it to have no limits on the frequency of words, term length, number of fields or complexity of structured data and support even overlap--- where fields or structures cross other's boundaries (common examples are quotes, line/sentences, biblical verse, annotations). Its model enables a completely flexible unit of retrieval and modes of search.

Initial project outcome: a freely available and completely open-source (and multiplatform) C++ library, bindings for other languages (such as Python) and some reference sample code using the library in some of these languages.

>> Read more about Re-isearch

Great OCR for SANE — Integrate OCR capabilities into open source scanning tools

We have become dependent on search engines, allowing us to locate a document using some specific words across billions of webpages. However, not every document is born digital - or may reach the web via an indirect way. And users with for instance visual disabilities cannot read documents that are 'just' pixels.

The SANE project is a collection of open-source scanner drivers and related software. SANE tools allow the users to convert their documents, photos and any other similar material from a completely unsearchable and non-discoverable analog form into a digital representation, which can be easily shared and distributed.

The SANE-OCR project enables users to close the gap right at the stage when physical documents are converted from their incoming "analog" form to a searchable digital form - using a completely open-source stack. While the traditional result of scanning is just the visual image (essentially a photo), but in addition contains the recognized text using optical character recognition (OCR). This outputs documents which are searchable and discoverable.

>> Read more about Great OCR for SANE

SIP RELOAD — REsource LOcation And Discovery, a peer-to-peer (P2P) signaling protocol

SIP is a mature internet technology to establish sessions of any type across the internet. RELOAD stands for REsource LOcation And Discovery and is a peer-to-peer (P2P) signaling protocol standardised in IETF that provides its clients with an abstract storage and messaging service between a set of cooperating peers that form an overlay network. RELOAD defines a security model based on a certificate enrollment service that provides unique identities. NAT traversal is a fundamental service of the protocol.

The goal is to implement a P2P communications network based on IETF standards that allows people to communicate securely without the traditional interposed third parties like SIP service providers.

This is done both by establishing direct encrypted channels between the participants as well as using digital identities based on X509 certificates to identify the participants in a conversation, which will prevent third parties from inserting themselves into the conversation by attempting to impersonate one of the participants.

The outcome would be a working RELOAD implementation, with a functional backend for connecting and discovering peers based on their identity which is backed by an email address that will then also function as a working SIP address.

>> Read more about SIP RELOAD

SWH package manager Data Ingestion — Add Package managers to Software Heritage

Software Heritage's ambition is to collect, preserve, and share all software that is publicly available in source code form. In this project we improve the SWH scanner tool which compares any set of files with the SWH archive. This is very useful for detecting license violations or security issues. The goal of the project is to take the scanner from a research prototype to a widely available and usable tool. This involves work around its packaging, user interface, robustness and performance. We will be re-purposing the advanced graph-comparison algorithm from the Mercurial DVCS to minimize the load to the SWH archive. We will also expand the list of existing source code origins we will create new listers and loaders for Maven, Go, Packagist, RubyGems, Bower, CPAN and pub.dev/Dart package managers.

>> Read more about SWH package manager Data Ingestion

Storing Efficiently Our Software Heritage — Faster retrieval within Software Heritage

Software Heritage (https://www.softwareheritage.org) is the single largest collection of software artifacts in existence. But how do you store this in a way that you can find something fast enough, taking into account that these are billions of files with a huge spread in file sizes? "Storing Efficiently Our Software Heritage" will build a web service that provides APIs to efficiently store and retrieve the 10 billions small objects that today comprise the Software Heritage corpus. It will be the first implementation of the innovative object storage design that was designed early 2021. It has the ability to ingest the SWH corpus in bulk: it makes building search indexes an order of magnitude faster, helps with mirroring etc. The project is the first step to a more ambitious and general purpose undertaking allowing to store, search and mirror hundreds of billions of small objects.

>> Read more about Storing Efficiently Our Software Heritage

Adera — Relevant scientific research results

The project summary for this project is not yet available. Please come back soon!

>> Read more about Adera

searx — A privacy-respecting, hackable metasearch engine

Searx (/sɜːrks/) is a free metasearch engine, available under the GNU Affero General Public License version 3, with the aim of protecting the privacy of its users. Across all categories, Searx can fetch and combine search results from more than 80 different engines. This includes major commercial search engines like Bing, Google, Qwant, DuckDuckGo and Reddit, as well as site-specific searches such as Wikipedia and Archive.is. Searx is a self hosted web application, meaning that every user can run it for themselves and others - and add or remove any features they want. Meanwhile, numerous publicly accessible instances are hosted by volunteer organizations and individuals alike. The project will consolidate the many suggestions and feature requests from users and operators into the first full-blown release (1.0) for Searx, as well as spend the necessary engineering effort in making the technology ready for even wider deployment.

>> Read more about searx

Dynamic indexing for real time graph database — Provide faster query results through algorithmic preprocessing

Based is an open source real time data platform with a suite of features that help developers build more performant applications faster and with more flexibility. It’s built on a self-developed real time graph database and the WebSocket protocol to ensure performance and scaling.

One of the features is an automatic indexing system that keeps track of frequently performed queries by monitoring a set of (real time) parameters and assigning values to queries, that in turn inform which parts of the graph to index. This index has to work with the Based real time graph database and optimise its performance, which means the index also has to be aware of any changes in schema structure or updates in indexed data. This is achieved through the existing subscription engine in Based. Our hope is that this project can lay the groundwork for more efficient indexing systems for all graph databases.

Software Heritage — Collect, preserve and share the source code of all software ever written

Software Heritage is a non profit, multi-stakeholder initiative with the stated goal to collect, preserve and share the source code of all software ever written, ensuring that current and future generations may discover its precious embedded knowledge. This ambitious mission requires to proactively harvest from a myriad source code hosting platforms over the internet, each one having its own protocol, and coping with a variety of version control systems, each one having its own data model. This project will amongst other help ingest the content of over 250000 open source software projects that use the Mercurial version control system that will be removed from the Bitbucket code hosting platform in June 2020.

>> Read more about Software Heritage

Sonar: a modular peer-to-peer search engine — Modular peer-to-peer search engine

Sonar is a project to research and build a toolkit for decentralized search. Currently, most open-source search engines are designed to work on centralized infrastructure. This proves to be problematic when working within a decentralized environment. Sonar will try to solve some of these problems by making a search engine share its indexes incrementally over a P2P network. Thereby, Sonar will provide a base layer for the integration of full-text search into peer to peer/decentralized applications. Initially, Sonar will focus on integration with a peer-to-peer network (Dat) to expose search indexes securely in a decentralized structure. Sonar will provide a library that allows to create, share, and query search indexes. An user interface and content ingestion pipeline will be provided through integration with the peer to peer archiving tool Archipel.

>> Read more about Sonar: a modular peer-to-peer search engine

sourcehut — Graph query support for software development platform

SourceHut is a free-software platform providing infrastructure for free-software projects, providing hosted repositories, mailing lists, bug trackers, real-time chat tools, and continuous integration infrastructure, among other services, and facilitating collaboration and project discovery via a federated project index. SourceHut focuses on performance, accessibility, and robustness, and since 2018 has provided a reliable platform supporting the thousands of FOSS projects that depend on its services. The NLnet project will expand the integration between SourceHut services, and between SourceHut and independently operated third-party services, primarily through the development of a comprehensive federation of GraphQL APIs.

>> Read more about sourcehut

Space Tube — Group-to-group instant messaging

Space Tube is a service utilising the Matrix protocol to allow groups to communicate with other groups. A group member adds the Space Tube bot to their shared chat platform e.g. discord server, slack organisation, element space etc, then they can create a channel (or tube) that sends messages to and from another group's chat platform. This allows groups to form relationships as groups that don't rely on individual people within those groups connecting them together. These group relationships can then scale to much larger directly participatory structures.

This project will automate the process of creating tubes so that it can be done in a few seconds by a non-technical user. It will also expand tube functionality by allowing tubes to connect more than two groups at once and providing links to a graphical interface to support more complex group interactions such as agreeing to proposals or sharing resources.

>> Read more about Space Tube

Stract — Explorative search engine

Search has become an intrinsic part of the way we explore the web. Sadly as of late, most of the current search engines fail to live up to this responsibility.

Stract is a fully open source, independent and user-centric search engine for the web. In short, our goal is to do web search right.

The funding from NLnet will be used to improve the performance of our index, improve the performance of our web graph, adding a live index for news articles and blog posts and finally improving our currently insufficient documentation.

>> Read more about Stract

TALER Bullion — Infrastructure for GNU Taler Payments with non-fiat Currencies

Depending on how you design a money system, its properties can be quite different. Regular currencies are typically steered towards (slight) inflation by the public bodies that steward them, by means of a gradual influx of money. This benefits "active money" (investors) which yields economic growth. Of course this also makes prices for consumers continually rise, and savings de-valuate over time in terms of purchasing power. The rate at which this devaluation takes place is a policy instrument, and of course one that should be used wisely. When these systems were first designed, money was backed up by physical assets such as gold and silver which offered more predictable long term purchasing power. Some users still prefer for their savings to be backed up by something of concrete value they own.

GNU Taler is a well-designed system for (online) payments, and it is eminently suitable to trade (the ownership safely of) stored gold, silver and similar systems based on real value. Besides its obvious use case as a payment system for regular currencies, the system can also be used to revitalise gold and silver for storage and payment systems; they still exist today but are decoupled. The purpose of this project is to solve problems with trust relations, such as passing (the ownership of) gold or silver between vault operators, or between gold storage and payment systems so it can become practically useful money on an international scale, in service of people outside the financial industry.

>> Read more about TALER Bullion

GNU Taler Tryton/GNUHealth integration — GNU Taler module for Tryton ERP/GNU Health

This project will develop a Tryton module which would allow users to integrate payments with GNU Taler into their financial workflow, whether from a webshop, a factory or a hospital. Tryton is a popular libre business management system used for e-commerce and enterprise resource planning. There are many modules for financial accounting, sales, inventory and stock, CRM, shipping, subscription management, etc. Existing payment provider integrations within Tryton are limited to specific proprietary payment providers, having a Taler based option would allow organisations to handle Taler based payments (incoming as well as outgoing).

GNU Health (which is built on Tryton) provides a suite of libre alternatives for Hospital Management software, health information systems and electronic health records. Integration of privacy preserving payments with TALER in GNU Health will deliver a much needed contribution to medical privacy, providing the first digital alternative (next to cash payment) which allows patients to pay for their personal medical treatment and medication directly and with full discretion - keeping the doctor-patient privilege intact.

>> Read more about GNU Taler Tryton/GNUHealth integration

TOS;DR OTA backend — Integrate Terms of Service;Didn't Read with Open Terms Archive

Open Terms Archive is a digital common that produces (since 2020) datasets of the evolution of contractual documents (Terms of Service, Privacy Policy…) over time, enabling analysis and comparison. It aims at shifting the power balance from big tech actors towards researchers, end users and regulators. The “Terms of Service; Didn't Read” (ToS;DR) project enables (since 2011) crowd-reading and rating of these same contractual documents. These documents are obtained from the web with a dedicated engine that stores them in a private database and suffers from lack of maintenance.

The goal of the effort is to replace the historical ToS;DR crawler with the public Open Terms Archive datasets, thus increasing the reliability and auditability of the source data, since the annotations will be based on public datasets produced by replicable instances instead of being based on a one-off database used only by ToS;DR itself. This will also enable establishing a common data format for annotating documents.

>> Read more about TOS;DR OTA backend

GNU Taler wallet app for iOS — Mobile GNU Taler payments for portable Apple devices

GNU Taler (Taxable Anonymous Libre Electronic Reserves) is a privacy-preserving electronic instant payment system that is fully free software. It uses electronic coins stored in wallets on customer’s device. Coins are like cash. Users can use Taler to pay in existing currencies (i.e. EUR, USD, BTC), or use it to for instance create new regional currencies. The Taler wallet is currently available as a browser-based WebExtension and as Android app, but not yet as iOS app. This project will develop a user-friendly and accessible iOS wallet app for the GNU Taler payment system. With the iOS Taler wallet app, users will be able to make payments with their iPhone -- similar to how they would use proprietary payments systems like Apple Pay.

>> Read more about GNU Taler wallet app for iOS

Transparency Toolkit — A decentralized hosted archiving service with search

Transparency Toolkit is building a decentralized hosted archiving service that allows journalists, researchers, and activists to create censorship-resistant searchable document archives from their browser. Users can upload documents in many different file formats, run web crawlers to collect data, and manually contribute research notes from a usable interface. The documents are then OCRed (when needed) and indexed in a searchable database. Transparency Toolkit provides a variety of tools to help analyze and understand the documents with text mining, searching/filtering, and manual collaborative analysis. Once users are ready, they can make some or all of the documents available in a public searchable archive. These archives will be automatically mirrored across multiple instances of the software and the raw data will be stored in a distributed fashion.

>> Read more about Transparency Toolkit

HTML export for Typst — Markup based typesetting for multichannel publishing

Typst is a markup-based typesetting system that is designed to be as powerful as LaTeX while being much easier to learn and use. Currently, Typst outputs documents only as PDF, yet there is strong demand for generating HTML. We want to extend Typst such that it can create high-quality HTML and PDF versions from the same document, which is currently not possible with comparable programs. As a result, Typst could be used in a variety of new scenarios, such as the generation of websites and e-books. Furthermore, this will improve the accessibility of the output documents.

>> Read more about HTML export for Typst

URL Frontier 2.0 — Enterprise features for URLFrontier

URLFrontier provides a crawler-neutral API and service implementation for a crawl frontier, which can power various web crawlers independently from their implementation language and scalability. This API defines the operations that a web crawler typically does when communicating with a web frontier e.g. get the next N URLs to crawl, update the information about URLs already processed, change the crawl rate for a particular hostname, get the list of active hosts, get stats, etc… The aim of this project is to turn what is currently a working piece of software (the result of an earlier grant from NGI Zero Discovery) into an enterprise-grade solution. The improvements will mainly concern the service implementation, eg. monitoring/reporting, clustering/discovery and robustness/resilience. The project will improve the usability of the system by adding configurable logging and metrics reporting, improve the performance of the service for very large volumes of data by adding efficient parallelization across multiple nodes; and improve the overall robustness through more graceful failure modes and more efficient restarts .

>> Read more about URL Frontier 2.0

URL Frontier — Develop a API between web crawler and frontier

Discovering content on the web is possible thanks to web crawlers, luckily there are many excellent open source solutions for this; however, most of them have their own way of storing and accessing the information about the URLs. The aim of the URL Frontier project is to develop a crawler-neutral API for the operations that a web crawler when communicating with a web frontier e.g. get the next URLs to crawl, update the information about URLs already processed, change the crawl rate for a particular hostname, get the list of active hosts, get statistics, etcetera. It aims to serve a variety of open source web crawlers, such as StormCrawler, Heritrix and Apache Nutch.

The outcomes of the project are to design a gRPC schema then provide a set of client stubs from the schema as well as a robust reference implementation and a validation suite to check that implementations behave as expected. The code and resources will be made available under Apache License as a sub-project of crawler-commons, a community that focuses on sharing code between crawlers. One of the objectives of URL Frontier is to involve as many actors in the web crawling community as possible and get real users to give continuous feedback on our proposals.

>> Read more about URL Frontier

variation graph (vgteam) — Privacy enhanced search within e.g. genome data sets

Vgteam is pioneering privacy-preserving variation graphs, that allow to capture complex models and aggregate data resources with formal guarantees about the privacy of the individual data sources from which they were constructed. Variation graphs relate collections of sequences together as walks through a graph. They are traditionally applied to genomic data, where they support the compression and query of very large collections of genomes.

But there are many types of sensitive data that can be represented in a variation graph form, including geolocation trajectory data - the trajectories of individuals and vehicles through transportation networks. Epidemiologists can use a public database of personal movement trajectories to for instance do geophylogenetic modeling of a pandemic like SARS-CoV2. The idea is that one cannot see individual movements, but rather large scale flows of people across space that would be essential for understanding the likely places where a outbreak might spread. This is essential information to understand at scientific and political level how to best act in case of a pandemic, now and in the future.

The project will apply formal models of differential privacy to build variation graphs which do not leak information about the individuals whose data was used to construct them. For genomes, the techniques allow us to extend the traditional models to include phenotype and health information, maximizing their utility for biological research and clinical practice without risking the privacy of participants who shared their data to build them. For geolocation trajectory data, people can share data in the knowledge that their social graph is not exposed. The tools themselves are not limited to the above use cases, and open the doors to many other types of applications both online (web browsing histories, social media usage) and offline. .

>> Read more about variation graph (vgteam)

WeasyPrint — Print rendering engine for HTML and CSS

WeasyPrint helps web developers create high quality print documents. It turns simple HTML pages into gorgeous statistical reports, invoices, tickets… From a technical point of view, WeasyPrint is a visual rendering engine for HTML and CSS that can export to PDF - independent from rendering engine like WebKit or Gecko. It aims to support web standards for printing. WeasyPrint is free software made available under a BSD license. The CSS layout engine is written in Python, designed for pagination, and meant to be easy to hack on.

>> Read more about WeasyPrint

Web Annotation — Building blocks for interoperable annotation systems

The idea of web annotation is to support the creation and exchange of annotations on any visited page; thereby enabling people to make, share, and discover corrections, rebuttals, side-notes, or other contextually relevant resources. Using the W3C’s Web Annotation standard, and contributing to the incubating Apache Annotator project, this project works on modules and tools that facilitate a diverse ecosystem of interoperable annotation systems.

>> Read more about Web Annotation

XWiki — Bring wiki capabilities into the Fediverse

XWiki is a modern and extensible open source wiki platform. Up until now, XWiki had been focusing on providing the best collaboration experience and features to its users. We're now taking this to the next level by having XWiki be part of the larger federation of collaboration and social software (a.k.a. fediverse), thus allowing users to collaborate externally. XWiki is embracing the W3C ActivityPub specification. Specifically we're implementing the server part of the specification, to be able to both view activity and content happening in external services inside XWiki itself and to make XWiki's activity and content available from these other services too. A specific but crucial use case, is to allow content collaboration between different XWiki servers, sharing content and activity.

>> Read more about XWiki

XR Fragments — Discover, reference, navigate and query 3D online content

After the hype of early (and proprietary) virtual reality technologies like Second Life cooled down, there is recently a renewed push towards the “3D” web which uses virtual reality technologies (also marketed under new brand names like "Metaverse"). While many technological building blocks are meanwhile available, seamlessly surfing the 3D web however seems quite far away still for a simple reason — browsers exit fullscreen/WebXR mode when switching web addresses, essentially removing the immersive experience when navigating. While such a limitation comes from obvious security considerations, it also pushes VR/AR-Headset owners into walled gardens for a more pleasant experience.

XR Fragments is developing a simple public protocol for networked 3D webrings to discover, reference, navigate and query 3D online content (read-only). This allows to enable immersive 3D navigation, liberate 3D content from being locked away inside games / walled gardens and to query objects inside a 3D asset files, without the need of serverside backends.

>> Read more about XR Fragments

YaCy Grid SaaS —

YaCy Grid Search-as-a-Service creates document crawling indexing functionality for everyone. Users of this new platform will be able to create their custom search portal by defining their own document corpus. Such a service is an advantage as a privacy or branding tool, but also allows scientific research and annotation of semantic content. User-group specific domain knowledge can be organized for custom applications such as fueling artificial intelligence analysis. This should be a benefit i.e. for private persons, journalists, scientists and large groups of people in communities like universities and companies. Instances of the portal should be able to self-support themselves financially: there is turn-key infrastructure to handle payments for crawling/indexing amounts as a subscription on a periodical basis while search requests are free for everyone. The portal will consist of free software, and users can download the portal software itself together with the acquired search index data - so everyone can start running a portal for themselves whenever they want.

>> Read more about YaCy Grid SaaS

Cpdf Accessibility — Implement PDF/UA in cpdf

The Cpdf accessibility project extends the popular open-source PDF processing tool Cpdf to support PDF/UA (ISO 14289), the standard for accessible PDF. PDF/UA helps those with disabilities who use screen readers and other tools to navigate documents by tagging PDFs with metadata describing the logical structure of the content. Such metadata can also help all users by allowing reliable text re-flow, and better searching within documents. There is very little open-source tooling for accessible PDF at present, so this will represent a significant step forward. The work will involve adding functionality to Cpdf for the inspection and manipulation of existing PDF/UA files, and the creation of new ones from scratch. These tools will be useful to PDF/UA developers as well as to end users.

>> Read more about Cpdf Accessibility

cables.gl — Creative tool for graphics and 3D content

Cables is a tool which allows people to create beautiful, interactive, visual web content without knowing how to type a line of code. Your work is easily exportable at any time, so you can embed it into your website, use it an immersive VR experience, or integrate into other kinds of creative output. Cables patches can be published, shared, copied and remixed by the entire community. This allows people to constantly learn new things from each other.

By developing a standalone version, that works outside of the browser, cables will open up even more for contributions from the open source community. It will be, at the same time, a development environment for contributors, and an offline version of the cables editor. As a side effect, using it with native modules on any major platform and operating system will open up a whole new area of how and where to use cables to create content.

>> Read more about cables.gl

dweb-search — Index DHT based distributed webs

dweb-search is a Free and Open Source (FOSS) search engine for directories, documents, videos, music on the Interplanetary Filesystem (IPFS), supporting the creation of a decentralized web where privacy is possible, censorship is difficult, and the internet can remain open to all. This project implements a publicly accessible IPFS thumbnail service and creaties a UI specifically to explore music or videos.

>> Read more about dweb-search

elRepo.io - Resilient, distributed content sharing — Resilient, human-centered, distributed content sharing and discovery.

In this project AlterMundi and NetHood collaborate to develop a critical missing part in decentralized and distributed p2p systems: content search. More specifically, this project will implement advanced search for elRepo.io, the self-hosted and distributed culturesharing platform currently under active development by AlterMundi and partners. Search functionalities will expand on the already proven coupling of thelibxapian searching and indexing library and turtle routing. The distributed search functionality will be implemented to be flexible and modular. It will become the meeting point of three complementary threads of on-going work: Libre technology and tools for building Community Networks (LibreRouter & LibreMesh), fully decentralized, secure and anonymous Friend2Friend software (Retroshare), and a transdisciplinary participatory methodology for local applications in Community Networks (netCommons).

mCaptcha — Privacy-friendly Proof of Work (PoW) based CAPTCHA system

Existing CAPTCHA systems expect visitors to identify objects to prevent spam, which makes the web inaccessible to persons with cognitive, auditory, and visual special needs. They log Internet Protocol (IP) addresses and use tracking technologies, like cookies, to track and profile their users across the internet. IP logging and cookie-based tracking are privacy-invasive, inaccurate, and impossible to use with anonymizing technologies like Tor and VPNs. Censors can abuse the opaque nature of these systems to prevent certain groups from accessing certain types of information. Independent testing for bias is not possible since the documentation doesn't exist for their methods and algorithms.

mCaptcha is an attempt at creating a self-hosted alternative to reCAPTCHA and hCaptcha with a focus on privacy, transparency, user experience, and accessibility. mCaptcha’s Proof of Work (PoW) mechanism uses strong cryptographic principles that guarantee idempotency and transparency. mCaptcha doesn’t log IP addresses and doesn’t require tracking user activity across the internet. Censors can’t use mCaptcha to deny access to information without detection. Also, the PoW mechanism requires minimal user interaction to solve the CAPTCHA, which will significantly improve the accessibility of the web.

>> Read more about mCaptcha

Search

Navigate projects

By theme

Want to help?

Help us by protecting open source and its users with 5 minutes of your time.