This page contains a concise overview of projects funded by NLnet foundation that belong to Verticals + Search (see the thematic index). There is more information available on each of the projects listed on this page - all you need to do is click on the title or the link at the bottom of the section on each project to read more. If a description on this page is a bit technical and terse, don't despair — the dedicated page will have a more user-friendly description that should be intelligible for 'normal' people as well. If you cannot find a specific project you are looking for, please check the alphabetic index or just search for it (or search for a specific keyword).

Extending PeerTube — Adding advanced search capabailities to PeerTube

This project aims to extend PeerTube to support the availability, accessibility, and discoverability of large-scale public media collections on the next generation internet. Although PeerTube is technically capable to support the distribution of large public media collections, the platform currently lacks practical examples and extensive documentation to achieve this in a timely and cost-efficient way. This project will function as a proof-of-concept that will showcase several compelling improvements to the PeerTube software by [1] developing and demonstrating the means needed for this end by migrating a large corpus of open video content, [2] implementing trustworthy open licensing metadata standards for video publication through the PeerTube platform, [3] and emphasizing the importance of accompanying subtitle files by recommending ways to generate them.

Search and Displace — Find and redact privacy sensitive information

The goal of this project is to establish a workflow and toolchain which can address the problem of mass search and displacement for document content where the original documents are in a range of forms, including a wide variety of digital document formats, both binary and more modern compressed XML forms, and potentially even encompassing older documents where the only surviving form is printed or even handwritten. The term "displacement" is meant to encompass actions taken on the discovered content that are beyond straight replacement, including content tagging and redaction, as well as more complex contextual and user-refined replacement on an iterative basis. It is assumed that this process will be a server application with documents uploaded as needed, on either an individual or bulk upload basis. The solution would be built in a modular fashion so that future deployments could deploy and/or modify only the parts needed. In practical terms this involves the creation of an open source tool chain that facilitates searching for private and confidential content inside documents, for instance attachments to email messages or documents that are to be published on a website. The tool can subsequently be used for the secure and automated redaction of sensitive documents; by building this as a modular solution enables the solution to be used “standalone” with a simple GUI, or used via command line, or embedded within 3rd party systems such as document management systems, content management systems and machine learning systems. In addition a modular approach will facilitate the use of the solution both with different languages (natural and programming) and different specialities e.g. government archives, winning tenders, legal contracts, court documents etc..

