Final report Open Sesame project
storage and querying middleware for the Semantic Web
1. Background of Sesame
Sesame is a Java software toolkit and server architecture for storing, querying and inferencing over RDF and RDF Schema, the description models that drive the vision of the Semantic Web.
Automated processing of RDF-based metadata on the web requires software that supports expressive querying, scalable storage and that can reason with the information provided in the metadata. Sesame's purpose is to provide this functionality in a variety of deployment scenarios, both as a Java software library and as an easy-to-deploy web-enabled database server.
2. Goals of the project
As defined in the Open Sesame project proposal, the overall goal of the Open Sesame project was defined as succesful distribution and development of Sesame as Open Source software.
To this end, the project was defined in terms of project tasks. In this report, we will briefly address how each task has been completed within this project.
3. Tasks
- [T1 + T2] Code base preparation and infrastructure setup
-
The code base was cleaned up, and proper code commentary and javadoc documentation was added. The complete Sesame codebase has been made publicly available through a public CVS server on the Sourceforge.net site. Mailinglists for feedback and for developer discussions were set up.
- [T3] Creation of an installation procedure
-
Detailed installation/configuration documentation was created. Also, a graphical user interface for easy remote maintenance of Sesame servers was developed.
- [T4] Writing tutorial documentation on RQL and Sesame
-
Tutorials and user manuals were created for Sesame, for RQL and for its followup query language, SeRQL.
- [T5] Developement of an API for communication with repositories
-
A client library for HTTP access to Sesame servers was developed. More recently, a revised API for repository-based access was developed that will allow more flexible deployment of Sesame in a variety of environments.
- [T6] Implementation of a generic SQL Repository Abstraction Layer
-
During the project, the RAL was renamed to SAIL (Storage And Inference Layer), and made more generic. Currently, the SAIL consists of a set of generic interfaces and a number of implementations. A generic SQL implementation supports major database platforms such as MySQL, PostgreSQL and Oracle 9. An in-memory implementation supports fast non-persistent storage of RDF models.
- [T7] Implementation of complete support for RQL in the query engine
-
Initially, the RQL engine was updated regularly to follow the specs. However, divergence in the needs of the Sesame user community and the agenda of the developers of the RQL spec led to the development of a new query language, called SeRQL, which is more suited to the needs of the Sesame user community and is regarded by many in the RDF community as a very strong contender for a QL standardization effort. SeRQL is as expressive, if not more so, as RQL.
Also, a third query engine was developed for Sesame, supporting the popular low-level query language RDQL. This query language is also supported by several other RDF tools, such as Jena.
Summarizing, after completing this task, Sesame now supports three query languages: SeRQL, RDQL and RQL.
- [T8] Implementation of partial support for DAML+OIL querying
-
During the course of the project, the scope of this task has been widened to supporting inferencing on RDF-based models. The main focus of this task has been the development of an exhaustive forward chaining inferencer for RDF and RDF Schema, that supports the semantics of RDF as defined by the RDF Model Theory. After successful completion of a prototype implementation, focus shifted to improving the performance of the inference.
The followup of DAML+OIL is OWL. A start has been made with supporting the more complex metamodel of OWL through a customizable rule-based inferencer.
- [T9] Extension of administration capabilities
-
The client interface of Sesame was extended with an option of deleting statements on statement-pattern basis. To enable this, it was necessary to develop a truth maintenance algorithm that cooperated with the RDF model theory inferencer. The task was completed with the implementation of such an algorithm, and subsequent tests and performance improvements on the implemented algorithm.
- [T10] Integration of visualization software
-
It was soon realized that integration of actual visualization software in Sesame would not be a good idea: the focus of Sesame is middleware and visualization is something that should happen on top of that, not as part of it. Therefore this task was redefined to enabling existing visualisation tools for RDF to cooperate smoothly with Sesame.
Sesame plugins for several visualization packages have been developed in the course of this task. A popular graphical visualization and editing tool for RDF, called IsaViz, now supports exporting to and importing from a Sesame server. Also, a plugin was developed for the popular ontology editor OntoEdit, developed by the German company OntoPrise.
- [T11] Generating developer interest by PR on the web
-
Through a website with a demonstration server, documentation, public announcements on semantic web-oriented fora and mailinglists, developer interest has been consistently generated and kept throughout the project.
Over the course of the project, the total number of memberships of the sesame mailinglists has grown to 131 members. Popularity of the Sesame demonstration website has steadily grown as well. Currently it attracts 10,000 visitors per month. Downloads of the Sesame software also keep growing, with currently on average 400 downloads for each major new release.
Regular feedback in the form of bug reports and requests for enhancements are received from a small number of active people on the mailinglists. Several people have done one-time contributions to the code base, such as Holger Lausen, who developed Oracle support in Sesame, and Jacco van Ossenbruggen, who supplied several bugfixes.
However, the number of actual developers on Sesame has grown only slightly, with the core team (Jeen Broekstra and Arjohn Kampman) still doing most of the development work and other developers contributing infrequently.
- [T12] Implementation of a SOAP protocol handler
-
Halfway through the project it was agreed upon that this task would be dropped from the project schedule as a SOAP protocol handler was already developed by OntoText, and needed little further development. It was agreed that the person-months would be redistributed across other tasks in the project.
- [T13] publication of papers on Sesame
-
Since the start of the project, publications about Sesame have appeared in several conference proceedings, journals and books. A few highlights:
- Jeen Broekstra, Arjohn Kampman and Frank van Harmelen. Sesame: An Architecture for Storing and Querying RDF and RDF Schema. In Proceedings of the First International Semantic Web Conference (ISWC 2002), Sardinia, Italy, June 9-12 2002, pg. 54-68. Springer-Verlag Lecture Notes in Computer Science (LNCS) no. 2342. (variations of this article were later re-published in several books and journals).
- Jeen Broekstra, Arjohn Kampman. Inferencing and Truth Maintenance in RDF Schema: exploring a naive practical approach. In Workshop on Practical and Scalable Semantic Systems (PSSS) 2003, Second International Semantic Web Conference (ISWC), Sanibel Island, Florida, USA, October 20-24 2003.
- J. Broekstra et al.: A Metadata Model for Semantics-Based Peer-to-Peer Systems, 1st Workshop on Semantics in Peer-to-Peer and Grid Computing at the WWW12, Budapest, May 20, 2003.
- H. Stuckenschmidt, R. Vdovjak and J. Broekstra. Index Structures and Algorithms for Querying Distributed RDF Repositories. To appear at the World Wide Web Conference 2004.
- Shelley Powers. Practical RDF: Solving Problems with the Resource Description Framework. O'Reilly & Associates, 2003. ISBN 0-596-00263-7.
- [T14] Writing technical documentation
-
Development documentation was further refined, consisting mainly of consise Javadoc descriptions of APIs and a technical document describing the general architecture of Sesame. All of this documentation is currently available in the Sesame distribution.
- [T15] Organizing a SW tools workshop
-
Work has been done to organize a tutorial at the 2003 Semantic Web Conference. Unfortunately the proposal was not accepted by the conference organizers. While disappointing, we deem this particular task (a dissemination task) non-critical to the success of the Open Sesame project.
4. Conclusions
With the exception of task 15, all tasks have been completed with substantial results. It is agreed that therefore the project has been completed succesfully.
Overall, the project has led to the development of Sesame as a serious software package, often quoted by experts in the field as in the top 2 of best software architectures for RDF storage, inferencing and querying. The Open Sesame project has led to a stable and performant system with numerous features, a relatively active user community, and a small set of dedicated developers. It is our hope that in the future, the community aspect of Sesame will be even further extended, and more developers will join and help develop the software further.
5. Overview of major releases
The following is an overview of Sesame release dates and a short summary of the new functionality that each release introduced. The download statistics are availabe from release 0.7 onwards, and reflect the number of downloads of each release, up to December 4, 2003.
- 2002-02-21 Sesame 0.1
-
- First public release of Sesame
- 2002-03-12 Sesame 0.2
- 2002-04-11 Sesame 0.3
- 2002-05-08 Sesame 0.4
-
- Introduction of the RDF Explorer
- First release of the RDQL query engine
- Support for deleting statements
- Prototype Truth Maintenance system
- 2002-07-16 Sesame 0.5
-
- Stable Truth Maintenance system
- 2002-08-30 Sesame 0.6
-
- Ontology Middleware Module first public release. OMM supports change tracking, advanced security features and partial DAML+OIL reasoning functionality
- Support for RMI access
- Support for SOAP access
- Improved web interface
- Support for in-memory storage of RDF models
- 2002-11-15 Sesame 0.7 (downloads: 731)
-
- Support for Oracle 9i
- First public release of Sesame RDF parsers
- 2003-03-18 Sesame 0.8 (downloads: 340)
-
- Update to reflect the revised RDF specifications, released by the W3C on Jaunary 23.
- First release of "Configure Sesame!", a graphical user interface for Sesame server configuration.
- Complete re-implementation of the RDBMS support.
- 2003-05-14 Sesame 0.9 (downloads: 230)
-
- First release of SeRQL query engine
- Sesame's own RDF parser now the default parser
- 2003-08-14 Sesame 0.95 (downloads: 229)
-
- Revision of the RDF model API
- Revision of the SAIL API, the core of the Sesame system
- Advanced query optimization for SeRQL
- Custom inferencing
- First release of an inferencer for in-memory repositories
- 2003-08-21 Rio 0.96 (downloads: 219)
-
- First public release of the Rio RDF parser toolkit as a seperate component.
- 2003-09-23 Sesame 0.96 (downloads: 390)
-
- Major extensions to the SeRQL query language
- 2003-11-21 Sesame 0.97 (downloads: 76)
-
- Introduction of a new programmer's API: the repository API
- Expected soon: Sesame 1.0
-
- Completely stabilized APIs and finished documentation