Last update: 2003-08-13

Summary of IRDP Research Exchange

international exchange of scholars for software projects

(February - July 2000) My first task was to get familiar with the source code of the Globe Location Service (LS) prototype since I would be working with it throughout my stay at the VU. The LS was having poor response times (~3ms) to lookup and update operations, even when there was no disk I/O or network latency between the resolver and the LS node (both running on the same machine). Thus, the most productive way to understand how the LS worked was to try to find where all this time was being spent. I worked on this for about 1 month: learning about each logical layer in a node (Datagram, Messenger, RPC and Operation), running the Java profiler, and inserting my own timers throughout the code. Unfortunately, I was not able to find any one function or object that was extraordinarily slow. The 3ms response time appeared to be equally distributed throughout the logical layers.

However, during this preliminary investigation, I did become comfortable enough with the code to be able to add serious modifications to it. The LS prototype lacked 2 components that had been designed and published in research papers: crash recovery and node partitioning mechanisms. Crash recovery deals with the loss of update operations, and node partitioning splits a logical node into many physical ones for load distribution purposes. Implementation of these was relatively straightforward; for details of the algorithms, see the papers (which can be found at www.cs.vu.nl/globe/).

However, one minor problem appeared when I went to merge the code of these 2 components. Originally, we thought that crash recovery and node partitioning were independent of each other, which led to a design that placed the partitioning algorithms in a new layer between the Messenger and RPC layers. The layers below the partitioning layer dealt only with physical nodes, and those above it dealt with only logical ones. However, the crash recovery mechanisms were mainly located in the RPC layer, but handled only physical nodes. Therefore, the partitioning algorithms had to be placed into the RPC layer.

Soon after I had completed the merge, we discovered that the crash recovery algorithms were partially incomplete. The paper describing crash recovery (and what I based my implementation on) dealt only with the loss of update RPCs but not lost lookup RPCs. In the prototype, if a reply to a lookup request was lost, the request would stay in the node's memory forever.

To prevent lost replies from using up a node's resources, I added a time-out mechanism for RPC requests. When a time-out occurs, the request and associated objects are removed from the node.

I'd estimate that I spent 3 months during this phase of the exchange: implementing crash recovery and partitioning and merging them, implementing RPC time-outs, and lots of testing.

The final 2 months of the exchange I spent working on a more serious performance analysis of the LS node. Instead of looking at absolute times as before, this analysis focused on comparing the relative execution times of the logical components within a single node. These included: RPC ordering and reliability, partitioning, disk I/O, etc. By comparing the time spent in each logical component, we hoped to find the performance penalties associated with the trade-offs made during the LS design. For example, the high-level lookup and update operations are made simple by having complex RPC and Messaging layers.

A list of the all experiments performed, along with a more detailed summary of the analysis can be found at www.cs.vu.nl/~ldcrawl/perf. Briefly, for the experiments involving a database on disk, disk I/O was the major bottleneck. Possible ways to improve this are optimizing the underlying disk database (easy), and making the LS node concurrent for operations on different object handles (hard). I also performed experiments using an in-memory database. Unfortunately, the node only appeared to speed up by a factor of 2-3, depending on the types of operations being performed. In this setting, Java was the bottleneck. To improve this, one might look at different Java interpreters or use the Manta compiler (written by a group at the VU).

Dan Crawl

Summary of IRDP Research Exchange

Project ReX

Navigate projects

Search