MOO-cows Mailing List Archive

[Subject Prev][Subject Next][Thread Prev][Thread Next][Subject Index][Thread Index]

Re: hash & balanced trees




It is never a bad idea to `over-engineer' when it comes to algorithms 
over data structures.  What seems like an impossible number of elements 
or key density will occur sooner than later.  My disposition is to go 
with a balanced tree approach, with some form of balancing act, a la AVL,
immediately after insertion or deletion of an element.  However as it was 
pointed out previously, one might choose to relax the balancing 
contraints on the tree, at least temporally, until some server idle time.

Whatever mechanism is implemented in the server code (in c where the 
computation complexity is not obscured by some additional layers), the 
issue remains as to what object interface will be exposed at the moo 
level.  A $dictionary object should have the following interface verbs:

/* Adding terms to a b-tree */

    defineb/3,          % +BTree, +Uniqueness, +Order

        /* This predicate can be ignored. It allows you to specify a
           funny sorting order and whether or not duplicates are
           allowed.

           Creates a b-tree named BTree, with the given order, Order.
           The argument BTree must be a ground term. The Order argument
           can be used to specify the sorting order. The default for
           Order is '+', which means sort the entire term in standard
           order. An Order argument such as +(-,+) means "sort the
           principal functor in standard order, the first argument in
           reverse standard order, and the last argument in standard
           order". So you can have order arguments like -(+,-,-(+,-)),
           etc.

           If Uniqueness = true, then duplicates are not allowed in the
           b-tree. Duplicates are allowed by default.
        */

    recordb/3,          % +Btree, +Key, +Term

        /* Saves the Key and its associated Term under the given Btree */

    replaceb/4,         % +BTree, +Key, +Old, +New

        /* Replaces the term Old saved under Key with the new term New */

/* Reviewing terms in a B-tree */
    retrieveb/3,                % +BTree, ?Key, ?Term

        /* Fetches the Keys and Terms from the Btree according to Order */

    retrieveb_nth/3,            % +BTree, +N, -Key

        /* Fetches the nth Key from the Btree */

    btree_count/2,              % +BTree, -Count

        /* Returns how many elements are in the given Btree */

    % could be implemented: retrieveb_terms/3, betweenb/7, betweenkeysb/4,
    % retrieveb_keys/2, retrieve_terms/2, what_btrees/1

/* Removing terms from a B-tree */

    removeb/2,          % +BTree, +Key

        /* Removes all elements stored under Key in BTree */

    removeb/3,          % +BTree, +Key, +Term

        /* Backtracks. Removes the given Key and Term from Btree. */

    removeallb/1]).     % +BTree

        /* Frees a Btree */

/* The btree predicates.

Here is a summary of what we have.

         | avl trees
--------------------
insert   | O(log N)
delete   | O(log N)
find     | O(log N)
walk     | O(N)

You might be saying to yourself: but why use avl's, why not 2-3 trees or
something else? As far as I know, there is no algorithm that limits
comparisons as well as avl.  If there is, tell me.  Comparisons are the
most expensive part here.  2-3 trees are optimal in limiting disk
access.  But all our stuff is in memory. 

*/

These are our notes from adding balanced trees to Quintus, (thanks Tom:) 
and proved to be adequate for most tasks.  The outstanding issues were 
related to multiple processes accessing these structures in shared 
memory, where one wanted to allow updates to the tree while a tree walk 
was in progress . . . 

Comments and suggestions ?

l8r,
v


References:
Home | Subject Index | Thread Index