Ethereal-dev: Re: [Ethereal-dev] Performance. Ethereal is slow whenusinglargecaptures.

Note: This archive is from the project's previous web site, ethereal.com. This list is no longer active.

From: "Ronnie Sahlberg" <ronnie_sahlberg@xxxxxxxxxxxxxx>
Date: Sun, 16 Nov 2003 14:06:43 +1100
Hmm. reflecting over my idea.
Thats suboptimal and possibly harmful for performance.

New attempt.
Searching a binary tree is constant time Ordo(log2(n)).
Searching sequentially is worst case Ordo(n), average case would be
Ordo(n/2) and best case would be Ordo(1)
(for teh case when the item we search for happens to be the first one in the
list).

By blindly using a tree here we could actually degrade performance.


One way to do it properly would be :
Document that the 8 first entries in a value_string as the fast entries.
Make sure that when using value_strings the most common ones are
stored in the first 8 items. The first value is the fastest one of them all.
Make sure the most common value is this one, for example
0 or NO_ERROR for SMB would be the most common SMB error code and should
thus be the first entry.

define a new struct :
struct val_string {
    bin_tree *tree;  /*defaults to NULL    no tree (yet) */
    value_string *original_value_string_struct;
}

Then things like val_to_str() would be cloned to take val_string instead of
value_string as argument: it would do :
  scan the first 8 (arbitrary value) entries in val_string->value_string and
see if we find a match.
  if we found a match, great this is the fast path,
    return the match
  if not we have more work to do. :-(
  if val_string->tree==NULL this means we havent built a constant-time tree
yet,
     so build a tree and keep it perfectly balanced.
     store the root of the tree in val_string->tree
  now, since the fast path failed, search the tree val_string->tree for the
matching item.

It would build the trees on demand.
Since we do not need to do random inserts/removes or destroy the tree,
only create it and lookup
there are optimizations we can do.
For example we only do lookups, we never walk the tree which means that we
can skip keeping the "parent" pointer around in the nodes
(that is 4 bytes less we need to alloc)
so our node could be something like
struct val_string_tree {
    struct val_string_tree *left;
    struct val_string_tree *right;
    guint32 value;
    gchar *string;   /* pointing into the value_string structure*/
}

(we could ofcourse replace value/string (8 bytes) with just one single
pointer to the row in the value_string array,
but then since we need to check index much more often than we dereference
*string  it ause a lot more pointer dereferencing which
takes cycles and also poisons the cache.  It would be unclear without
testing if that would be faster or not.
)


----- Original Message ----- 
From: "Ronnie Sahlberg"
Sent: Sunday, November 16, 2003 1:30 PM
Subject: Re: [Ethereal-dev] Performance. Ethereal is slow
whenusinglargecaptures.


>
> ----- Original Message ----- 
> From: "Biot Olivier"
> Sent: Sunday, November 16, 2003 12:50 PM
> Subject: RE: [Ethereal-dev] Performance. Ethereal is slow when
> usinglargecaptures.
>
>
> > A second performance increase proposal I had in mind is to use hash
lookup
> > instead of sequential lookup for large value_string tables. Part of this
> can
> > be done with #define's (where either the existing linear lookup or the
the
> > yet-to-define hash lookup can be chosen), and the remaining part can be
> > produced with scripts (similar to makeregdotc). But if we're eliminating
> the
> > number of (re)dissection passes this may not be relevant anymore...
>
> Excellent suggestion.
> Both SMB and DCE have value strings for the response code that are huge.
> Both of these protocols are very common.
> But there are also a lot of (the majority) value_strings that only contain
a
> small number of entries.
>
> We could do something like
>
> create :
> struct val_string {
>    int num_items;    /*initially set to 0 */
>    bin_tree *tree;  /* binary tree holding  value and string values */
>    value_string      /* the original value string tree */
> }
>
> Then we could create alternate functions to access the val_string that are
> functionally equivalent to the ones accessing value_string.
> Lots of work but will probably be worth it.
>
> Each time val_to_string() is called we also do :
>    if(val_string->num_items==0){   /* we have not converted this one yet
*/
>      val_string->num_items=count_num_items_in(val_string->value_string);
>      if(val_string->num_items>8){   /* only create the tree if it holds >8
> (arbitrary value) items, if less items its probably not worth it */
>
>
val_string->tree=create_a_binary_tree_of_all_value_string_values(val_string-
> >value_string);  /* make sure the tree is perfectly balanced */
>      }
>   }
>   if(val_string->tree){  /* do we have a quick access tree or not ? */
>     look_it_up_in_the_tree();
>   } else {
>     do_the_normal_lookup_walking_value_string_sequentially();
>   }
>
>
> We do our own binary tree handler.   We dont have to worry about
> inserting/removing values or deleting the tree since it will never change.
> We only implement an optimized function to build a tree given a
value_string
> and to look up a value in the tree.
>
>
> Then we just convert all the value_string declarations one by one into
> val_string declarations.
>
> _______________________________________________
> Ethereal-dev mailing list
> Ethereal-dev@xxxxxxxxxxxx
> http://www.ethereal.com/mailman/listinfo/ethereal-dev