Arc Forum | Your use of hash tables to represent nodes in the AVL tree bothers me a lot. Ye...

Arc Forum

0 points by prestonbriggs 5198 days ago | link | parent

Your use of hash tables to represent nodes in the AVL tree bothers me a lot. Yes, you get convenient notation, but it's disgusting, both practically and algorithmically, in terms of both space and time.

To see why, re-implement your stuff in something simple, like C or assembly, then measure space and time consumption. Compare using structs for nodes versus hash tables.

Just because it's wrapped up in pretty syntax doesn't make it efficient for use in all circumstances.

Might play with a C version of alists, versus hash tables, versus binary tree while you're at it. They're all useful, but in different circumstances. Implementing every "dictionary" with one or the other data structure is a mistake. You need to choose the correct data structure to match your application. It's an interesting engineering problem.

Preston

4 points by waterhouse 5198 days ago | link

What I was doing was getting the AVL tree to work (which, by the way, took a certain amount of thinking, debugging, and rewriting). Implementing them with hash tables, with convenient notation, was entirely appropriate for the task at hand. And now that I've got it in working form, it's easy to go and replace the hash table parts with structs or something. In fact, I think that was pretty clearly on my mind, because I described in footnote 2, in some detail, how one might implement the nodes using things other than hash tables. It was not my intention that the nodes should forever remain hash tables. I think you misunderstand me greatly.

And by the way, I did re-implement it in C, using structs, before I wrote the original post. Here's an excerpt.

  avl_node * node_r(st_h *d, avl_node *x, avl_node *y) {
    if (x->dp > 1+y->dp) {
      if (x->lf->dp > x->rt->dp)
        return node(x->dt, x->lf, node(d, x->rt, y));
      else return node(x->rt->dt, node(x->dt, x->lf, x->rt->lf), node(d,x->rt->rt, y)); }
    else {
      if (y->rt->dp > 1+y->lf->dp) {
        if (y->rt->dp > y->lf->dp)
  	return node(y->dt, node(d,x,y->lf), y->rt);
        else return node(y->lf->dt, node(d,x,y->lf->lf), node(y->dt, y->lf->rt, y->rt)); }
      else return node(d,x,y);
    }}

Feels good, man. (The above code, by the way, fails to free() the obsolete nodes. That could be fixed if necessary.) Fortunately, I didn't really have to debug that code; I cribbed directly from my Arc code.

-----

2 points by akkartik 5198 days ago | link

A node has a constant number of fields. Who cares what the time complexity of lookup is?

Perhaps it's inefficient in use of space. But isn't it premature optimization to worry about that?

I don't see how showing an implementation for AVL trees claims to be efficient 'in all circumstances' (what is?), or how it claims that you can use one data structure for everything.

-----

1 point by prestonbriggs 5198 days ago | link

The "constant numbers of fields" is the reason there's a better alternative.

A hash table has an _expected_ constant lookup time. But it might be as bad as O(number of fields). He blows all his O(log n) complexities by putting in an expected O(1) lookup for all his field manipulations.

It's probably premature optimization to use an AVL tree in place of an ordinary binary tree, or even an alist.

I didn't see him promoting AVL trees for all uses, but I did see him suggesting they could/should replace alists.

Of course I don't mind a guy learning about AVL trees, or anything else, by implementing them in Arc or anything else. Indeed, I recommend such exercises to everyone; we learn a lot. But my feedback regarding his implementation of the nodes stands.

Preston

-----

1 point by akkartik 5198 days ago | link

"It's probably premature optimization to use an AVL tree in place of an ordinary binary tree, or even an alist."

Yeah that makes sense, thanks for bringing that up. The constants will probably drown out anything else for small structures.

---

"A hash table has an _expected_ constant lookup time. But it might be as bad as O(number of fields)."

Let me rephrase. A node has a statically constant, small number of fields. Who cares what the time complexity of lookup is?

"He blows all his O(log n) complexities by putting in an expected O(1) lookup for all his field manipulations."

I think you mean O(k) not O(1)? Since k is not n and exactly 4, which is a small number, it's not clear this is a problem.

---

May I suggest referring to the post or idea or implementation rather than to 'the guy'? I'm also finding your use of words like 'disgusting' distracting. (unless you're really feeling flamey :)

-----

1 point by prestonbriggs 5198 days ago | link

Yeah, it is distracting and I'm sorry about that. Similarly, I'm sorry to publicly snipe, 'cause I really do admire folks who make an effort to learn by doing, and better yet, write up their experiences.

But "disgust" was the word I wanted. I can't point at his choice and say "this is all wrong". Instead, I point at it and claim that it "smells" wrong to me, I'd never do it that way, bad taste, etc. Then I tried to argue about why my taste is offended.

Regarding O(1) versus O(k), I actually wrote "expected O(1)". That is, _expected_ constant time, versus guaranteed constant time. Though your point that the number of fields is a constant, 4, is certainly true.

Preston

-----

1 point by aw 5198 days ago | link

By the way, if anyone would like to use struct's from Arc it would be easy to implement. (http://docs.racket-lang.org/guide/define-struct.html)

-----