Arc Forum | I would love to see a Haskell version of this.

Arc Forum

new | comments | leaders | submit

4 points by lojic 6671 days ago | link | parent

I would love to see a Haskell version of this.

7 points by raymyers 6671 days ago | link

Ask and ye shall receive. http://ray.codezen.org/wiki/doku.php?id=binary-search-tree

-----

4 points by lojic 6671 days ago | link

Very nice - thanks!

  Haskel:
  36 total lines; 1,167 chars
  Arc:
  53 total lines; 1,215 chars

I don't know the formula for a "code tree".

-----

4 points by gregwebs 6671 days ago | link

This doesn't tell the whole story. The haskell version is doing some balancing. For an AVL tree with rotations implemented through pattern matching see here. http://www.nabble.com/Re%3A-Why-functional-programming-matte...

The line deriving(Eq,Show) means that you can now do this:

  Prelude BST> Tree 'a' Empty Empty
  Tree 'a' Empty Empty
  Prelude BST> let a = Tree 'a' Empty Empty
  Prelude BST> a
  Tree 'a' Empty Empty
  Prelude BST> let b = Tree 'b' Empty Empty
  Prelude BST> b
  Tree 'b' Empty Empty
  Prelude BST> a == b
  False
  Prelude BST> let c = Tree 'c' a b
  Prelude BST> let d = Tree 'c' a b
  Prelude BST> c == d
  True

All this is guaranteed at compile time. Moreover, the haskell version is much more readable. The use of modules removes unnecessary function prefixing, and there is no doubt about what the arguments to functions are when they are pattern matched.

Curiously, though, the arc version actually has less tokens by my measurement

  ruby -e 'p File.read(ARGV[0]).split(/\s|\.|!/).reject{|s| s==""}.map{|tok| tok.gsub(/\(|\)/,"")}.flatten.uniq.size'

  bst.arc 205
  bst.hs  253

but here is the interesting thing- now lets count just the unique tokens

  bst.arc 41
  bst.hs   47

This is in part due to haskell's pattern matching semantics where the same function definition is repeated multiple times, and the '|' character is repeatedly used. Also, there are more functions in the arc version, and in both pieces of code, variable names are usually one character and function names have multiple characters.

-----

2 points by rkts 6671 days ago | link

> The haskell version is doing some balancing.

Balancing that causes remove to be O(n). Probably not a good idea.

-----

2 points by raymyers 6670 days ago | link

A more balanced tree will speed up 'find' and 'insert'. If those actions are performed far more often than 'remove', it would indeed be a good idea. Without knowing the use case, I cannot say which I would prefer.

-----

3 points by pg 6671 days ago | link

It's fairly easy to do by hand. Just count the number of tokens-- in the sense of things whose first character a parser would not treat as merely another character in the name it was previously reading-- that are not closing delimiters of a pair.

-----

2 points by vrk 6671 days ago | link

If by that you mean counting the nodes, you need to parse the source first. You can either do it by hand (if you know the grammar) or tweak the compiler/interpreter, if it doesn't have any statistics option.

-----

4 points by lojic 6671 days ago | link

Is there a bug in bstMin ? Shouldn't it reference bstMin on the rhs?

-----

2 points by raymyers 6670 days ago | link

Fixed. Thanks!

-----

1 point by pg 6671 days ago | link

It's missing bst-trav. (The functions that are supposed to be part of the external interface are the ones that begin bst-...)

Also, is numerical < hard-wired as the comparison function?

-----

2 points by gregwebs 6671 days ago | link

no need for bst-trav since there is an elements function. You can just map over the elements.

in the data declaration 'a' is a generic type, but Haskell will infer that the type must be an instance or class Ord (i.e. implement <,<=,>=,>,max,min,==). (If the type is not an instance of Ord it will tell you so at compile time)

-----

2 points by pg 6670 days ago | link

no need for bst-trav since there is an elements function. You can just map over the elements

Mapping over elements is not the same if you're looking for the 4th element of a huge tree; which is probably what you have if you're using a bst.

-----

7 points by raymyers 6670 days ago | link

Actually yes, in Haskell it is the same. Lazy evaluation means you could take the 4th element of a map over an infinite list. For example: "map (*2) [1..] !! 4"

-----

2 points by rkts 6671 days ago | link

I would argue that relying on overloaded comparison functions is a bad idea. Not all things have a 'natural ordering,' and even for those that do you might want to override it at some point.

-----

2 points by raymyers 6670 days ago | link

I think reasonable people can differ on this. Passing comparators as arguments is an approach more typical of dynamically-typed languages (e.g. CL and Scheme 'sort'). In a statically-typed language, it is more common to use a comparison operator that specializes on type (e.g. Haskell Prelude 'sort'). If a set of objects don't have a 'natural ordering', they might not belong in a Binary Search Tree in the first place.

-----

3 points by pg 6670 days ago | link

What do you do if you want to maintain a list of strings sorted according to their length?

-----

4 points by raymyers 6670 days ago | link

If I wanted to keep using the Ord type-class for comparison, I might use a container type, like this:

    data Elem = Elem String deriving (Show, Eq)
    instance Ord Elem where
        Elem x < Elem y = length x < length y
    
    tree1 = insert (Elem "Some string") Empty
    tree2 = remove (Elem "Some string") tree1

If I really wanted to pass in the comparison function, I would have to change my code slightly. See: "A More Faithful Translation" (http://ray.codezen.org/wiki/doku.php?id=binary-search-tree).

-----

0 points by gregwebs 6670 days ago | link

You have just seen a bunch of code that gets the job done, whereas there is still an argument about whether the lisp code should be passed a comparison function. If you are going to make this argument you should show some actual code, because it is obvious from looking at this example which way is better.

-----

2 points by pg 6670 days ago | link

You have just seen a bunch of code that gets the job done,

It seems to me more that you have redefined the job to be what the code does.

-----

3 points by raymyers 6670 days ago | link

Or maybe we just interpreted the job as "a non-destructive binary search tree". Sorry if we misread the syllabus :)

-----

3 points by gregwebs 6670 days ago | link

Sorry, that comment doesn't sound nice and is dumb because I didn't realize what the previous commenter was getting at.

-----