Arc Forum | I know I'll take a lot of flak for this, but I'm going to disagree vehemently wi...

Arc Forum

6 points by cchooper 6318 days ago | link | parent

I know I'll take a lot of flak for this, but I'm going to disagree vehemently with this suggestion. Someone has to play devil's advocate, and I guess it's going to be me :)

One of the things I've liked about Arc so far is the tiny number of types: characters, strings, symbols, numbers, lists, hashes, functions and macros. Having a very small number of types means that every function can be written to do something useful on each type. For example, I just wrote a library for converting any Arc value into its JSON equivalent so I could send them to a web client. It was really easy, because I had so few types to deal with. No need for polymorphic whatnot and abstract do-daas. Just use a simple case statement and you're there.

Elaborate type systems can solve a lot of problems. For example, I had a problem with alists, because I wanted them to be converted into objects, but general lists needed to be converted to arrays. How do you tell the difference between an alist and a normal list? If you had some kind of type system that allowed you to declare that your list is an alist then that would solve it. This is how complex type systems get started, but it's all downhill from there. Once your language starts relying on its type system for its power, you begin taking huge risks.

The problem is that when complex type systems fail, they fail hard. Despite endless academic papers on the subject, and a plethora of practical programming languages, no one has yet developed a type system that is powerful enough to allow people to express everything they want in a reasonably complex program. That wouldn't be a problem if the type system got you 99% of the way there and you had to do a small workaround for the last 1%. Unfortunately, what usually happens is that the 1% left over breaks the whole program. Suddenly you can't use that really useful parsing library, because it's defined over an abstract type that your type can't quite be mapped into. Suddenly the polymorphism features of your language become useless because your type isn't quite polymorphic in the way the language designers anticipated (perhaps you want function polymorphism but the language supports object polymorphism, or vice versa). All type systems fail eventually, and if your language relies on its type system for its power, then you might as well be writing machine code for all the use it will be.

Let's take just two examples. OOP started with multiple inheritance, but then people decided this was too error prone, so people moved so single inheritance. But that wasn't expressive enough, so we got interface inheritance, then generics, then mixins, then "composition over inheritance". Despite all this, you still can't express the relationship between circles and ellipses in any OO language.

Example 2: Haskell was inspired by the very powerful Hindley-Milner type system, which has type variables, polymorphic types and abstract types, but this still wasn't good enough, so they added classes. The ML people still didn't think this was good enough, so they added functors. Some people still didn't think this was good enough, so they proposed higher-level functors. There are endless proposals to add generics to the language too, and then there's dependent typing and all that malarkey. No matter how powerful your type system is, it's never enough.

Arc doesn't get its power from having a complex type system. It gets it from being able to express computations on a small number of types in a very concise manner. This kind of power can fail too, but it doesn't fail hard because Arc has the tools to make any problem simple: functions, macros and reprogramming the programming language. Type systems can save you writing lots of similar functions for slightly different types, but so can macros because they can generate the code for you. Type systems can distinguish between different implementations of cdr and car for different types, but you can also do that by redefining cdr and car to cope with those types, or by passing your own versions of cdr and car along with the object (perhaps even stored in the object itself: see the appendix on object systems in ANSI Common Lisp).

Complex type systems give you power, but they are a seduction that we should resist. They allow you to solve problems by constructing interesting types, but the set of problems you can solve is always strictly limited by the type system (and just because you can write your own type system doesn't mean the problem goes away, unless you can get your system to work seamlessly with everyone else's system).

A powerful set of functions over a small number of types is a much better way of doing this. Types are the sockets that allow functions to be connected together. You can either work with a small number of types that allow you to connect all your functions to each other, or you can create all kinds of different types and then use polymorphism/inheritance/type variables/has-a relationships to try and patch up the fragmentation problem. If you really have lots of types that are similar enough for your functions to work across them, then you should probably use fewer types.

On the other hand, I could be talking nonsense. Let's assume that until we have evidence to the contrary :)

4 points by sacado 6317 days ago | link

Wow, I think you won the price of the longest post so far. And it is even a very clever one, actually. And I think your view and almkglor's are not so far from each other.

You state that there should only be the few basic types currently defined in Arc. Paul's idea was to eventually get rid of strings (they are a special kind of list) and even numbers (they are a special kind of list too...). But he finally didn't, and won't, at least for numbers. He also said that this view (as few basic types as possible) finally forced him to develop a basic type system (with annotate, type, rep and isa) to distinguish between the raw list '(a a a a a) as the number 5, as the string "aaaaa" or as an actual list of 5 symbols.

In a way or another, you need explicit types if you want some kind of dynamic typing. Assembly language work the opposite way : e.g. you state (explictly or not) the arg of your function is a number. If the user gave you what he considers a string, too bad for him, because you can't distinguish between them. That means your function can't be polymorphic and you are stuck in an even more contrived space than with user types.

You need an isa function (call it isa or hasa, never mind as for now). For example, car should have a list, and nothing else. To do so, you have to check its type. If you want to redefine car so as to take scanners, generators, ... into consideration, that's easy too : just define your own version of car : if arg is a list, call the original car, else arg is a generator à la Python, so funcall it :

  (let _ car car
    (def car (x)
      (if (isa x 'list)
        (_car x)
        ((x)))))

Ok, you're right until now cchooper, predefined types are enough for these situations, and that's how we should do in such cases. Now imagine we want to deal with lists, generators and arrays defined through FFI. You have to distinguish between the latter two, but how can you do that ? Encapsulating them in a cons whose car is a discriminant between both types will not work here, as a cons isa 'list. That's why you need a way to define these new types, and that is what annotate is for.

  (let _car car
    (def car (x)
      (if (isa x 'list)
          (_car x)
        (isa x 'generator)
          ((rep x))
        (isa x 'array)
          (a-get x 1)
          (err "Not a valid type for car : " type.x))))

Now, about the distinction between isa and hasa. The wonderfull thing about annotate is that it is very generic ! It does not provide you with a way to say your data is of a specific type, it lets you annotate your data with whatever data you want ! The fact that it works with isa is a side effect actually, annotate does not care about isa. You can annotate with a symbol for sure, but also a string, a number, a list, a macro, a closure, a continuation, whatever !

That means you can do something this way, for example :

  (annotate (listtab `((car . ,(fn (self) (self 'car))) (cdr . ,(fn (self) (self 'cdr))) x)

  (let _car car
    (def car (x)
      (if (isa x 'list)
        (_car x)
        (let tx type.x
          (tx!car rep.x)))))

And you've got an object system where the car function is embedded into the data when you don't apply it to lists. That's it, you used annotate as it is now defined to create an has-a behavior. Almkglor has got many other funny ideas with typing, and I think all of them can be implemented simply with annotate and encapsulating old definitions of core functions and axioms into usertype-aware ones.

Maybe Arc needs a few more facilities right there (for example, having to use rep on annotated data is, I think, the biggest mistake of that type system. Please, pg, correct this !), but I think we've got everything we need. Almkglor only proposes a few macros and discipline in librarys, but this can be done with the current language definition (and ignored by everybody but him :)

-----

4 points by sacado 6317 days ago | link

Just tried that, it works :

  (= _car car _cdr cdr)

  (def hasa (obj typ)
    (if (no typ)
      t
      (and (type.obj (car typ)) (hasa obj (cdr typ)))))

  (def car (x)
    (if (acons x)
        (_car x)
      (hasa x scanner)
        (let tx type.x
          (annotate tx (tx!car rep.x)))
        (err "bad object for car")))

  ; define scanner type-class
  (= scanner '(car cdr))

  (= mycar [_ 0] mycdr [cut _ 1])
  ; Say s can answer to car and cdr
  (= fns (listtab `((car ,mycar) (cdr ,mycdr))))
  (= s (annotate fns "abcde"))
  (car s)
  -> #3(tagged #hash((cdr . #<procedure: mycdr>) (car . #<procedure: mycar>)) #\a)
  (car rep.s)
  -> #\a

It's a bit ugly, probably it could get a little better (the display is ugly, IMHO) but those wanting it could start doing funny things.

-----

4 points by cchooper 6317 days ago | link

> Wow, I think you won the price of the longest post so far.

Well someone has to load-test this thing!

> And it is even a very clever one, actually.

Thanks :)

> having to use rep on annotated data is, I think, the biggest mistake of that type system.

This is exactly this problem that got me thinking about types. I've been tempted to create a few types with annotate already, but each time I stopped because I didn't want to have to reimplement every function to work on my new type. Each time, I found a different solution to the problem that didn't require new types, which started me thinking "hey, perhaps we don't need new types after all!"

But you're right that you'll always need new types eventually. The solution you suggested is, I think, the right one. It's a bit like the object system in ANSI Common Lisp (pg even used hash tables to store the object's methods!) but it uses annotate to associate methods with objects.

So I'll modify my position and say that you should avoid creating new types, but if you have to do it, duck typing is the way to go.

-----

2 points by almkglor 6317 days ago | link

For completeness ^^

  (redef hasa (obj typ)
    (if (and (isa obj 'cons) (is typ scanner))
      t))

-----

1 point by sacado 6317 days ago | link

You could read this about annotate et al. : http://www.paulgraham.com/ilc03.html

I particularily like that one : "I expect type names will ordinarily be symbols, but they don't have to be. Either argument can be of any type. I can't imagine why users would want to have type labels other than symbols, but I also can't see any reason to prevent it."

-----

1 point by cchooper 6317 days ago | link

You know, the first thing I thought when I read that x years ago was "You could pass around a load of functions as the type to do polymorphism"! I wonder if everyone has the same thought.

-----

2 points by almkglor 6318 days ago | link

The point mostly is that those problems probably stem from is-a semantics. It might be that has-a semantics might work better.

In such a case a has-a semantics means that circle "has-a" function to compute its area, a function to compute its circumference, etc. An ellipse "has-a" function to compute its area, a function to compute its circumference, etc. It doesn't matter whether the user thinks of circles as special cases of ellipses or not.

Perhaps the way to go would be to support interfaces without requiring type checking. Basically you simply say "all I care about is that this object can be passed to that function".

That said a semiformal way of expressing this - probably by giving a name to an interface (which is just a set of function symbols that an object supports) - might be useful. This way wouldn't necessarily be checked by the program - it might be useful to have it be read by the programmers as part of the code/documentation.

  (typeclass 'scanner 'car 'cdr)
  ; programmer reads: a scanner is anything that somehow supports 'car and 'cdr

  (def foo (a)
    " A ridiculously complex library function which does
      a lot of useful things and which the library user
      probably doesn't want to read in full, because he or
      she is using the library so that he or she doesn't
      have to think about it.
      See also [[bar]] "
    (must-have a 'scanner)
    ; programmer reads:
    ; "Anything I define which supports 'car and
    ; 'cdr can work on this function"
    (ridiculously-complex-expression-involving a))

-----

3 points by cchooper 6317 days ago | link

OK, maybe I flipped out a bit when I heard the word 'type' mentioned. It brought up nightmares of using Java and C++ and so forth.

The kind of thing you're suggesting here does look powerful (and most importantly, optional!) If you combine it with sacado's suggestion for putting the methods in the tag, then you have a very powerful type system indeed.

I just have one contentious thing left to say: when you move away from is-a typing to has-a typing, does it really make sense to use the word 'type' at all? Aren't we really talking about what your functions can do, rather that what your objects are? For example, if you define car and cdr to work on strings, have you added strings to a new sequence type or have you expanded the power of your functions? I prefer to think that you've done the latter, and save the word 'type' for the basic is-a types that every language has to have.

It's the word 'type' I'm objecting to now, not the general idea. Perhaps we need to get out of the typing mindset in order to really break new ground.

-----

3 points by almkglor 6317 days ago | link

The value of types is the name you associate with an object. Instead of giving a really long list of "functions that should work on the object" you say "an object of this type". So instead of saying "an object that has 'car and 'cdr" you just say "scanner".

Brevity, brrevity.

-----