Arc Forumnew | comments | leaders | submitlogin
Stab at implementing Arc lists with mpair's, appears to fix queue bug (awwx.ws)
9 points by aw 5544 days ago | 23 comments


3 points by waterhouse 5542 days ago | link

I did a couple of tests, and stuff that involves a lot of set-c[ad]ring seems to happen 3-4 times as fast. Nice!

  ;All tests done using "racket -i -f as.scm", Racket v5.0
  ;Where x is (n-of 1000000 (rand))
  ;With mpairs
  arc> (time:no:sort < x)
  time: 12408 msec.
  nil
  
  ;Normal arc3.1
  arc> (time:no:sort < x)
  time: 36004 msec.
  nil

  ;This function destructively reverses a list, as in CL
  ; but it's actually slower than 'rev, for some reason
  (def nreverse (xs)
    (and xs
         (let tail nil
           (while xs
             (let rest cdr.xs
               (= cdr.xs tail
                  tail xs
                  xs rest)))
           tail)))

  ;With mpairs
  arc> (time:no:zap nreverse x)
  time: 719 msec.
  nil

  ;Normal arc3.1
  arc> (time:no:zap nreverse x)
  time: 2732 msec.
  nil
On the other hand, it seems to slow down normal list operations somewhat, about 20%.

  ;The format of the output of "time" is "time: nnn msec."
  ; so we will take the output of a bunch of iterations,
  ; read all sexprs, and take the second of every three.

  ;With mpairs
  arc> (map cadr (tuples (readall:tostring (repeat 30 (time:len x))) 3))
  (66 65 63 63 62 62 62 63 66 63 63 69 74 66 63 371 97 92 74 71
   71 71 64 71 63 71 63 63 63 63)

  ;With normal arc3.1
    arc> (map cadr (tuples (readall:tostring (repeat 30 (time:len x))) 3))
  (62 62 61 52 52 52 52 52 52 51 51 51 57 52 52 56 56 61 61 52 52
   52 52 54 56 61 52 52 54 61)
On the whole, probably worth it.

And, by the way, for your amusement and my pleasure, I have a dinky little version of mergesort that runs 20% faster than the destructive version defined in arc.arc runs in arc3.1. It's stable, too. http://pastebin.com/XFz6T9xW

-----

1 point by aw 5542 days ago | link

Thanks for the tests!

On the other hand, it seems to slow down normal list operations somewhat, about 20%

hmmm, since since most Arc lists are now built out of mpairs in this hack, I wonder if it would be any faster to test for mpair's first:

  (xdef car (lambda (x)
               (cond ((mpair? x)   (mcar x))
                     ((pair? x)    (car x))
                     ...
etc.

-----

1 point by thaddeus 5543 days ago | link

Out of curiosity, why does arc avoid immutability? I thought the point was that mutability meant race conditions amongst concurrent threads could cause data busts. And although I know arc provides a queue mechanism, which effectively accomplishes the same thing, why not just build a transaction mechanism on top of the immutable data structures, kinda like clojure's dosync?

-----

2 points by aw 5542 days ago | link

Well, I can only speak for myself, but let's take a look at what immutability is useful for.

Suppose you have a list that you know shouldn't be changed. If you have immutability in the language, you can make the list immutable, and then if some buggy code tries to change the list you'll get an error.

However, bug-free correct code won't try to modify the list. (If it did, it would get an error, and it wouldn't be correct code). So for correct code, it doesn't matter whether the list is actually enforced to the immutable or not, it will be immutable in practice because the correct code won't try to change it.

So being able to enforce immutability doesn't help code that is already correct. What it does do is maybe help you find bugs quicker. If you have some buggy code that is modifying a list when it shouldn't be, you'll find out right away instead of getting some other error or problem later.

So, you go to this extra effort to mark certain lists as immutable (or maybe have things immutable by default and then make particular things mutable), and in return for this extra effort some kinds of bugs are easier to figure out.

Is it worth it?

Well, for me, just in my own personal experience when I've been tracking down my multithreading bugs, I haven't had a problem with accidentally mutating things. The kind of bugs I run into are typically things like not doing enough locking or doing too much locking. So I could run around marking all kinds of things immutable in my own code and it wouldn't help me find the bugs that I actually have.

Of course, that doesn't mean that there might not be some future situation in which I'd find immutability helpful, such as where I was accidentally modifying some list I didn't want to modify and I wasn't sure where I was doing it. Then I'd happily make the list immutable, so I could quickly find out where my bug was.

Which would be easy to do. Even in Arc 3.1 with PLT 4, which makes Scheme's "immutable" pairs mutable, just add a datatype to the Arc runtime, have "car" and "cdr" work on the datatype, but don't add setters. Since this would be so easy to add, that no one has done it suggests to me that maybe no one has found it particularly useful yet.

Another situation in which I suspect people like enforced immutability when they're working in a team of programmers and they don't trust the skills other the other programmers very much. When the other programmers introduce bugs it's easy to see that it's their fault when they get an error.

It may be that individual programmers working on their own code often don't care about having immutability enforced by the language as much because they already know where they are and aren't modifying things.

-----

1 point by akkartik 5542 days ago | link

IMO to add things like immutability and modules defeats the 'philosophy' of lisp. It makes it harder to build lisps atop PLT.

---

When you just load a series of files in PLT you can declare your functions in any order and in any file. Nice.

But turn those files into modules and now you have errors.

  compile: unbound identifier in module
I've spent crazy quantities of time trying to decompose arc's implementation into files (http://arclanguage.org/item?id=11869) while getting around modules.

---

According to R5RS (http://www.schemers.org/Documents/Standards/R5RS/HTML/r5rs-Z...):

  "In Scheme, combinations must have at least one subexpression, so that
  () is not a syntactically valid expression."
This means I can't make the arc compiler R5R6 scheme without lots of crazy checking all over the place about whether to emit () or '().

(And to decompose arc into arbitrary files I need to come up with a weird language that accepts () like mzscheme mode, but doesn't use modules.)

---

The arc implementation is far more verbose than it could be because of PLT's phase separation rules -- it must implement its own macro system (http://arclanguage.org/item?id=11529)

---

If anybody has suggestions around these constraints I'm all ears.

-----

1 point by thaddeus 5542 days ago | link

Thanks for your reply, your thoughts are allowing me to work through the trade offs...

I view it as a convenience. Similar to HN or this forum app, many of my apps keep data in-memory (to improve performance) however unlike news.arc my apps tend to allow concurrent data changes among multiple users. News.arc doesn't need to worry about this 99% of the time (where 1 user, 1 story) and I'm thinking arc is very news.arc centric.

Using Clojure I just find it easier to wrap my statement in a dosync, walk away and not worry about writing more code to manage the transaction to shared in-memory data.

Not trying to push Clojure, but having just spent a tonne of time learning Clojure, I naturally would come back to the watering hole asking questions. :)

-----

1 point by aw 5542 days ago | link

Share your discovery by giving us an example in Clojure and Arc, and showing us how the Clojure example is shorter or how it was easier to write for you.

Arc is a very hackable language, so if there is a facility there that is useful and not just marketing hype ("makes threads worry-free!"), someone may add it for you.

-----

1 point by thaddeus 5542 days ago | link

At a high level, assuming the data structure "data" is initialized:

Arc (without transaction code):

  (= data!1 (value-of-function-taking-a-variable-amount-of-time))
Clojure (with transaction code):

  (dosync 
     (alter data assoc 1 (value-of-function-taking-a-variable-amount-of-time))) 
Aside from arc's update statement being shorter, the point is I literally only have to wrap the update statement in a (dosync ...) to achieve atomicity, concurrency, and isolation semantics. Clojure manages 'state' and 'time' for me.

I could start writing a bunch of code to show the equivalent of dosync in arc, but it would be implementation specific and I believe the point is shown that I don't have a coding exercise in Clojure it's part of the language, therefore a convenience.

Addtionally I can keep two data structures in sync at all times.

  (dosync 
     (let [x (value-of-function-taking-a-variable-amount-of-time)]
       (alter data assoc 1 x)
       (do-something-else)
       (alter data2 assoc 1 x)))
For those interested you can view Rich's session on state and identity here: http://www.infoq.com/presentations/Value-Identity-State-Rich....

-----

1 point by aw 5542 days ago | link

Hmm, you don't need to reimplement Clojure in Arc, but showing an example in Arc that's buggy and then how the same thing can be written as easily in Clojure in a bug-free way would be helpful.

I haven't had trouble with "atomicity, concurrency, and isolation semantics" that I've wanted to do myself in Arc so far, so without an example I don't have a way to tell why this might be better.

-----

1 point by thaddeus 5542 days ago | link

I know you'd rather have code exposing the problem, but I don't have the environment in front of me or the cycles to set it up.

So... Bob triggers a process at a time-stamp of 1. The process takes some input arguments and the current value of a record in a data-structure, as additional input arguments, then builds upon it. The process takes a time-length of 4. Afterwards it stores the result back into the record.

Alice triggers the same process on the same record at time-stamp 2 only her process takes a time-length of 1.

Using Arc the end data structure will contain Bobs change, an incorrect value. It should have done Bobs process, then using it's result as an input argument for Alice's process to produce the final result.

The reason why I am not writing a function for you is that these are common problems/scenarios that require complex setup. They are common enough that we have databases requiring ACID compliance to overcome them.

There are a variety of business examples that would expose these problems. ie ROFR's (right of first refusal algorithms): where processes grab data as input args then allow user checks/processes, yet rights are granted at the initial time of execution.

I know these things could be accomplished in arc, but I believe you're going to be writing a lot implementation specific code, that would just get in your way. You're going to write much more than 'dosync' and you're likely going to have to troubleshoot that code more often than you would like. No one writes 'correct' code all the time.

I also wonder why arc avoids it. It seems like arc is going out of it's way to bypass immutable data-structures rather than build upon them and progress to offer convenient features like built in transaction semantics.

^_^

-----

1 point by akkartik 5541 days ago | link

I think arc's trying to avoid multiple processes. If you need them this isn't the right tool. But I think you can stretch arc pretty far without needing multiple processes.

"..convenient features like builtin transaction semantics."

To my jaundiced eye a feature that requires a lot of cycles of setup doesn't seem very convenient ;) It may well be impossible to do without, and transaction semantics may be useful because they make your life easier, but they aren't convenient.

It's hard to be a hacker's language with such advanced/complicated/complex features. Arc's approach seems to be, "if it's complicated throw it out." Hackers will find a way to live within its limitations. And we often do.

-----

4 points by thaddeus 5541 days ago | link

> If you need them this isn't the right tool.

I agree. I was never intending to use arc for doing so, I was just asking about the rationale of going against scheme and other functional programming languages evolution. Just seeing what others think.

> ... feature that requires a lot of cycles of setup doesn't seem very convenient.

I was referring to crafting an example problem using Arc, to show the value of the feature - not the feature itself. The feature implementation is dead simple and convenient.

> It's hard to be a hacker's language with such advanced/complicated/complex features. Arc's approach seems to be, "if it's complicated throw it out."

Depends on how you view it. I'm not a language junkie, but I value languages that have a good balance between power, efficiency, and usability. Arc provides this. So does Clojure. Clojure has more features. You see power in arc, you understand all of its' minimalistic feature sets. I see feature sets provide power simply because they exist - and they still they have a very low complexity requirement to achieve a given result. (ie . wrapping an update statement in a 'dosync' is pretty much free of any complexity yet you get a lot in return). That, to me, is valuable. That to me is power.

It's funny, I'm starting to notice some rosy glasses around here. No offense and this isn't just from your comment, but I get the impression if a feature gets implemented in arc, then arc users see the power arc provides and say "look how cool arc is - its powerful". Yet if someone says language "x" has a cool feature that arc doesn't have then all of the sudden it's a "framework", "cruft" or extra complexity that's not needed anyway....Really? I don't buy it. These other languages are doing great things too.

> Hackers will find a way to live within its limitations. And we often do.

Honestly - I think arc users live within its limitations simply because there's really no guiding force or buy in from either the author or a large enough community to evolve it's feature sets at even a reasonable pace. That's not to criticize everyone's contributions, but rather a relative experiential comparison.

-----

2 points by akkartik 5541 days ago | link

"not the feature itself. The feature implementation is dead simple and convenient."

But the implementation is irrelevant! What matters is how easy the feature is to use. [Insert rant about how 37signals confuses interface complexity with implementation complexity. Implementations should be complicated. http://www.joelonsoftware.com/items/2010/08/19.html]

---

I don't think I have rose-colored glasses at all. I didn't mean to imply that transactions aren't useful, or that convenience is all. We all use other languages here. My point is this: if you want to get stuff done, don't go dreaming of concurrency and transactional semantics in arc. It's just too hard, PLT's foundations are too crappy from that perspective. It would be a waste of your time.

Arc is a place to experiment on how far we can go with a minimal software stack. I wouldn't use it for other kinds of experiments.

---

Update: Rereading the thread, I suspect I'm saying nothing you don't know, except perhaps that I agree with you. Your core question seems to be, "Why do we not like immutable when PLT seems to like it." To that I have no intelligent response.

-----

2 points by aw 5540 days ago | link

PLT's foundations are too crappy

Before I worked on improving the port of Arc to PLT 4 (the bit about using MzScheme's mutable pairs to implement Arc lists instead of whacking immutable pairs with pointers), I spent some time looking around to see if there was another runtime I'd want to use instead. Because the Arc runtime is less than a thousand lines of code, it would be easy to move it to something else if it were better, and I wouldn't want to take the time improving the port to PLT 4 if I were going to use something else instead.

PLT has some weirdness such as for example needing to use custodians to forcibly close socket ports. But in terms of a runtime that supports concurrency AND call-with-current-continuation AND dynamic variables ("parameters") that work in the presence of continuations and threads and exceptions AND a rich set of synchronization primitives AND tails calls AND is reliable AND is fast... if there's anything better than PLT, I sure would like to know about it, so that I can start using it!

-----

1 point by garply 5540 days ago | link

Have you looked at Termite (that is, Gambit)? It seems to support continuations well - continuations are even serializable (along with ports, threads, etc) - and I think concurrency is handled exceptionally elegantly (built-in message-passing a la Erlang).

I never got many chances to use such exotic features, but I believe the combination of message passing and serializable continuations would allow you to do fun stuff like easily transfer code execution between different computers.

My limited experience suggested it was fast. It's also compilable to C. I suspect tail calls are properly handled - doesn't seem like something Feeley would let slide. I'm not sure about reliability.

I stopped using it because of lack of documentation and poor library support.

-----

1 point by akkartik 5539 days ago | link

I was probably overstating things, and certainly overstating my knowledge of them. I was focusing entirely on message-passing concurrency, like in Erlang. 'Crappy' was a reference not to a specific runtime but to the threads model in general and how errorprone it is.

What do you think of SBCL? I get the sense it is faster. Is that not true?

-----

1 point by aw 5539 days ago | link

re message-passing concurrency in PLT, PLT has both message-passing (either synchronous channels or asynchronous buffered channels, your choice) and concurrency built in as a standard part of the language. Second, even if PLT didn't have that, you can easily implement "message-passing concurrency" on top of a threads model.

As for SBCL, I haven't seen a comparison, so I don't know. Common Lisp doesn't have call-with-current-continuation so Arc's ccc wouldn't be fully supported, but I don't know if anyone's Arc programs use full continuations so it may not matter.

-----

1 point by akkartik 5539 days ago | link

I was wondering about runtime quality without reference to arc or specific features.

From what people say Erlang has a really high-performance fine-grained message-passing implementation. PLT has the primitives, but not nearly the same runtime quality. Is that mistaken?

-----

1 point by aw 5538 days ago | link

"not nearly the same runtime quality"

Which means what exactly? Slow? Crashes a lot? Doesn't sparkle?

Do you have a single shred of evidence for these claims, whatever they are? Do those "people"? Or is this just all blabble?

If you actually want to know if Erlang has a faster message-passing implementation than PLT you don't need to compare what I say with what "people" say. Write a program in Erlang and PLT and see which one is faster.

-----

1 point by akkartik 5537 days ago | link

Ok. You seem to be saying PLT is the best thing for everyone, and that nobody should use anything else. I'll just assume something's being lost in translation, and shut up. It's not a very productive topic anyway.

-----

1 point by aw 5537 days ago | link

You seem to be saying PLT is the best thing for everyone

Sigh. If you say the Earth is round, and I ask "what's your evidence?", that does not mean that I'm claiming the Earth is flat.

-----

1 point by akkartik 5537 days ago | link

You don't just ask, you ask using words like "babble" and "shred of evidence". Is that necessary? It sounds defensive of PLT. But perhaps you think my argument is poor. So let me try to clarify.

You seem to be saying reputations are irrelevant. They're a shortcut. They're for when the program I'd have to write to test some property would require a week to months of work. If you want to ignore reputations, be my guest. But reputation is how you found out about PLT in the first place. You didn't run some idealized experiment comparing every single platform in existence for precisely the nuanced characteristics of the program you were planning to write.

I hear lots of people online touting Erlang as high performance at hundreds of cores, and highly fault tolerant. Have you not heard this? Do I need to cite chapter and verse for things I assume people are aware of? I don't hear anybody saying similar things for PLT. Am I just not listening in the right places? Feel free to cite evidence yourself.

I use PLT and arc because of programming convenience. I don't spend much time thinking about how good the runtime quality is because it's frankly not that important to me. It doesn't give me much confidence when I see 12 segfaults a day on my readwarp server[1]. So yes, it does 'crash a lot'[2]. In this very thread you're switching away from using immutable cons pairs because they seem to be not working as advertised. Is that not 'evidence'?

You've said things like "if there's anything better than PLT.." so it sounds like you're thinking in terms of an absolute ordering that you can stack language/platform combinations on. Surely the ordering is in the context of your use case. Or would you write the linux kernel in PLT scheme? Based on what you know now? How would you scale news.arc to two servers? I don't care that PLT 'has message passing'; if I wrote something that required lots of machines I know I would switch to Erlang or C, and put up with crappy syntax. You can't have the best programming environment and the best runtime - because no single community focuses equally on both.

I was careless in my phrasing of 'some people'. Sorry. I didn't expect to get pounced on, and I didn't mean for the conversation to get yet more noisy. It takes two to do that.

[1] http://arclanguage.org/item?id=11355

[2] I don't know if I'm doing something wrong. There's no FFI stuff going on.

-----

1 point by aw 5536 days ago | link

I apologize. I seem to have been especially grumpy in the past few days for some reason, and I regret it came out at you...

-----