Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

> I'd very much like to see zero-cost abstractions in a new language, so that there is no surprise differences between various ways of iterating over collections.

I never got around to writing in detail about it, but I have that goal with iteration in Wren, and came up with a little iteration protocol[1] that I think accomplishes it.

In most languages a loop structure like:

    for (i in sequence) {
      print(i);
    }
Desugars to something like:

    {
      var __iterator = sequence.getIterator()
      while (__iterator.moveNext()) {
        var i = __iterator.getValue()
        print(i)
      }
    }
You've got one fundamental piece of state: a cursor that tracks where you currently are in the sequence. This is separate from the sequence itself because you can iterate over the same sequence multiple times, sometimes even simultaneously.

And you have three fundamental operations you need to perform:

1. Produce a new cursor pointing to the beginning of the sequence.

2. Advance the cursor and determine if we've reached the end.

3. Get the value at the current cursor position.

In most languages, the last two operations are defined as methods on the cursor itself. That means for any given sequence, it needs to define a cursor type specific to that sequence (and storing a reference to it), and you have to allocate that object at runtime, mainly so that you can use it as a dispatch token to find its implementations of those two operations.

That's kind of lame. In Wren, it desugars a little differently:

    {
      var __iterator = null 
      while (__iterator = sequence.iterate(__iterator)) { 
        var i = sequence.iteratorValue(__iterator) 
        print(i) 
      }
    }
The first two operations are combined into a single iterate() method that uses a "null" parameter to mean "produce a new cursor" and any other parameter to mean "advance". If it returns null or false, we're done with the sequence. Otherwise, it returns a cursor pointing to the next element in the sequence.

To get the value at the current cursor position, we call iteratorValue() on the sequence passing the cursor in.

This means the cursor object is pure state—it's just a bit of data that gets passed repeatedly to the sequence. The sequence itself has the dynamic dispatch of the methods for its own iteration protocol.

What this means in practice is that for simple things like lists, you can use a dead simple cursor type: a number.

The implementation of the protocol for lists looks pretty much like this (except its actually in C in Wren):

    class List {
      iterate(iterator) {
        // Start the at first element.
        if (iterator == null) return 0

        // Stop at the end.
        if (iterator >= count) return null

        // Advance.
        return iterator + 1
      }

      iteratorValue(iterator) {
        return this[iterator]
      }
    }
Since numbers are unboxed in Wren, it means many sequences can be iterated without any dynamic allocation at all. It makes it easier to implement the iteration protocol because you don't always have to define a class that stores a reference back to your sequence.

[1]: http://wren.io/control-flow.html#the-iterator-protocol



How do you unbox something that could be null? Why not have a separate method on the sequence to define the initial value?


> How do you unbox something that could be null?

Wren is dynamically typed, so you don't really have to think about explicit boxing or unboxing.

> Why not have a separate method on the sequence to define the initial value?

Yes, I am definitely considering that. There's no real reason why that needs to be combined with the method to advance, and I'm honestly not even sure why I originally designed it that way.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: