Are there any major conceptual differences to Theano? Not that I wouldn't apprec...

albertzeyer · on Nov 9, 2015

I'm looking at the RNN implementation right now (https://github.com/tensorflow/tensorflow/blob/master/tensorf...). It looks like the loop over the time frames is actually in Python itself.

    for time, input_ in enumerate(inputs): ...

This confuses me a bit. Maybe the shape is not symbolic but must be fixed.

I also haven't seen some theano.scan equivalent. Which is not needed in many cases when you know the shape in advance.

obstbraende · on Nov 9, 2015

I think this loop actually still only builds the graph -- what `scan` would do. The computation still happens outside of python. That is, in tensorflow they perhaps don't need `scan` because a loop with repeated assignments "just works"... Let's try this:

It seems like in TensorFlow you can say:

    import tensorflow as tf 
    sess = tf.InteractiveSession() # magic incantation

    state = init_state = tf.Variable(1) # initialise a scalar variable

    states = []
    for step in range(10):
         # this seems to define a graph that updates `state`:
         state = tf.add(state,state)
         states.append(state)

    sess.run(tf.initialize_all_variables())

at this point, states is a list of symbolic tensors. now if you query for their value:

    print sess.run(states)
    >>> [2, 4, 8, 16, 32, 64, 128, 256, 512, 1024]

you get what you would naively expect. I don't think that would work in Theano. Cool.

benanne · on Nov 9, 2015

Why wouldn't this work in Theano?

    >>> import theano
    >>> import theano.tensor as T
    >>> state = theano.shared(1.0)
    >>> states = []
    >>> for step in range(10):
    >>>     state = state + state
    >>>     states.append(state)
    >>> 
    >>> f = theano.function([], states)
    >>> f()
    [array(2.0),
     array(4.0),
     array(8.0),
     array(16.0),
     array(32.0),
     array(64.0),
     array(128.0),
     array(256.0),
     array(512.0),
     array(1024.0)]

obstbraende · on Nov 9, 2015

Thanks! When I tried this before, I thought compilation was stuck in an infinite loop and gave up after about a minute. But you're right, it works. Though on my machine, this took two and a half minutes to compile (ten times as long as compiling a small convnet). For 10 recurrence steps, that's weird, right? And the TensorFlow thing above runs instantly.

benanne · on Nov 10, 2015

Agreed. Theano has trouble dealing efficiently with very deeply nested graphs.

yablak · on Nov 9, 2015

You're right. There is not currently a theano.scan equivalent that dynamically loops over a dimension of a tensor.

That said, you can do a lot with truncated BPTT and LSTM. See the sequence modeling tutorial on tensorflow.org for more details.