Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

Are there any major conceptual differences to Theano? Not that I wouldn't appreciate a more polished, well funded competitor in the same space.

It looks like using TensorFlow from Python will feel quite familiar to a Theano user, starting with the separation of graph building and graph running, but also down into the details of how variables, inputs and 'givens' (called feed dicts in tensorflow) are handled.



I'm looking at the RNN implementation right now (https://github.com/tensorflow/tensorflow/blob/master/tensorf...). It looks like the loop over the time frames is actually in Python itself.

    for time, input_ in enumerate(inputs): ...
This confuses me a bit. Maybe the shape is not symbolic but must be fixed.

I also haven't seen some theano.scan equivalent. Which is not needed in many cases when you know the shape in advance.


I think this loop actually still only builds the graph -- what `scan` would do. The computation still happens outside of python. That is, in tensorflow they perhaps don't need `scan` because a loop with repeated assignments "just works"... Let's try this:

It seems like in TensorFlow you can say:

    import tensorflow as tf 
    sess = tf.InteractiveSession() # magic incantation

    state = init_state = tf.Variable(1) # initialise a scalar variable

    states = []
    for step in range(10):
         # this seems to define a graph that updates `state`:
         state = tf.add(state,state)
         states.append(state)

    sess.run(tf.initialize_all_variables())
at this point, states is a list of symbolic tensors. now if you query for their value:

    print sess.run(states)
    >>> [2, 4, 8, 16, 32, 64, 128, 256, 512, 1024]
you get what you would naively expect. I don't think that would work in Theano. Cool.


Why wouldn't this work in Theano?

    >>> import theano
    >>> import theano.tensor as T
    >>> state = theano.shared(1.0)
    >>> states = []
    >>> for step in range(10):
    >>>     state = state + state
    >>>     states.append(state)
    >>> 
    >>> f = theano.function([], states)
    >>> f()
    [array(2.0),
     array(4.0),
     array(8.0),
     array(16.0),
     array(32.0),
     array(64.0),
     array(128.0),
     array(256.0),
     array(512.0),
     array(1024.0)]


Thanks! When I tried this before, I thought compilation was stuck in an infinite loop and gave up after about a minute. But you're right, it works. Though on my machine, this took two and a half minutes to compile (ten times as long as compiling a small convnet). For 10 recurrence steps, that's weird, right? And the TensorFlow thing above runs instantly.


Agreed. Theano has trouble dealing efficiently with very deeply nested graphs.


You're right. There is not currently a theano.scan equivalent that dynamically loops over a dimension of a tensor.

That said, you can do a lot with truncated BPTT and LSTM. See the sequence modeling tutorial on tensorflow.org for more details.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: