"To develop robots, you have two options: You can either simulate an environment...

JimmyAustin · on March 19, 2016

I've heard a similar story about small self driving cars.

The cars would drive around using a random algorithm, then copy and tweak the algorithm of the longest running car when they crashed. The researcher left the room to let the cars work, only to come back and find that each of the cars had deduced that the perfect solution was to remain perfectly still. After all, if they didn't move, they couldn't crash!

ghostDancer · on March 19, 2016

The only winning move is not to play - WOPR A.k.a. Joshua

komali2 · on March 19, 2016

That's the story story of a game-learning program that found the best way to "not lose" at tetris was to pause right before a brick extended above the top of the level, and leave it paused indefinitely.

swsieber · on March 19, 2016

I don't know if that's the best illustration - it seems like he forgot to include forward motion in his fitness function.

oh_sigh · on March 19, 2016

Yes...it's not that hard. There are literally dozens of genetic algorithm simulators available to run in your browser which do exactly that.

mixmax · on March 19, 2016

15 years ago it was rather hard....

oh_sigh · on March 20, 2016

I mean - not that hard to reason about. And it wasn't rather hard 15 years ago either.

zxcvvcxz · on March 19, 2016

Funny anecdote, but "artificial intelligence" (which people misuse as a fancy term when what they really mean is task optimization) requires setting the right goals. You know, like moving from point A to point B.

hughperkins · on March 19, 2016

Yes, seems like more anecdote than reality. Reality is, you wouldnt just 'code an AI', and leave it running for the weekend, and then act all surprised when it has bugs in. You'd work your way eg through Sutton's tasks, like drive a car up a hill https://en.wikipedia.org/wiki/Mountain_Car , try not to fall off a cliff, http://webdocs.cs.ualberta.ca/~sutton/book/ebook/node65.html balance an arm http://webdocs.cs.ualberta.ca/~sutton/book/ebook/node110.htm... , and so on. And since these mostly learn really quickly, and ones initial implementation will be buggy, you wouldnt go away for a weekend and leave it running, youd just sit there running it for a minute or two, fixing bugs, running again, and so on.

i000 · on March 19, 2016

It is actually possible to do this (the gaits are discovered not programmed):

https://www.youtube.com/watch?v=yci5FuI1ovk

pcl · on March 19, 2016

Wow, that's a great video.

The generation-comparison section (0:55) and the outtakes (4:50) were my favorites.

agumonkey · on March 19, 2016

Reaching zen in a weekend. Over archiever.

undergrowth54 · on March 19, 2016

Hmm.... I now want to see if I can train a neural net to play QWOP.

trentlott · on March 19, 2016

That would be incredible, and incredibly popular

argonaut · on March 20, 2016

Fortunately, this is not an issue in reinforcement learning any more. It's quite simple to train humanoids in simulators to walk. But that knowledge is completely untransferable to the real world.