Reconnecting to my original point way up-thread, my point is these "innovations" have not substantially expanded the types of models we are capable of expressing (they have certainly expanded the size of the types of models we're able to train), not nearly to the same degree as backprop/convnets/LSTMs did way back decades ago (this is important because AGI will require several expansions in the types of models we are capable of implementing).
Right, LSTM was invented 20 years ago. 20 years from now, the great new thing will be something that has been published today. It takes time for new innovations to gain popularity and find their uses. That doesn't mean innovations are not being made!
Today lots of people-- ones with even less background and putting in less effort-- try and are successful.
This is not a small change, even if it is the product of small changes.