I like this because it describes fully and precisely the structure of an LSTM cell, in a way which mostly avoids ML/stats jargon.
From this article I can correctly and easily read off the model architecture it describes, as a composition of smooth maps on modules over the real numbers, which is more than what one can say about a lot of papers.
From this article I can correctly and easily read off the model architecture it describes, as a composition of smooth maps on modules over the real numbers, which is more than what one can say about a lot of papers.