Very nicely written post. I particularly like how you attached a link to your co... | Hacker News

Hacker Newsnew | past | comments | ask | show | jobs | submit

		RyEgswuCsn on July 25, 2019 \| parent \| context \| favorite \| on: An Introduction to Recurrent Neural Networks Very nicely written post. I particularly like how you attached a link to your codebase on repl.it so anyone who is interested can tinker with the code. One thing I have been wondering for some time is whether the vanilla RNN can learn negations (i.e. 'not good' == 'bad') and valence shifts (e.g. modifier words like 'very' --- they do not carry sentiment connotations themselves, but may amplify/dampen the sentiment of the words they modify; negations like 'not' can be considered as a special-case valence shifter where it inverts the sentiment of the following word). My suspicion is that vanilla RNNs are not capable of modelling negations and valence shifters since they make inference on the sentiment of a sentence by 'adding up' the sentiment connotations of its constituent words --- negations and valence shifts, however, works more like multiplications than additions. I see you already have such examples in your dataset so I thought I'd do some experiments. I simplified your original dataset to the following: `train_data = { 'good': True, 'bad': False, 'not good': False, 'not bad': True, 'very good': True, 'very bad': False, 'not very good': False, 'not very bad': True } test_data = { 'very not bad': True, 'very not good': False }` While the test cases do not reflect how people actually speak, the hope is that the model should be able to apply its learning to infer their sentiment. For me, however, it would seem the training failed to converge with the default parameter settings (hidden_size=64). It would be interesting to see how other RNN architectures (e.g. LSTM, Transformers) fare with negations and valence shifters. P.S.: When calculating softmax, it is better to use the built-in functions or at least do the log-sum-exp trick to prevent under-flowing.

vzhou842 on July 25, 2019 | [–]

Thanks for the comments! Interesting experiment - I wouldn't be surprised if better RNN architectures were more effective for this example.

Appreciate the softmax tip, I'll update soon.

mruts on July 25, 2019 | [–]

I tried a LSTM model of stock twists and it seemed reasonably good at handling negations (single and double negatives at least)

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact