What does that mean? Does he remove words that are only said once or twice?
Can anyone point me to a text explaining the difference between Identifying Characteristic Words using Log Likelihood and using tfidf. ?
# remove sparse terms all.tdm.75 <- removeSparseTerms(all.tdm, 0.75) # 3117 / 728215
What does that mean? Does he remove words that are only said once or twice?
Can anyone point me to a text explaining the difference between Identifying Characteristic Words using Log Likelihood and using tfidf. ?