You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
The library seems to be working more like a dictionary look up for swear words. For example, it can correctly tag "fucking idiot" as negative, but also tags "fucking awesome!" as negative. Maybe the training set's features were uni-grams?
The text was updated successfully, but these errors were encountered:
From my point of view, that happens because of the learning algorithm the library uses. By tokenizing each word, "fucking" gets a huge probability of being profane, since it is profane in any context. For example, you cannot say "fucking awesome!" in a professional environment. If you place "fucking awesome!" in clean_data.csv, you will label as 1 (profane), not 0(not profane).
The library seems to be working more like a dictionary look up for swear words. For example, it can correctly tag "fucking idiot" as negative, but also tags "fucking awesome!" as negative. Maybe the training set's features were uni-grams?
The text was updated successfully, but these errors were encountered: