INDEX
Explanations
Twitter usernames or handles
usernames or handles in Twitter posts
New Auto-Interp
Negative Logits
LIMITED
-0.76
Enabled
-0.73
diapers
-0.67
notebooks
-0.66
fingerprints
-0.66
Islamists
-0.63
striking
-0.62
Lisp
-0.62
Eighth
-0.62
AFB
-0.61
POSITIVE LOGITS
veyard
0.92
izon
0.88
CBC
0.86
noon
0.85
rentice
0.84
anie
0.83
DonaldTrump
0.82
Anonymous
0.81
news
0.81
official
0.79
Activations Density 0.090%