INDEX
Explanations
comments or messages written by users expressing their opinions
phrases regarding user comments and participation on social media
New Auto-Interp
Negative Logits
headquartered
-0.74
subord
-0.67
hart
-0.66
sac
-0.66
enshr
-0.65
deregulation
-0.63
treaties
-0.63
negotiating
-0.63
Spending
-0.63
³³³³³³³³
-0.63
POSITIVE LOGITS
Redditor
1.10
commenters
1.07
1.04
0.99
username
0.98
feedback
0.96
ickr
0.93
estamp
0.85
commenter
0.85
submissions
0.83
Activations Density 0.656%