INDEX
Explanations
statements or actions by political figures
statements or actions taken by individuals in positions of authority or influence
New Auto-Interp
Negative Logits
venge
-0.56
stub
-0.55
offspring
-0.53
otin
-0.52
physical
-0.52
scaven
-0.51
respective
-0.51
rafted
-0.50
1966
-0.49
tag
-0.49
POSITIVE LOGITS
doms
0.73
forcefully
0.60
Mattis
0.60
CNBC
0.60
İĭ
0.59
Talks
0.57
scathing
0.57
bluntly
0.57
ozy
0.56
Amnesty
0.55
Activations Density 0.718%