INDEX
Explanations
Any mention of a specific organization looking for publicity or engaging in political activities
references to political organizations and campaigns
New Auto-Interp
Negative Logits
ãĥŁ
-0.80
icipated
-0.74
iverpool
-0.69
atars
-0.68
urchase
-0.66
ãĤ¨ãĥ«
-0.66
eatures
-0.65
byss
-0.65
ierrez
-0.65
uilt
-0.65
POSITIVE LOGITS
..."
1.39
â̦"
1.27
.")
1.25
[
1.19
,'"
1.18
['
1.17
."[
1.11
..."
1.10
.'"
1.09
.''
1.07
Activations Density 0.898%