INDEX
Explanations
references to political figures and actions
statements or discussions about political controversies
New Auto-Interp
Negative Logits
)",
-0.69
');
-0.69
analyse
-0.64
)"
-0.63
());
-0.62
)."
-0.62
Firstly
-0.61
.",
-0.61
"},
-0.60
Whilst
-0.60
POSITIVE LOGITS
etheless
0.93
similarly
0.84
downright
0.84
decidedly
0.82
Gorsuch
0.78
Canaver
0.78
bona
0.75
nonetheless
0.73
willfully
0.72
DeVos
0.72
Activations Density 2.016%