INDEX
Explanations
statistics and figures
expressions related to political events or groups
New Auto-Interp
Negative Logits
..."
-0.64
Allaah
-0.55
-->
-0.55
.</
-0.54
...)
-0.54
`.
-0.54
.","
-0.54
Allah
-0.53
``(
-0.52
.<
-0.51
POSITIVE LOGITS
meanwhile
0.66
odore
0.59
endum
0.52
Jacobs
0.48
ccording
0.47
irony
0.47
however
0.46
Lauder
0.46
responded
0.46
aback
0.45
Activations Density 1.792%