INDEX
Explanations
references to political strategies and criticisms of opposition
New Auto-Interp
Negative Logits
_TestCase
-0.17
ÑijÑĢ
-0.15
ênh
-0.15
icher
-0.14
atz
-0.14
zeÅĦ
-0.14
isOk
-0.14
éģº
-0.14
еÑĢин
-0.14
uml
-0.13
POSITIVE LOGITS
vulnerabilities
0.19
vulnerable
0.18
lose
0.16
Vulner
0.16
vulner
0.16
challenged
0.16
exposed
0.15
fall
0.15
fail
0.15
loses
0.15
Activations Density 0.287%