INDEX
Explanations
keywords and phrases related to violence or conflict
New Auto-Interp
Negative Logits
Bowman
-0.16
losing
-0.15
lost
-0.15
jas
-0.14
ï¸ı
-0.14
LOSE
-0.14
lost
-0.14
ìłł
-0.14
antal
-0.13
losing
-0.13
POSITIVE LOGITS
remained
0.20
remain
0.18
Remain
0.17
olest
0.16
bleibt
0.16
stayed
0.16
Remaining
0.16
оÑģÑĤаÑĤ
0.15
оÑģÑĤан
0.15
simultaneously
0.15
Activations Density 0.016%