INDEX
Explanations
mentions of Israel and its military actions
New Auto-Interp
Negative Logits
bury
-0.17
antu
-0.16
ogle
-0.16
utsche
-0.15
.TR
-0.15
ìĦľëĬĶ
-0.15
chten
-0.15
باÙĦ
-0.15
ansom
-0.15
.jet
-0.14
POSITIVE LOGITS
608
0.17
å¡
0.16
hone
0.15
AVED
0.14
ancel
0.14
itet
0.14
/lang
0.13
LD
0.13
Shank
0.13
Solic
0.13
Activations Density 0.012%