INDEX
Explanations
references to chemical warfare and its implications
New Auto-Interp
Negative Logits
getNode
-0.14
درÛĮ
-0.14
statt
-0.14
çŁ¿
-0.13
IENTATION
-0.13
_PRIV
-0.13
.literal
-0.13
iron
-0.13
bero
-0.13
νομ
-0.12
POSITIVE LOGITS
sar
0.34
nerve
0.32
biological
0.31
ric
0.28
chemical
0.28
VX
0.28
mustard
0.28
Biological
0.28
agents
0.26
agent
0.26
Activations Density 0.027%