INDEX
Explanations
phrases related to escaping or fleeing from danger
New Auto-Interp
Negative Logits
locker
-0.18
šit
-0.18
yen
-0.15
iams
-0.15
quirer
-0.14
sus
-0.14
екÑģи
-0.14
xious
-0.14
hei
-0.14
hud
-0.14
POSITIVE LOGITS
into
0.20
khá»ıi
0.20
hatch
0.19
Hatch
0.19
detection
0.18
into
0.18
notice
0.18
per
0.16
pell
0.15
reality
0.15
Activations Density 0.024%