INDEX
Explanations
phrases related to avoiding danger or getting out of a situation
New Auto-Interp
Negative Logits
anlı
-0.14
ogra
-0.14
леÑĤ
-0.14
804
-0.14
chine
-0.13
bakan
-0.13
lav
-0.13
azio
-0.13
ardır
-0.13
icators
-0.13
POSITIVE LOGITS
Dodge
0.29
dodge
0.24
alive
0.24
alive
0.23
here
0.23
Alive
0.22
Alive
0.20
ensa
0.19
town
0.19
there
0.19
Activations Density 0.019%