INDEX
Explanations
actions involving movement and transition
New Auto-Interp
Negative Logits
overn
-0.15
lant
-0.15
DEX
-0.14
ibling
-0.14
ingu
-0.14
ige
-0.13
itchen
-0.13
ep
-0.13
wyn
-0.13
bach
-0.13
POSITIVE LOGITS
&type
0.16
.fm
0.15
дин
0.15
ean
0.14
adle
0.14
æĦı
0.14
ãĥ¥
0.14
chio
0.14
onus
0.14
.walk
0.13
Activations Density 0.232%