INDEX
Explanations
references to slowness or gradualness in actions or descriptions
New Auto-Interp
Negative Logits
alat
-0.16
chw
-0.16
laÅŁ
-0.16
546
-0.16
onte
-0.16
tá
-0.15
ulet
-0.15
iled
-0.15
819
-0.14
erap
-0.14
POSITIVE LOGITS
poke
0.36
-motion
0.27
-paced
0.27
paced
0.26
est
0.25
ww
0.25
paced
0.23
inski
0.23
æħ¢
0.23
-moving
0.23
Activations Density 0.017%