INDEX
Explanations
instances of the word "up" indicating movement or progression
New Auto-Interp
Negative Logits
INGLE
-0.15
oval
-0.15
unn
-0.14
ingle
-0.14
lettes
-0.14
nb
-0.14
innen
-0.13
uv
-0.13
gons
-0.13
erde
-0.13
POSITIVE LOGITS
roke
0.17
elo
0.17
acier
0.15
CONF
0.14
swing
0.14
鹿
0.14
isify
0.14
UTE
0.14
iese
0.14
erty
0.13
Activations Density 0.012%