INDEX
Explanations
prepositions and determiners indicating location or direction
New Auto-Interp
Negative Logits
oko
-0.17
mpar
-0.15
Cout
-0.15
iggins
-0.15
ilyn
-0.14
pery
-0.14
466
-0.14
rie
-0.14
Äįi
-0.14
è©
-0.14
POSITIVE LOGITS
oken
0.16
pedia
0.16
AZY
0.16
_PRINTF
0.15
rena
0.15
blr
0.15
imos
0.14
utin
0.14
gezocht
0.14
Emer
0.14
Activations Density 0.001%