INDEX
Explanations
the preposition "in" as an indicator of location
New Auto-Interp
Negative Logits
coon
-0.18
elman
-0.17
bao
-0.15
anson
-0.15
abra
-0.15
wish
-0.15
ropol
-0.15
FFE
-0.14
ductor
-0.14
apult
-0.14
POSITIVE LOGITS
èİ
0.16
zc
0.16
Bel
0.15
752
0.15
.vars
0.15
Ń
0.14
Rubio
0.14
608
0.13
anni
0.13
ninger
0.13
Activations Density 0.021%