INDEX
Explanations
prepositions and references to academic departments or institutions
New Auto-Interp
Negative Logits
abar
-0.16
ffm
-0.15
963
-0.15
ople
-0.15
å¥Ĺ
-0.14
ngle
-0.14
bd
-0.14
mond
-0.14
abis
-0.14
niž
-0.13
POSITIVE LOGITS
Lump
0.14
ango
0.14
stadt
0.13
cán
0.13
sted
0.13
ixa
0.13
vard
0.13
ãĤ¥
0.13
verse
0.12
rele
0.12
Activations Density 0.054%