INDEX
Explanations
instances of location-related words, particularly "here" and "there."
New Auto-Interp
Negative Logits
gis
-0.18
raÄį
-0.15
ër
-0.15
ëĮ
-0.14
lass
-0.14
illy
-0.14
feu
-0.14
elay
-0.14
theid
-0.14
wap
-0.14
POSITIVE LOGITS
zelf
0.16
<article
0.14
sole
0.14
abant
0.14
adian
0.13
gun
0.13
Gun
0.13
åĨł
0.13
mechan
0.13
unga
0.13
Activations Density 0.020%