INDEX
Explanations
place names, specifically geographical locations
New Auto-Interp
Negative Logits
zan
-0.20
isle
-0.15
thane
-0.15
erus
-0.15
uri
-0.15
Ler
-0.14
azard
-0.14
stri
-0.14
inter
-0.14
222
-0.13
POSITIVE LOGITS
olle
0.16
Reality
0.16
tridge
0.15
reality
0.15
asz
0.15
brig
0.15
ayar
0.14
Reality
0.14
ComputedStyle
0.14
McCorm
0.14
Activations Density 0.015%