INDEX
Explanations
references to geographical locations and entities
New Auto-Interp
Negative Logits
rist
-0.18
jang
-0.16
odial
-0.15
à¹Ģ
-0.15
usp
-0.15
-UA
-0.14
ÑģÑĤиÑĤ
-0.14
quip
-0.14
Disallow
-0.14
idan
-0.14
POSITIVE LOGITS
Holland
0.43
Netherlands
0.41
Dutch
0.40
Nederland
0.32
holland
0.32
olland
0.27
Hague
0.26
Amsterdam
0.25
Belgium
0.24
.nl
0.24
Activations Density 0.054%