INDEX
Explanations
geographic and cultural references related to specific regions
New Auto-Interp
Negative Logits
nea
-0.15
bd
-0.15
agic
-0.15
empty
-0.15
ldr
-0.14
owitz
-0.14
obil
-0.14
hrad
-0.14
earth
-0.14
onta
-0.14
POSITIVE LOGITS
occ
0.32
Occ
0.28
occ
0.25
اÙĦجÙĨ
0.25
Occ
0.24
Oriental
0.23
orient
0.23
à¹ĥà¸ķ
0.22
_occ
0.21
اÙĦØ´Ùħ
0.21
Activations Density 0.067%