INDEX
Explanations
proper nouns and geographical references, particularly related to India and its administrative divisions
New Auto-Interp
Negative Logits
690
-0.16
Byl
-0.15
rw
-0.15
itable
-0.15
etro
-0.14
_hosts
-0.14
Hier
-0.14
iedo
-0.14
волÑı
-0.14
pena
-0.14
POSITIVE LOGITS
obre
0.16
Sp
0.16
oub
0.15
cü
0.15
ady
0.14
æ¢
0.14
uve
0.14
odi
0.14
uide
0.14
ovic
0.14
Activations Density 0.041%