INDEX
Explanations
references to cities, particularly Kuala Lumpur
New Auto-Interp
Negative Logits
Philippine
-0.16
pron
-0.16
Indonesian
-0.15
nesia
-0.15
_ma
-0.15
kariy
-0.15
Indonesia
-0.14
Indones
-0.14
Abb
-0.14
arness
-0.14
POSITIVE LOGITS
Perl
0.19
Johan
0.17
laut
0.16
antan
0.16
impulse
0.15
Bes
0.15
raman
0.15
Palestin
0.14
Hulu
0.14
ãĤ²
0.14
Activations Density 0.056%