INDEX
Explanations
references to Indonesia or Indonesian identity
New Auto-Interp
Negative Logits
للمعارف
-0.63
keinem
-0.53
sammen
-0.52
ら
-0.50
Lähteet
-0.50
iprot
-0.49
wachten
-0.49
ريل
-0.49
político
-0.49
hoffen
-0.48
POSITIVE LOGITS
Indonesia
0.92
Indonesia
0.85
Indonesian
0.76
AddTagHelper
0.76
Indones
0.71
ertale
0.67
ID
0.65
JAKARTA
0.65
indonesia
0.65
UnsafeEnabled
0.65
Activations Density 0.055%