INDEX
Explanations
references to specific countries or the concept of a country itself
New Auto-Interp
Negative Logits
olin
-0.18
pron
-0.15
engin
-0.15
é³´
-0.15
latin
-0.14
esar
-0.14
OutOfRange
-0.14
bane
-0.14
Bene
-0.14
.telegram
-0.14
POSITIVE LOGITS
izia
0.16
oglob
0.15
omore
0.15
ÑĢиÑĩ
0.15
mour
0.14
ÑĪев
0.14
Mour
0.14
realistic
0.14
å°¾
0.14
acha
0.14
Activations Density 0.023%