INDEX
Explanations
references to nationalities or ethnicities
New Auto-Interp
Negative Logits
Tot
-0.15
aint
-0.15
IPC
-0.15
داشت
-0.15
моÑĤ
-0.15
abin
-0.14
deer
-0.14
å©Ĩ
-0.14
átor
-0.14
opsis
-0.14
POSITIVE LOGITS
abra
0.16
carrier
0.15
Roch
0.15
ROWS
0.15
jez
0.15
ADDE
0.14
Ïģαν
0.14
à¥ģà¤Ń
0.14
-local
0.14
-span
0.13
Activations Density 0.024%