INDEX
Explanations
references to political beliefs and ideologies
New Auto-Interp
Negative Logits
ibo
-0.16
jem
-0.15
ibe
-0.15
idor
-0.15
ç¿Ĵ
-0.15
GBK
-0.14
ÑĨÑİ
-0.14
unde
-0.14
áž
-0.14
ãĥŃãĥ¼
-0.14
POSITIVE LOGITS
sop
0.15
946
0.15
omorphic
0.15
ARA
0.14
ara
0.14
Order
0.13
088
0.13
太éĥİ
0.13
Ary
0.13
Marketable
0.13
Activations Density 0.366%