INDEX
Explanations
mentions of specific religious or ethnic groups and discussions surrounding their treatment or representation
New Auto-Interp
Negative Logits
rez
-0.16
arro
-0.15
ancel
-0.15
ORD
-0.15
buie
-0.14
ispecies
-0.14
вв
-0.14
zd
-0.14
akov
-0.14
SCRI
-0.14
POSITIVE LOGITS
напÑĢимеÑĢ
0.15
ResponseType
0.15
TELE
0.14
erable
0.14
MERCHANTABILITY
0.14
ÄŁ
0.13
напÑĢиклад
0.13
ÙħØ«ÙĦا
0.13
ilder
0.13
AMA
0.12
Activations Density 0.252%