INDEX
Explanations
references to nationalities or ethnic identities
nationality and gender descriptors
New Auto-Interp
Negative Logits
intptr
-0.43
Glej
-0.38
depiction
-0.35
Frankel
-0.35
aksi
-0.35
flourished
-0.34
См
-0.34
disparu
-0.34
aldı
-0.34
问题的
-0.33
POSITIVE LOGITS
يتيمه
0.66
female
0.59
casian
0.58
########.
0.57
abancı
0.55
femininas
0.54
<=",
0.54
MethodManager
0.53
tvguidetime
0.52
EconPapers
0.52
Activations Density 0.101%