INDEX
Explanations
references to authors and their works related to race and identity studies
New Auto-Interp
Negative Logits
åĤ
-0.16
kola
-0.16
stab
-0.16
indo
-0.15
bÄĽ
-0.15
zn
-0.14
iae
-0.14
á¿Ĩ
-0.14
iaux
-0.14
BOSE
-0.13
POSITIVE LOGITS
Thur
0.18
-Th
0.17
oke
0.16
Th
0.16
Th
0.15
/**<
0.15
th
0.15
,Th
0.15
thur
0.14
cod
0.14
Activations Density 0.051%