INDEX
Explanations
references to Indigenous identities and cultures
New Auto-Interp
Negative Logits
ÑģÑı
-0.18
rij
-0.17
loff
-0.16
lix
-0.16
sg
-0.16
nero
-0.15
ses
-0.15
Ùī
-0.15
naire
-0.15
shire
-0.15
POSITIVE LOGITS
-born
0.28
/native
0.24
born
0.23
born
0.20
vore
0.20
tongue
0.20
/original
0.19
Hawaiian
0.19
-made
0.18
bred
0.18
Activations Density 0.018%