INDEX
Explanations
words and phrases related to personal identity and social connections
New Auto-Interp
Negative Logits
iya
-0.15
kraj
-0.15
kon
-0.15
emez
-0.14
Spo
-0.14
koc
-0.14
comm
-0.14
lô
-0.14
Cle
-0.14
dik
-0.14
POSITIVE LOGITS
että
0.16
aad
0.15
ellen
0.15
allon
0.15
gii
0.15
ksi
0.15
ELLOW
0.14
herits
0.14
meille
0.14
aks
0.14
Activations Density 0.004%