INDEX
Explanations
references to ownership and individuality
New Auto-Interp
Negative Logits
novità
-0.71
vectorielles
-0.70
er
-0.68
culoare
-0.66
faptul
-0.63
malattie
-0.63
Protestants
-0.62
tuturor
-0.62
gră
-0.61
amac
-0.61
POSITIVE LOGITS
own
1.46
OWN
1.24
Own
1.21
sendiri
1.08
Own
1.07
personal
1.03
own
1.02
PreferredItem
0.98
eigenen
0.97
selves
0.92
Activations Density 0.052%