INDEX
Explanations
references to familial and possessive pronouns
New Auto-Interp
Negative Logits
Gratuit
-0.16
ãģĹãĤĩ
-0.15
Consortium
-0.15
ÑģÑĤве
-0.15
exus
-0.14
Ñģов
-0.14
ucu
-0.14
essen
-0.14
еÑĢин
-0.14
TES
-0.13
POSITIVE LOGITS
own
0.17
ãĥ¼ãĥ¬
0.17
own
0.16
pler
0.15
ewe
0.14
ga
0.14
URITY
0.14
atof
0.14
ement
0.14
bye
0.14
Activations Density 0.240%