INDEX
Explanations
reflexive pronouns and expressions of self-reference
New Auto-Interp
Negative Logits
Desta
-0.56
\}\\
-0.56
NonQuery
-0.55
tonic
-0.54
enrique
-0.54
asilan
-0.54
jkl
-0.53
manas
-0.53
panoramique
-0.52
Ergänzung
-0.52
POSITIVE LOGITS
itself
1.71
itself
1.67
Itself
1.50
themselves
1.23
themselves
1.16
himself
1.12
Himself
1.09
сама
1.08
zelf
1.07
selve
1.03
Activations Density 0.076%