INDEX
Explanations
instances of the pronoun "he" and its variations
New Auto-Interp
Negative Logits
aren
-0.16
ظÙĩ
-0.16
Were
-0.15
*=*=
-0.15
_compiler
-0.14
são
-0.14
weren
-0.14
luv
-0.14
ajs
-0.14
Were
-0.14
POSITIVE LOGITS
/her
0.44
/she
0.44
himself
0.42
aviest
0.30
avier
0.29
inous
0.28
arken
0.27
idelberg
0.26
aped
0.26
Himself
0.25
Activations Density 0.351%