INDEX
Explanations
references to personal pronouns and possessive forms of "he" and "his."
New Auto-Interp
Negative Logits
reeNode
-0.08
ogs
-0.07
omik
-0.07
lasses
-0.07
¶Į
-0.07
eln
-0.07
ковÑĸ
-0.07
евид
-0.07
ertainment
-0.07
usercontent
-0.07
POSITIVE LOGITS
or
0.10
/her
0.08
/she
0.06
onom
0.06
彼女
0.06
(.)
0.06
sil
0.06
she
0.06
yo
0.06
osph
0.06
Activations Density 0.004%