INDEX
Explanations
occurrences of the word "its" indicating possession or relevance
New Auto-Interp
Negative Logits
his
-0.16
han
-0.15
aise
-0.14
hand
-0.14
hip
-0.14
hy
-0.14
egg
-0.14
çļĦåľ°æĸ¹
-0.14
ÑģÑĤÑĢ
-0.14
ig
-0.14
POSITIVE LOGITS
own
0.26
lef
0.26
iner
0.25
itself
0.23
obre
0.18
esser
0.18
panic
0.17
elve
0.17
urm
0.17
/her
0.17
Activations Density 0.110%