INDEX
Explanations
various forms of possessive pronouns and references to characters' physical interactions with one another
New Auto-Interp
Negative Logits
å¾Ĺ
-0.16
éĴŁ
-0.14
Bone
-0.14
backgrounds
-0.14
Heads
-0.14
logan
-0.14
illi
-0.14
yourselves
-0.14
erot
-0.14
Feeling
-0.14
POSITIVE LOGITS
left
0.25
right
0.23
face
0.21
sing
0.20
arms
0.19
arm
0.19
index
0.19
bare
0.18
jacket
0.18
hand
0.18
Activations Density 0.133%