INDEX
Explanations
expressions of affection and emotional connections between characters
New Auto-Interp
Negative Logits
ultipart
-0.16
upe
-0.15
abol
-0.15
unik
-0.15
Exposure
-0.14
ifter
-0.14
amer
-0.14
åī²
-0.14
@js
-0.14
bev
-0.14
POSITIVE LOGITS
hug
0.26
arms
0.25
embrace
0.24
hugged
0.23
arm
0.22
/arm
0.21
embraced
0.19
æĬ±
0.18
embraces
0.18
hugs
0.18
Activations Density 0.132%