INDEX
Explanations
instances of character interactions and relationships in narratives
New Auto-Interp
Negative Logits
аÑĢÑĩ
-0.20
oto
-0.16
raud
-0.15
Arc
-0.14
bid
-0.14
andest
-0.14
Xxx
-0.13
aur
-0.13
mour
-0.13
utan
-0.13
POSITIVE LOGITS
vag
0.16
uckles
0.16
uy
0.15
orny
0.15
ick
0.14
воÑĢ
0.14
Ease
0.14
ucks
0.14
ecom
0.14
orns
0.13
Activations Density 0.081%