INDEX
Explanations
references to character dynamics, particularly between heroes and villains
New Auto-Interp
Negative Logits
onis
-0.15
hero
-0.15
umont
-0.15
çij
-0.15
tae
-0.14
Sherlock
-0.14
Heroes
-0.14
_hero
-0.14
unread
-0.14
704
-0.14
POSITIVE LOGITS
evil
0.30
villain
0.28
vill
0.26
Evil
0.26
Vill
0.24
villains
0.23
evil
0.23
superv
0.21
evils
0.21
boss
0.20
Activations Density 0.215%