INDEX
Explanations
mentions of character names or roles in storytelling contexts
New Auto-Interp
Negative Logits
unnel
-0.16
Aviv
-0.16
erap
-0.15
ocket
-0.15
IPP
-0.14
chip
-0.14
_softc
-0.14
ÑĤол
-0.14
moire
-0.14
afort
-0.14
POSITIVE LOGITS
Actor
0.35
actor
0.35
Actor
0.33
actors
0.32
actor
0.31
Actors
0.29
.actor
0.28
.Actor
0.26
_actor
0.26
(actor
0.24
Activations Density 0.001%