INDEX
Explanations
dialogue and conversational interactions among characters
New Auto-Interp
Negative Logits
vetica
-0.16
_ASSUME
-0.15
δά
-0.15
coquine
-0.14
íĥģ
-0.14
orum
-0.14
.until
-0.14
ecycle
-0.14
vider
-0.14
dge
-0.14
POSITIVE LOGITS
done
0.29
ain
0.28
been
0.28
Ain
0.27
AIN
0.26
don
0.25
seen
0.24
ain
0.24
Cain
0.23
gonna
0.23
Activations Density 0.299%