INDEX
Explanations
instances of dialogue or speech in the text
New Auto-Interp
Negative Logits
ux
-0.16
holm
-0.15
ippi
-0.15
_BEGIN
-0.15
edin
-0.14
spo
-0.14
iping
-0.14
itia
-0.14
ITLE
-0.13
isu
-0.13
POSITIVE LOGITS
anship
0.16
pent
0.16
yourself
0.16
pent
0.16
your
0.15
How
0.15
sir
0.15
mitters
0.15
Is
0.14
缮
0.14
Activations Density 0.044%