INDEX
Explanations
dialogue interactions involving multiple characters
New Auto-Interp
Negative Logits
itational
-0.77
arcity
-0.70
Forge
-0.68
iffe
-0.66
ourse
-0.64
requently
-0.64
irie
-0.64
qus
-0.63
reated
-0.62
ourses
-0.62
POSITIVE LOGITS
huh
1.66
eh
1.48
haha
1.34
sir
1.28
yeah
1.09
tho
0.95
lol
0.95
anyways
0.93
though
0.91
alright
0.89
Activations Density 0.270%