INDEX
Explanations
dialogue that includes emotional or tense interactions between characters
New Auto-Interp
Negative Logits
erti
-0.16
alars
-0.16
ogan
-0.15
owi
-0.15
oga
-0.15
porno
-0.14
dna
-0.14
karak
-0.14
esis
-0.14
lemen
-0.14
POSITIVE LOGITS
BX
0.14
asel
0.14
cko
0.14
.Glide
0.13
od
0.13
kh
0.13
781
0.12
Airlines
0.12
mare
0.12
_browser
0.12
Activations Density 1.872%