INDEX
Explanations
motifs of confusion, frustration, and emotional disturbance in narratives
New Auto-Interp
Negative Logits
stral
-0.18
gie
-0.15
ven
-0.15
iel
-0.14
Sark
-0.14
ijke
-0.14
kl
-0.14
:
-0.13
its
-0.13
/ref
-0.13
POSITIVE LOGITS
ingly
0.41
Tactics
0.16
edly
0.16
eyle
0.16
rowad
0.15
edo
0.15
ваÑģ
0.15
άνι
0.15
//~
0.15
uku
0.15
Activations Density 0.074%