INDEX
Explanations
instances of emotional expression and reactions
New Auto-Interp
Negative Logits
-0.59
StructEnd
-0.55
win
-0.53
mukaan
-0.52
fédé
-0.48
dikutip
-0.47
#
-0.46
win
-0.46
ITHUB
-0.46
//
-0.46
POSITIVE LOGITS
nod
0.81
smile
0.79
hiss
0.79
gasp
0.78
cry
0.78
blink
0.77
sob
0.76
grin
0.76
sigh
0.76
giggle
0.74
Activations Density 0.387%